Discussion:
[arch-general] [arch-dev-public] AUR ToS (aka making AUR user names public)
Mauro Santos via arch-general
2017-03-05 21:12:55 UTC
Permalink
Hi,
I was recently contacted by a Polish researcher asking for a list of AUR
account names. I did not expect this to be controversial but a couple of
Trusted Users raised concerns on IRC, so I decided to move this to the
public mailing list and discuss the whole topic in generality. I would
like to head more opinions but please read the whole email and give it a
second thought before simply bringing up the usual privacy arguments
mentioned below.
My original questions was: Are we fine with sharing the list of AUR
accounts names (only user names, no real names or email addresses) with
a researcher that seems trustworthy and agrees to not share the data in
any form other than the resulting anonymized statistics?
In this particular case, we are talking about Dorota Celinska [1] from
the University of Warsaw, Faculty of Economic Sciences [2], see [3] for
a list of her publications and [4] for a summary of her research project
funded recently by the Polish National Science Centre. She needs the
list of user names to perform a segmentation analysis, including users
which were active on the older AUR releases both do not show any
activity on AUR 4. She would also like to use the user names as
identifiers to establish connections with other platforms, such as
GitHub.
The next question is: Would it make sense to even make this data
publicly available? Would it make sense to extend our RPC interface such
that one can search for users names? GitHub, for example, already
provides such an interface [5]. Let me quickly summarize some arguments
* User names are mostly identifiers. It is questionable whether they
can/should be considered personal/private information. Maybe this can
only be answered by a lawyer, though.
* The user names of all accounts with any kind of public activity, like
uploading a package, filing a request, writing a comment, are public
already.
* After logging into the aurweb interface, you can already check whether
an account with a given user name exists because the account details
page URIs have the form https://aur.archlinux.org/account/$username.
This means that for any platform providing a list of user names (such
as GitHub), you can "establish connections" with the AUR already.
* Principle of data economy: We should not share any kind of information
we do not need to share.
* Sharing user names lowers the threshold for sharing other information
which is considered more confidential.
* Users can (and should) already use crawlers to fetch the user names.
For example, the user names of all package maintainers and comment
authors appear on the package details pages. The names of all users
filing package requests appear in the mailing list archives etc.
* We do not have ToS so we better not share anything.
I, personally, find the second last argument a very weak one. Telling
users to build crawlers scraping an brute-forcing our HTML pages makes
life difficult for both them and us. What do you think?
On the other side of the coin, the last argument is a very good one and
it brings me to my last point. Independently of the outcome of this
discussion, I think we should add some ToS that users need to agree upon
when registering. It should contain information on liability and on
privacy. Is anybody willing to write a draft? Do we need the support of
a lawyer here?
Thank you for your time and have a nice Sunday!
Regards,
Lukas
[1] http://coin.wne.uw.edu.pl/dcelinska/en/
[2] https://www.wne.uw.edu.pl/index.php/en/
[3] http://coin.wne.uw.edu.pl/dcelinska/en/pages/publications.html
[4] https://ncn.gov.pl/sites/default/files/listy-rankingowe/2016-03-15/streszczenia/337724-en.pdf
[5] https://developer.github.com/v3/users/
I'd say err on the caution side and don't share, even though the
usernames are public and easy to find by scraping them from the
website/mailing list/etc, handing the whole database of usernames in a
silver platter is a whole different story, which is what is being asked.
Is there any community/website that provides a full list of registered
usernames on request?

There is also the question of how useful that data would be, without any
other data such as email the username list is useless, you have no
guarantee that user foo on github is the same person as user foo on the
AUR/Wiki/Forum or user foo somewhere else. In this case I'd also have to
agree that sharing usernames lowers the threshold for sharing other
information.

It also doesn't fit with their stated research goals, only github and
projects associated with scraping data from github are mentioned, why
would they want to throw the AUR usernames in the mix?
--
Mauro Santos
Ralf Mardorf
2017-03-05 23:36:35 UTC
Permalink
Giving away any data is bad, period.
I hate this fashion that nowadays every "expert" holding a share is
granted access to data, that even the NSA isn't getting that easy.
Starting to give away such data to "researchers" is evil, let alone
that all that "serious" statistics are just bullshit.
No exceptions in regards to privacy!
An "exceptions" is a violation of privacy.
Why not giving away telephone numbers? Everybody anyway could dial each
available number, so it doesn't matter to give away the numbers,
right? ;)
Leonid Isaev
2017-03-06 01:14:02 UTC
Permalink
Isn't Arch BBS already providing list of usernames?

In general, though, I'd say follow the principle of least effort. Why just not
publish the list of usernames and that's all? This way, new users can easily
grep for them and don't need scrapers, and "researchers" can have fun...
Post by Ralf Mardorf
Giving away any data is bad, period.
I hate this fashion that nowadays every "expert" holding a share is
granted access to data, that even the NSA isn't getting that easy.
Starting to give away such data to "researchers" is evil, let alone
that all that "serious" statistics are just bullshit.
No exceptions in regards to privacy!
An "exceptions" is a violation of privacy.
Why not giving away telephone numbers? Everybody anyway could dial each
available number, so it doesn't matter to give away the numbers,
right? ;)
Oh, please. Not the usual NSA crap again.

Cheers,
--
Leonid Isaev
Tinu Weber
2017-03-06 08:44:12 UTC
Permalink
Post by Leonid Isaev
Isn't Arch BBS already providing list of usernames?
The BBS's user list is only available to logged-in users. Although that
is certainly not an extended privacy measure, it still prevents random
people who just "pass by" from extracting the user list.
Post by Leonid Isaev
In general, though, I'd say follow the principle of least effort. Why just not
publish the list of usernames and that's all? This way, new users can easily
grep for them and don't need scrapers, and "researchers" can have fun...
Because anonymisation: even if one dataset in isolation may look
unsuspicious from a privacy POV, if combined with other datasets, it may
suddenly reveal information that was not intended to be public.

I admit that a simple one-column list of user nick names may probably
not really be joinable with other datasets or -tables in any useful
manner, but it is still not always obvious how data can be (ab)used
(see also [1]).

I would not give out the user list. Even if there are means for
everybody to somehow obtain the data (with enough effort from their
side), it is not the same thing as simply handing it out conveniently
prepared and formatted.

Best,
Tinu


[1] http://archive.wired.com/politics/security/commentary/securitymatters/2007/12/securitymatters_1213
Leonidas Spyropoulos via arch-general
2017-03-06 08:58:11 UTC
Permalink
I was under the impression that the AUR git interface is just one big
git repo. Yes it checks out only the package you clone but the
references contain all packages (and commits). Am I mistaken to this?

Regards,
--
Leonidas Spyropoulos

A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Leonid Isaev
2017-03-06 09:14:52 UTC
Permalink
Post by Tinu Weber
Because anonymisation: even if one dataset in isolation may look
unsuspicious from a privacy POV, if combined with other datasets, it may
suddenly reveal information that was not intended to be public.
I admit that a simple one-column list of user nick names may probably
not really be joinable with other datasets or -tables in any useful
manner, but it is still not always obvious how data can be (ab)used
(see also [1]).
I would not give out the user list. Even if there are means for
everybody to somehow obtain the data (with enough effort from their
side), it is not the same thing as simply handing it out conveniently
prepared and formatted.
See, the rule should be that private information is the one that is manifestly
marked so. For example, a password or a secret key is private information which
you never ever disclose to anyone.

But a username is by definition open. Therefore, if your privacy relies on a
web service not disclosing usernames, you haven't considered the threat model
carefuly enough.

What I'm saying is just another example of avoiding security through obscurity:
don't rely on a web service not advertising your usernames, if this is an
issue, make each username a random string (which defeats the attack [1]).
Post by Tinu Weber
[1] http://archive.wired.com/politics/security/commentary/securitymatters/2007/12/securitymatters_1213
Cheers,
--
Leonid Isaev
Ralf Mardorf
2017-03-06 09:03:21 UTC
Permalink
Post by Leonid Isaev
Isn't Arch BBS already providing list of usernames?
In general, though, I'd say follow the principle of least effort. Why
just not publish the list of usernames and that's all? This way, new
users can easily grep for them and don't need scrapers, and
"researchers" can have fun...
Post by Ralf Mardorf
Giving away any data is bad, period.
I hate this fashion that nowadays every "expert" holding a share is
granted access to data, that even the NSA isn't getting that easy.
Starting to give away such data to "researchers" is evil, let alone
that all that "serious" statistics are just bullshit.
No exceptions in regards to privacy!
An "exceptions" is a violation of privacy.
Why not giving away telephone numbers? Everybody anyway could dial
each available number, so it doesn't matter to give away the numbers,
right? ;)
Oh, please. Not the usual NSA crap again.
I did not wrote about the NSA. I only pointed out that even the NSA
doesn't get all the data as a gift. Why should a researcher get such
data as a gift? You are seemingly already that used to data mining and
offended privacy, that it's good and natural from your point of view,
if data is misused and any concerns are just crap in your opinion. That
usernames are used in public and maybe even a list might be already
published, is different to actively give the same data away to
"researchers" and to formally allow them to use the data. You seem not
to understand the principle of privacy. If you don't lock the street
door, this does not automatically indicate that you want people to come
into your house and take away your property. Btw. what is the aim of
the research and how could the research be used or possibly misused? We
don't need to care about such questions, if data isn't given away as a
matter of principle.
Bennett Piater
2017-03-06 09:38:43 UTC
Permalink
Post by Ralf Mardorf
I did not wrote about the NSA. I only pointed out that even the NSA
doesn't get all the data as a gift. Why should a researcher get such
data as a gift? You are seemingly already that used to data mining and
offended privacy, that it's good and natural from your point of view,
if data is misused and any concerns are just crap in your opinion. That
usernames are used in public and maybe even a list might be already
published, is different to actively give the same data away to
"researchers" and to formally allow them to use the data. You seem not
to understand the principle of privacy. If you don't lock the street
door, this does not automatically indicate that you want people to come
into your house and take away your property. Btw. what is the aim of
the research and how could the research be used or possibly misused? We
don't need to care about such questions, if data isn't given away as a
matter of principle.
I also don't think that the list should be published.
--
GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
Henrik Danielsson via arch-general
2017-03-06 09:52:51 UTC
Permalink
I guess I'll be the devil's advocate. I see no privacy issues in handing
over a list of already public information You could deny it for practical
reasons though, if you simply could not be bothered to scrape/export such a
list yourself. Denying or allowing won't stop anyone from obtaining the
list.

If users were concerned about their usernames being public, they shouldn't
have submitted them publicly. Public information is public, deal with it
and stop being so paranoid, they're gonna get you anyway. ;)
Ralf Mardorf
2017-03-06 10:18:42 UTC
Permalink
On Mon, 6 Mar 2017 10:52:51 +0100, Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
I guess I'll be the devil's advocate. I see no privacy issues in
handing over a list of already public information You could deny it
for practical reasons though, if you simply could not be bothered to
scrape/export such a list yourself. Denying or allowing won't stop
anyone from obtaining the list.
If users were concerned about their usernames being public, they
shouldn't have submitted them publicly. Public information is public,
deal with it and stop being so paranoid, they're gonna get you
anyway. ;)
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
Henrik Danielsson via arch-general
2017-03-06 11:20:19 UTC
Permalink
Post by Ralf Mardorf
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
My standpoint is that privacy does not apply to this kind of public
information, simply because it's not private and by no means sensitive
(people freely chose the username and other visible info they posted, no?).
Thus, no, I see no difference and really no point in even considering
trying to keep such information private.

What anyone does with the freely available information posted in the AUR is
up to them ("mining" it or handing it over to someone else included), we
could not do anything about it anyway, nor would I even care if I was in
that list or not, since there seems to be no ToS between the one submitting
that information and the one publishing it. Since it was freely submitted
without any terms, I can simply not find any restrictions on its usage.

Yes, we should have a ToS to at least keep the principle of privacy alive.
But let's face it, real privacy online has been dead for long, if it ever
existed.

If there was a ToS, the situation would perhaps have been different, at
least legally. I'm no legal expert of course, but to me it makes perfect
sense that if you posted something on the internet, in a very public space,
you can have no expectations of keeping any of that information private in
any way, nor any information easily associated with.
No, I don't see that as a problem, at least not if you never explicitly
agreed that information would not be shared. What I really want to keep
private I don't post anywhere.
Mauro Santos via arch-general
2017-03-06 11:53:34 UTC
Permalink
Post by Henrik Danielsson via arch-general
Post by Ralf Mardorf
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
My standpoint is that privacy does not apply to this kind of public
information, simply because it's not private and by no means sensitive
(people freely chose the username and other visible info they posted, no?).
Thus, no, I see no difference and really no point in even considering
trying to keep such information private.
What anyone does with the freely available information posted in the AUR is
up to them ("mining" it or handing it over to someone else included), we
could not do anything about it anyway, nor would I even care if I was in
that list or not, since there seems to be no ToS between the one submitting
that information and the one publishing it. Since it was freely submitted
without any terms, I can simply not find any restrictions on its usage.
Yes, we should have a ToS to at least keep the principle of privacy alive.
But let's face it, real privacy online has been dead for long, if it ever
existed.
If there was a ToS, the situation would perhaps have been different, at
least legally. I'm no legal expert of course, but to me it makes perfect
sense that if you posted something on the internet, in a very public space,
you can have no expectations of keeping any of that information private in
any way, nor any information easily associated with.
No, I don't see that as a problem, at least not if you never explicitly
agreed that information would not be shared. What I really want to keep
private I don't post anywhere.
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public and can be scraped.

The point here is handing over the full list of usernames on request. Do
note that in their research proposal[1] they specifically mention
scraping information from github. That information is public, github
does have an API to query that information, but they still have to
scrape it, I suppose that implies github does not hand it over wholesale
on request, why should we? This might be due to their ToS or they know
something we don't.

[1]
https://ncn.gov.pl/sites/default/files/listy-rankingowe/2016-03-15/streszczenia/337724-en.pdf
--
Mauro Santos
Ralf Mardorf
2017-03-06 12:13:32 UTC
Permalink
Post by Mauro Santos via arch-general
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public
It's not per se forbidden to take a photo of a public location, it's
even allowed to take the photo and to publish the photo, if a girl
randomly is on that photo, too. It is forbidden to provide a collection
of such photos to somebody else, who needs such photos for a porn
website. Now "research" isn't "porn", but subtleties could make it hard
to decide how to handle something like this. That something is public,
doesn't mean that privacy could be ignored.
Mauro Santos via arch-general
2017-03-06 13:32:22 UTC
Permalink
Post by Ralf Mardorf
Post by Mauro Santos via arch-general
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public
It's not per se forbidden to take a photo of a public location, it's
even allowed to take the photo and to publish the photo, if a girl
randomly is on that photo, too. It is forbidden to provide a collection
of such photos to somebody else, who needs such photos for a porn
website. Now "research" isn't "porn", but subtleties could make it hard
to decide how to handle something like this. That something is public,
doesn't mean that privacy could be ignored.
.
I'm not saying privacy doesn't matter, it does. The usernames are there
for everyone to see, there is no expectation of privacy on that, or the
comments on packages.

What I feel is the crux of the problem here is handing the list (or
database) of users wholesale. I believe you have framed the main
question better than I have in one of your replies :)
--
Mauro Santos
Henrik Danielsson via arch-general
2017-03-06 12:45:37 UTC
Permalink
2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general <
Post by Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
Post by Ralf Mardorf
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
My standpoint is that privacy does not apply to this kind of public
information, simply because it's not private and by no means sensitive
(people freely chose the username and other visible info they posted,
no?).
Post by Henrik Danielsson via arch-general
Thus, no, I see no difference and really no point in even considering
trying to keep such information private.
What anyone does with the freely available information posted in the AUR
is
Post by Henrik Danielsson via arch-general
up to them ("mining" it or handing it over to someone else included), we
could not do anything about it anyway, nor would I even care if I was in
that list or not, since there seems to be no ToS between the one
submitting
Post by Henrik Danielsson via arch-general
that information and the one publishing it. Since it was freely submitted
without any terms, I can simply not find any restrictions on its usage.
Yes, we should have a ToS to at least keep the principle of privacy
alive.
Post by Henrik Danielsson via arch-general
But let's face it, real privacy online has been dead for long, if it ever
existed.
If there was a ToS, the situation would perhaps have been different, at
least legally. I'm no legal expert of course, but to me it makes perfect
sense that if you posted something on the internet, in a very public
space,
Post by Henrik Danielsson via arch-general
you can have no expectations of keeping any of that information private
in
Post by Henrik Danielsson via arch-general
any way, nor any information easily associated with.
No, I don't see that as a problem, at least not if you never explicitly
agreed that information would not be shared. What I really want to keep
private I don't post anywhere.
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public and can be scraped.
The point here is handing over the full list of usernames on request. Do
note that in their research proposal[1] they specifically mention
scraping information from github. That information is public, github
does have an API to query that information, but they still have to
scrape it, I suppose that implies github does not hand it over wholesale
on request, why should we? This might be due to their ToS or they know
something we don't.
It would be rather interesting to see what they could come up with from
that correlation.

I think, perhaps a bit cynically, the reason github may not hand over that
data directly is likely that they don't want to do some of the work of the
researchers for them. As you said, the data is there, the format matters
less if they're going to massage it into something else later anyway, so
why bother with the effort of compiling it on their [github] own time?

We could simply deny the AUR username request it for the same reason, or no
reason at all. Since some people seem uncomfortable about what could be
derived from a potential correlation of publicly available data, that's
most likely the safest way to go.
Mauro Santos via arch-general
2017-03-06 13:36:39 UTC
Permalink
Post by Henrik Danielsson via arch-general
2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general <
Post by Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
Post by Ralf Mardorf
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
My standpoint is that privacy does not apply to this kind of public
information, simply because it's not private and by no means sensitive
(people freely chose the username and other visible info they posted,
no?).
Post by Henrik Danielsson via arch-general
Thus, no, I see no difference and really no point in even considering
trying to keep such information private.
What anyone does with the freely available information posted in the AUR
is
Post by Henrik Danielsson via arch-general
up to them ("mining" it or handing it over to someone else included), we
could not do anything about it anyway, nor would I even care if I was in
that list or not, since there seems to be no ToS between the one
submitting
Post by Henrik Danielsson via arch-general
that information and the one publishing it. Since it was freely submitted
without any terms, I can simply not find any restrictions on its usage.
Yes, we should have a ToS to at least keep the principle of privacy
alive.
Post by Henrik Danielsson via arch-general
But let's face it, real privacy online has been dead for long, if it ever
existed.
If there was a ToS, the situation would perhaps have been different, at
least legally. I'm no legal expert of course, but to me it makes perfect
sense that if you posted something on the internet, in a very public
space,
Post by Henrik Danielsson via arch-general
you can have no expectations of keeping any of that information private
in
Post by Henrik Danielsson via arch-general
any way, nor any information easily associated with.
No, I don't see that as a problem, at least not if you never explicitly
agreed that information would not be shared. What I really want to keep
private I don't post anywhere.
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public and can be scraped.
The point here is handing over the full list of usernames on request. Do
note that in their research proposal[1] they specifically mention
scraping information from github. That information is public, github
does have an API to query that information, but they still have to
scrape it, I suppose that implies github does not hand it over wholesale
on request, why should we? This might be due to their ToS or they know
something we don't.
It would be rather interesting to see what they could come up with from
that correlation.
Probably nothing meaningful. As I've said before you have no way of
knowing if user foo on github is the same as user foo on the AUR.
Post by Henrik Danielsson via arch-general
I think, perhaps a bit cynically, the reason github may not hand over that
data directly is likely that they don't want to do some of the work of the
researchers for them. As you said, the data is there, the format matters
less if they're going to massage it into something else later anyway, so
why bother with the effort of compiling it on their [github] own time?
We could simply deny the AUR username request it for the same reason, or no
reason at all. Since some people seem uncomfortable about what could be
derived from a potential correlation of publicly available data, that's
most likely the safest way to go.
--
Mauro Santos
Henrik Danielsson via arch-general
2017-03-06 13:46:20 UTC
Permalink
2017-03-06 14:36 GMT+01:00 Mauro Santos via arch-general
Post by Mauro Santos via arch-general
Post by Henrik Danielsson via arch-general
2017-03-06 12:53 GMT+01:00 Mauro Santos via arch-general <
Post by Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
Post by Ralf Mardorf
Privacy is a principle. You seem not to understand the difference
between giving somebody data with the formal permission to use this data
and data that simply is available for everybody, but not explicitly
handed over to somebody. Paranoia isn't involved in my concern.
My standpoint is that privacy does not apply to this kind of public
information, simply because it's not private and by no means sensitive
(people freely chose the username and other visible info they posted,
no?).
Post by Henrik Danielsson via arch-general
Thus, no, I see no difference and really no point in even considering
trying to keep such information private.
What anyone does with the freely available information posted in the AUR
is
Post by Henrik Danielsson via arch-general
up to them ("mining" it or handing it over to someone else included), we
could not do anything about it anyway, nor would I even care if I was in
that list or not, since there seems to be no ToS between the one
submitting
Post by Henrik Danielsson via arch-general
that information and the one publishing it. Since it was freely submitted
without any terms, I can simply not find any restrictions on its usage.
Yes, we should have a ToS to at least keep the principle of privacy
alive.
Post by Henrik Danielsson via arch-general
But let's face it, real privacy online has been dead for long, if it ever
existed.
If there was a ToS, the situation would perhaps have been different, at
least legally. I'm no legal expert of course, but to me it makes perfect
sense that if you posted something on the internet, in a very public
space,
Post by Henrik Danielsson via arch-general
you can have no expectations of keeping any of that information private
in
Post by Henrik Danielsson via arch-general
any way, nor any information easily associated with.
No, I don't see that as a problem, at least not if you never explicitly
agreed that information would not be shared. What I really want to keep
private I don't post anywhere.
I think the point here is not so much privacy, as I believe everyone
recognizes that the information that was asked for (the full list of
usernames) is public and can be scraped.
The point here is handing over the full list of usernames on request. Do
note that in their research proposal[1] they specifically mention
scraping information from github. That information is public, github
does have an API to query that information, but they still have to
scrape it, I suppose that implies github does not hand it over wholesale
on request, why should we? This might be due to their ToS or they know
something we don't.
It would be rather interesting to see what they could come up with from
that correlation.
Probably nothing meaningful. As I've said before you have no way of
knowing if user foo on github is the same as user foo on the AUR.
True, but you could make a decent guess based on how many coincidences
there are surrounding those names.
Relations between names could be interesting even if the people behind
them are not the same.
Post by Mauro Santos via arch-general
Post by Henrik Danielsson via arch-general
I think, perhaps a bit cynically, the reason github may not hand over that
data directly is likely that they don't want to do some of the work of the
researchers for them. As you said, the data is there, the format matters
less if they're going to massage it into something else later anyway, so
why bother with the effort of compiling it on their [github] own time?
We could simply deny the AUR username request it for the same reason, or no
reason at all. Since some people seem uncomfortable about what could be
derived from a potential correlation of publicly available data, that's
most likely the safest way to go.
--
Mauro Santos
Ralf Mardorf
2017-03-06 14:01:16 UTC
Permalink
Post by Henrik Danielsson via arch-general
We could simply deny the AUR username request it for the same reason,
or no reason at all. Since some people seem uncomfortable about what
could be derived from a potential correlation of publicly available
data, that's most likely the safest way to go.
Even if all users would agree to hand out a username list, why risking
a possible issue for some research, that seems to gain nothing for the
Arch community and as far as I can see even not for human kind? To be
honest, I can't name a real issue, I only could imagine very abstract
issues. I don't understand that research at all. Much likely nothing bad
would happen by handing out a list, but to avoid a "Now, why didn't I
think of that?"-issue the easiest solution seems to reject such
requests in general, at least as long as it's not obviously that the
research is "good" (what ever this means) for the Arch community and/or
human kind or the universe in general.
Henrik Danielsson via arch-general
2017-03-06 14:07:53 UTC
Permalink
Post by Ralf Mardorf
Post by Henrik Danielsson via arch-general
We could simply deny the AUR username request it for the same reason,
or no reason at all. Since some people seem uncomfortable about what
could be derived from a potential correlation of publicly available
data, that's most likely the safest way to go.
Even if all users would agree to hand out a username list, why risking
a possible issue for some research, that seems to gain nothing for the
Arch community and as far as I can see even not for human kind? To be
honest, I can't name a real issue, I only could imagine very abstract
issues. I don't understand that research at all. Much likely nothing bad
would happen by handing out a list, but to avoid a "Now, why didn't I
think of that?"-issue the easiest solution seems to reject such
requests in general, at least as long as it's not obviously that the
research is "good" (what ever this means) for the Arch community and/or
human kind or the universe in general.
Well, there's probably a lot of research results we did not know the
positive [or any] effects of beforehand.
I also doubt we'll find some drastically new improved way of life
because of this, but not all research aims for that.
Satisfying curiosity would be enough reason for most research IMHO.
Learning there is nothing there is also learning.
Martin Kühne via arch-general
2017-03-06 14:18:43 UTC
Permalink
Post by Ralf Mardorf
Much likely nothing bad
would happen by handing out a list, but to avoid a "Now, why didn't I
think of that?"-issue the easiest solution seems to reject such
requests in general, at least as long as it's not obviously that the
research is "good" (what ever this means) for the Arch community and/or
human kind or the universe in general.
So you're admitting that you can't come up with a real concern and are
opposing just for the sake of opposing.

I think it's important to discuss such issues a lot in general,
because they improve our reasoning.

But just to take away the outcome of your well-meaning pessimism:
There could be a white hair in that soup just as there could indeed be
a black hair in the opposite colored "Now, why didn't I think of
that?" soup of well-meaning optimism you warn against. If you can
follow this rahter heavily metaphoric dish of thought, it looks like
we learned something today, in that case.

cheers!
mar77i
Ralf Mardorf
2017-03-06 14:53:59 UTC
Permalink
Post by Martin Kühne via arch-general
Post by Ralf Mardorf
Much likely nothing bad
would happen by handing out a list, but to avoid a "Now, why didn't I
think of that?"-issue the easiest solution seems to reject such
requests in general, at least as long as it's not obviously that the
research is "good" (what ever this means) for the Arch community
and/or human kind or the universe in general.
So you're admitting that you can't come up with a real concern and are
opposing just for the sake of opposing.
Wrong! Protection of privacy is something that requires much thinking
and much weighting. Abstract imagination of issues is reason enough to
deny such a request, as long as the researcher doesn't plausibly
explains the benefit of the research. If somebody wants to hand out
the requested data, this person should provide more easy to understand
information, that isn't too long to read.
Post by Martin Kühne via arch-general
I also doubt we'll find some drastically new improved way of life
because of this, but not all research aims for that.
Satisfying curiosity would be enough reason for most research IMHO.
Learning there is nothing there is also learning.
Curiosity about what? How many equal nicknames were used on AUR and
github and what kind of software is related to those nicknames? A
researcher is seriously interested in this information? Not in
something else?

How do you know that this research is about learning something
positive?

We, the Internet and/or phone home app users already suffer from much
misused data. It's reasonable to be sceptic in regards to protection of
privacy.

Has got somebody the slightest idea about the aim of this research?

"anonymized statistics" and "establish connections" are abstract
phrases. Not abstract is that those claims are contradictory, without
the need of much abstract concerns or paranoia.

In the end I don't care, since I more or less have given up that
nowadays people are interested in really thinking about protection of
privacy, hence I'll op out, I only wanted to point out my doubts. Done.

Regards,
Ralf
Martin Kühne via arch-general
2017-03-06 15:06:30 UTC
Permalink
Post by Ralf Mardorf
Has got somebody the slightest idea about the aim of this research?
good question.
Post by Ralf Mardorf
"anonymized statistics" and "establish connections" are abstract
phrases. Not abstract is that those claims are contradictory, without
the need of much abstract concerns or paranoia.
none of these are crimes. and xxxjavaturtle69xxx actually writes
vectorgraphics in java.
Post by Ralf Mardorf
In the end I don't care, since I more or less have given up that
nowadays people are interested in really thinking about protection of
privacy, hence I'll op out, I only wanted to point out my doubts. Done.
Your approach kind of reminds me about how statistics is a much
misunderstood field. It doesn't matter what or how you record
statistics, you will always going to get rid of most of the data for
the sake of having a general overview, and nothing keeps you from
misinterpreting the results you get. That *still* doesn't make the
tools completely useless, as it's great for grouping many data points
into individual sectors. Of course it's not a simple topic, but you
can't fit a one-opt-out-fits-all-opt-outs approach to the problem
domain and think you're done?

cheers!
mar77i
Ralf Mardorf
2017-03-06 15:22:50 UTC
Permalink
Hi,

ok a last reply to this topic.

Since the usernames are anyway public, there is a reason to ask for a
list.

- politeness?
- laziness?
- something related to laws?
- ??

Perhaps the research has nothing to do with AUR and github, but e.g.
with a method, maybe an algorithm to "establish connections", perhaps
for manipulation purpose? I've got much fantasy about a lot of "good"
but as well "evil" reasons.
Post by Martin Kühne via arch-general
Post by Ralf Mardorf
Has got somebody the slightest idea about the aim of this research?
good question.
Post by Ralf Mardorf
"anonymized statistics" and "establish connections" are abstract
phrases. Not abstract is that those claims are contradictory, without
the need of much abstract concerns or paranoia.
none of these are crimes. and xxxjavaturtle69xxx actually writes
vectorgraphics in java.
Researchers sometimes misuse real records, not to harm those who
originally own those records, they just want to test processes that
later should be used for something that isn't related to those
"test" records.
Post by Martin Kühne via arch-general
Post by Ralf Mardorf
In the end I don't care, since I more or less have given up that
nowadays people are interested in really thinking about protection of
privacy, hence I'll op out, I only wanted to point out my doubts. Done.
Your approach kind of reminds me about how statistics is a much
misunderstood field. It doesn't matter what or how you record
statistics, you will always going to get rid of most of the data for
the sake of having a general overview, and nothing keeps you from
misinterpreting the results you get. That *still* doesn't make the
tools completely useless, as it's great for grouping many data points
into individual sectors. Of course it's not a simple topic, but you
can't fit a one-opt-out-fits-all-opt-outs approach to the problem
domain and think you're done?
No, as already pointed out, we don't know what this research is for.
Who says that the target of the research are statistics? This statistic
thingy perhaps is just to test or train something, that later should be
used for something completely different.

Now I'll use my fantasy for continuing a music project done with Linux.

Regards,
Ralf
Guus Snijders via arch-general
2017-03-06 10:26:25 UTC
Permalink
Op 6 mrt. 2017 10:52 schreef "Henrik Danielsson via arch-general" <
arch-***@archlinux.org>:

I guess I'll be the devil's advocate. I see no privacy issues in handing
over a list of already public information You could deny it for practical
reasons though, if you simply could not be bothered to scrape/export such a
list yourself. Denying or allowing won't stop anyone from obtaining the
list.


I'd say don't.
It's not that the information cannot be obtained otherwise, but I believe
it makes a legal difference.

Also, I don't see any advantage for ArchLinux to handing overy this info.
If they really want this info to profile AUR users and contributers, they
can either compile their own info (using git or scraping), or they could
use an opt-in mechanism and ask the users if they want to participate.

I know it's not directly an privacy issue, but I find it scary
nonetheless... (especially since they expressed the wish to consolidate the
data with other websites such as github).



Mvg, Guus Snijders
Martin Kühne via arch-general
2017-03-06 10:39:33 UTC
Permalink
On Mon, Mar 6, 2017 at 11:26 AM, Guus Snijders via arch-general
Post by Guus Snijders via arch-general
Op 6 mrt. 2017 10:52 schreef "Henrik Danielsson via arch-general" <
Post by Henrik Danielsson via arch-general
I guess I'll be the devil's advocate. I see no privacy issues in handing
over a list of already public information You could deny it for practical
reasons though, if you simply could not be bothered to scrape/export such a
list yourself. Denying or allowing won't stop anyone from obtaining the
list.
Gaetan's criticism applies to you here, now. please designate
paragraphs of text which you reply to.
Post by Guus Snijders via arch-general
I know it's not directly an privacy issue, but I find it scary
nonetheless... (especially since they expressed the wish to consolidate the
data with other websites such as github).
This is exactly for the argument I was struggling to come up with. As
far as I followed the discussion, this was the first time (I
realized?) someone clearly disconnected the argument from the privacy
discussion. Put this way, it makes sense to me, too. For the practical
implications we'd hand over along, with a note of "do whatever you
want with it we don't care". Turns out we do care what someone else
does with the data.

cheers!
mar77i
mike lojkovic via arch-general
2017-03-06 11:05:14 UTC
Permalink
I really, don't want to make it any easier for someone to spam me based on
correlations between account names.

On Mon, Mar 6, 2017 at 4:39 AM, Martin Kühne via arch-general <
Post by Martin Kühne via arch-general
On Mon, Mar 6, 2017 at 11:26 AM, Guus Snijders via arch-general
Post by Guus Snijders via arch-general
Op 6 mrt. 2017 10:52 schreef "Henrik Danielsson via arch-general" <
Post by Henrik Danielsson via arch-general
I guess I'll be the devil's advocate. I see no privacy issues in
handing
Post by Guus Snijders via arch-general
Post by Henrik Danielsson via arch-general
over a list of already public information You could deny it for
practical
Post by Guus Snijders via arch-general
Post by Henrik Danielsson via arch-general
reasons though, if you simply could not be bothered to scrape/export
such a
Post by Guus Snijders via arch-general
Post by Henrik Danielsson via arch-general
list yourself. Denying or allowing won't stop anyone from obtaining the
list.
Gaetan's criticism applies to you here, now. please designate
paragraphs of text which you reply to.
Post by Guus Snijders via arch-general
I know it's not directly an privacy issue, but I find it scary
nonetheless... (especially since they expressed the wish to consolidate
the
Post by Guus Snijders via arch-general
data with other websites such as github).
This is exactly for the argument I was struggling to come up with. As
far as I followed the discussion, this was the first time (I
realized?) someone clearly disconnected the argument from the privacy
discussion. Put this way, it makes sense to me, too. For the practical
implications we'd hand over along, with a note of "do whatever you
want with it we don't care". Turns out we do care what someone else
does with the data.
cheers!
mar77i
Henrik Danielsson via arch-general
2017-03-06 11:21:49 UTC
Permalink
2017-03-06 11:39 GMT+01:00 Martin Kühne via arch-general <
Post by Martin Kühne via arch-general
Gaetan's criticism applies to you here, now. please designate
paragraphs of text which you reply to.
I was not replying to anyone in particular. Gaetan? Sorry, you lost me
there.
Martin Kühne via arch-general
2017-03-06 11:58:57 UTC
Permalink
On Mon, Mar 6, 2017 at 12:21 PM, Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
I was not replying to anyone in particular. Gaetan? Sorry, you lost me
there.
It may not have appeared in the same thread for you, but here we go
[0] context, and the mail I was replying to was [1], for which the
former applies.

cheers!
mar77i

[0] http://www.mail-archive.com/arch-dev-***@archlinux.org/msg25123.html
[1] http://www.mail-archive.com/arch-***@archlinux.org/msg43113.html
Henrik Danielsson via arch-general
2017-03-06 12:58:20 UTC
Permalink
2017-03-06 12:58 GMT+01:00 Martin Kühne via arch-general
Post by Martin Kühne via arch-general
On Mon, Mar 6, 2017 at 12:21 PM, Henrik Danielsson via arch-general
Post by Henrik Danielsson via arch-general
I was not replying to anyone in particular. Gaetan? Sorry, you lost me
there.
It may not have appeared in the same thread for you, but here we go
[0] context, and the mail I was replying to was [1], for which the
former applies.
cheers!
mar77i
You are right, those did not show up as a thread for me, or even in my
inbox. Thank you for that.
I suppose I should have quoted part of the original message, but it
was not on a list I'm not a subscriber on and the quote in Mauro's
mail did not render as a quote normally does in Gmail, hence I was not
sure it would render correctly if I messed up the copying it (HTML
entities and all). I've never been able to make heads or tails of what
mailing lists actually consider new or continued threads, or even
navigating archives. :(
Ralf Mardorf
2017-03-06 11:50:41 UTC
Permalink
Post by Martin Kühne via arch-general
Post by Guus Snijders via arch-general
I know it's not directly an privacy issue, but I find it scary
nonetheless... (especially since they expressed the wish to
consolidate the data with other websites such as github).
This is exactly for the argument I was struggling to come up with. As
far as I followed the discussion, this was the first time (I
realized?) someone clearly disconnected the argument from the privacy
discussion. Put this way, it makes sense to me, too. For the practical
implications we'd hand over along, with a note of "do whatever you
want with it we don't care". Turns out we do care what someone else
does with the data.
This doesn't disconnect it from the privacy reasoning, this is a
privacy issue, too.
Post by Martin Kühne via arch-general
It's not that the information cannot be obtained otherwise, but I
believe it makes a legal difference.
This is the whole point. It makes a difference to explicitly provide a
lists under somebodies responsibility, that is isolated from the
individual responsibility of the individual user and by doing this
quasi to allow to use the lists for information processing. It's not
allowed to download and misuse a photo that is published by a homepage.
The photo is accessible for everybody, but not necessarily free for
usage. Usernames are readable for everybody, but this doesn't implicate
that it's allowed to use the usernames for information processing, it
might also not be forbidden, it's just important, that the
responsibility to have a username is by the user and to collect and
process the data by the "researcher" and not by a third party providing
a list.
YANG Ling via arch-general
2017-03-07 03:08:26 UTC
Permalink
Post by Mauro Santos via arch-general
Hi,
I was recently contacted by a Polish researcher asking for a list of AUR
account names. I did not expect this to be controversial but a couple of
Trusted Users raised concerns on IRC, so I decided to move this to the
public mailing list and discuss the whole topic in generality. I would
like to head more opinions but please read the whole email and give it a
second thought before simply bringing up the usual privacy arguments
mentioned below.
My original questions was: Are we fine with sharing the list of AUR
accounts names (only user names, no real names or email addresses) with
a researcher that seems trustworthy and agrees to not share the data in
any form other than the resulting anonymized statistics?
In this particular case, we are talking about Dorota Celinska [1] from
the University of Warsaw, Faculty of Economic Sciences [2], see [3] for
a list of her publications and [4] for a summary of her research project
funded recently by the Polish National Science Centre. She needs the
list of user names to perform a segmentation analysis, including users
which were active on the older AUR releases both do not show any
activity on AUR 4. She would also like to use the user names as
identifiers to establish connections with other platforms, such as
GitHub.
The next question is: Would it make sense to even make this data
publicly available? Would it make sense to extend our RPC interface such
that one can search for users names? GitHub, for example, already
provides such an interface [5]. Let me quickly summarize some arguments
* User names are mostly identifiers. It is questionable whether they
can/should be considered personal/private information. Maybe this can
only be answered by a lawyer, though.
* The user names of all accounts with any kind of public activity, like
uploading a package, filing a request, writing a comment, are public
already.
* After logging into the aurweb interface, you can already check whether
an account with a given user name exists because the account details
page URIs have the form https://aur.archlinux.org/account/$username.
This means that for any platform providing a list of user names (such
as GitHub), you can "establish connections" with the AUR already.
* Principle of data economy: We should not share any kind of information
we do not need to share.
* Sharing user names lowers the threshold for sharing other information
which is considered more confidential.
* Users can (and should) already use crawlers to fetch the user names.
For example, the user names of all package maintainers and comment
authors appear on the package details pages. The names of all users
filing package requests appear in the mailing list archives etc.
* We do not have ToS so we better not share anything.
I, personally, find the second last argument a very weak one. Telling
users to build crawlers scraping an brute-forcing our HTML pages makes
life difficult for both them and us. What do you think?
On the other side of the coin, the last argument is a very good one and
it brings me to my last point. Independently of the outcome of this
discussion, I think we should add some ToS that users need to agree upon
when registering. It should contain information on liability and on
privacy. Is anybody willing to write a draft? Do we need the support of
a lawyer here?
Thank you for your time and have a nice Sunday!
Regards,
Lukas
[1] http://coin.wne.uw.edu.pl/dcelinska/en/
[2] https://www.wne.uw.edu.pl/index.php/en/
[3] http://coin.wne.uw.edu.pl/dcelinska/en/pages/publications.html
[4] https://ncn.gov.pl/sites/default/files/listy-rankingowe/2016-03-15/streszczenia/337724-en.pdf
[5] https://developer.github.com/v3/users/
I'd say err on the caution side and don't share, even though the
usernames are public and easy to find by scraping them from the
website/mailing list/etc, handing the whole database of usernames in a
silver platter is a whole different story, which is what is being asked.
Is there any community/website that provides a full list of registered
usernames on request?
There is also the question of how useful that data would be, without any
other data such as email the username list is useless, you have no
guarantee that user foo on github is the same person as user foo on the
AUR/Wiki/Forum or user foo somewhere else. In this case I'd also have to
agree that sharing usernames lowers the threshold for sharing other
information.
It also doesn't fit with their stated research goals, only github and
projects associated with scraping data from github are mentioned, why
would they want to throw the AUR usernames in the mix?
--
Mauro Santos
Hi all,
Shall we focus on Lukas's questions?
Post by Mauro Santos via arch-general
My original questions was: Are we fine with sharing the list of AUR
accounts names (only user names, no real names or email addresses) with
a researcher that seems trustworthy and agrees to not share the data in
any form other than the resulting anonymized statistics?
→ The first question: Are we fine with sharing the user names?
Post by Mauro Santos via arch-general
The next question is: Would it make sense to even make this data
publicly available? Would it make sense to extend our RPC interface such
that one can search for users names? GitHub, for example, already
provides such an interface [5]. Let me quickly summarize some arguments
→ The second question: Would it make sense to even make this data publicly available?
Post by Mauro Santos via arch-general
I think we should add some ToS that users need to agree upon
when registering. It should contain information on liability and on
privacy. Is anybody willing to write a draft? Do we need the support of
a lawyer here?
→ The third question: Shall we add some ToS that users need to agree upon when registering?

My opinions:

1. The first question: Are we fine with sharing the user names?
I am fine. But I think some agreements should be made before sharing
the data.
2. The second question: Would it make sense to even make this data
publicly available?
No, it is not OK. Please check this wiki [1]. Login name or nickname
is Personally identifiable information (PII).
3. The third question: Shall we add some ToS that users need to agree
upon when registering?
Yes, it is better to have ToS.

[1]: https://www.wikiwand.com/en/Personally_identifiable_information
Eli Schwartz via arch-general
2017-03-07 03:46:14 UTC
Permalink
Post by YANG Ling via arch-general
Hi all,
Shall we focus on Lukas's questions?
Yes, let's.

[skipped - pointlessly quoted and then repeated questions]
Post by YANG Ling via arch-general
1. The first question: Are we fine with sharing the user names?
I am fine. But I think some agreements should be made before sharing
the data.
There is no need to be fine or not, the user names are *already* public
(with the exception of people who have never uploaded a package, left a
comment, filed a package request, or indeed visibly interacted with the
AUR in any way).
Post by YANG Ling via arch-general
2. The second question: Would it make sense to even make this data
publicly available?
No, it is not OK. Please check this wiki [1]. Login name or nickname
is Personally identifiable information (PII).
Okay... firstly, thanks for the strange Wikipedia proxy....

Stating a tautology does not advance this discussion. No one thought for
a moment that usernames weren't somehow "personally identifying
information".
Lukas elaborated upon this question, by providing actual arguments for
and against. By ignoring 90% of what he said, you are stripping the
discussion of most meaningful context, and replacing it with some vague
buzzwords.
Post by YANG Ling via arch-general
3. The third question: Shall we add some ToS that users need to agree
upon when registering?
Yes, it is better to have ToS.
This wasn't even the question. Lukas said we should have a ToS, and he
*asked* if anyone was willing to draft one.

...

I really don't understand why people seem to have a paranoia issue with
other people having an efficient interface to data that is already
there. Researchers and Peeping Toms can already find out all this
information by hitting the AUR server a lot and scraping HTML responses,
offering the *same* data with less overhead can only serve to ease
server congestion (on "our" end) and *time expended* reinventing the
username list (on "their" end).

Do we wish to penalize all researchers for the evil habit of extracting
personally identifiable information, by making them slog through the
process of compiling their information? Knowing full well that it won't
actually stop them (for good or for ill)?

Do we even owe anything to the relevant users? Since there is no ToS, an
argument could be made that we all agreed to share whatever information
we have in fact shared, without asking for qualifications about what the
Arch Linux project intended to *do* with our usernames etc.
(The usual IANAL applies.)

tl;dr
Let us emulate the forums, and provide a username list only accessible
to logged-in AUR users.
--
Eli Schwartz
Ralf Mardorf
2017-03-07 05:48:08 UTC
Permalink
Post by Eli Schwartz via arch-general
Let us emulate the forums, and provide a username list only accessible
to logged-in AUR users.
So you recommend that AUR should deviate from the Arch related mailing
lists. Note, mailman mailing list could be set up to "The subscribers
list is only available to the list members" as done for
https://lists.archlinux.org/listinfo/aur-general and then the members
could decide on their own, if they are visible by this list [1].
However, https://lists.archlinux.org/listinfo/arch-general is set up to
"The subscribers list is only available to the list administrator".

So it was always the default and still is the default, not to provide
such lists. There are sane reasons why those options are available, but
we are at a point were nobody cares anymore about privacy and you
recommend that providing such a list is appropriate. For what reason?
Why not keeping what always was and still is default for Arch, simply
to respect privacy?

[1]
"Conceal yourself from subscriber list?

When someone views the list membership, your email address is normally
shown (in an obscured fashion to thwart spam harvesters). If you do not
want your email address to show up on this membership roster at all,
select Yes for this option."
Ralf Mardorf
2017-03-07 06:08:21 UTC
Permalink
Post by Ralf Mardorf
Post by Eli Schwartz via arch-general
Let us emulate the forums, and provide a username list only accessible
to logged-in AUR users.
So you recommend that AUR should deviate from the Arch related mailing
lists. Note, mailman mailing list could be set up to "The subscribers
list is only available to the list members" as done for
https://lists.archlinux.org/listinfo/aur-general and then the members
could decide on their own, if they are visible by this list [1].
However, https://lists.archlinux.org/listinfo/arch-general is set up to
"The subscribers list is only available to the list administrator".
So it was always the default and still is the default, not to provide
such lists. There are sane reasons why those options are available, but
we are at a point were nobody cares anymore about privacy and you
recommend that providing such a list is appropriate. For what reason?
Why not keeping what always was and still is default for Arch, simply
to respect privacy?
[1]
"Conceal yourself from subscriber list?
When someone views the list membership, your email address is normally
shown (in an obscured fashion to thwart spam harvesters). If you do not
want your email address to show up on this membership roster at all,
select Yes for this option."
PS:

And btw. stop calling people who care about privacy "paranoid". I for
example decided not to hide my membership and email address.

Another question: Are you willing to become the one responsible for
providing such a list in the legal sense?
YANG Ling via arch-general
2017-03-07 06:18:14 UTC
Permalink
Post by Eli Schwartz via arch-general
Post by YANG Ling via arch-general
Hi all,
Shall we focus on Lukas's questions?
Yes, let's.
[skipped - pointlessly quoted and then repeated questions]
Sorry, I'm not familiar with the rules here. I thought it is necessary
to keep all original text.
Post by Eli Schwartz via arch-general
Post by YANG Ling via arch-general
2. The second question: Would it make sense to even make this data
publicly available?
No, it is not OK. Please check this wiki [1]. Login name or nickname
is Personally identifiable information (PII).
Okay... firstly, thanks for the strange Wikipedia proxy....
Oops, I forgot to paste the original wiki link. Wikiwand is a tool which
can beautify Wikipedia page. If that bothers you, here is the original
link: https://en.wikipedia.org/wiki/Personally_identifiable_information
Post by Eli Schwartz via arch-general
tl;dr
Let us emulate the forums, and provide a username list only accessible
to logged-in AUR users.
Actually I am not *strongly* against it. I just feel it is not necessary
to do it.
Here is my logic:
If one thing can make Archlinux better, then we do it.
If one thing can make Archlinux worse, then we don't do it.
If we cannot tell one thing make Archlinux better or worse, then it
is not necessary to do it.

In this case, what does Archlinux or its community gain from sharing
usernames with researchers? I cannot see merits.

Well, I am not Archlinux expert, nor Trust User. Maybe you can see
merits which are not obvious to me.
If most Trust Users feel it is OK to share usernames, I am fine with it.

Don't get me wrong. I just simply hope Archlinux become better and better.

--
Allen Yang
Neven Sajko via arch-general
2017-03-08 19:57:12 UTC
Permalink
This discussion is pointless without legal advice. Without it
disclosing user information (even if it is public) does not seem like
such a good idea.
Neven Sajko via arch-general
2017-03-08 20:02:24 UTC
Permalink
Post by Neven Sajko via arch-general
This discussion is pointless without legal advice. Without it
disclosing user information (even if it is public) does not seem like
such a good idea.
Not that I advocate paying a lawyer just for this issue, it would be
simpler to let her scrape AUR ;)

... But there probably should be some TOS ...
Joakim Hernberg
2017-03-08 20:36:35 UTC
Permalink
On Wed, 8 Mar 2017 21:02:24 +0100
Post by Neven Sajko via arch-general
Post by Neven Sajko via arch-general
This discussion is pointless without legal advice. Without it
disclosing user information (even if it is public) does not seem
like such a good idea.
Not that I advocate paying a lawyer just for this issue, it would be
simpler to let her scrape AUR ;)
... But there probably should be some TOS ...
IMO, why are we even discussing this. If she wants to do research, do
it, set it up and scrape. Why should we compile a list and send
it..? :S
--
Joakim
Leonid Isaev
2017-03-08 20:45:58 UTC
Permalink
Post by Neven Sajko via arch-general
... But there probably should be some TOS ...
Why? Would a ToS be a legally binding document? If yes, it will constrain Arch,
which is not good. If no, then it's just a meaningless text. I understand that
companies have ToS because they want to cover their back legally, but Arch is
different in this regard...

Cheers,
--
Leonid Isaev
fnodeuser
2017-03-07 08:36:53 UTC
Permalink
Eli Schwartz,

no one is paranoid here. we do not want security and privacy issues.

YANG Ling,

sharing the requested data will not benefit us in any way.

Lukas Fleischer,

why are you talking on her behalf? why did she send a message to you instead of one to the ML?
why is she not answering any questions in this ML?
Bartłomiej Piotrowski
2017-03-07 08:55:54 UTC
Permalink
Post by fnodeuser
why are you talking on her behalf? why did she send a message to you instead of one to the ML?
why is she not answering any questions in this ML?
Why do you keep breaking threads on our mailing lists? Why you are
incapable of using one e-mail address? Why do you think we particularly
care about your opinion?

BP
fnodeuser
2017-03-07 09:03:25 UTC
Permalink
Bartłomiej Piotrowski,

it is the same email address that i have been using since the beginning.

what opinions?

i never talk with opinions. i always talk with facts.
Jelle van der Waa
2017-03-07 09:22:16 UTC
Permalink
Bartłomiej Piotrowski,
it is the same email address that i have been using since the beginning.
what opinions?
i never talk with opinions. i always talk with facts.
You broke the thread again..... If you want to be taken serious atleast
reply in a thread and provide actual facts and arguments.
--
Jelle van der Waa
YANG Ling via arch-general
2017-03-07 09:46:06 UTC
Permalink
Post by fnodeuser
Bartłomiej Piotrowski,
it is the same email address that i have been using since the beginning.
what opinions?
i never talk with opinions. i always talk with facts.
Hi. You may not notice your current reply function broke this thread.
What is your mail client?
I believe your mail client has "Reply-To-All" function. Try to use that
function to avoid breaking this thread.

--
Allen Yang
fnodeuser
2017-03-07 10:06:14 UTC
Permalink
test
Bennett Piater
2017-03-07 10:08:45 UTC
Permalink
Reply-To is not what you are supposed to look into; look at In-Reply-To!
test
--
GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
fnodeuser
2017-03-07 10:36:44 UTC
Permalink
test
Martin Kühne via arch-general
2017-03-07 11:00:49 UTC
Permalink
test
This worked.
Thanks for your effort.

cheers!
mar77i
Loading...