Web lists-archives.com

Re: salsa.debian.org maintenance (GitLab 11.1.4 upgrade, external storage migration)




Hello,

Andrey Rahmatullin:
> On Fri, Aug 17, 2018 at 10:12:00AM +0000, u wrote:
>>>> While I understand the simplicity of using $company's cloud storage, I'd
>>>> rather not rely on some external company and in particular not on this
>>>> one. This company does not exactly represent what I would call ethical,
>>>> non-proprietary, and decentralized.
>>> Is that a problem?>> Rhetorical questions are a means of communication which are not only a
>> means of linguistic manipulation, but also counterproductive in such a
>> discussion. May you reformulate?
> They are not rhetorical.
In that case let me answer your question "Is that a problem?". Note that
ideally I would not want to create a gigantic discussion thread with
people throwing virtual mud at each other and unfortunately this thread
has all the potential to end up in exactly that kind of scenario. So
I'll try to stay factual and hope others will do the same. <3

First of all, I'm very grateful that Salsa exists and works so well.
Thank you for working on it. Thank you too for thinking about solving
I/O problems and for considering proxying in first place. And thank you
for announcing the technical possibility of switching to another storage.

Besides the bad taste in Jonas' mouth, I want to try to frame where the
bad taste in my mouth comes from by asking some questions. Note that I
have no ready responses to these questions, but I think we need to ask them.

If I understand correctly this storage was enabled on August 11th.

Consent
-------

I feel like we're currently balancing on a thin cobweb of fait accompli.
Are such decisions team internal or do they require the consent of the
project?

Should we make it known and visible to people who use Salsa that some of
their data might be stored at a 3rd party company? Is this a consent issue?

How does this entire issue relate to the GDPR?

  - I'm not knowledgeable enough in this area so I cannot reply to this
    question myself.

How does this relate to our Social Contract?

  - 4. "Our priorities are our users and free software". This is very
    vague but one could argue that this paragraph is not entirely
    respected.

Data
----

Google Cloud storage is tightly linked to their AI & big data analytics
features which I personally find highly questionable. As this
intelligent cute monster feeds on data and metadata, it's part of its
ecosystem to provide free services in order to get more free food.
(Mentioning this because Marco d'Itri was raising the issue of having to
pay for storage.)

Even though the initial email mentioned that only files that are already
publicly accessible are stored there, we have no means to know if this
changes at some point due to some configuration modification on our
side, be it accidental or not. How can we guarantee it won't? Is this
process currently transparent? And if not, how can we make it so?

Have Salsa maintainers enabled the least invasive privacy features for
this service? [0] Is there a means for us as Salsa users to know when
those change? Is there a means to know what they are exactly? I'm not
only concerned about sending identifying information about Salsa users
but also about making it easier for a 3rd party to do metadata analysis.
(I agree this can already be done with data that is public, so this is
not a privacy issue per se. But: it cannot be done as easily, hence my
concern is more about providing free food, i.e. cheap work [1][2] to a
company, and again another concern is about consenting to do so.)

Access
------

While it has been said that all access is currently proxied and no user
identifying data provided to Google, how can we know this remains the
case in the future? Will we be consulted for consent? Is this process
transparent, i.e. can we see this configuration publicly? And if not how
can we make it so?

Thanks for thinking about this kind of proxying in first place, Salsa
team! By proxying, accessing this data will work over Tor and in places
where Google exerts censorship, I suppose and hope. If ever the Salsa
team considers changing this setup, or Google cloud storage changes
their inner functioning, we might lack the possibility for anonymous
access which should be taken into account in order to provide free,
decentralized, captcha-free access to Debian's source code repositories.

Transparency
------------

I believe that privacy should always be the default, as you cannot
opt-out of already leaked information after the fact: there is no way to
go backwards in time. Hence I believe that such issues need to be
treated carefully, transparently and consensually.

To end with, I believe that a self-hosted solution would address all of
these issues.

Cheers,
Ulrike


PS: I have no interest in replying to the question "Are they ethical,
non-proprietary, and decentralized?" on this mailing list as I believe
it will not advance the discussion any further.

Notes:

[0] https://cloud.google.com/security/privacy/
[1]
https://www.ucpress.edu/book/9780520299931/a-history-of-the-world-in-seven-cheap-things
this is about the notion of cheapness and its importance for capitalist
development.
[2]
https://usbeketrica.com/article/quatre-metaphores-pour-une-politique-de-la-donne
this is about why data can be considered as work, see metaphor n° 4.