Tech

By Johannes Ernst

https://reb00ted.org/tech/

  • 2020-02-16

    MyData Stakeholder Segmentation (Draft)

    For the purposes of our ongoing discussions in the MyData Silicon Valley Hub about a potential North America conference later this year, here is my attempt at segmenting the stakeholders.

    Prime mover Follower Neutral Adversary
    Product & Services
    Channel & Distribution
    Catalysts
    Customers

    where:

    • Prime movers: innovators, inventors, people and organizations that proactively push the vision forward and do things the first time they have ever been done, not waiting for others.
    • Followers: people and organizations who are willing to do things consistent with the vision but only after others have pioneered the way first.
    • Neutral: people and organizations who don’t care about about the vision.
    • Adversaries: people and organizations whose vision is fundamentally different and whose agenda is opposed to ours.

    and:

    • Product & Services: creators of apps, platforms, integration products, support and the like.
    • Channel & Distribution: systems integrators, value-added resellers, app stores, retail etc.
    • Catalysts: press, analysts, event organizers, activists, MyData Global itself, governments / regulators, investors.
    • Customers: buyers and users of products and services (consumers, enterprises, governments).

    What do you think?

  • 2020-02-16

    German data for German firms, according to … Microsoft?

    In the US, we think of our struggle over data ownership as a conflict between large, unaccountable companies (like Facebook) versus us as individuals. But it is more complex than that as soon as you look beyond the US.

    Take the Germany federal government, for example. How does your sovereignty as a nation look to you, if data is the new oil in the 21st century, but most of that data ends up on clouds operated by American (or Chinese) companies? Critical infrastructure entirely dependent on the goodwill of one (or two) other countries? Who can see anything you do there? Or turn it off in case of a conflict? Sounds like a disaster waiting to happen.

    So what do you do? You might team up with fellow nations, like other EU members, and pass regulations such as the GDPR which erodes the exclusivity US companies have over data. Or spearhead a project called Gaia-X, which is intended to be a European alternative to American (and Chinese) “clouds” with the stated goal of regaining data sovereignty.

    And into this fight steps Brad Smith, now Microsoft president, who is being quoted (in German) in the press saying:

    German data should serve German companies – not just a handful of companies on the US west coast or the Chinese east coast.

    (I will ignore here that this comes across as quite racist, and in case of Germany, one should not make that mistake, even if it comes from an American.)

    Clearly, Microsoft has identified an opportunity to make a bundle here, by selling to countries like Germany attempting to set up their own clouds, and we know this because the quote comes from very top of the company, not some regional sales manager.

    But the striking part of the quote: “should server German companies” (not “people”, or “Germany, the country”) tells us clearly what the German government has in mind here, to whom it is directed: use data to bolster German companies in international competition.

    While we all benefit from new rules such as the GDPR, and their enforcement in Europe as in the recent case of the Irish against Facebook, it’s clear we, individuals, are merely an accidental beneficiary.

    It’s really about big company competition, supported by national governments. Let’s not forget that. If they were to accommodate each other somehow, I bet the push for privacy and GDPR-like things would evaporate in a heartbeat.

  • 2020-02-06

    When privacy and agency are in conflict

    In the privacy-related communities I hang out, we often use the phrase “privacy and agency” as a label for the totality of what we want.

    But what if those two cannot be had at the same time? What if more privacy, in practice, means I need third parties to take a larger role, thereby reducing my agency? Or what if I have more agency and can do more things in more ways that I solely determine, but only at the cost of less privacy?

    Unbelievable?

    If so, then look no further than the recent public discussion (dispute?) between the founders of the Signal and Matrix messaging systems, Moxie Marlinspike and Matthew Hodgson. The essence of their arguments, and I paraphrase:

    • Moxie: you can’t build a private messaging system that’s competitive as a consumer app unless a single party, such as the Signal project, takes responsibility and ownership of the whole thing. Lots of privacy, but for the user it’s take it or leave it. Link to full post.
    • Matthew: decentralization, on all levels including code forks and potentially insecure (non-private!) deployments, is an essential requirement to avoid single points of failure: critical people or components turning bad. Link to full post.

    This is a high-quality conversation and we can all be very happy that it is conducted openly, and in a spirit of finding the truth. Go read both pieces, and ponder the arguments, it’s very much worth your while.

    Who is right?

    IMHO, both are. I don’t know whether all the the tradeoffs described are as unavoidable and unmitigatable as they are made out to be on those posts; maybe more innovation in technology and in particular governance could alleviate some of them.

    However, the basic idea of a tradeoff between them, is valid. The Signal and Matrix projects make different choices on that spectrum, both for valid reasons.

    If they need to do that, chances are, everybody else who cares about providing products and services with privacy and agency for the user, faces similar tradeoffs. It would serve us well to acknowledge that in every discussion on those points, and respect others who have the same goals as we do, but make different tradeoffs.

    The most important point, however, is this: it shows how important it is to have both projects, or a plurality of projects addressing similar requirements but making different tradeoffs. Because that gives us, the users, you and me, the agency to make our own choices based on our own preferences. Including the choice to forego some agency in some aspects in favor of more privacy.

    Which is the most important aspect of agency of them all.

  • 2020-01-15

    Comments and questions on the JLINC protocol for Information Sharing Agreements

    Updated 2020-01-24 with answers from Victor, slightly edited for formatting purposes. Thanks, =vg!

    My friends Victor and Jim at JLINC have published a set of technical documents that show how to implement “Information Sharing Agreements” – contractual agreements between two parties, where one party receives information, such as personal information, from the other party and commits to only use the received data in accordance with the agreement.

    This is basically a respectful, empowering form of today’s widespread, one-sided “I must consent to anything” click-through agreement every website forces us to sign. It’s respectful because:

    • it is negotiated, rather than unilaterally imposed as it is the default on the internet today;
    • the existence of the agreement, and which parties it binds, can be cryptographically proven by both parties;
    • there’s a full audit log on both sides, and so it would be difficult to “wiggle out of” the agreement;
    • it can’t be unilaterally changed after the fact, only terminated.

    So as I read through the documents, I had some questions, and as usual, I blog them :-) in random sequence. ~~~I will add answers to this post as I find out about them.~~~ Answers in-lined.

    • Q: Why is a separate DID method required? I don’t quite understand what is unique about JLINC DIDs that are forms of DIDs can’t do, too.

      • A: The W3C DID working group has specified a “common data model, a URL format, and a set of operations for DIDs, DID documents, and DID methods.” This by itself does nothing - individual DID methods conforming to this model then need to be specified and implemented. See here. There are various DID methods (including `did:jlinc``) listed in the DID method registry. We believe our method is better for -our- needs and use cases – and besides, we understand that one ;-)
    • Q: To create a JLINC DID, I need to post something to which URL? The spec says /register but doesn’t identify a hostname. Can it be any? Or is that intended to be a centralized service, perhaps run by JLINC, the company?

      • A: Anyone could read our public spec and create their own registry, but we have put up a testnet and made it available via an open source Node module](https://github.com/jlinclabs/jlinc-did-client). The example config file in the above repo contains the correct testnet URL. When we feel the W3C DID model has stabilized sufficiently we will make available a production-version public registry.
    • Q: How do the identifiers that the two parties use for the JLINC protocol relate to identifiers they may use for other types of interaction, e.g. some other protocols within in the decentralized / self-sovereign identity universe? Is a given user supposed to have a variety of them for different purposes?

      • A: This is a question that is being addressed by the W3C DID-resolver community group, in which we are participating. We will make available a JLINC DID resolver when that spec has been published. Every DID contains a (presumably registered) DID method as its second colon-separated value (e.g. “did:jlinc:SOME-UNIQUE-STRING”) so you will be able to resolve any DID whose method your resolver is configured for.
    • Q: Why is a ledger and its associated ledger provider required? (Actually, maybe it is optional. But the spec says “may submit it to a Ledger of their choice to establish non-repudiation”, so that implies the ledger is required for that purpose.)

      • A: Supporting audit ledgers is part of our plan but has not yet been implemented.
    • Q: There is already a previousId in each exchange. Wouldn’t that be sufficient for non-repudiation if the two parties keep their own records?

      • A: Theoretically yes, but a third-party audit record contemporaneous with each data-transfer event would guard against any nefarious record manipulation that might become possible if there should turn out to be some cryptographic weakness discovered.
    • Q: There is also the role of an “audit provider”. How is it different from a “ledger provider”? And if it is, why do we need both?

      • A: Those are two names for the same thing.
    • Q: Are, by virtue of the ledger, the Information Sharing Agreements themselves, essentially public or at least leaked to an uninvolved third party? Can I use JLINC to privately agree on an Information Sharing Agreement without telling others about it? If so, what functionality do I lose?

      • A: For most purposes we envision using Standard Information Sharing Agreements (SISAs) that are published publicly, and we are looking for a suitable standards body to work out a format for those and perhaps publish some useful ones, modeled along the lines of Creative Commons. But JLINC will work fine with any agreement, most likely identified with a content-addressed URL, but conceivably even a private legal agreement between two parties, identified only by its hash value.
    • Q: When an AgreementURI is used to merely point to the legal text that defines the agreement, rather than incorporating it into the exchanged JSON, would it make sense to also at least include a hash of the agreement text? That way, a party cannot so easily wiggle out of the agreement by causing the hoster of the agreement text to make modifications, or claim to have agreed to a different version of the agreement.

      • A: Yes, ISAs are always identified by their hashes, usually via a content-addressed URL like IPFS or some similar scheme that includes a hash of the content as part of the address.
    • Q: There’s a field descendedFrom in various examples, which isn’t documented and is always the text string null. What might that be for?

      • A: The JLINC protocol has been rapidly evolving as we build stuff and discover ambiguities and possible efficiencies in it. That field is obsolete.
    • Q: How would a permissionEvent work in practice? Wouldn’t that require the underlying legal text to change? Is there a description somewhere?

      • A: The ISA should specify that the data-custodian agrees and will respect the rights-holder’s choices as they are transmitted via permission events. Then each permission change event is transmitted under the existing ISA, same as with data events.
    • Q: Could one use JLINC to govern data that’s much longer, or much more complex, than the typical small set of name-value pairs used for user registration data on consumer websites? Can I use it, say, for the first chapter of my Great American Novel I am sending to a publisher, permitting them to only read it themselves but not publish it yet, or to send my MRIs to a new doctor?

      • A: Yes, absolutely.
    • Q: In a successful relationship between a Me and a B, to use the Me2B Alliance’s terminology, it appears that the “data kimono” is gradually opened by the Me to the B. For example, the Me may first visit a website without an account, then register (and provide their name and e-mail address) and a month later, buy something (which requires a shipping address and a credit card number, but only until the purchase is delivered and the data can be deleted again). In the JLINC world, does this require a different Information Sharing Agreement on each step? (particularly for the deletion after shipment?)

      • A: No – see the permissionEvent question above.
  • 2020-01-13

    Want to buy an aged Twitter account?

    From a spam e-mail:

    Aged Twitter 2009 to 2015 Accounts For Sale - check new thread for new prices

    The accounts are empty or with less than 50 followers.

    2008 - 10$ Per Account
    2009 - 9$ Per Account
    2010 - 8$ Per Account
    2011 - 7$ Per Account
    2012 - 6$ Per Account
    2013 - 5$ Per Account
    2014 - 4$ Per Account
    2015 - 3$ Per Account

    Assuming those accounts actually exist, I can think of some political maneuverers who would likely be interested. I’m a bit surprised at the prices.

  • 2020-01-13

    Downloading all your data and new security risks

    I’ve been playing around with the new data download features major on-line providers like Twitter, Facebook or Google have been forced to provide to us Californians since January 1, 2020, under the California Consumer Privacy Act.

    It’s amazing what kinds of data they have. For example, from the Facebook download I learned that dozens of car dealerships all over the country (like, say, in Texas, where I definitely have never gone car shopping) have my name and address. How – I have no idea.

    But speak about putting all your eggs in one basket. In Google’s case, a single ZIP file contains all your e-mail over a decade or more, your pictures, your private messages, your location history – everything you ever used any Google product for, and many things you never thought Google recorded about you.

    If this one file fell into the hands of somebody nefarious, you’d probably be in serious trouble – from possible financial fraud to blackmail on multiple levels, in particular in less-liberal countries, of which there are unfortunately more and more. The trouble would likely be much bigger than if somebody “merely” logged into your account: because all the info is there in one place, you don’t have to look for it, you can write scripts against it and immediately analyze it.

    As Andrew Carnegie, and then Mark Twain, said: “And Then Watch That Basket!”. The trouble is … do we? I mean … before I started jotting down this post, I don’t recall having seen a single discussion of this threat anywhere, and I usually pay attention to this kind of thing.

    It’s hard to secure that kind of access. To be sure, I’m all in favor of me and you being able to know every last bit of what big companies record about us, and get that data and use it somewhere else. But that power sure comes with a lot of potential dangers.

    I fully expect a wave of “GDPR” and “CCPA attacks” to occur, all focused on getting your full archive from major service providers and “monetizing” this in various ways, plus enabling whatever any secret police in some jurisdiction – and I use the word “juris diction” loosely here – can come up with.

    What’s the alternative? Well, those service providers not having all that data about me in the first place! Instead, they should only be “borrowing” it from me; well, the parts they need for something I agree to for as long as they actually need it. Then, no bulk upload or download is necessary, and we don’t have this high-risk security problem in the first place.