Tech

By Johannes Ernst

https://reb00ted.org/tech/

  • 2023-03-25

    My wish list for future ActivityPub standardization and related activities in the Fediverse “commons”

    Note: Newer version of this list is here.

    There is some movement in the W3C to perhaps reactivate working on ActivityPub, the protocol that lets Fediverse apps such as Mastodon talk to each other and form a large, decentralized, social network.

    I posted this to their mailing list recently. Sorry if this is a bit cryptic, it was written for a very specific technical audience. I have added a few annotations in square brackets to provide context.

    What am I missing in my wish list?


    Putting on my product developer hat, here’s what I want for Christmas:

    A single-document basic [Fediverse] interop spec, ie.

    • when I have written code for everything it says, my code will interoperate in some basic fashion with all other software that has implemented this document
    • no need to consult or understand other implementations
    • may be quite basic, e.g. text content only, only a minimal set of activity types
    • enables implementors to have a “MVP”-style interop at lowest-possible engineering effort (including the time required to read/understand the specs!)
    • This could be done as a “minimal profile” of a stack that contains a subset of AP [ActivityPub], AS [ActivityStreams], and Webfinger

    A test suite for that profile

    • suitable to add to my automated test suite
    • over time, this test suite can grow beyond the minimal profile

    A branding program for products that have passed the test suite

    • As an implementor, you get to put the sticker on your product.
    • In particular, in the places in the product where users “connect” to other servers in the Fediverse, like “Visa” is displayed at the POS terminal
    • I believe this will become critical if/when larger orgs with potentially different value systems connect to the Fediverse

    A set of web “intent buttons” for Like, Follow, Post, etc that work across sites

    • like they exist for centralized social media
    • as easy to use for the user
    • we can argue how this can be accomplished technically. I have opinions, but for this wish list, they are immaterial, as long as I get those buttons :-)

    A standardized way to express terms under which content is provided

    • As I understand it, Bob Wyman calls that a Rights Expression Language
    • This probably should start with a use case collection

    A design for search that meets the requirements of all relevant parties and points of view

    • This is probably far less a technical problem than one of successful communication

    A design to reduce certain loads

    • Fan-out
    • Video
    • (Not my area of expertise, so I don’t have details)

    Improved identity management across the Fediverse

    • Easy-to-use single-sign-on across servers. Use case: I use several apps for different content types (like micro blog and video). Bonus: they all post from the same identifier
    • Easy-to-use persona management. Use case: I have a personal and a work account, bonus if they can be on the same server
    • Identifiers not tied to the domain name system

    Some of this squarely falls in the domain of this group [that would be the W3C’s Social Web Interest Community Group], some is adjacent. It could be pulled in, or it can be done somewhere else. I don’t particularly care about that either, as long as it gets done and done consistently with the rest.

    Now I’m sure you all are going to tell me why I can’t have all those things for Christmas, and certainly not this Christmas. But I can wish, no? (More seriously, I think they are all essential for continued successful growth of the ActivityPub network as new parties connect)

  • 2023-03-22

    Open networks are more valuable than closed ones: the case for the Fediverse over Twitter and Facebook

    Networks are everywhere, and they are valuable. Consider:

    • The road network. Imagine you’d need a different car for each piece of road. Life as we know it today would be utterly impossible without it.
    • The phone network. To pick two random use cases, without it you couldn’t call customer service or summon an ambulance.
    • The Visa credit card network (and its competitors). You would have to use cash instead, but arguably everybody accepting the same currency forms a network, too, and without that, we’d be back to barter. Which would be really inconvenient.
    • The world-wide-web. Some of us are old enough to remember the times before. No on-demand news, music, entertainment, chatting, reservations, e-commerce and all the others.

    Generally, larger networks are more valuable than smaller networks: if you are the only person in the world who has a telephone, that phone is not worth much. If there are 2 people with phones, you can at least call each other. With 3 people, 3 different conversations can be had. With 4, it’s 6. With 100, it’s 100*99/2 = 4950 possible conversations, not counting multi-party conference calls. This quadratic growth of value with the size of the network applies to all networks, according to Metacalfe’s Law.

    But in this post, I want to look at another dimension of networks that impacts their values, and that is whether the network is “open” or “closed”. There are lots of details one could consider, but for our purposes here, we define a spectrum with two extremes, and lots of gray in the middle:

    Fully open Somewhere in between Entirely closed
    Anybody can connect to the network and do what they like, nobody's permission is required. Who may connect, and what they may do on the network, is closely controlled by the network proprietor.

    There can be all sorts of network proprietors, but for simplicity in this post, assume it’s a single entity, like Meta.

    Here are some examples:

    Fully open Somewhere in between Entirely closed
    The public road system. Roads on a private golf course.
    Buyers and sellers using cash. Buyers and sellers using Visa. Internal company accounting system.
    The world-wide web. Facebook. Twitter. The old AOL walled garden.

    If you had two networks that are otherwise identical in size, structure and function, except that one is open and the other one is closed, which of those two is more valuable?

    Valuable to whom?

    Fully open Somewhere in between Entirely closed
    Valuable to:
    • Platform proprietor: no, does not exist
    • Network users: yes
    Valuable to:
    • Platform proprietor: yes
    • Network users: yes

    It’s clear that if both networks produce the same amount of total value, the open network is more valuable to its users (such as individuals and organizations), for the simple reason that there is no network proprietor who needs to get paid! The value entirely accrues to the network participants.

    But there’s more to it: Cory Doctorow recently coined the term enshittification to describe the inevitable march of platform/network proprietors, over time, to siphon off an ever-larger percentage of value generated by their network, to the detriment of its users. So the older a closed network, the less value it provides to its users. (Facebook users experience this every day: ever more ads, ever less genuine human engagement. While, for its business users, ad prices go up.) In an open network, on the other hand, the value that accrues to the users does not deteriorate over time.

    And finally: could AOL, the online service, ever have provided the same value as the open web? Of course absolutely not! Open networks allow many more technologists and entrepreneurs to innovate in a gazillion different ways that would never be possible in a closed network. As closed networks mature, not only do they enshittify, but they also further and further discourage innovation by third parties, while the opposite is true for open networks.

    Which brings us to the Fediverse. Which is more valuable today: the decentralized, open social network called the Fediverse (with its thousands of independently operated Mastodon, and other instances), or the poster closed social network, Facebook?

    Clearly, Facebook. That’s because by all counts, Facebook today has order-of-magnitude about 1000 times the number of users of the Fediverse. Same for Twitter, which has maybe 100 times the number of users of the Fediverse.

    But the network effect is the only thing the closed social platforms have going for themselves. All other parts of the value proposition favors the open social network alternative. Think of this:

    • The Fediverse extracts far less / no value: no annoying ads, no user manipulation favoring the business model of the network proprietor.
    • More functionality: it’s one interoperable social network with apps that emulate Twitter, Facebook, Medium, Reddit, Goodreads, and many others! In the same network.
    • It’s entirely open for innovation, and innovators are building furiously as we speak. By its nature, it’s permanently locked open for innovation, and there is no danger of ever getting cut off from an API, facing sudden connection charges or drawing the wrath of a gazillionaire.

    So by the time the Fediverse has sufficient numbers of users, it’s game over for proprietary social networks. This is true for both user categories in social networks: individuals and businesses. (I assume here that businesses and the Fediverse will find a way of accommodating each other, mainly by businesses behaving respectfully. If not, there simply will be no businesses in the Fediverse.) Individuals will get more value from the open network, and businesses will be far more profitable because there is no network operator to pay and many products and services pop up all the time that won’t in the closed network.

    Note that the critical “sufficient number of users” can likely be substantially smaller than the user populations of those closed networks today, because all value accrues to users and it’s not diminished by value extraction from a network proprietor. For many of my own use cases, in many niches the Fediverse has critical mass today already.

    Can the user advantage be overcome across the board? We will have to see. But if we add up just numbers of active users of organizations that have publicly announced Fediverse plans as of the date that I’m writing this, or even have products already in the market – Flipboard, Medium, Mozilla, Tumblr, Wordpress and more – we’re already in the high 100’s of millions.

    Those numbers look awfully close to the user numbers necessary to overcome Metcalfe’s Law.

    tldr; The time to take the Fediverse seriously, for individuals and businesses, is now. The value of the Fediverse for everybody is much higher than the value of any closed, proprietary social network – other than the proprietary social network companies themselves. And we won’t cry for them very much.

    Note: FediForum is next week, where we’ll discuss this.

  • 2023-03-10

    Meta’s decentralized social plans confirmed. Is Embrace-Extend-Extinguish of the Fediverse next?

    Casey Newton at Platformer reports he has e-mail confirmation from Meta that:

    [Meta is] exploring a standalone decentralized social network for sharing text updates. We believe there’s an opportunity for a separate space where creators and public figures can share timely updates about their interests (Source).

    Their new app is codenamed P92, and according to a separate report by Moneycontrol:

    will support ActivityPub, the decentralised social networking protocol powering Twitter rival Mastodon and other federated apps (Source).

    It will also:

    be Instagram-branded and will allow users to register/login to the app through their Instagram credentials.

    First, the good news:

    This is a huge validation of the decentralized social network known as the Fediverse, built around a set of internet protocol standards that include ActivityPub, ActivityStreams, WebFinger as well as a set of commonly implemented unofficial extensions. The Fediverse has been around for some years, but recently came to more widespread prominence through its leading implementation, Mastodon, as the leading alternative of increasingly erratic (and increasingly many other things, but I digress…) Twitter.

    That’s because only when alternatives are actually beginning to look like they might become serious threats to incumbents – and Meta is the market-leading incumbent in social media by far – do incumbents start paying attention and then connect to them. Or, as it may be the case here, simply leak that they might be connecting in the future but never actually will. We don’t know which of those will turn out to be true, but it doesn’t matter: both validate the Fediverse as a serious competitor to Meta.

    This is on the heels of recent Fediverse adoption by companies such as Mozilla, Medium, CloudFlare and Flipboard. Apple now has Mastodon content previews in iMessage. Even Microsoft has been spotted in the Fediverse a few days ago.

    But:

    I have some Brooklyn Bridges for sale. You get a Brooklyn Bridge for free if you believe that a company like Meta would connect to the Fediverse, and be a perfect citizen the way the Fediverse expects you to be today. Including:

    • No ads;
    • No tracking;
    • No algorithms that favor business outcomes for Meta over your wellbeing;
    • Respect for different cultures, minorities, non-mainstream behavior etc.;
    • A rich toolset for filtering and blocking according what you decide you want to filter and block, not Meta;
    • The ability to move from one host to another without having to build your network from scratch;
    • The ability to pick who is your system administrator and moderator, from models that are essential centrally managed to full-fledged self-managed, user-owned cooperatives;
    • The ability, and encouragement, to innovate with new apps;
    • and so forth.

    Instead, based on the history of technology, the chances are overwhelming that such an app would be used by Meta with an embrace, extend and extinguish strategy, at the end of which the Fediverse would either have become irrelevant or effectively been taken over by Meta. So the much-heralded alternative to Twitter would become … Meta? I sure hope not.

    If you think that is unlikely, read up on some of the historical examples listed on that Wikipedia page. Merely being based on open standards and having a million+ strong user community does not protect you at all. Instead, I would say the attack happens every single time a network dominated by an incumbent (here: social media) is threatened by a more open network. And it succeed, at least partially, more often than not. Here it is Meta’s $100b+ business that’s under strategic threat, of course they will protect it and use any means they can think of to do so.

    It does not help that the Fediverse today is chronically underfunded and has corresponding difficulty to compete at the same speed as somebody like Meta can. Actually, “unfunded” is a better term because the amounts are so small. There are many unpaid contributions, the Fediverse largely being open source and all, but I’d be surprised if more than $10m per year are spent in total on the entire Fediverse today, likely it’s far less. If Meta can burn more than $10b – that’s one entire annual fediverse spend every 8 hours! – on a very doubtful Metaverse project, they surely could find the same amount of money to protect their core business.

    And that’s just one of the many issues we need to solve to protect, and grow, the beautiful thing we currently have with the Fediverse.

    So what shall we do about all this?

    (I welcome your comments – in the Fediverse! Find me at @j12t@social.coop.)

    (Also, I’m co-organizing FediForum, an online unconference at the end of March, where we will surely discuss this and other issues. And celebrate the Fediverse, because there is much to celebrate! Join us?)

  • 2023-01-29

    What if Apple’s headset is a smashing success?

    Signs are pointing that Apple will announce its first headset in the next few months. This would be a major new product for Apple – and the industry beyond –, but there is very little excitement in the air.

    We can blame Meta for that. After buying Oculus, iterating over the product for almost 9 years since, and reportedly spending more than $10 billion a year on it, their VR products remains a distinct Meh. I bought a Quest 2 myself, and while it definitely has some interesting features (I climbed Mt Everest, in VR!), it mostly sits on the shelf, gathering dust.

    So the industry consensus is that Apple’s won’t amount to much either. If Meta couldn’t find compelling use cases, the thinking goes, Apple won’t either, because there aren’t any! (Other than some limited forms of gaming and some niche enterprise ones.)

    I think this line of thinking would be a mistake.

    My argument: Apple understands their customers and works down their use cases better than anybody. If Apple works on a new product category for many years – and signs are that they have – and then finally decides that the product is ready, chances are, it is. Their track record on new products is largely unblemished since the return of Jobs about 25 years ago:

    • fruity fun design for a computer (iMac) – success
    • digital music player (iPod) – smashing success
    • smartphone (iPhone) – so successful it killed and reinvented an entire industry
    • table (iPad) – success
    • watch (iWatch) – success
    • … and many smaller products, like headsets, voice assistance, Keynote etc.

    Looking for a major dud in those 25 years, I can’t really find one. (Sure, some smaller things like the 25-year anniversary Mac – but that was always a gimmick, not a serious product line.)

    It appears that based on their history, betting against Apple’s headset is not a smart move. Even if we can’t imagine why an Apple headset would be compelling before we see it: we non-Apple people didn’t predict iPhone either, but once we saw it, it was “immediately” obvious.

    So let’s turn this around. What about we instead assume the headset will be a major success? Then what?

    I believe this would transform the entire technology industry profoundly. For historical analogies, I would have to go back all the way to the early 80’s when graphical user interfaces first became widely used – coincidentally (or not) an Apple accomplishment: they represented a fundamentally different way of interacting with computers than the text terminals that came before them. Xerox Parc gave that demo to many people. Nobody saw the potential and went with it, just Apple did. And they pulled a product together that caused the entire industry to transform. Terminals are still in use, but only by very few people for very specific tasks (like system administrators).

    What if AR/VR interfaces swept the world as the GUI swept the PC?

    I believe they can, if somebody relentlessly focuses on uses cases and really makes them work. I built my first 3D prototype in VRML in 1997. It was compelling back then and it would be today. Those uses can be found, I’m quite certain.

    Based on everything we’ve seen, it’s clear that Meta won’t find them. Hanging out with your friends who don’t look like your friends in some 3D universe is just not it. But if anybody can do it, it’s Apple.

    So I’m very much looking forward to seeing what they came up with, and I think you should be, too.

  • 2023-01-24

    Activity Streams graphical model

    All you need is a gazillionaire doing strange things to some internet platform, and all of a sudden decentralized social media soars in adoption. So lots of people are suddenly seriously looking at how to contribute, myself included.

    Core to this is the ActivityPub standard, and real-world implementations that mix it with additional independently defined protocols, such as what Mastodon does.

    None of them are particularly easy to understand. So I did a bit of drawing just to make it clearer (for myself) what kind of data can be shipped around in the Fediverse. To be clear, this is only a small part of the overall stack, but an important one.

    Here are some diagrams. They are essentially inheritance diagrams that show what kinds of activities there are, and actors, etc. Posted here in case they are useful for others, too.

    And here’s how to interpret my homegrown graphical notation. (I made it up for my dissertation eons ago, and used it ever since. It has certain advantages over, say, UML or traditional ERA diagram styles. IMHO :-))

  • 2022-10-24

    The Push-Pull Publish-Subscribe Pattern (PuPuPubSub, or shorter: P3Sub)

    (Updated Dec 14, 2022 with clarifications and a subscriber implementation note.)

    Preface

    The British government clearly has more tolerance for humor when naming important things than the W3C does. Continuing in the original fashion, thus this name.

    The Problem

    The publish-subscribe pattern is well known, but in some circumstances, it suffers from two important problems:

    1. When a subscriber is temporarily not present, or cannot be reached, sent events are often lost. This can happen, for example, if the subscriber computer reboots, falls off the network, goes to sleep, has DNS problems and the like. Once the subscriber recovers, it is generally not clear what needs to happen for the subscriber to catch up to the events it may have missed. It is not even clear whether it has missed any. Similarly, it is unclear for how long the publisher needs to retry to send a message; it may be that the subscriber has permanently gone away.

    2. Subscriptions are often set up as part of the following pattern:

      • A resource on the Web is accessed. For example, a user reads an article on a website, or a software agent fetches a document.
      • Based on the content of the obtained resource, a decision is made to subscribe to updates to that resource. For example, the user may decide that they are interested in updates to the article on the website they just read.
      • There is a time lag between the time the resource has been accessed, and when the subscription becomes active, creating a race condition during which update events may be missed.

    While these two problems are not always significant, there are important circumstances in which they are, and this proposal addresses those circumstances.

    Approach to the solution

    We augment the publish-subscribe pattern in the following way:

    1. All events, as well as the content of the resource whose changes are supposed to be tracked are time-stamped. Also, each event identifies the event that directly precedes it (that way, the subscriber can detect if it missed something). Alternatively, a monotonically increasing sequence number could be used.

    2. The publisher stores the history of events emitted so far. For efficiency reasons, this may be shortened to some time window reaching to the present, as appropriate for the application; for example, all events in the last month. (Similar to how RSS/Atom feeds are commonly implemented.)

    3. The publisher provides a query interface to the subscriber to that history, with a “since” time parameter, so the subscriber can obtain the sequence of events emitted since a certain time. (Actually, since “right after” the provided time not including the provided time itself.)

    4. When subscribing, in addition to the callback address, the subscriber provides to the publisher:

      • a time stamp, and
      • a subscription id.

    Further, the actual sending of an event from the publisher to the subscriber is considered to be a performance optimization, rather than core to the functionality. This means that if the event cannot be successfully conveyed (see requirements above), it is only an inconvenience and inefficiency rather than a cause of lost data.

    Details

    About the race condition

    1. The future subscriber accesses resource R and finds time stamp T0. For example, a human reads a web page whose publication date is April 23, 2021, 23:00:00 UTC.

    2. After some time passes, the subscriber decides to subscribe. It does this with the well-known subscription pattern, but in addition to providing a callback address, it also provides time stamp T0 and a unique (can be random) subscription id. For example, a human’s hypothetical news syndication app may provide an event update endpoint to the news website, and time T0.

    3. The publisher sets up the subscription, and immediately checks whether any events should have been sent between (after) T0 and the present. (It can do that because it stores the update history.) If so, it emits those events to the subscriber, in sequence, before continuing with regular operations. As a result, there is no more race condition between subscription and event.

    4. When sending an event, the publisher also sends the subscription id.

    About temporary unavailability of the subscriber

    1. After a subscription is active, assume the subscriber disappears and new events cannot be delivered. The publisher may continue to attempt to deliver events for as long as it likes, or stop immediately.

    2. When the subscriber re-appears, it finds the time of the last event it had received from the publisher, say time T5. It queries the event history published by the publisher with parameter T5 to find out what events it missed. It processes those events and then re-subscribes with a later starting time stamp corresponding to the last event it received (say T10). When it re-subscribes, it uses a different subscription id and cancels the old subscription.

    3. After the subscriber has re-appeared, it ignores/rejects all incoming events with the old subscription id.

    Subscriber implementation notes

    • The subscriber receives events exclusively through a single queue for incoming events. This makes implementing an incoming-event handler very simple, as it can simply process events in order.

    • The event queue maintains the timestamp of the last event it successfully added. When a new event arrives, the queue accepts this event but only if the new event is the direct follower of the last event it successfully added. If it is not, the incoming event is discarded. (This covers both repeatedly received events and when some events were missed.)

    • The subscriber also maintains a timer with a countdown from the last time an event was successfully added to the incoming queue. (The time constant of the timer is application-specific, and may be adaptive.) When the timeout occurs, the subscriber queries the publisher, providing the last successful timestamp. If no updates are being found, nothing happens. If updates are being found, it is fair to consider the existing subscription to have failed. Then:

      • The subscriber itself inserts the obtained “missed” events into its own incoming event queue from where they are processed.
      • The subscriber cancels the existing subscription.
      • The subscriber creates a new subscription, with the timestamp of the most recent successfully-inserted event.

    Observations

    • Publishers do not need to remember subscriber-specific state. (Thanks, Kafka, for showing us!) That makes it easy to implement the publisher side.

    • From the perspective of the publisher, delivery of events to subscribers that can receive callbacks, and those that need to poll, both works. (It sort of emulates RSS except that a starting time parameter is provided by the client, instead of a uniform window decided on by the publisher as in RSS)

    • Subscribers only need to keep a time stamp as state, something they probably have already anyway.

    • Subscribers can implement a polling or push strategy, or dynamically change between those, without the risk of losing data.

    • Publishers are not required to push out events at all. If they don’t, this protocol basically falls back to polling. This is inefficient but much better than the alternative and can also be used in places where, for example, firewalls prevent event pushing.

    Feedback?

    Would love your thoughts!