2020-05-07

Trust through Transparency for Apps

By Johannes Ernst

https://reb00ted.org/tech/20200507-trust-through-transparency-for-apps/

What do you need to know so you can confidently trust a piece of technology, such as an app supposedly helping fight COVID-19?

That question is at the heart of Project App Assay. It applies to all technology, but is particularly important for the COVID-19 apps, because many of them collect so much information about our health, our friends, our locations and activities around the clock.

Here is a proposal.

First: the key questions that need answering, I think, are:

Is the app effective? If it is not effective in what it does, such as help fight the virus, there is no point, and you should not trust it to help with your life or the lives of your fellow people. Specifically:
- Does it do what it says it does and is it good at it? E.g. if it says it tracks contacts via Bluetooth, does it do that and do it well (and nothing else)?
- Does that help with the virus? E.g. if the app provides medical advice, it would be pointless if the advice it dispensed made no difference to your health or the health of the people around you.
What are the downsides of me using the app? These range from the mundane, like will it drain my phone’s battery quickly, to the profound: e.g. will the people promoting the app use the collected personal data for purposes other than fighting the virus? Perhaps even use it against me now or at some point in the future, e.g. by jacking up insurance rates or finding other members of my persecuted religious minority?

These are critical questions we all ask ourselves when faced with the decision to use or not use an app.

As we analyze COVID-19 apps at Project App Assay, we have observed that the authors of those apps make many claims about their apps answering these questions, but that’s all they are: claims by the creators of the app who obviously have a self-interest. Can those claims be trusted? Clearly, it would be nice if we had more to go on.

So I have come up with the following rating scheme. It looks like this:

	Self-asserted, few details	Self-asserted, comprehensive	Comprehensively audited	Demonstrably uses best practices
Effectiveness
Technology
Operations
Governance

Let me explain:

Effectiveness: what do we know about whether the app is effective? This includes whether its advertised features work, and what we know about whether it indeed helps and pushes back the virus.
Technology: what do we know about the technology, including algorithms, which data is collected, what protocols and cryptography does it use and the like?
Operations: what do we know about how the deployed system is operated, e.g. how often are security reviews being performed, who has access to cryptographic secrets, or are systems administrators vetted?
Governance: who makes decisions, and how are they made, about all aspects of the app and the data it generates? How is dissent handled on the governance team? (E.g. is there a whistleblower process?)

We then rate each dimension with the possible values of:

Self-asserted, few details: the app creator provides no or few details on the subject; no third party has validated those claims.
Self-asserted, comprehensive: the app creator provides comprehensive information on the subject; but no independent, credible third party has validated those claims.
Comprehensively audited by an independent, credible third party: the claims have been validated by an independent, credible third party, and found to be largely correct with no major discrepancies.
Follows best industry practices: the third-party validation confirms that the app follows best industry practices.

As an example, the evaluation of a simple hypothetical app (only) dispensing health advice that gained high marks might look like this:

	Self-asserted, few details	Self-asserted, comprehensive	Comprehensively audited	Demonstrably uses best practices
Effectiveness
Technology
Operations
Governance

This would be the evaluation for the health advice app, if, for example:

the health advice was sourced from respectable medical sources (e.g. CDC) with back links to the source, and had been reviewed for correctness by the CDC.
it was developed in the open, such as open-source, with a large and diverse developer community. If the developer community is large and diverse and functional, it effectively performs the audit function itself, and gravitates to following best technology practices.
for this app, operations are minimal and transparent, so this is a non-issue.
governance of the app was performed in the open, such as in public meetings or on public mailing lists.

On the contrary, the evaluation for a similar app with low marks could look like this:

	Self-asserted, few details	Self-asserted, comprehensive	Comprehensively audited	Demonstrably uses best practices
Effectiveness
Technology
Operations
Governance

This would happen if, for example:

the health advice had no discernable source, and no review had been performed by medical professionals.
the app was provided as a “black box” of which nothing is known other than what the developers claim about it, and they have publicly said little.
there is no knowledge about who is involved in operations or governance of the app, and what decisions are being made on an ongoing basis.

Of course, it is entirely possible that an app could receive low marks although it is effective and does not harm users in any way.

However, for a public health emergency like COVID-19, I can think of few good reasons why apps should keep their technology or governance secret. And as large-scale adoption by many users is required for most to be effective, I can think of few ways to better gain user trust than evaluations all to the right in this matrix.

I would love your feedback on Twitter.