This post provides more details on the “Dynamic Quarantine” exit path from the COVID-19
pandemic that I listed in a previous post.
The problem
We need to reduce transmission of the virus to a level where the number of infected
people at any time shrinks, rather than grows.
Absent vaccines or other medications, this requires reduction of in-person contact
between people (“social distancing”).
However, this makes normal functioning of the economy largely impossible. For example, the
state of California just ordered all “non-essential businesses” to be closed. While this
may work in the short term, the longer the lock-down continues, the more things “break”:
from mass unemployment and resulting poverty/defaults/bankruptcies, to the availability of
replacement parts and eventually essentials such as food.
Such “social distancing” may need to continue until a vaccine is available, which may take
many months (12-18 months is a common estimate). It is unclear how to keep the economy
functioning enough for such an extended period of time.
We need better ideas.
The basic idea
Instead of a blanket shutdown of all “non-essential” businesses, confining “everybody”
to their residence, we could shut down only those businesses in which infection is likely,
and confine only those people to isolation whose likelihood of infecting somebody is
higher than a certain threshold. In this approach, those likelihoods are dynamically
determined by means of data collection mostly through mobile phones, and an algorithm that
produces a corresponding score for each person from the collected data.
The likelihood of a subject infecting somebody is determined as a function of what is known about
the health of the subject so far, plus a history of the subject’s interactions with other people
and those people’s likelihood of infecting somebody.
By tracking this information in real time, the blanket closure of businesses and blanket
shelter-in-place of the population can be avoided, and instead be replaced with a sharp,
pinpointed focus on isolating those that are most likely contributing to the spread of the
disease. The remainder of the economy and population can continue to function.
Certain parameters in the algorithm can be tuned to provide different tradeoffs between
reducing spread and inhibiting (or not) the economy.
Infectiousness score
The infectiousness score in this approach is an estimate for the likelihood that one
person infects another when exposed for a certain time period (e.g. 5 min).
For our purposes here, the infectiousness score is a number between 0 and 1, where
0 means: not infectious (e.g. because a highly reliable test has just cleared the subject)
and 1 means: known to be maximally infectious (e.g. because viral loads have been found to
be high, and the subject behaves promiscuously).
Details
A few definitions first:
P
: a person (aka subject)
S(P,t)
: the infectiousness score of person P
at time t
. Ranges between
0 (not infectious) and 1 (maximally infectious).
The core algorithm is as follows. It deals with direct infection between two people only,
but an extension is discussed below.
- At each time unit (e.g. every hour),
S(P,t)
is calculated as a function of:
S(P,t-1)
: the infectiousness score of the person at the time prior;
S(Pi,τ)
: the infectiousness score of all people Pi
that
the subject interacted with in the time period τ = (t-tw) ... (t-1)
(where tw
is a parameter that determines the length of the time window that’s being considered;
selection of this parameter depends on characteristics of the disease, such as
incubation times, as well as the characteristics of enacted community interventions such
as availability, frequency and accuracy of testing);
- a rating of the subject’s current health derived from the subject’s self-assessment;
- a rating of the subject’s current health based on information from the future (see
below).
Rewriting history:
- Test results come in with a delay (e.g. one day between
tTest
and current time t
).
Once available, the estimate for the infectiousness of the subject between tTest
and
t
will be “overwritten” with an updated, more accurate estimate for that already-passed
time period that takes the results of the test into account.
- Similarly, subjects may be infectious prior to experiencing any symptoms. Once symptoms
are apparent, all prior estimates of infectiousness of the subject will be
recalculated over some time window whose length is determined by some assumptions
about the disease (incubation time, time of infectiousness prior to symptoms etc).
- When subject
P
’s history is rewritten, the histories (and current score) need to be
recalculated and rewritten of all subjects that have previously taken the history of
subject P
into account for their own scores. They need to now use the rewritten history.
This may happen recursively. History may be overwritten repeatedly for a given
subject, which again triggers rewrites for other subjects. (More efficient algorithms
producing the same result can be found.)
Additional potential inputs to the algorithm:
- A rating of a subject’s interventions that may modify their infectiousness, such as:
- wearing a mask;
- intentionally exhaling at others;
- etc.
So far, we have assumed that transmission can only occur between two people in the same
location. However, there are other forms of transmissions, such as:
- transmission via a contaminated surface within a certain time interval that the
virus remains active on that surface;
- transmission via air droplets in an enclosed space with a certain time interval.
To account for these forms of transmission, the algorithm is extended to also include
estimates of the infectiousness of objects in certain locations. Similar to people,
these objects have an infectiousness score that is a function of which people (and
their scores) have interacted with it in times prior, its previous infectiousness
score and the passage of time.
The score of objects in the vicinity is considered as part of the algorithm to update
S(P,t)
in a corresponding manner to that of people.
User experience
-
Users run an app on their mobile phones.
-
From time to time, the app asks the user about how they feel. Specifically it asks
about symptoms related to COVID-19, such as fever, fatigue, cough etc.
-
The app’s main screen shows an easy-to-understand visual representation of the
likely infectiousness score, such as a color code (e.g. green: unlikely to infect).
-
When the app reports a score above a certain threshold, the subject goes into
shelter-in-place or quarantine. (Legal questions about whether this is voluntary
or legally required are out of scope for this discussion; certainly regulations such
as “must be sheltered-in-place unless score is green” would be possible.)
-
Before two (or more) people meet in person, they can agree on a maximum score that
participants are allowed to have to be allowed to participate in the meeting. (Such a
maximum score may also be legally mandated.) The participants in the meeting check each
others’ scores before the meeting.
-
Before a business admits a customer (or employee) onto the premises, they require the
customer or employee to share their score. They will be denied access if the score
is above a certain threshold. They may also deny access to those visitors who do not
have, or are unwilling to display their score.
-
When the user gets tested, they enable the testing provider to add the test results
to their record so it can be used to calculate the score going forward.
-
Depending on the implementation choices made, the mobile phone may need to be
connected to the internet, to a local WiFi network and/or have Bluetooth on as sender
or receiver or both.
Assumptions / challenges
-
Test results can be brought into the system in a way that defeats tampering: we cannot
allow a subject to fake negative test results, for example, or eliminate from consideration
positive test results.
-
Individuals may be tempted to fake their scores in order to enter a certain venue,
for example, such as by displaying a static screen shot on their phone instead of their
live score. Technical means (e.g. timestamping the display, or simultaneously broadcasting
the score via wireless networking) can be employed to make this more difficult. This
approach would also use technical means (e.g. public keys, app stores) to prevent “rogue apps”
with false scores to participate.
-
In a naive implementation, the entire record of each subject (e.g. the entire world
population) would be centrally collected. This would create a privacy nightmare and
enable substantial future harm from dangers that are not biological in nature. So we
assume that the implementation would need to be performed in a fashion that does not have
a central point of data collection.
-
Location accuracy for this app is paramount. The absolute coordinates are less important;
but relative coordinates between two subjects need to be determined as well as possible,
as a distance of 2ft vs 8ft has substantially different impact on likelihood of
transmission. This could be addressed with technical means (e.g. Bluetooth, NFC), user
input (e.g. verify / enter into the app the people currently in close proximity) or a
combination.
-
The space in which an encounter occurs is highly relevant. For example, a 10 min
contact at 6ft inside a small, enclosed space without ventilation has dramatically
different transmission characteristics than contact of the the same duration and distance
in open nature with a slight wind. This also could be addressed with technical means
(e.g. mapping information), use input (e.g. enter into the app whether the surroundings
are enclosed space, ventilated, open window, city street, open nature etc) or a
combination.
Approach to Privacy
It appears possible to keep most information needed for the functioning of the system
on individual users’ mobile phones without requiring a centralized data repository:
- The algorithm can run locally on local data.
- Detection of other people in the neighborhood can be performed via local wireless networking
(e.g. WiFi, zeroconf, Bluetooth).
- The communication between mobile phones of people in an encounter to exchange scores can be
performed using secure end-to-end encryption between the phones using any networking technology
including through a centralized backend. This would not compromise privacy significantly.
- To trigger history rewrites in other phones, those connections to other phones can be
remembered and re-activated (including identity / encryption keys). This may use
some existing centralized communication network (e.g. instant messenger) or a decentralized
alternative with a distributed hash table for lookup, for example.
- None of the functionality, or communications require more than pseudonymous identity.
No centralized account, or identity verification is required, with the potential exception
of entering verified testing results. However, in this case, the identity correlation
remains local on the user’s device and is never shared beyond.
Public health reporting and management
- The app can report scores to the public health authorities, who have the ability to
track actual – and best-guess estimates – of the spread of the disease in real time.
- For privacy reasons, scores do not need to be associated with other identifying
attributes, although it may be advantageous to share demographic info such as age,
and approximate (maybe rasterized) geographic location of the subject.
- Key parameters of the algorithm – e.g. thresholds for “acceptable” scores for
certain activities – could be centrally updated by the public health authorities,
in order to “shape” the progression of the disease in real time.
Algorithmic improvements
- The intentional distribution of data and computation, instead of centrally collecting
it all, for privacy reasons, needs to be weighed against the need to continually
debug, and improve the algorithm.
- To be able to understand the functioning of the algorithm in the field, and to make
improvements, it appears sufficient to report the time histories of scores centrally,
including rewritten histories. It does not appear necessary to identify the specific other
people whose scores were used as input to the algorithm, nor the locations where
encounters took place.
- Should more detailed information be required, collecting such more detailed information
from a relatively small sample of volunteers should be sufficient.