For other articles on maths and slides for talks I have given, visit my Thoughts on Maths page.
With the global spread of the coronavirus (Covid-19), scientists, doctors and public health officials are racing to study the spread of the disease in an effort to understand how it might affect people in different countries. You might be wondering how the spread of diseases is studied (hint: it's maths). I hope this article gives you a glimpse of this.
The starting point: exponential models.
You will have studied exponential models of population at school:
\[P(n)=a(1+b)^{n}\]
Here, \(n\) is time (e.g. in days), \(P(n)\) represents the population at time \(n\), and \(a\) and \(b\) are parameters. We can interpret the parameters as follows:
- \(a\) is the initial population
- \(b\) is the growth/decay rate: when time \(n\) increases by 1, the population multiplies by \(1+b\). In other words, a \((b\times 100)\)% increase/decrease.
The last part means that we can also write \[ P(n)=(1+b)P(n-1)\] i.e. the population at time \(n\) is obtained by multiplying the population at time \(n-1\) by \((1+b)\). This also shows that \(P(n)\) is a geometric sequence with common ratio \((1+b)\)( you might have seen before that exponential models give rise to geometric sequences).
We now wish to apply this to epidemics: \(P(n)\) is the population of people carrying the disease. However, this gives rise to a problem in the model.
- If \(b<0\), we have exponential decay - the population of people carrying the disease decreases to \(0\). Clearly, this is what we want to happen to the disease but is not happening at the moment. So the model doesn't reflect reality.
- If \(b>0\), we have exponential growth - the population of people carrying the disease increases very quickly. This also shouldn't reflect reality (hopefully!)
We need a more sophisticated model beyond exponential models!
The SIR model
The gold standard of epidemiological models is called the SIR model which was invented by Kermack and McKendrick (in 1927). It considers three related populations and how they interact with each other. The three populations are
- Susceptible individuals - those who have not contracted the disease before.
- Infected individuals - those who currently have the disease.
- Recovered individuals - those who have recovered from the disease.
It's important to mention the distinction between "Susceptible" individuals and "Recovered" individuals. For many diseases (such as measles), if an individual contracts a disease, his/her body produces antibodies which prevent the individual from being reinfected by the disease. This fact does not apply to the flu because the influenza virus constantly mutates meaning that antibodies do not offer lasting protection. It's unclear whether this applies to 2019-nCoV at this moment. However, somewhat morbidly, we also include the people that died in the "recovered" population - obviously they also cannot be reinfected. It is more accurate to consider the third population as the "recovered/removed" population.
Let
- \(N\) be the total population of individuals;
- the population of susceptible individuals be \(S(n)\);
- the population of infected individuals be \(I(n)\);
- the population of recovered/removed individuals by \(R(n)\).
The infected population \(I(n)\) will be modelled using something similar to an exponential model. Starting from the formula \(I(n)=(1+b)I(n-1)\) for an exponential model, we know that some susceptible individuals get infected while some recover, this suggests we can consider \(b=\alpha-\gamma\) where \(\alpha\) is the "infection rate" and \(\gamma\) is the "recovery rate".
However, we would expect the infection rate to depend on the proportion of the total population of susceptible individuals, \(\frac{S(n)}{N}\) since how fast the infected population grows depends on how many susceptible people there are. We replace \(\alpha\) with \(\beta \frac{S(n-1)}{N}\), giving the following formula \[I(n) = \left(1+\beta \frac{S(n-1)}{N} - \gamma\right) I(n-1) \] This also gives us the formula for the other populations. Since people who are susceptible either stay susceptible or become infected, hence \[S(n)=S(n-1) - \beta \frac{S(n-1)}{N}I(n-1).\] Similarly, the recovered population changes by those who were infected and recovered, in other words \[R(n)=R(n-1) + \gamma I(n-1)\] These three equations of \(S(n)\), \(I(n)\), and \(R(n)\) form the SIR model.
Understanding the parameters
The graph below shows the susceptible, infected and recovered/removed population for each n. You can see the characteristic shape of the curves.
I used Geogebra to make this plot. You can access the worksheet here: https://www.geogebra.org/classic/yujqgy5b. Drag the sliders for the parameters to see what happens.
green = susceptible, red = infected, blue = recovered/removed
The only parameters of the SIR model are \(\beta\) and \(\gamma\). In particular, the ratio \(R_{0}= \beta/\gamma \) is of special interest. It is called the basic reproduction number.
- If \(R_{0}\gt\frac{N}{S(n-1)}\), then \(\beta \frac{S(n-1)}{N} > \gamma \) and hence \(I(n) >I(n-1)\). This means that the disease is spreading so that the rate of those getting infected is greater than the rate of those recovering and overall, the population of infected individuals is increasing. When this is the case, the disease is at an epidemic phase .
- If \(R_{0}=\frac{N}{S(n-1)}\), then \(\beta \frac{S(n-1)}{N} = \gamma \) and hence \(I(n)=I(n-1)\). This means that overall population of infected individuals is unchanging: the number of people infected is the same as the number of people recovered. This is the endemic phase.
- If \(R_{0}\lt\frac{N}{S(n-1)}\), then \(\beta \frac{S(n-1)}{N} < \gamma \) and hence \(I(n) \lt I(n-1)\). This means that overall population of infected individuals is decreasing, the rate of those recovering is greater than those getting infected. If this is the case, then the disease will die out.
This gives a very simple target for public health officials: take steps to make \(R_{0}<\frac{N}{S(n-1)} \).
This can be done by:
1) Reducing \(\beta \), the rate at which individuals are infected. Steps to do this include: improving awareness, imposing travel restrictions and reporting and quarantine.
2) Increasing \(\gamma\), the rate at which people recover. Steps to do this include: researching the disease to find effective methods of treatment.
3) Reducing \( S(n-1) \), the number of the susceptible population. This is what vaccination achieves - a vaccinated individual is no longer susceptible and hence can be considered part of the recovered/removed population.
Further ideas
How else can the model be used? We can understand more about the disease by studying the data of the epidemic as it unfolds. For example, we can see data on the population of susceptible, infected and recovered/removed individuals day to day. This allows us to get estimates for the parameters \(\beta\) and \(\gamma\). We can then deduce information about how the disease is spreading e.g. whether it's spreading through the air. This allows scientists and doctors to give suitable recommendations to the public on what precautions to take.
In the above discussion, we considered what is called the "discrete time" case. In other words, time advances one day at a time. In reality, the continuous-time case would be used. However, this requires knowledge of advanced calculus!
For more details and a discussion of the continuous time case, see these slides produced by a professor in the University of Hong Kong who I know well. Also see the wikipedia article.
Follow me on social media:
|