
Ordinary Differential Equations (14): Epidemic Models and Epidemiology
Mathematical epidemiology from first principles. Build the SIR and SEIR models, derive R0 and the herd-immunity threshold, fit COVID-style scenarios with asymptomatic transmission and time-varying interventions.
In early 2020 the entire world watched a small system of ordinary differential equations decide policy. “Flatten the curve” was not a slogan; it was the intuition of a specific equation. Herd immunity was not a guess; it was the threshold $1 - 1/R_0$ derived in a single line. The SIR model — four lines of math, written down in 1927 by Kermack and McKendrick — turned out to be precise enough to drive trillion-dollar decisions.
This chapter builds that machinery from scratch. We start with the basic SIR model, derive every threshold and final-size relation analytically, and then layer on the realism: incubation periods (SEIR), asymptomatic transmission, vaccinations, and time-varying interventions (a stylised COVID-style scenario). Throughout, the goal is not to believe a forecast but to understand which mechanism a given parameter controls.

What You Will Learn#
- The SIR model as a 3-equation system: compartments, parameters, and the basic reproduction number $R_0 = \beta/\gamma$
- The threshold theorem ($R_0 > 1 \Leftrightarrow$ outbreak) and the final-size relation $S_\infty = S_0 e^{-R_0(1 - S_\infty)}$
- Closed-form expressions for the peak height $I^* = 1 - 1/R_0 - \ln R_0 / R_0$ and the herd-immunity threshold $1 - 1/R_0$
- The SEIR variant, latent-period dependence, and why it slows the initial growth rate
- COVID-19 extensions: asymptomatic compartment, intervention-induced time-varying $R_e(t)$ , reporting iceberg
- Network-level reproduction numbers and the role of super-spreaders
- Practical fitting: what $R_0$ , doubling time, and serial interval actually measure
Prerequisites: phase-plane analysis from Chapter 7 , nonlinear stability from Chapter 8 , numerical methods from Chapter 11 .
The SIR Model#
Split the population into three compartments:
- $S$ — susceptible (can catch the disease)
- $I$ — infectious (currently transmitting)
- $R$ — removed (recovered with immunity, isolated, or dead)
Two parameters carry all the physics:
- $\beta$ — transmission coefficient: average effective contacts per unit time, times probability of transmission per contact.
- $\gamma$ — removal rate: $1/\gamma$ is the average duration of infectiousness.
The total $S + I + R \equiv N$ is conserved (there are no births or deaths in the closed model), so the system is genuinely two-dimensional.
The basic reproduction number $R_0$ #
$$ \dot I \approx (\beta - \gamma) I, $$ $$ \boxed{\;R_0 \equiv \frac{\beta}{\gamma} > 1.\;} $$Threshold theorem. The disease-free equilibrium is locally stable iff $R_0 < 1$ . This number has a beautifully concrete meaning: it is the expected number of secondary infections caused by one typical infectious individual placed into a fully susceptible population. If each infected person infects fewer than one other, the chain dies out. If more than one, it explodes — at first.
What controls the peak?#
$$ \frac{dI}{dS} = \frac{\gamma}{\beta S/N} - 1 = \frac{1}{R_0\,S/N} - 1. $$ $$ i(s) = i_0 + (s_0 - s) + \frac{1}{R_0}\ln\frac{s}{s_0}. $$ $$ \boxed{\;I^* \;\approx\; 1 - \frac{1}{R_0} - \frac{\ln R_0}{R_0}.\;} $$And the time of peak is when $S$ first crosses $1/R_0$ . Both quantities are determined by $R_0$ alone (with $\gamma$ setting the timescale).
Final size: who escapes?#
$$ \boxed{\;S_\infty = S_0\,\exp\!\bigl[-R_0\,(1 - S_\infty/N)\bigr].\;} $$For $R_0 = 2.5$ and $S_0 \approx N$ , the equation gives $S_\infty / N \approx 0.107$ — about 89% of the population is infected by the end. Surprising, but a one-line consequence of the math.

Top-left: classic SIR time series for $R_0 = 2.5,\ 1/\gamma = 10$ d — $S$ collapses, $I$ peaks then dies, $R$ saturates. Top-right: phase portrait in $(S, I)$ — trajectories enter the upper half-plane, peak when crossing the vertical line $S = 1/R_0$ , and spiral toward the $S$ -axis. Bottom-left: cumulative incidence $1 - S$ asymptotes to the final-size value (red dashed line). Bottom-right: $S_\infty$ and $1 - S_\infty$ as functions of $R_0$ — the curve is steep just above 1, meaning small changes in $R_0$ cause huge changes in eventual reach.
| |
Sensitivity to $R_0$ #
$R_0$ is the lever that determines outcomes. A factor-of-two change in $R_0$ can multiply peak demand on hospitals five-fold and pull the peak forward by weeks. The sensitivity is also analytic, which is rare.

Left: a family of $I(t)$ curves at $R_0 \in \{1.2, 1.6, 2.0, 2.5, 3.5, 5.0\}$ for fixed $1/\gamma = 10$ d. Higher $R_0$ -> higher and earlier peak. Middle: peak fraction infected (red) and time of peak (blue) versus $R_0$ . The black dashed line is the analytical formula $1 - 1/R_0 - \ln R_0 / R_0$ — spot on. Right: total fraction infected at the end (red) and the herd-immunity threshold $1 - 1/R_0$ (green dashed). Note that the final size always overshoots HIT — the epidemic does not stop the instant immunity reaches the threshold; it stops when transmission can no longer sustain itself, by which time many “extra” infections have happened.
Vaccination and Herd Immunity#

For $R_e \leq 1$ we need $v \geq 1 - 1/R_0$ . This is the herd-immunity threshold (HIT): the minimum fraction that must be immune for an introduced case to die out, on average.
| Disease | Typical $R_0$ | HIT |
|---|---|---|
| Influenza | 1.5 | 33% |
| COVID-19 (Wuhan strain) | 2.5 | 60% |
| SARS | 3.0 | 67% |
| COVID-19 (Delta) | 5.0 | 80% |
| Smallpox | 6.0 | 83% |
| Mumps | 10 | 90% |
| Measles | 15 | 93% |
For high-$R_0$ diseases combined with imperfect vaccines this can become infeasible ($v > 1$ ).

Top-left: $I(t)$ for vaccination coverages $v = 0,\ 0.20,\ 0.40,\ HIT,\ 0.85$ . At $v = HIT$ the curve is barely an outbreak; at $v = 0.85$ no outbreak occurs at all. Top-right: peak and additional infections versus coverage; the green region is post-HIT. Bottom-left: HIT versus $R_0$ with example diseases marked. Bottom-right: required coverage $v$ as a function of $R_0$ for vaccine efficacies $\mathrm{VE} \in \{0.5, 0.7, 0.9, 1.0\}$ — the “infeasible” red shading shows where high $R_0$ + low VE makes herd immunity unattainable through vaccination alone.
The picture also justifies the public-health emphasis on getting the laggards vaccinated: under a high-$R_0$ disease, the last 10% matters as much as the first 60%.
The SEIR Model#
$$\dot S = -\frac{\beta SI}{N}, \quad \dot E = \frac{\beta SI}{N} - \sigma E, \quad \dot I = \sigma E - \gamma I, \quad \dot R = \gamma I.$$The transition rate $\sigma$ has $1/\sigma$ = average latent duration. The basic reproduction number is unchanged: $R_0 = \beta/\gamma$ . So, do incubation periods matter?
$$ r^2 + (\sigma + \gamma)\,r + \sigma(\gamma - \beta) = 0. $$ $$ r_{\text{SEIR}} = \frac{1}{2}\!\left[-(\sigma + \gamma) + \sqrt{(\sigma + \gamma)^2 + 4\sigma\gamma(R_0 - 1)}\right] < r_{\text{SIR}} = \beta - \gamma. $$The latent stage slows the early exponential growth, even though $R_0$ is unchanged. So at fixed $R_0$ , SEIR predicts a later, slightly lower peak than SIR. Equivalently: doubling time depends on the generation interval $T_g \approx 1/\sigma + 1/\gamma$ , not just on $R_0$ .

Top-left: full SEIR trajectory at $R_0 = 3,\ 1/\sigma = 5\ \text{d},\ 1/\gamma = 7\ \text{d}$ . Top-right: SEIR vs SIR at the same $R_0$ — SEIR peaks later and is slightly lower. Bottom-left: family of SEIR $I(t)$ for $1/\sigma \in \{0.5, 2, 5, 10, 20\}$ d — longer latent periods produce later, broader peaks. Bottom-right: initial growth rate $r$ as a function of latent period; doubling time on the right axis. As $\sigma \to \infty$ (instantaneous transition), SEIR recovers SIR.
A COVID-Style Extension#
Real epidemics need more than SEIR. COVID-19 introduced four mathematical wrinkles:
- Asymptomatic transmission. A fraction $p$ of infections never develop symptoms but still spread (with reduced infectiousness $\kappa$ ).
- Time-varying $\beta$ . Lockdowns, mask mandates, and behavioural changes alter transmission.
- Reporting iceberg. Only a fraction of true cases gets detected; reported = $\rho \cdot I_s$ with $\rho < 1$ .
- Variants. New strains restart the dynamics with a fresh $R_0$ .
We want $R_e(t) < 1$ . Two ways: shrink $\beta$ (interventions) or shrink $S/N$ (immunity).

Top-left: a stylised four-phase scenario — baseline $\beta = 0.6$ for 30 days, lockdown $\beta = 0.18$ for 30 days, partial relaxation $\beta = 0.30$ , then variant + relaxation $\beta = 0.40$ . Top-right: effective $R_e(t)$ — the lockdown phase pushes $R_e$ below 1 (green region) and the third wave climbs above 1 again. Bottom-left: reporting iceberg — the dashed black line is what surveillance would report ($\rho I_s$ ), far below the true infectious prevalence $I_s + I_a$ (purple). Bottom-right: counterfactual comparison of cumulative infected with vs without interventions. The gap is the cases averted — the policy benefit, expressed as a fraction of the population.
This is not a fit to real data — it is a clean cartoon of the structure that real fits use. The same equations, fitted to actual reported case time series with Bayesian inference, drove official forecasts in 2020-2022.
Network and Heterogeneous Models#
$$ R_0 = \frac{\beta}{\gamma}\,\frac{\langle k^2 \rangle - \langle k \rangle}{\langle k \rangle}. $$For scale-free networks ($P(k) \propto k^{-\alpha}$ with $2 < \alpha < 3$ ), $\langle k^2 \rangle$ diverges with system size. This means the epidemic threshold goes to zero — on such networks, even very low transmissibility can sustain an outbreak. Super-spreaders dominate the early dynamics in real epidemics; targeting them with contact tracing is disproportionately effective.
Two conceptual lessons:
- Average contact rate underestimates risk; the variance matters as much.
- Equal-coverage interventions waste resources; prioritise high-degree nodes.
Applications and Limits#
Where the math wins#
- Order-of-magnitude forecasting. Will hospitals overflow in 4 weeks? SEIR with the right $R_0$ gives a usable answer.
- Threshold reasoning. “How much vaccination do we need?” -> $1 - 1/R_0$ , divided by VE. No fit needed.
- Counterfactual comparison. “How many lives did the lockdown save?” -> Run with and without intervention; difference is the answer (subject to $R_0$ uncertainty).
Where it loses#
- Exact predictions. $R_0$ varies between settings, populations, weather, behavioural changes. Exact case counts beyond a few weeks are not reliable.
- Heterogeneity ignored. Age structure, geography, household clustering all matter; mean-field models give the wrong peak timing in detail.
- Behavioural feedback. People reduce contacts when they see the news. This is a closed-loop system, and ignoring the feedback inflates predicted peaks.
The right way to use the math is as a structured language for arguing about scenarios, not as a literal forecast.
Summary#
| Concept | Key formula |
|---|---|
| Basic reproduction number | $R_0 = \beta / \gamma$ |
| Outbreak threshold | $R_0 > 1$ |
| Peak fraction infectious | $I^* \approx 1 - 1/R_0 - \ln R_0 / R_0$ |
| Final size relation | $S_\infty = S_0\,e^{-R_0(1 - S_\infty/N)}$ |
| Herd-immunity threshold | $1 - 1/R_0$ |
| Imperfect vaccine | $v \geq (1 - 1/R_0)/\mathrm{VE}$ |
| SEIR initial growth | $r$ from $r^2 + (\sigma + \gamma)r + \sigma(\gamma - \beta) = 0$ |
| Effective $R$ | $R_e(t) = R_0(t)\,S(t)/N$ |
| Network $R_0$ | $\beta\,\langle k^2 - k\rangle / (\gamma\,\langle k \rangle)$ |
Exercises#
Conceptual.
- Why does the SIR final size overshoot the herd-immunity threshold? Make the argument both intuitive and analytical.
- The serial interval and the generation interval are both proxies for “time between successive infections”. Define them, and explain why they differ.
- Two countries report the same case-doubling time. Could they have very different $R_0$ ? Use SEIR to argue.
Computational.
- Solve the SIR system for $R_0 = 1.05, 1.5, 3.0$ and check the final-size formula numerically against the transcendental equation.
- For SEIR at $R_0 = 3$ , plot the generation interval $1/\sigma + 1/\gamma$ versus the doubling time; verify that doubling time grows linearly with generation interval at fixed $R_0$ .
- Implement the asymptomatic COVID model, fit it to your country’s first-wave reported time series with two free parameters ($\beta$ and start time), and compute $R_e(t)$ .
Programming.
- Animate the family of SIR phase portraits as $R_0$ slides from 0.5 to 5.0.
- Build a stochastic SIR (Gillespie algorithm) and compare its mean trajectory with the deterministic ODE for population sizes $N = 10^2,\ 10^4,\ 10^6$ . When does the deterministic limit kick in?
- Implement an age-structured SIR with two age classes (children, adults) and a 2x2 contact matrix. Compute the next-generation matrix and find $R_0$ as its dominant eigenvalue.
References#
- Kermack & McKendrick, “A contribution to the mathematical theory of epidemics,” Proc. Roy. Soc. A 115 (1927)
- Anderson & May, Infectious Diseases of Humans, Oxford University Press (1991)
- Diekmann, Heesterbeek & Britton, Mathematical Tools for Understanding Infectious Disease Dynamics, Princeton (2013)
- Keeling & Rohani, Modeling Infectious Diseases in Humans and Animals, Princeton (2008)
- Brauer, Castillo-Chavez & Feng, Mathematical Models in Epidemiology, Springer (2019)
- Ferguson et al., “Impact of non-pharmaceutical interventions to reduce COVID-19 mortality,” Imperial College Report 9 (2020)
- Pastor-Satorras & Vespignani, “Epidemic spreading in scale-free networks,” Phys. Rev. Lett. 86 (2001)
ODE Foundations 18 parts
- 01 Ordinary Differential Equations (1): Origins and Intuition
- 02 Ordinary Differential Equations (2): First-Order Methods
- 03 Ordinary Differential Equations (3): Higher-Order Linear Theory
- 04 Ordinary Differential Equations (4): The Laplace Transform
- 05 Ordinary Differential Equations (5): Power Series and Special Functions
- 06 Ordinary Differential Equations (6): Linear Systems and the Matrix Exponential
- 07 Ordinary Differential Equations (7): Stability Theory
- 08 Ordinary Differential Equations (8): Nonlinear Systems and Phase Portraits
- 09 Ordinary Differential Equations (9): Chaos Theory and the Lorenz System
- 10 Ordinary Differential Equations (10): Bifurcation Theory
- 11 Ordinary Differential Equations (11): Numerical Methods
- 12 Ordinary Differential Equations (12): Boundary Value Problems
- 13 Ordinary Differential Equations (13): Introduction to Partial Differential Equations
- 14 Ordinary Differential Equations (14): Epidemic Models and Epidemiology you are here
- 15 Ordinary Differential Equations (15): Population Dynamics
- 16 Ordinary Differential Equations (16): Fundamentals of Control Theory
- 17 Ordinary Differential Equations (17): Physics and Engineering Applications
- 18 Ordinary Differential Equations (18): Frontiers and Series Finale