S, I, and R populations explained in Epidemiologic modeling

Reading Time: 6 minutes

Growth models, otherwise sometimes known also as “population models”, became very prominent in 2020, as the Corona virus epidemic was raging all over the world. Population models want to estimate the future situation, of a population, given as fragments: how big part of the initial population will be in certain state (compartment) at given date.

The models have their roots in classical differential equations. As mathematical modeling of epidemics are essentially about biological growth (and transmission of illness causing pathogen), the natural choice was to take a differential equation as basis.

This blog Post first explains the background in differential equations, and then goes to show how SIR model uses differential equations.

Why are differential equations used?

Differential equations describe the quantity of something, in terms of the “steepness of change”. The equations contain differentials – the slope of another, external thing. Differential equation thus predicts the amount of the target variable using quantities of change in other variables. The other variables, free parameters, guide a differential equation. Solving a differential equation is about iterative looping through the whole timeline, always applying the change to the calculation of target variable.

For example, dropping a rock can be presented as differential equation: the rock is pulled by gravity – gravity is the external thing, that doesn’t change – but gravity changes the rock’s position. The rock goes toward the surface of Earth, until it hits the surface, and stays in place.

Differential and an integral – they make a pair

Differentials are the operations which help you going from target function to change. Example:

dv/dt = d f(t) / dt

Substitute the falling body equation into f(t). Then take the derivative from all terms in the f(t) equation, according to the rules of derivation. You can learn the rules from any algebra tutorial – such as MIT 18.01 Single Variable calculus (video).

Integrals work vice versa: going from change equation -> to value of a target function. A round of integrating practically makes your function more complex, and “peels” derivatives, replacing them by higher power variables.

If you have the rate of change function known, you can find out the values of the target function as a function of time, by taking the integral of the derivative function.

Differentiation and integration are complementary operations.

First 3 minutes tell the essentials up to this point in blog

One minor note: calculating the integral needs also a initial quantity, C, value chosen. The reason why we need to also tell the integral what C was, is that the integral function cannot “know” how much of something had already been accumulated at the time of calculation. C is called the ‘free parameter’ of an integral function.

Drop the ball! Newtonian example

Let’s take a simple example: In physics, we know that gravity is a force. All objects react to gravity. Gravity is a conservative, two-way force: dropping a rock, in actuality the rock also pulls Earth toward itself, but since Earth is way more massive than the rock, it seems as if Earth only pulls the rock towards the ground.

In everyday practical situations, it’s a planet that pulls all objects near it, towards the center of the planet. Space travel is different: after the initial escape from Earth’s gravity, space flights need to take into account other things (gravity assist).

The magnitude of gravity (called ‘g’, a number) doesn’t change much, since it’s the force created by mass of Earth. Near earth g = 9,81 m/s2. Thus the magnitude of gravity in practical sense is quite constant; the objects we are talking about on surface of Earth, move a maximum height of about 30-40 kilometers. During that transition, the value of Earth’s gravity force changes very little. In practise, we use g=9,81 for calculations.

We could use a table to mechanically calculate the very first seconds:

But wait.. the simple gravity model doesn’t look credible?!

This model of Gravity is one of the cornerstones of Newtonian physics. However in real world, this model alone would imply that a rock thrown from a high altitude would have a huge speed at its impact to ground!! That’s not true. By observation, we know that objects reach a limited speed, called ‘terminal velocity‘.

Adjustment of rock-throw equation with two foes

The reason is simple: gravity is not the only force in the rock-throw: since the rock is coming through Earth’s atmosphere, we’re talking about matter – a fluid. So in reality there’s also 2 counteracting forces to gravity:

  • drag
  • buyoancy

Equilibrium of forces: acceleration hits zero

The terminal velocity (for falling rock) is reached when the forces (in vertical direction) are equal to each other. The net sum of forces acting on the object is zero => acceleration is zero, and thus velocity (which was a “function of acceleration”) stays the same. All good.

Resistance is created by the object pushing away a mass of air. The object, as it is ‘sweeping’ directly downwards, pushes away to the sides all the air molecules in front of the falling path. This is the same kind of thing that happens also in any other direction:

  • a car driving forward on a road pushes away air molecules, as the car goes
  • there’s actually no difference between a falling object pushing air and a car pushing away air
  • the direction (up-down, or left-right) doesn’t “matter”
  • ..but whereas the “dropping object” keeps moving towards center of Earth, a car stops if enough time passes, and the engine doesn’t provide forward force to push the car going
  • thus car needs energy from fuel, whereas a dropping object gets “fuel” from the gravity force of Earth

Back to SIR model – with new understanding

So, using the variables and binding the compartments together we can:

  • define a time-dependent function for each individual compartment’s value (amount of population likely to be in S, I or R at time t)
  • make the compartments work logically in conjunction (“connecting” the compartments)
  • keep the logic watertight – by requiring that the sum of population be constant.

Why SIR and SEIR are useful modeling methods for epidemics?

The beauty and usefulness of SIR, SEIR and other variants of the model are:

  • by adjusting the free parameters early on in an epidemic, one can estimate the whole epidemic wave’s shape
  • different levels of infections can be estimated
  • epidemiologist, government and general public can be kept aware of different scenarios
  • as there are new facts learned from the field, the model parameters can be adjusted accordingly and new forecasts made instantly (this is why testing for the infection is also important; naturally the proper healthcare of an individual is put first)
  • the date when the epidemic starts to level off (slow growth in end stage of an epidemic) can be estimated

SIR model can be used to estimate the peak level of infected individuals. That is the maximum fraction of a population that will eventually get the disease. This peak level may not always be 100%. Not even close. There are a few things that make the peak rate lower than 100% of population:

  • virus is too aggressive, self-limiting its spread by causing too quick deaths of the “I” – infected persons
  • a vaccination comes out, that once given to, immunizes “S” people so that they skip “I” stage, and go to “R” (being safe from infection)
  • virus mutates to a less potent form in the population, affecting the parameters

4 parameters in SIR model

  • alpha
  • beta
  • gamma
  • mu

These parameters are explained better in the next part of this blog series, where a computer-based R language simulation is shown.

‘S’ compartment – susceptible (disease-free)

Photo by CDC on Unsplash

Epidemiologic models are a set of functions, that draw the curves of various “populations” during an epidemic. Typically in times of no epidemia, the population is considered healthy (regarding a particular pathogen). Pathogens that become epidemics, also may not have existed for long time in human-transmissible form. The Corona virus epidemic of 2019 is a prime example: until the nCov virus jumped initially from animals to human, somewhere in late 2019, it wasn’t kind of a threat to humans. It was a threat which apparently wasn’t registered as a dire threat to humankind. Sometimes pathogens can be widely distributed in animal kingdom, but are of no danger to humans – and vice versa.

However, in reality, unless vaccinated, the population indeed is in “S” state – susceptible to getting infected. This is normal business, a rather academic definition indeed. It basically means that normal people, since we have not yet invented a global, universal vaccination against all possibly harmful viruses, is susceptible to new pathogens.

I – Infected

Next compartment, “I” means the pathogen has invaded a person. Viruses are present in sufficient amounts in the body, that they will typically soon start showing signs of illness.

SIR and others are also called compartmentalized models: the populations are the compartments. People still stay the same, essentially, but they get labeled (and counted for in statistical models) differently according to their factual status of having or not having the illness (or, as in case of “R”, having had and gone through into Resolved population).

R – Resolved (cured or died)

The R population is often considered not capable of infecting S population; thus once resolved, person is both immune to reinfection and does not infect others. This, however, in medicine is a case-by-case thing, again dependent on the real biological and systemic properties of the virus.

communications · CompSci_Studies · download · Health · medicine

Epidemiological models (SIR) in nCov and Other Epidemics

Reading Time: 4 minutes

Epidemic is an incident, in time, where typically a large proportion of a population gets ill.

Corona-virus pandemic of 2019 is causing an epidemic. Originally detected somewhere in December 2019, we are still (March 20th 2020) in midst of the rising tide of infection cases. The causing virus is called specifically ‘Severe acute respiratory syndrome coronavirus 2″, or SARS-CoV-2 for short. Another alias for the exact same virus is nCov. The disease that results from this virus is called COVID-19.

The virus – cause of the epidemic

A virus — photo by CDC on Unsplash

The biological root of an epidemic is called an antigen. With the novel Corona-virus pandemic, it’s a virus in the “corona family”.

There have been corona viruses in the wild before this 2019-2020 epidemic.

Viruses are small, lifeless objects per se, who carry either a DNA or RNA code, and can drift to hijack a working cell’s production mechanism, so that the cells start producing replicants of the virus. Thus normal functioning of the cells are interrupted and the virus population starts to grow.

Corona (nCOV) leading to the disease COVID-19 is a RNA-virus. Thus the replication message carried is in the form of ribonucleic acid. See Wikipedia: RNA-virus.

As one virus can reproduce many other viruses, the growth curve of the mass of viruses is exponential in shape. It’s similar to the mechanism of nuclear fission – the mechanism of nuclear weapons. Many biological processes are exponential.

The growth often also has a natural limiting factor, thus there’s resistance. In human bodies, resistance may come in the form of immunity fighting back the spread of the virus. A virus may also simply exhaust the host or exhausting a critical matter that is needed to replicate; leading to either sustained levels of viral presence, or decay of the level.

The antigen causes the symptoms and capability to transmit the disease to another person. The branch of medicine and science that deals with epidemics is called epidemiology.

Mathematical models for viral epidemics

There are lot of mathematics which is useful in modeling these epidemics. Some of maths is actually quite simple, and can be understood perhaps better with computer simulation.

There’s a few “main ideas” of viral outbreak simulations:

  • differential equations (called DE, or ‘ODE’)
  • agent-based simulation
  • AI models, such as using autoencoders [Wikipedia: autoencoder]

The simplest epidemic models choose variables that predict the amounts of people in various stages of the disease. People move (permanently) from one compartment towards the final compartment, which is ‘Recovered’. A recovered person means one who has either gotten immune (healthy), or died.

People always thus essentially end up in the Recovered state. This means also that these kind of models assume the epidemic goes through 100% of the people; for an individual, thus, the question wouldn’t be “whether I will get infected”, but “when (is it) I will get infected”.

In real life, there’s actually only a few things that potentially can prevent an infection from ever happening. One of those is that during the epidemic, a vaccine is found. Thus this would “freeze” the situation (number of population allocated into each compartments), given that the nations have funds to provide vaccination and given that everyone is willing to get vaccinated.

Thus an epidemic has a few interesting elements to it:

  • properties of the virus
  • sociology of a population, among which the virus is spreading
  • remedies available to stop the virus spreading
  • effectiveness of communicating the correct information and situational awareness to target population
  • availability and cost of the cure, if a person has gotten Infected
Photo by CDC on Unsplash

One of the most famous model, a set of differential equations, is called SIR model. SIR is a “compartmental model”: it places people into exactly one compartment at any given time. In SIR, for example, people can be:

  • Susceptible
  • Infected
  • Resolved

Actual, recognizable individuals (single people) are not “tracked” in these models – rather; the numbers of people in each compartment are calculated as function of time. So the model itself doesn’t identify individuals who are infected, but the use of the model is fed with real numbers. The statistics of infected (tested) people gives epidemiologists, citizens and any stakeholders during the management and containment a lot of important information.

Population models produce numeric results that can be plotted as curves.

Contained population: sum S+I+R

There’s one particular limitation set in SIR model, by design: the sum of compartmentalized populations is constant, and equal to initial population of the study:

  • in SIR model, summing S+I+R is always constant => equal to the initial population
  • thus in SIR model, births are not allowed
  • “R” includes both cured (immune) and deaths

These models were largely formulated in 1927.

Recipe for using SIR epidemiologic model

  • initialize all 3 compartments to values (populations)
  • define 4 parameters for the differential equations
  • there will be 3 differential equations, one for each population
  • in SIR, the populations are S=susceptible (healthy), I = infected, R = recovered
  • run a ODE solver algorithm, usually provided as part of your programming language of choice
  • for example, R language has “deSolve” libary and a ode() function, for example
  • for R language, there’s also ready-made code libraries for the particular SIR model; for example, one called EpiDynamics
  • ode() or the appropriate modeling function returns as result the values of each function (corresponding to one function per compartment)
  • you can plot the functions, all on a same diagram (axes t for time, and autoscaling Y axis as per quantity) to get an overall image of how the epidemic turns out

Some suggested reading

Btw. Getting R for experimenting with math modeling – it’s a snap! It took me less than 15 minutes to install both free RStudio and the underlying R programming environment. Definitely recommended!