Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tutorial on disease burden #76

Merged
merged 23 commits into from
Feb 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ episodes:
- modelling-interventions.Rmd
- compare-interventions.Rmd
- vaccine-comparisons.Rmd

- disease-burden.Rmd

# Information for Learners
learners:
Expand Down
306 changes: 306 additions & 0 deletions episodes/disease-burden.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
---
title: 'Modelling disease burden'
teaching: 40 # teaching time in minutes
exercises: 10 # exercise time in minutes
---

:::::::::::::::::::::::::::::::::::::: questions

- How can we model disease burden and healthcare demand?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Understand when mathematical models of transmission can be separated from models of burden
- Generate estimates of disease burden and healthcare demand using a burden model

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: prereq

+ Complete tutorial on [Simulating transmission](../episodes/simulating-transmission.md)

:::::::::::::::::::::::::::::::::


## Introduction

In previous tutorials we have used mathematical models to generate trajectories of infections, but we may also be interested in measures of disease. That could be measures of health in the population such as mild versus severe infections, or measures important to contingency planning of services being overwhelmed such as hospitalisations.

In the case of hospitalisations, mathematical models comprised of ODEs may include a compartment for individuals in hospital (see for example the [Vacamole model](../episodes/compare-interventions.md#vacamole-model)). For diseases like Ebola, hospitalised cases contribute to new infections and therefore it is necessary to keep track of hospitalised individuals so their contribution to infection can be modelled. When hospitalised cases contribute little to transmission (i.e. if most transmission happens early in the infection and severe illness typically occurs later on) we can separate out models of transmission from models of burden. In other wrods, we can first run an epidemic model to simulate infections, then use these values to simulate disease burden as a follow-on analysis.

In this tutorial we will translate new infections generated from a transmission model to hospital admissions and number of hospitalised cases over time. We will use the in `{epidemics}` package to simulate disease trajectories, access to social contact data with `{socialmixr}` and `{tidyverse}` for data manipulation and plotting.

```{r, warning = FALSE, message = FALSE}
library(epiparameter)
library(epidemics)
library(socialmixr)
library(tidyverse)
```


:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor

Inline instructor notes can help inform instructors of timing challenges
associated with the lessons. They appear in the "Instructor View".


::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


## A burden model

We will extend the influenza example in the tutorial [Simulating transmission](../episodes/simulating-transmission.md) to calculate hospitalisations over time. In this example our **transmission model** is an SEIR model comprised of ODEs. Our **burden model** will convert new infections to hospitalisations over time by summing up all the new hospitalisations we expect to occur on each day according to the delay from infection to hospitalisation. We can do this summation using a calculation known as a convolution.

We will first obtain the new infections through time from our influenza example (see tutorial on [Simulating transmission](../episodes/simulating-transmission.md) for code to generate `output_plot`) using `new_infections()` in `{epidemics}`

```{r, echo = FALSE, message = FALSE}
# load contact and population data from socialmixr::polymod
polymod <- socialmixr::polymod
contact_data <- socialmixr::contact_matrix(
polymod,
countries = "United Kingdom",
age.limits = c(0, 20, 40),
symmetric = TRUE
)

# prepare contact matrix
contact_matrix <- t(contact_data$matrix)

# prepare the demography vector
demography_vector <- contact_data$demography$population
names(demography_vector) <- rownames(contact_matrix)

# initial conditions: one in every 1 million is infected
initial_i <- 1e-6
initial_conditions_inf <- c(
S = 1 - initial_i, E = 0, I = initial_i, R = 0, V = 0
)

initial_conditions_free <- c(
S = 1, E = 0, I = 0, R = 0, V = 0
)

# build for all age groups
initial_conditions <- rbind(
initial_conditions_inf,
initial_conditions_free,
initial_conditions_free
)
rownames(initial_conditions) <- rownames(contact_matrix)

# prepare the population to model as affected by the epidemic
uk_population <- epidemics::population(
name = "UK",
contact_matrix = contact_matrix,
demography_vector = demography_vector,
initial_conditions = initial_conditions
)

# time periods
preinfectious_period <- 3.0
infectious_period <- 7.0
basic_reproduction <- 1.46

# rates
infectiousness_rate <- 1.0 / preinfectious_period
recovery_rate <- 1.0 / infectious_period
transmission_rate <- basic_reproduction / infectious_period

# run an epidemic model using `epidemic()`
output_plot <- epidemics::model_default(
population = uk_population,
transmission_rate = transmission_rate,
infectiousness_rate = infectiousness_rate,
recovery_rate = recovery_rate,
time_end = 600, increment = 1.0
)

```

```{r}
new_cases <- new_infections(output_plot, by_group = FALSE)
head(new_cases)
```

To convert the new infections to hospitalisations we need to parameter distributions to describe the following processes :

+ the time from infection to admission to hospital,

+ the time from admission to discharge from hospital.

We will use the function `epiparameter()` from `{epiparameter}` package to define and store parameter distributions for these processes.

```{r, message = FALSE}
# define delay parameters
infection_to_admission <- epiparameter(
disease = "COVID-19",
epi_name = "infection to admission",
prob_distribution = create_prob_distribution(
prob_distribution = "gamma",
prob_distribution_params = c(shape = 5, scale = 4),
discretise = TRUE
)
)
```

To visualise this distribution we can create a density plot :

```{r}
x_range <- seq(0, 60, by = 0.1)
density_df <- data.frame(days = x_range,
density_admission = density(infection_to_admission,
x_range))

ggplot(data = density_df, aes(x = days, y = density_admission)) +
geom_line(linewidth = 1.2) +
theme_bw() +
labs(
x = "infection to admission (days)",
y = "pdf"
)

```


::::::::::::::::::::::::::::::::::::: challenge

## Define distribution for admission to discharge

Using the function `epiparameter()` define the distribution for admission to discharge as a Gamma distribution with shape = 10 and scale = 2 and plot the density of this distribution.

:::::::::::::::::::::::: solution

```{r, message = FALSE}
admission_to_discharge <- epiparameter(
disease = "COVID-19",
epi_name = "admission to discharge",
prob_distribution = create_prob_distribution(
prob_distribution = "gamma",
prob_distribution_params = c(shape = 10, scale = 2),
discretise = TRUE
)
)

x_range <- seq(0, 60, by = 0.1)
density_df <- data.frame(days = x_range,
density_discharge = density(admission_to_discharge,
x_range))


ggplot(data = density_df, aes(x = days, y = density_discharge)) +
geom_line(linewidth = 1.2) +
theme_bw() +
labs(
x = "admission to discharge (days)",
y = "pdf"
)

```

:::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::

To convert the new infections to number in hospital over time we will do the following:

1. Calculate the expected numbers of new infections that will be hospitalised using the infection hospitalisation ratio (IHR)
2. Calculate the estimated number of new hospitalisations at each point in time using the infection to admission distribution
3. Calculate the estimated number of discharge at each point in time using the admission to discharge distribution
4. Calculate the number in hospital at each point in time as the difference between the cumulative sum of hospital admissions and the cumulative sum of discharges so far.


#### 1. Calculate the expected numbers of new hospitalisations using the infection hospitalisation ratio (IHR)
```{r}
ihr <- 0.1 # infection-hospitalisation ratio

# calculate expected numbers with convolution:
hosp <- new_cases$new_infections * ihr
```

#### 2. Calculate the estimated number of new hospitalisations using the infection to admission distribution

To estimate the number of new hospitalisations we use a method called convolution.

::::::::::::::::::::::::::::::::::::: callout
### What is convolution?

If we want to know how people are admitted to hospital on day $t$, then we need to add up the number of people admitted on day $t$ but infected on day $t-1$, day $t-2$, day $t-3$ etc. We therefore need to sum over the distribution of delays from infection to admission. If $f_j$ is the probability an infected individual who will be hospitalised will be admitted to hospital $j$ days later, and $I_{t-j}$ is the number of individuals infected on day $t-j$, then the total admissions on day $t$ is equal to $\sum_j I_{t-j} f_j$. This type of rolling calculation is known as a convolution (see this [Wolfram article](https://mathworld.wolfram.com/Convolution.html) for some mathematical detail). There are different methods to calculate convolutions, but we will use the built in R function `convolve()`to perform the summation efficiently from the number of infections and the delay distribution.

::::::::::::::::::::::::::::::::::::::::::::::::

The function `convolve()` requires inputs of two vectors which will be convolved and `type`. Here we will specify `type = "open"`, this fills the vectors with 0s to ensure they are the same length.

The inputs to be convolved are the expected number of infections that will end up hospitalised (`hosp`) and the density values of the distribution of infection to admission times. We will calculate the density for the minimum possible value (0 days) up to the tail of the distribution (here defined as the 99.9th quantile, i.e. it is very unlikely any cases will be hospitalised after a delay this long).

Convolution requires one of the inputs to be reversed, in our case we will reverse the density distribution of infection to admission times. This is because if people infected earlier in time get admitted today, it means they've had a longer delay from infection to admission than someone infected more recently. In effect the infection to admission delay tells us how far we have to 'rewind' time to find previously new infections that are now showing up as hospital admissions.

```{r}
# define tail of the delay distribution
tail_value_admission <- quantile(infection_to_admission, 0.999)

hospitalisations <- convolve(hosp,
rev(density(infection_to_admission,
0:tail_value_admission)),
type = "open")[seq_along(hosp)]
```


#### 3. Calculate the estimated number of discharges

Using the same approach as above, we convolve the hospitalisations with the distribution of admission to discharge times to obtain the estimated number of new discharges.


```{r}
tail_value_discharge <- quantile(admission_to_discharge, 0.999)
discharges <- convolve(hospitalisations, rev(density(admission_to_discharge,
0:tail_value_discharge)),
type = "open")[seq_along(hospitalisations)]
```

#### 4. Calculate the number in hospital as the difference between the cumulative sumo of hospitalisation and the cumulative sum of discharges

We can use the R function `cumsum()` to calculate the cumulative number of hospitalisations and discharges. The difference between these two quantities gives is the current number of people in hospital.

```{r}
# calculate the current number in hospital
in_hospital <- cumsum(hospitalisations) - cumsum(discharges)
```

We create a data frame to plot our outcomes using `pivot_longer()`:
```{r}
# create data frame
hosp_df <- cbind(new_cases, in_hospital = in_hospital,
hospitalisations = hospitalisations)
# pivot longer for plotting
hosp_df_long <- pivot_longer(hosp_df, cols = new_infections:hospitalisations,
names_to = "outcome", values_to = "value")


ggplot(data = hosp_df_long) +
geom_line(
aes(
x = time,
y = value,
colour = outcome
),
linewidth = 1.2
) +
theme_bw() +
labs(
x = "Time (days)",
y = "Individuals"
)
```

## Summary

In this tutorial we estimated the number of people in hospital based on daily new infections from a transmission model. Other measures of disease burden can be calculated using outputs of transmission models, such as Disability-Adjusted Life-Years (DALYs). Calculating measures of burden from transmission models is one way we can utilise mathematical modelling in health economic analyses. As a follow-up, you might also be interested in [this how to guide](https://epiverse-trace.github.io/howto/analyses/reconstruct_transmission/estimate_infections.html), which shows how we can take reported hospitalisations or deaths over time and 'de-convolve' them to get an estimate of the original infection events.

::::::::::::::::::::::::::::::::::::: keypoints

- Transmission models should include disease burden when it is important for onward transmission
- Outputs of transmission models can be used as inputs to models of burden


::::::::::::::::::::::::::::::::::::::::::::::::
Loading
Loading