Skip to contents

This assumes positive disease counts are stratified by a population grouping, e.g. geography or age, and we have estimates of the size of that population during that time period. Normalising by population size allows us to compare groups.

Usage

normalise_count(
  raw = i_incidence_data,
  pop = i_population_data,
  ...,
  population_unit = 1e+05,
  normalise_time = FALSE
)

Arguments

raw

The count data

A dataframe containing the following columns:

  • count (positive_integer) - Positive case counts associated with the specified time frame

  • time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a `time_period`

Any grouping allowed.

pop

The population data must be grouped in the same way as raw.

A dataframe containing the following columns:

  • population (positive_integer) - Size of population

Any grouping allowed.

...

not used

population_unit

What population unit do you want the count data normalised to e.g. per 100K

normalise_time

The default behaviour for normalising is to keep it in the same time units as the input data. If this parameter is set to TRUE the incidence rates are calculated per year. If given as a lubridate period string e.g. "1 week" then the incidence is calculated over that time period.

Value

a dataframe with incidence rates per unit capita. A dataframe containing the following columns:

  • population (positive_integer) - Size of population

  • count (positive_integer) - Positive case counts associated with the specified time frame

  • time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a time_period

  • population_unit (double) - The population unit on which the per capita incidence rate is calculated

  • time_unit (lubridate::as.period) - The time period over which the per capita incidence rate is calculated

Any grouping allowed.

Examples

tmp = ggoutbreak::england_covid %>%
  ggoutbreak::normalise_count(ggoutbreak::england_demographics) %>%
  dplyr::glimpse()
#> Rows: 26,790
#> Columns: 9
#> Groups: class [19]
#> $ date             <date> 2023-12-09, 2023-12-09, 2023-12-09, 2023-12-09, 2023…
#> $ class            <fct> 00_04, 05_09, 10_14, 15_19, 20_24, 25_29, 30_34, 35_3…
#> $ count            <int> 24, 8, 8, 4, 21, 20, 29, 36, 41, 59, 53, 54, 56, 54, …
#> $ denom            <dbl> 771, 771, 771, 771, 771, 771, 771, 771, 771, 771, 771…
#> $ time             <time_prd> 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, …
#> $ population       <int> 3077000, 3348600, 3413100, 3218900, 3414400, 3715400,…
#> $ count.per_capita <dbl> 0.77998050, 0.23890581, 0.23439102, 0.12426605, 0.615…
#> $ population_unit  <dbl> 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+0…
#> $ time_unit        <Period> 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S…