Skip to contents

This assumes positive disease counts are stratified by a population grouping, e.g. geography or age, and we have estimates of the size of that population during that time period. Normalising by population size allows us to compare groups.

Usage

normalise_count(
  raw = i_incidence_data,
  pop = i_population_data,
  ...,
  population_unit = 1e+05,
  normalise_time = FALSE
)

Arguments

raw

The count data - a dataframe with columns:

  • count (positive_integer) - Positive case counts associated with the specified time frame

  • time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a `time_period`

Any grouping allowed.

pop

The population data must be grouped in the same way as raw. - a dataframe with columns:

  • population (positive_integer) - Size of population

Any grouping allowed.

...

not used

population_unit

What population unit do you want the count data normalised to e.g. per 100K

normalise_time

The default behaviour for normalising is to keep it in the same time units as the input data. If this parameter is set to TRUE the incidence rates are calculated per year. If given as a lubridate period string e.g. "1 week" then the incidence is calculated over that time period.

Value

a dataframe with incidence rates per unit capita. A dataframe containing the following columns:

  • population (positive_integer) - Size of population

  • count (positive_integer) - Positive case counts associated with the specified time frame

  • time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a time_period

Any grouping allowed.

Examples

tmp = ggoutbreak::england_covid %>%
  ggoutbreak::normalise_count(ggoutbreak::england_demographics) %>%
  dplyr::glimpse()
#> Rows: 26,790
#> Columns: 9
#> Groups: class [19]
#> $ date             <date> 2023-12-09, 2023-12-09, 2023-12-09, 2023-12-09, 2023…
#> $ class            <fct> 00_04, 05_09, 10_14, 15_19, 20_24, 25_29, 30_34, 35_3…
#> $ count            <int> 24, 8, 8, 4, 21, 20, 29, 36, 41, 59, 53, 54, 56, 54, 
#> $ denom            <dbl> 771, 771, 771, 771, 771, 771, 771, 771, 771, 771, 771…
#> $ time             <time_prd> 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, 
#> $ population       <int> 3077000, 3348600, 3413100, 3218900, 3414400, 3715400,
#> $ count.per_capita <dbl> 0.77998050, 0.23890581, 0.23439102, 0.12426605, 0.615…
#> $ population_unit  <dbl> 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+0…
#> $ time_unit        <Period> 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S…