This assumes positive disease counts are stratified by a population grouping, e.g. geography or age, and we have estimates of the size of that population during that time period. Normalising by population size allows us to compare groups.
Usage
normalise_count(
raw = i_incidence_data,
pop = i_population_data,
...,
population_unit = 1e+05,
normalise_time = FALSE
)
Arguments
- raw
The count data
A dataframe containing the following columns:
count (positive_integer) - Positive case counts associated with the specified time frame
time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a `time_period`
Any grouping allowed.
- pop
The population data must be grouped in the same way as
raw
.A dataframe containing the following columns:
population (positive_integer) - Size of population
Any grouping allowed.
- ...
not used
- population_unit
What population unit do you want the count data normalised to e.g. per 100K
- normalise_time
The default behaviour for normalising is to keep it in the same time units as the input data. If this parameter is set to
TRUE
the incidence rates are calculated per year. If given as a lubridate period string e.g. "1 week" then the incidence is calculated over that time period.
Value
a dataframe with incidence rates per unit capita. A dataframe containing the following columns:
population (positive_integer) - Size of population
count (positive_integer) - Positive case counts associated with the specified time frame
time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a
time_period
population_unit (double) - The population unit on which the per capita incidence rate is calculated
time_unit (lubridate::as.period) - The time period over which the per capita incidence rate is calculated
Any grouping allowed.
Examples
tmp = ggoutbreak::england_covid %>%
ggoutbreak::normalise_count(ggoutbreak::england_demographics) %>%
dplyr::glimpse()
#> Rows: 26,790
#> Columns: 9
#> Groups: class [19]
#> $ date <date> 2023-12-09, 2023-12-09, 2023-12-09, 2023-12-09, 2023…
#> $ class <fct> 00_04, 05_09, 10_14, 15_19, 20_24, 25_29, 30_34, 35_3…
#> $ count <int> 24, 8, 8, 4, 21, 20, 29, 36, 41, 59, 53, 54, 56, 54, …
#> $ denom <dbl> 771, 771, 771, 771, 771, 771, 771, 771, 771, 771, 771…
#> $ time <time_prd> 1409, 1409, 1409, 1409, 1409, 1409, 1409, 1409, …
#> $ population <int> 3077000, 3348600, 3413100, 3218900, 3414400, 3715400,…
#> $ count.per_capita <dbl> 0.77998050, 0.23890581, 0.23439102, 0.12426605, 0.615…
#> $ population_unit <dbl> 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+0…
#> $ time_unit <Period> 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S, 1d 0H 0M 0S…