Delayed GAM reporting model function generator
Usage
gam_delayed_reporting(
window,
max_delay = 40,
...,
knots_fn = ~gam_knots(.x, window, ...)
)
Arguments
- window
controls the knot spacing in the GAM (if the default)
- max_delay
the maximum delay we expect to model
- ...
Named arguments passed on to
gam_knots
data
the function will be called with incidence data - a dataframe with columns:
count (positive_integer) - Positive case counts associated with the specified time frame
time (ggoutbreak::time_period + group_unique) - A (usually complete) set of singular observations per unit time as a `time_period`
Any grouping allowed.
k
alternative to
window
, ifk
is given then the behaviour of the knots will be similar to the defaultmgcv::s(...,k=...)
parameter....
currently not used
- knots_fn
a function that takes the data as an input and returns a set of integers as time points for GAM knots, for
s(time)
term. The default here provides a roughly equally spaced grid determined bywindow
, by a user supplied function could do anything. The input this function is the raw dataframe of data that will be considered for one model fit. It is guaranteed to have at least atime
andcount
column. It is possible to
Value
a list with 2 entries - model_fn
and predict
suitable as the
input for poisson_gam_model(model_fn = ..., predict=...)
.
Details
This function is used to configure a delayed reporting GAM model. The model is of the form:
count ~ s(time, bs = "cr", k = length(kts)) + s(log(tau), k = 4, pc = max_delay)
where tau
is the difference between time series observation time and the
time of the data point in the time series, and we have multiple observations
of the same time series. This function helps specify the knots of the GAM and
the maximum expected delay
Examples
data = test_delayed_observation %>% dplyr::group_by(obs_time)
cfg = gam_delayed_reporting(14,40)
fit = cfg$model_fn(data)
summary(fit)
#>
#> Family: Negative Binomial(331.222)
#> Link function: log
#>
#> Formula:
#> count ~ s(time, bs = "cr", k = length(kts)) + s(log(tau), k = 4,
#> pc = max_delay)
#>
#> Parametric coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 2.76943 0.01968 140.7 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Approximate significance of smooth terms:
#> edf Ref.df Chi.sq p-value
#> s(time) 6.99 7 222105 <2e-16 ***
#> s(log(tau)) 3.00 3 31119 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> R-sq.(adj) = 0.993 Deviance explained = 99.7%
#> -REML = 8726.6 Scale est. = 1 n = 3240