The vignette("quickstart")
and
vignette("calibration")
articles work with simulated data,
but it is more interesting to fit models to real data. The International
Infectious Disease Data Archive (IIDDA) is a useful place to look
for incidence, mortality, and population data for illustrating
macpan2
. This archive contains data from this data digitization
project, which we will use here. What we find is that with real
data, even with relatively simple models and calibration problems,
issues can arise that require careful thought.
Setup
As usual we need the following packages. If you don’t have them,
please get them in the usual way (i.e. using
install.packages
).
In addition, we need the iidda.api
package.
if (!require(iidda.api)) {
repos = c(
"https://canmod.r-universe.dev"
, "https://cran.r-project.org"
)
install.packages("iidda.api", repos = repos)
}
api_hook = iidda.api::ops_staging
And we set a few options for convenience.
options(iidda_api_msgs = FALSE, macpan2_verbose = FALSE)
One Scarlet Fever Outbreak in Ontario
The following code will pull data for a single scarlet fever outbreak in Ontario that ended in 1930.
scarlet_fever_ontario = api_hook$filter(resource_type = "CANMOD CDI"
, iso_3166_2 = "CA-ON" ## get ontario data
, time_scale = "wk" ## weekly incidence data only
, disease = "scarlet-fever"
# get data between 1929-08-01 and 1930-10-01
, period_end_date = "1929-08-01..1930-10-01"
)
print(scarlet_fever_ontario)
#> # A tibble: 61 × 19
#> iso_3166 iso_3166_2 location location_type period_start_date period_end_date
#> <chr> <chr> <chr> <chr> <date> <date>
#> 1 CA CA-ON ONT. province 1929-07-28 1929-08-03
#> 2 CA CA-ON ONT. province 1929-08-04 1929-08-10
#> 3 CA CA-ON ONT. province 1929-08-11 1929-08-17
#> 4 CA CA-ON ONT. province 1929-08-18 1929-08-24
#> 5 CA CA-ON ONT. province 1929-08-25 1929-08-31
#> 6 CA CA-ON ONT. province 1929-09-01 1929-09-07
#> 7 CA CA-ON ONT. province 1929-09-08 1929-09-14
#> 8 CA CA-ON ONT. province 1929-09-15 1929-09-21
#> 9 CA CA-ON ONT. province 1929-09-22 1929-09-28
#> 10 CA CA-ON ONT. province 1929-09-29 1929-10-05
#> # ℹ 51 more rows
#> # ℹ 13 more variables: period_mid_date <date>, days_this_period <dbl>,
#> # time_scale <chr>, disease <chr>, nesting_disease <chr>,
#> # historical_disease_family <chr>, historical_disease <chr>,
#> # historical_disease_subclass <chr>, dataset_id <chr>,
#> # original_dataset_id <chr>, cases_this_period <dbl>, population <dbl>,
#> # population_reporting <dbl>
This is what the data looks like.
SIR Model
We will begin by pulling the simple sir model from the model library.
sir = mp_tmb_library("starter_models", "sir", package = "macpan2")
print(sir)
#> ---------------------
#> Default values:
#> ---------------------
#> matrix row col value
#> beta 0.2
#> gamma 0.1
#> N 100.0
#> I 1.0
#> R 0.0
#>
#> ---------------------
#> Before the simulation loop (t = 0):
#> ---------------------
#> 1: S ~ N - I - R
#>
#> ---------------------
#> At every iteration of the simulation loop (t = 1 to T):
#> ---------------------
#> 1: mp_per_capita_flow(from = "S", to = "I", rate = infection ~ I *
#> beta/N)
#> 2: mp_per_capita_flow(from = "I", to = "R", rate = recovery ~ gamma)
Modify Model for Reality
Once we take any library model off the shelf we need to modify it a
bit to be more realistic. We do this with the
mp_tmb_insert()
function, which inserts new default values
and model expressions.
We first change the population size to the population of Ontario at the time.
sf_sir = mp_tmb_insert(sir
, default = list(N = median(scarlet_fever_ontario$population))
)
Second, our SIR model assumes that all of the cases are recorded in
the data, but this is not realistic. We therefore modify the model to
include under-reporting, by creating a case reports
variable that is given by the product of incidence (i.e. weekly
infection
rate) and a reporting probability.
sf_sir = mp_tmb_insert(sf_sir
## insert this expression ...
, expressions = list(reports ~ infection * report_prob)
## at the end (i.e. Infinity) of the expressions evaluated
## 'during' each iteration of the simulation loop ...
, at = Inf
, phase = "during"
## add a new default value for the reporting probability
, default = list(report_prob = 1/300)
)
Here we can print out the modified model to see if our changes were made successfully.
print(sf_sir)
#> ---------------------
#> Default values:
#> ---------------------
#> matrix row col value
#> beta 2.000000e-01
#> gamma 1.000000e-01
#> N 3.392636e+06
#> I 1.000000e+00
#> R 0.000000e+00
#> report_prob 3.333333e-03
#>
#> ---------------------
#> Before the simulation loop (t = 0):
#> ---------------------
#> 1: S ~ N - I - R
#>
#> ---------------------
#> At every iteration of the simulation loop (t = 1 to T):
#> ---------------------
#> 1: mp_per_capita_flow(from = "S", to = "I", rate = infection ~ I *
#> beta/N)
#> 2: mp_per_capita_flow(from = "I", to = "R", rate = recovery ~ gamma)
#> 3: reports ~ infection * report_prob
Preparing the Data
The first step in preparing data for macpan2
is to
simulate from the model that you are considering. Here we simulate the
"reports"
variable because it corresponds to the reported
incidence (the number of new cases per time-step, which in our case is
one week, multiplied by a reporting probability).
sir_simulator = mp_simulator(sf_sir
, time_steps = 5
, outputs = "reports"
)
head(mp_trajectory(sir_simulator))
#> matrix time row col value
#> 1 reports 1 0 0 0.0006666665
#> 2 reports 2 0 0 0.0007333330
#> 3 reports 3 0 0 0.0008066662
#> 4 reports 4 0 0 0.0008873327
#> 5 reports 5 0 0 0.0009760658
The next step is to get our data into a format that is compatible
with the format of these simulations. In particular, we need
matrix
, time
, and value
columns.
We can omit the row
and col
columns, because
all of the ‘matrices’ in the model are 1-by-1 scalars.
observed_data = (scarlet_fever_ontario
## select the variables to be modelled -- a time-series of case reports.
|> select(period_end_date, cases_this_period)
## change the column headings so that they match the columns
## in the simulated trajectories.
|> mutate(matrix = "reports")
|> rename(value = cases_this_period)
## create a `time` column with the time-step IDs that will correspond
## to the time-steps in the simulation. this column heading also
## must match the column with the time-steps in the simulated trajectories
|> mutate(time = seq_along(period_end_date))
)
print(head(observed_data))
#> # A tibble: 6 × 4
#> period_end_date value matrix time
#> <date> <dbl> <chr> <int>
#> 1 1929-08-03 23 reports 1
#> 2 1929-08-10 27 reports 2
#> 3 1929-08-17 47 reports 3
#> 4 1929-08-24 24 reports 4
#> 5 1929-08-31 24 reports 5
#> 6 1929-09-07 45 reports 6
Set up the Optimizer
Now we can create an object that can be calibrated.
sir_cal = mp_tmb_calibrator(
spec = sf_sir
, data = observed_data
## name the trajectory variable, with a name that
## is the same in both the spec and the data
, traj = "reports"
## fit the following parameters
, par = c("beta", "gamma", "I", "report_prob")
)
Here we assert that we will fit beta
,
gamma
, (the initial value of) I
, and
report_prob
. Note that we can only choose to fit parameters
in the default
list of the model spec. In particular,
I
is in the default list because the model requires the
initial number of infectious individuals, whereas S
is not
because it is derived before the simulation loop as
S ~ N - I - R
.
print(sf_sir)
#> ---------------------
#> Default values:
#> ---------------------
#> matrix row col value
#> beta 2.000000e-01
#> gamma 1.000000e-01
#> N 3.392636e+06
#> I 1.000000e+00
#> R 0.000000e+00
#> report_prob 3.333333e-03
#>
#> ---------------------
#> Before the simulation loop (t = 0):
#> ---------------------
#> 1: S ~ N - I - R
#>
#> ---------------------
#> At every iteration of the simulation loop (t = 1 to T):
#> ---------------------
#> 1: mp_per_capita_flow(from = "S", to = "I", rate = infection ~ I *
#> beta/N)
#> 2: mp_per_capita_flow(from = "I", to = "R", rate = recovery ~ gamma)
#> 3: reports ~ infection * report_prob
Run the Optimization
sir_opt = mp_optimize(sir_cal)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
Not off to a good start. Those warnings are not necessarily bad, but they might get us thinking.
Examine the fit
print(sir_opt)
#> $par
#> params params params params
#> 29.0058616 28.8891098 2.6406276 0.2895818
#>
#> $objective
#> [1] 424.55
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 118
#>
#> $evaluations
#> function gradient
#> 200 119
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
OK, things are not great. The convergence
code is
1
, indicating that the model did not converge
(convergence == 0
is good). Examining the parameter
estimates, which are stored internally in the calibrator object, things
get worse.
mp_tmb_coef(sir_cal, conf.int = TRUE)
#> term mat row col default type estimate std.error
#> 1 params beta 0 0 0.200000000 fixed 29.0058616 17.616797
#> 2 params.1 gamma 0 0 0.100000000 fixed 28.8891098 17.616948
#> 3 params.2 I 0 0 1.000000000 fixed 2.6406276 3.204148
#> 4 params.3 report_prob 0 0 0.003333333 fixed 0.2895818 0.175993
#> conf.low conf.high
#> 1 -5.5224261 63.5341493
#> 2 -5.6394732 63.4176929
#> 3 -3.6393864 8.9206416
#> 4 -0.0553582 0.6345218
That doesn’t look right! Those are very high beta
and
gamma
values, and the confidence intervals are enormous and
overlap zero.
But the model fit doesn’t look all that bad.
fitted_data = mp_trajectory_sd(sir_cal, conf.int = TRUE)
(observed_data
|> ggplot()
+ geom_point(aes(time, value))
+ geom_line(aes(time, value)
, data = fitted_data
)
+ geom_ribbon(aes(time, ymin = conf.low, ymax = conf.high)
, data = fitted_data
, alpha = 0.2
, colour = "red"
)
+ theme_bw()
+ facet_wrap(~matrix, ncol = 1, scales = "free")
)
What is going on? Can we modify this model and/or fitting procedure to get both reasonable parameter estimates and a reasonable trajectory?
Fixing the Optimization Problem
One way to address the lack of convergence of the optimizer is to
assess the degree to which the parameter estimates are biologically
reasonable. Take the recovery rate, which is estimated as
gamma ~ 26.5
per time-step. Given that a time-step is one
week in this model, this estimate implies that individuals recover from
scarlet fever in 1/26.5
weeks – much less than a day. A
quick search suggests to me that this recovery rate is not reasonable
for scarlet fever, and that a rough guess that the infectious stage
lasts one week is much more plausible than 1/26.5
weeks.
Now look at the transmission rate, beta
. It is also
estimated to be pretty large, but what is more interesting is the large
standard errors for both beta
and gamma
. These
standard errors suggest that the estimates are not precise. However, the
estimated correlation between beta
and gamma
is very close to 1
, suggesting that many values for these
parameters would fit the data well as long as they are both of similar
magnitude.
mp_tmb_fixef_cov(sir_cal) |> cov2cor()
#> beta gamma I report_prob
#> beta 1.0000000 1.0000000 -0.9997652 0.9995361
#> gamma 1.0000000 1.0000000 -0.9997636 0.9995384
#> I -0.9997652 -0.9997636 1.0000000 -0.9989125
#> report_prob 0.9995361 0.9995384 -0.9989125 1.0000000
This diagnosis, if correct, suggests that the data are not
sufficiently informative to identify values for both beta
and gamma
. Resolving such identifiability issues is often
best done by introducing prior information, such as our rough guess that
gamma
is close to 1
. The simplest way to
include such prior information is to move gamma
out of the
pars
argument to mp_tmb_calibrator
and into
the default
argument, as below.
sir_cal_assume_gamma = mp_tmb_calibrator(
spec = sf_sir
, data = observed_data
## name the trajectory variable, with a name that
## is the same in both the spec and the data
, traj = "reports"
## fit the following parameters
, par = c("beta", "I", "report_prob")
, default = list(gamma = 1)
)
This calibration specification does not try to jointly fit both
beta
and gamma
, but rather fits only
beta
while assuming that gamma = 1
.
sir_opt_assume_gamma = mp_optimize(sir_cal_assume_gamma)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
print(sir_opt_assume_gamma)
#> $par
#> params params params
#> 1.122440e+00 1.669046e+03 1.123683e-02
#>
#> $objective
#> [1] 426.6981
#>
#> $convergence
#> [1] 0
#>
#> $iterations
#> [1] 81
#>
#> $evaluations
#> function gradient
#> 124 81
#>
#> $message
#> [1] "both X-convergence and relative convergence (5)"
More warnings, but now the optimizer converges. And the standard errors in the coefficient table and fixed effect correlations seem more plausible.
mp_tmb_coef(sir_cal_assume_gamma)
#> term mat row col default type estimate std.error
#> 1 params beta 0 0 0.200000000 fixed 1.122440e+00 0.001789219
#> 2 params.1 I 0 0 1.000000000 fixed 1.669046e+03 50.635015008
#> 3 params.2 report_prob 0 0 0.003333333 fixed 1.123683e-02 0.000192139
mp_tmb_fixef_cov(sir_cal_assume_gamma) |> cov2cor()
#> beta I report_prob
#> beta 1.0000000 -0.8155308 -0.7603637
#> I -0.8155308 1.0000000 0.5966437
#> report_prob -0.7603637 0.5966437 1.0000000
And the fit looks similar to the four-parameter model, which is consistent with our diagnosis of non-identifiability.
fitted_data = mp_trajectory_sd(sir_cal_assume_gamma, conf.int = TRUE)
(observed_data
|> ggplot()
+ geom_point(aes(time, value))
+ geom_line(aes(time, value)
, data = fitted_data
)
+ geom_ribbon(aes(time, ymin = conf.low, ymax = conf.high)
, data = fitted_data
, alpha = 0.2
, colour = "red"
)
+ theme_bw()
+ facet_wrap(~matrix, ncol = 1, scales = "free")
)
Caution: Ben, what is that paper that you always recommend about being careful when prior information is included without prior uncertainty? This one (Elderd, Dukic, and Dwyer 2006)
Learning the Functional form of Time Variation in Transmission (New!)
Let’s get a bit more data to see two seasons.
scarlet_fever_ontario = api_hook$filter(resource_type = "CANMOD CDI"
, iso_3166_2 = "CA-ON" ## get ontario data
, time_scale = "wk" ## weekly incidence data only
, disease = "scarlet-fever"
# get data between 1929-08-01 and 1931-10-01
, period_end_date = "1929-08-01..1931-10-01"
)
(scarlet_fever_ontario
|> ggplot(aes(period_mid_date, cases_this_period))
+ geom_line() + geom_point()
+ ggtitle("Scarlet Fever Incidence in Ontario, Canada")
+ theme_bw()
)
observed_data = (scarlet_fever_ontario
## select the variables to be modelled -- a time-series of case reports.
|> select(period_end_date, cases_this_period)
## change the column headings so that they match the columns
## in the simulated trajectories.
|> mutate(matrix = "reports")
|> rename(value = cases_this_period)
## create a `time` column with the time-step IDs that will correspond
## to the time-steps in the simulation. this column heading also
## must match the column with the time-steps in the simulated trajectories
|> mutate(time = seq_along(period_end_date))
)
print(head(observed_data))
#> # A tibble: 6 × 4
#> period_end_date value matrix time
#> <date> <dbl> <chr> <int>
#> 1 1929-08-03 23 reports 1
#> 2 1929-08-10 27 reports 2
#> 3 1929-08-17 47 reports 3
#> 4 1929-08-24 24 reports 4
#> 5 1929-08-31 24 reports 5
#> 6 1929-09-07 45 reports 6
Prepare to fit to the data. We make a function so that we can easily update the dimension of the radial basis.
make_rbf_calibrator = function(dimension) {
mp_tmb_calibrator(
spec = sf_sir
, data = observed_data
, traj = "reports"
## -----------------------------
## this is the key bit
, tv = mp_rbf("beta", dimension)
## -----------------------------
, par = c("gamma", "I", "report_prob")
)
}
And it is also convenient to make a function for plotting the results
plot_fit = function(cal_object) {
fitted_data = mp_trajectory_sd(cal_object, conf.int = TRUE)
(observed_data
|> ggplot()
+ geom_point(aes(time, value))
+ geom_line(aes(time, value)
, data = fitted_data
, colour = "red"
)
+ geom_ribbon(aes(time, ymin = conf.low, ymax = conf.high)
, data = fitted_data
, alpha = 0.2
, colour = "red"
)
+ theme_bw()
)
}
Now we try fitting for a number of different dimensions.
sir_cal = make_rbf_calibrator(dimension = 1)
mp_optimize(sir_cal)
#> $par
#> params params params params
#> 1.1161463 88.4704036 0.8660326 0.1269302
#>
#> $objective
#> [1] 2757.747
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 133
#>
#> $evaluations
#> function gradient
#> 200 133
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 2)
mp_optimize(sir_cal)
#> $par
#> params params params params params
#> 0.03614755 37.88296337 13.70413466 -1.73565077 -3.36834735
#>
#> $objective
#> [1] 2769.557
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 119
#>
#> $evaluations
#> function gradient
#> 200 120
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 3)
mp_optimize(sir_cal)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> $par
#> params params params params params
#> 7.445217e-01 6.323353e+03 5.707895e-03 -1.592204e-01 -4.668529e-02
#> params
#> 9.091554e-01
#>
#> $objective
#> [1] 1249.741
#>
#> $convergence
#> [1] 0
#>
#> $iterations
#> [1] 70
#>
#> $evaluations
#> function gradient
#> 96 71
#>
#> $message
#> [1] "relative convergence (4)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 4)
mp_optimize(sir_cal)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> $par
#> params params params params params params
#> 1.30734342 106.01354219 0.07401727 0.42630267 -0.06942185 0.35248532
#> params
#> 0.01001478
#>
#> $objective
#> [1] 1228.817
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 116
#>
#> $evaluations
#> function gradient
#> 200 116
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 5)
mp_optimize(sir_cal)
#> $par
#> params params params params params params
#> 0.02023712 23.88310851 23.63162276 -3.47579664 1.65114417 -5.44649398
#> params params
#> 1.86159848 -5.85211557
#>
#> $objective
#> [1] 1039.075
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 120
#>
#> $evaluations
#> function gradient
#> 200 120
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 6)
mp_optimize(sir_cal)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> $par
#> params params params params params params
#> 0.1270159 128.1590763 128.3625346 -7.2152311 3.2483740 -3.8894784
#> params params params
#> 0.4432874 -1.0387206 -2.2276442
#>
#> $objective
#> [1] 1029.754
#>
#> $convergence
#> [1] 1
#>
#> $iterations
#> [1] 103
#>
#> $evaluations
#> function gradient
#> 200 103
#>
#> $message
#> [1] "function evaluation limit reached without convergence (9)"
plot_fit(sir_cal)
sir_cal = make_rbf_calibrator(dimension = 7)
mp_optimize(sir_cal)
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
#> NA/NaN function evaluation
#> $par
#> params params params params params
#> 1.971535e-02 7.543047e+03 4.314971e-03 3.005822e-01 -2.356461e+00
#> params params params params params
#> 8.875691e-01 -4.221472e+00 8.522903e-01 -2.034926e+00 6.156743e-01
#>
#> $objective
#> [1] 951.6669
#>
#> $convergence
#> [1] 0
#>
#> $iterations
#> [1] 75
#>
#> $evaluations
#> function gradient
#> 160 76
#>
#> $message
#> [1] "relative convergence (4)"
plot_fit(sir_cal)
Interestingly we have now managed to fit gamma
mp_tmb_coef(sir_cal)
#> term mat row col default type estimate std.error
#> 1 params gamma 0 0 0.100000000 fixed 1.971535e-02 6.841928e-03
#> 2 params.1 I 0 0 1.000000000 fixed 7.543047e+03 1.851632e+03
#> 3 params.2 report_prob 0 0 0.003333333 fixed 4.314971e-03 3.499032e-05
#> 4 params.3 time_var_beta 0 0 -0.006264538 fixed 3.005822e-01 1.627780e-01
#> 5 params.4 time_var_beta 1 0 0.001836433 fixed -2.356461e+00 1.402708e-01
#> 6 params.5 time_var_beta 2 0 -0.008356286 fixed 8.875691e-01 1.122606e-01
#> 7 params.6 time_var_beta 3 0 0.015952808 fixed -4.221472e+00 1.333442e-01
#> 8 params.7 time_var_beta 4 0 0.003295078 fixed 8.522903e-01 1.810992e-01
#> 9 params.8 time_var_beta 5 0 -0.008204684 fixed -2.034926e+00 1.118681e-01
#> 10 params.9 time_var_beta 6 0 0.004874291 fixed 6.156743e-01 2.701168e-01
This is a much slower recovery rate. Expected time in the R box is about one year! Not believeable, but the point is to make it easier to fit models so you can try more things.
mp_tmb_calibrator(
spec = sf_sir
, data = observed_data
, traj = "reports"
, tv = mp_rbf("beta", dimension = 7)
, par = c("gamma", "I", "report_prob")
, outputs = c("reports", "beta")
)
#> ---------------------
#> Before the simulation loop (t = 0):
#> ---------------------
#> 1: S ~ N - I - R
#> 2: outputs_var_beta ~ group_sums(values_var_beta * time_var_beta[col_indexes_beta], row_indexes_beta, outputs_var_beta)
#>
#> ---------------------
#> At every iteration of the simulation loop (t = 1 to T):
#> ---------------------
#> 1: beta ~ exp(outputs_var_beta[time_step(1)])
#> 2: infection ~ S * I * beta/N
#> 3: recovery ~ I * gamma
#> 4: S ~ S - infection
#> 5: I ~ I + infection - recovery
#> 6: R ~ R + recovery
#> 7: reports ~ infection * report_prob
#>
#> ---------------------
#> After the simulation loop (t = T + 1):
#> ---------------------
#> 1: sim_reports ~ rbind_time(reports, obs_times_reports)
#>
#> ---------------------
#> Objective function:
#> ---------------------
#> ~-sum(dpois(obs_reports, clamp(sim_reports)))