Prerequisites
This document assumes knowledge of compartmental epidemic models. We will not be defining terms and symbols that we consider to be widely understood by mathematical epidemiologists.
Document Goals
This document describes a specification for a data structure to represent compartmental models. The goal of this document is not to define a spec for a user interface for constructing compartmental models. Instead, we define the data structure here so that interfaces for creating instances of compartmental models can be designed without needing to grapple with issues of how to represent and store model definitions.
User Stories
This spec design has been guided by several high-level user stories, which we outline briefly.
As a modeller, I want to use software that provides an extensive set of modelling capabilities, so that I can address specific and changing public health needs.
As a modeller, I want to build complex models out of simpler – easier to understand – modular sub-models, so that I can more easily modify the structure of my model as public health needs change.
As a developer, I want to be able to quickly add a new modelling capability, so that I can support public health modellers in as timely a manner as possible.
These and other user-stories were developed by working with public-health modellers during the COVID-19 pandemic in Canada.
Labelled Partitions (in progress)
Before describing a data structure for representing Compartmental Models, we need to define the concept of labelled partitions. This concept allows for a consistent, flexible, compact, and readable mechanism for identifying model variables and sets of variables. For example, in a spatially structured model we might want to refer to all states in a particular location (e.g. Toronto) and a specific state within that location (e.g. susceptible individuals in Toronto). Labelled partitions can be stored as a data frame. The rows of the data frame represent the things being represented. Our primary example of a ‘thing’ is a variable in a compartmental model (e.g. number of susceptible individuals in Toronto, or the force of infection). The columns of the data frame represent different ways to partition the rows.
#> Epi Vax
#> 1 S unvax
#> 2 E unvax
#> 3 I unvax
#> 4 R unvax
#> 5 S vax
#> 6 E vax
#> 7 I vax
#> 8 R vax
Non-empty values in each cell must contain only letters, numbers, underscores, and must start with a letter. Empty labels are zero-length strings that can be used to indicate that some partitions are not applicable to some variables. The purpose for these restrictions is to facilitate the construction of strings and character vectors that summarize different aspects of the labelled partition. When taken together, these summaries can be inverted to restore the full labelled partition and so they represent zero information loss. This equivalence allows us to go back-and-forth between the two representations without loosing information, but perhaps gaining convenience. There are three types of summaries: the names, the name, and the labels. The names of a labelled partition are the column names of the corresponding data frame.
p$names()
#> [1] "Epi" "Vax"
The name of a labelled partition is a dot-concatenation of the names,
p$name()
#> [1] "Epi.Vax"
The labels of a labelled partition is a dot-concatenation of the columns.
p$labels()
#> [1] "S.unvax" "E.unvax" "I.unvax" "R.unvax" "S.vax" "E.vax" "I.vax"
#> [8] "R.vax"
These labels give a unique single character string for referring to
each variable. With only the labels and one of either the names or the
name, one may recover the labelled partition. The labels provide
convenient names for the variables – i.e. rownames. By convention we use
UpperCamelCase
for partition names and a modified form of snake_case for
variable labels. Our modification of snake case allows for single
uppercase letters in order to accommodate the convention in epidemiology
for using single uppercase letters to refer to state variables. For
example, S
, I
, and R
, as well as
I_mild
and I_severe
, would be consistent with
our modified snake case style.
Compartmental Models
A compartmental model can be represented as a directory with the following four files.
-
variables
– A CSV file describing the variables in the model. -
derivations
– JSON file with an array of instructions on how to update thevariables
at each iteration of a model simulation. -
flows
– A CSV file describing what variables are state variables and the magnitudes of flows to and from compartments. -
settings
– A JSON file with an object of elements that determine how files 1-3 are used by the software.
The following sections describe each of these files in more detail.
Variables and Partitions
The variables.csv
file has a row for each variable in
the model and one column for each labelled partition of the variables. A
labelled partition is a character vector of labels that describes the
variables. Each row in this file must be unique so that it uniquely
identifies each variable.
For example, here is a list of variables for the standard SIR model.
#> Epi
#> 1 S
#> 2 I
#> 3 R
#> 4 N
#> 5 beta
#> 6 foi
#> 7 gamma
This example has a single labelled partition, Epi
. This
partition uniquely identifies each variable and so we do not need other
partitions to comply with the uniqueness requirement.
To stratify by another factor (e.g. age groups or locations) we can add other labelled partitions.
We emphasize that the list with the single Epi
partition
did not violate this spec, as it is not required to add a partition to
make a distinction between state variables and parameters. The spec
distinguishes Variable Roles like this in
the settings
file. In this example, the addition of the
Type
and Infect
partitions are for
convenience. The point is that the user is free to add any number of
partitions even if they are not required for uniqueness.
We use the term names for strings that identify partitions and the term labels for strings that identify variables with respect to a particular partition.
Note that we do not provide a column for the numeric quantity of each variable because this is a spec for a compartmental model data structure. By working with pure model structure objects, we can combine different atomic models to more easily produce models with more structure. When it comes time to use a model, the computational engine can be used to associate model variables with numerical quantities.
Required Partitions
There is a component, settings$required_partitions
,
which is a character vector of the names of the partitions that are used
to uniquely identify each variable. In the previous example with
Type
, Infect
, and Epi
partitions
we have settings$required_partitions == "Epi"
. We refer to
the labels associated with required partitions as required labels. Every
variable must have at least one non-empty required label. The required
partitions are used to generate Variable
Names.
Atomic and Non-Atomic Models
We refer to models with exactly one required partition as atomic. One
may combine two atomic models to get another model with two required
labelled partitions. An example combination of atomic models is the SEIR
model stratified by the states of an atomic vaccination model with
required partition, Vax
.
#> Vax
#> 1 unvax
#> 2 vax
#> 3 dose_rate
In this atomic model unvax
and vax
are
state variables representing the numbers of individuals who are
unvaccinated and vaccinated. The dose_rate
variable is a
parameter giving the proportion of unvaccinated individuals who receive
a vaccination each day.
We combine these two atomic models to produce a non-atomic model with
required partitions, Epi
and Vax
.
#> Epi Vax
#> 1 S unvax
#> 2 E unvax
#> 3 I unvax
#> 4 R unvax
#> 5 beta unvax
#> 6 alpha
#> 7 gamma
#> 8 N unvax
#> 9 foi unvax
#> 10 S vax
#> 11 E vax
#> 12 I vax
#> 13 R vax
#> 14 beta vax
#> 15 alpha
#> 16 gamma
#> 17 N vax
#> 18 foi vax
#> 19 dose_rate
Note here that the state variables, S
, E
,
I
, and R
, as well as the transmission rate,
beta
, are stratified by vaccination status, but the latency
and recovery parameters, alpha
and gamma
, are
not stratified indicating that latency and recovery is not influenced by
vaccination status. Similarly the dose_rate
is also not
stratified by epidemiological state.
Consider one more example with different symptomatic statuses, where infectious individuals have either mild or severe symptoms and these different statuses recover at different rates.
#> Epi Symp Vax
#> 1 S unvax
#> 2 E unvax
#> 3 I mild unvax
#> 4 I severe unvax
#> 5 R unvax
#> 6 beta unvax
#> 7 S vax
#> 8 E vax
#> 9 I mild vax
#> 10 I severe vax
#> 11 R vax
#> 12 beta vax
#> 13 alpha
#> 14 gamma mild
#> 15 gamma severe
#> 16 infectiousness mild
#> 17 infectiousness severe
#> 18 dose_rate
Variable Names
Although each row of the variables
list provides a
unique identifier for each variable, it is convenient to be able to
combine the required labels into a single string. We call this string
the variable name.
The name of a variable is the dot-delimited concatenation of the
partitions in settings$required_partitions
associated with
the variable. For example, the following table contains the labelled
partitions of the
Epi
-by-Symp
-by-Vax
variable list
with a fourth column giving the variable names.
#> Epi Symp Vax variable_names
#> 1 S unvax S..unvax
#> 2 E unvax E..unvax
#> 3 I mild unvax I.mild.unvax
#> 4 I severe unvax I.severe.unvax
#> 5 R unvax R..unvax
#> 6 beta unvax beta..unvax
#> 7 S vax S..vax
#> 8 E vax E..vax
#> 9 I mild vax I.mild.vax
#> 10 I severe vax I.severe.vax
#> 11 R vax R..vax
#> 12 beta vax beta..vax
#> 13 alpha alpha..
#> 14 gamma mild gamma.mild.
#> 15 gamma severe gamma.severe.
#> 16 dose_rate ..dose_rate
This choice of a dot delimiter leads to syntactically valid R names even if the name begins with one or more dot. Because dots are not allowed in the labels themselves, this delimiter also makes parsing the labels into their respective partitions unambiguous. Note that it is impossible to have names with all dots because all variables must have at least one non-blank label.
The biggest downside to the choice of a dot delimiter is that it conflicts with the tidyverse style, which requires using dots only to specify S3 method dispatch. The reason for this guideline is to reduce the chances of ambiguity when the S3 machinery searches for methods. This disadvantage is in our opinion outweighed by the advantages that we list above, because model variables will not be involved in S3 dispatch.
Variable names are invertible, in that a vector of variable names can be combined with a vector of partition names to reproduce the associated labelled partitions. This fact allows us to choose either the labelled partitions or names representation depending on convenience or necessity.
Variable Roles
Variable Names are used in lists that
identify variables with specific roles in the model. There are at least
two such lists in the settings.json
file.
-
settings$state_variables
– Character vector giving names of variables, each of which tracks the number of individuals in a compartment. -
settings$rate_variables
– Character vector giving names of variables, each of which determines the per-capita rate of flow between one or more pairs of compartments.
Additional variable lists may be used to accommodate such things as absolute flow rates as well as sources and sinks.
All state variables and flow variables must be scalars.
Partition Sets
The required_paritions
component is a set of partitions
that are used to uniquely identify each variable and to generate
invertible variable names. The user is free to define other
partition_sets
so that they can refer to groups of
variables by name. These sets can be used to filter the variables for a
specific purpose. Such partition_sets
are useful when
combining models.
Variable names can be used with respect to a particular partition
set. Variable names with respect to some partition sets do not uniquely
identify each variable. For example, Epi.Symp == "I.mild"
occurs twice in the following example.
#> Epi Symp Vax Epi.Symp Epi.Vax
#> 1 S unvax S. S.unvax
#> 2 E unvax E. E.unvax
#> 3 I mild unvax I.mild I.unvax
#> 4 I severe unvax I.severe I.unvax
#> 5 R unvax R. R.unvax
#> 6 beta unvax beta. beta.unvax
#> 7 S vax S. S.vax
#> 8 E vax E. E.vax
#> 9 I mild vax I.mild I.vax
#> 10 I severe vax I.severe I.vax
#> 11 R vax R. R.vax
#> 12 beta vax beta. beta.vax
#> 13 alpha alpha. alpha.
#> 14 gamma mild gamma.mild gamma.
#> 15 gamma severe gamma.severe gamma.
#> 16 dose_rate . .dose_rate
The ability to refer to multiple variables with a single name is
useful in many circumstances. Typically such non-unique names are unique
within groups of variables. For example,
Epi.Symp == "I.mild"
is unique within
Vax == "vax"
and within Vax == "unvax"
.
As the examples above have illustrated, the name of a partition set
is the dot-delimited concatenation of the names of the partitions in the
set. For example, Epi.Symp
is the name of the partition set
containing the Epi
and Symp
partitions.
Derived Variables
Now that we have a solid mechanism for representing, referring to,
and grouping model variables, we can begin to consider defining Flow between Compartments. When
the rate of flow from one compartment to another is given by a model
parameter (e.g. recovery rate, gamma
), then it is
straightforward to describe the flow between those compartments
(e.g. I
to R
). However, when the flow rate is
a function of parameters and state variables (e.g. force of infection),
we first need to define how to compute such a rate.
In this section we describe a data structure for representing mathematical expressions for deriving model variables that must be recomputed at each iteration of a simulation. Flow rates such as the force of infection is an example of such a derived variable. There are many other examples, including sums of groups of (e.g. vaccinated) compartments, model summaries such as \(\mathcal{R}_0\), and convolutions of infected compartments for modelling under-reporting and reporting delays.
Each derived variable must be represented by a row in the
variables
list. Derived variables are distinguished from
input variables by being associated with an element in a list,
derivations
. This list contains objects that describe how
to compute each derived variable. Each object contains the following
components.
-
simulation_phase
[required] – Specifies when in the simulation the variable should be evaluated. Must be one of four values: “before”, “during_pre_update”, “during_post_update”, or “after”.-
before
: Variable is computed before the simulation loop. Intended for variables that must be computed once but will remain constant throughout the simulation. -
during_pre_update
: Variable is re-computed during each loop of the simulation before the state variable has been updated. Intended for variables that depend on the state during the previous timestep. -
during_post_update
: Variable is re-computed during each loop of the simulation after the state variable has been updated. Intended for variables that depend on the state during the current timestep. -
after
: Variable is computed after the simulation loop. Intended for modifying simulation outputs to be comparable with observed data.
-
-
group_partition
[optional] – The name of a partition set to use for filtering and grouping the variables so that (1) only a subset of variables are utilized in order to produce the derived variables and (2) a set of derived variables can be produced at the same time by producing one derived variable per group. Whengroup_partition
is missing, all variables are potentially utilized by a single derived variable. -
group_names
[required ifgroup_partition
is not missing] – A vector of variable names with respect to thegroup_partition
, each of which will define a single group. Thegroup_names
also determine the labels of variables that distinguish groups from each other. There are three use cases.-
group_names
andgroup_partition
are both missing – All variables are potentially utilized by a single derived variable -
length(group_names) == 1L
– A filtered subset of variables get utilized by a single derived variable -
length(group_names) == n
– The variables in each ofn
groups are used to produce one ofn
derived variables
-
-
output_partition
[optional] – The name of a partition set to use for declaring the labels of the derived variables within each group (seegroup_partition
andgroup_names
for producing one derived variable per group). If nooutput_partition
is given then all required partitions are used. -
output_names
[optional] – A vector of variable names with respect to theoutput_partition
, each of which will determine the labels associated with theoutput_partition
of each derived variable within its group. Ifgroup_names
are missing thenoutput_names
must be a single variable name with respect to theoutput_partition
for determining the labels for the derived variable. Otherwise, thei
th element ofoutput_names
determines the labels for the derived variable produced by thei
th element ofgroup_names
. -
input_partition
[optional] – The name of a partition set to use for matching variables to arguments in theexpression
. Every variable name with respect to theinput_partition
must be both unique and exist within each group. If noinput_partition
is given then all required partitions are used. -
arguments
[required ifexpression
has arguments other then...
] – A vector of variable names with respect to theinput_partition
. -
argument_dots
[required ifexpression
has a...
argument] – A vector of variable names with respect to theinput_partition
for finding variables to pass to a...
argument inexpression
, if it exists. -
expression
[required] – A mathematical function for producing one derived variable per group. The names of the arguments of this function are a subset of variable names with respect to theinput_partition
and optionally a...
argument.
TODO: Perhaps this should be a ‘compressed’ data format for
derivations
, and the simple
derived_var_name = function(...) ...
should be used as the
base case? The compressed format can be used to form the basis of an
interface design?
The spec for this file will be hard to understand on first reading. To understand it better we provide several examples designed to justify and clarify the description in the above bullet list.
We begin with defining the force of infection for an SEIR model with the following variable list.
#> Epi
#> 1 S
#> 2 I
#> 3 R
#> 4 N
#> 5 beta
#> 6 foi
#> 7 gamma
The force of infection for this model can be represented as follows.
list(
output_partition = "Epi",
output_names = "foi",
input_partition = "Epi",
math_function = function(S, E, I, R, beta) {
I * beta / sum(S, E, I, R)
}
)
Because this is an atomic
model, by specifying output_partition = "Epi"
and
output_names "foi"
, we are really just saying that the name
of the output variable is "foi"
. More technically we would
say that the derived variable will have the label "foi"
in
the "Epi"
partition. The input_partition
and
math_function
go together. The math_function
describes how to compute the derived variable, foi
, in
terms of other variables. The input_partition
gives the
name of a partition set to use for
matching model variables to the arguments of this function. For example,
the argument list contains S
, which is a label in the
Epi
partition set.
This description may seem unnecessarily complex, but it is useful for representing more complex models as we will see. Also remember that this is a spec for a data structure and not a user interface – it is likely that a user interface for specifying this component would make it much simpler to do so.
Updating derived variable specifications when taking model products
is reasonably straightforward with this data structure. For example,
consider stratifying S
, E
, I
,
R
, and beta
by geographic location.
#> Epi Location
#> 1 S montreal
#> 2 E montreal
#> 3 I montreal
#> 4 R montreal
#> 5 beta montreal
#> 6 S toronto
#> 7 E toronto
#> 8 I toronto
#> 9 R toronto
#> 10 beta toronto
#> 11 alpha
#> 12 gamma
#> 13 montreal_to_toronto_rate
#> 14 toronto_to_montreal_rate
In this new model, the functional form of the force of infection remains constant and we simply update and add some of the partition sets and variable names.
list(
group_partition = "Location",
group_names = c("montreal", "toronto"),
output_partition = "Epi",
output_names = c("foi", "foi"),
input_partition = "Epi",
math_function = function(S, E, I, R, beta) {
I * beta / sum(S, E, I, R)
}
)
This will define expressions to produce derived variables called
foi.montreal
and foi.toronto
. To get this list
we did the following.
- Added geographical grouping information to specify how to stratify the force of infection.
- Repeat the entry in
output_names
once for each location.
This model product assumes that individuals in Toronto cannot be infected by individuals in Montreal. This is a reasonable assumption for this spatial stratification, but in models where the stratification does not isolate compartments from each other we need to add an additional derived variable that sums the force of infection components over the strata. Consider for example the SEIR model stratified by vaccination status.
#> Epi Vax
#> 1 S unvax
#> 2 E unvax
#> 3 I unvax
#> 4 R unvax
#> 5 beta unvax
#> 6 alpha
#> 7 gamma
#> 8 N unvax
#> 9 foi unvax
#> 10 S vax
#> 11 E vax
#> 12 I vax
#> 13 R vax
#> 14 beta vax
#> 15 alpha
#> 16 gamma
#> 17 N vax
#> 18 foi vax
#> 19 dose_rate
We can compute the force of infection components just as we did for the spatial model.
list(
group_partition = "Vax",
group_names = c("unvax", "vax"),
output_partition = "Epi",
output_names = c("foi", "foi"),
input_partition = "Epi",
math_function = function(S, E, I, R, beta) {
I * beta / sum(S, E, I, R)
}
)
But to complete the model product we must add a derived variable that
takes the sum of all of the Epi == "foi"
variables.
list(
output_partition = "epi.vax",
output_names = "foi.total",
input_partition = "epi.vax",
input_names_dots = c("foi.unvax", "foi.vax"),
math_function = function(...) sum(...)
)
This will produce a single derived variable called
foi.total
, which can be used as the force of infection for
all vaccination strata.
There are other operations on model space besides model products. Another such operation is to convert a single compartment into several compartments. An example of this operation is to break up the infectious compartment into several symptomatic states.
#> Epi EpiEffective Symp Type
#> 1 S S
#> 2 E E
#> 3 I mild component
#> 4 I severe component
#> 5 I vector
#> 6 I I
#> 7 R R
#> 8 beta beta
#> 9 alpha alpha
#> 10 gamma gamma mild
#> 11 gamma gamma severe
#> 12 infectiousness mild component
#> 13 infectiousness severe component
#> 14 infectiousness vector
#> 15 foi foi
Notice the "Type"
partition that indicates the different
forms with which the symptomatic-status-structured variables are
represented. Here the scalar components of I
and
infectiousness
can be referred to, or one can refer to
vectors that contain all components. The components are
I.mild.component
, I.severe.component
,
infectiousness.mild.component
, and
infectiousness.severe.component
. The vectors are
I..vector
and infectiousness..vector
. The
variable I..effective
refers to the the weighted sum given
by the inner product of I..vector
and
infectiousness..vector
.
Here we assume that the component
types are input
variables, and that we derive the other I
and
infectiousness
variables with the following
definitions.
list(
group_partition = "Epi.Symp",
group_names = c("I.mild", "I.severe", "infectiousness.mild", "infectiousness.severe"),
output_partition = "Epi.Type",
output_names = c("I.vector", "infectiousness.vector"),
input_partition = "Symp",
input_names_dots = c("mild", "severe"),
math_function = function(...) c(...)
)# list(
# #group_partition = "Epi",
# #group_names = c("I", "I", "infectiousness.mild", "infectiousness.severe"),
# output_partition = "Epi.EpiEffective",
# output_names = "I.I",
# input_partition = "Epi.Symp",
# input_names_dots = c("I.mild", "I.severe", "infectiousness.mild", "infectiousness.severe"),
# math_function = function(...) {
# l = unlist(list(...))
# n = length(l)
# sum(l[1:(n/2)] * l[((n/2)+1):n])
# }
# )
list(
group_partition = "Type",
group_names = "vector",
output_partition = "EpiEffective",
output_names = "I",
input_partition = "Epi",
math_function = function(I, infectiousness) {
sum(I * infectiousness)
}
)list(
output_partition = "Epi",
output_names = "foi",
input_partition = "Epi",
math_function = function(S, E, I, R)
)
The ordering of the derivations
list must be
topologically sorted so that the dependencies of each expression appear
earlier in the list. This topological sorting requirement presents a
technical challenge when models are combined, as it becomes necessary to
be able to merge lists of derivations
so that they remain
topologically sorted.
Flow between Compartments
We have another data frame, flows
, with rows describing
how individuals flow between pairs of compartments. There are four
required columns in this data frame, from
, to
,
flow
, and type
.
-
from
– The variable name of the compartment from which individuals are flowing. -
to
– The variable name of the compartment to which individuals are flowing. -
flow
– The name of the variable governing the magnitude of the flow. -
type
– A type describing the way in which the magnitude of flow is quantified. The following options are available.-
"per_capita"
– The flow fromfrom
and toto
equalsfrom * flow
. -
"absolute"
– The flow fromfrom
and toto
equalsflow
. -
"per_capita_inflow"
– The flow toto
equalsfrom * flow
. -
"per_capita_outflow"
– The flow fromfrom
equalsfrom * flow
. -
"absolute_inflow"
– The flow toto
equalsflow
. -
"absolute_outflow"
– The flow fromfrom
equalsflow
.
-
For example, consider this SIR model.
#> epi
#> 1 S
#> 2 I
#> 3 R
#> 4 beta
#> 5 gamma
#> 6 foi
We would have the following flows.csv
file.
#> from to flow type
#> 1 S I foi per_capita
#> 2 I R gamma per_capita
Notice that all state variables appear in this table, and that it contains no parameters. This is the mechanism for encoding whether a model variable is a state variables or a parameter.
A more interesting example is the following
Epi
-by-Vax
model.
#> Epi Vax
#> 1 S unvax
#> 2 I unvax
#> 3 R unvax
#> 4 beta unvax
#> 5 S vax
#> 6 I vax
#> 7 R vax
#> 8 beta vax
#> 9 alpha
#> 10 gamma
#> 11 foi
#> 12 dose_rate
The flows
data frame for these variables could look like
the following.
#> from to flow type
#> 1 S.unvax I.unvax foi.total per_capita
#> 2 I.unvax R.unvax gamma. per_capita
#> 3 S.vax I.vax foi.total per_capita
#> 4 I.vax R.vax gamma. per_capita
#> 5 S.unvax S.vax .dose_rate per_capita
#> 6 I.unvax I.vax .dose_rate per_capita
#> 7 R.unvax R.vax .dose_rate per_capita
Flows in Structured Models
Structured models can have a large number of flows making the above
flows
data structure very long. For example, an SEIR model
with three symptomatic status’, five vaccination status’, in 30
geographic locations would have over 750 flows depending on the details
of the vaccination and geographic models. The length of the list is not
in-and-of-itself the main problem with this data structure. A bigger
problem is that the data structure does not preserve the model
structure, making it difficult to visualize the underlying simplicity.
For example, if the model structure was preserved in the
flows
data it would be possible to automatically construct
box diagrams for communicating the simplicity of a seemingly complex
compartmental model.
On the other hand, the benefit of the more explicit data structure in [Flows between Compartments] is that it is easy to read each line. For example, the following line clearly tells me that unvaccinated exposed individuals flow to unvaccinated infectious individuals with mild symptoms at a rate that is independent of both vaccination status and symptomatic status.
#> from to flow type
#> 1 E..unvax I.mild.unvax alpha.. per_capita
In this section we add more columns to the data structure for encoding model structure. These additional columns make it possible to shorten the list and retain model structure information, at the cost of making the meaning of each line more difficult to decode. However, if required, as we will see, the structured version can always be converted to the simpler but longer unstructured version. But it is not possible to go from unstructured to structured.
The first three additional columns contain partition sets to resolve
the names in from
, to
, and flow
.
Consider the following example.
#> from to flow type from_partition to_partition flow_partition
#> 1 S E foi per_capita Epi Epi Epi
Consider this example within the context of an SEIR model stratified
by two vaccination status’. In this case this data structure does not
allow us to distinguish between models where S.unvax
flows
only to E.unvax
from one where it is also allowed to flow
to E.vax
. To address this problem, we need three more
partition columns for matching up valid S
-E
pairs. The following table provides an example of this.
#> from to flow type from_partition to_partition flow_partition
#> 1 S E foi per_capita Epi Epi Epi
#> from_to_partition from_flow_partition to_flow_partition
#> 1 Vax
File Formats
Both CSV files have the following format.
- Column delimiter is a comma.
- There are no quotes anywhere in the file.
- There are no special missing value tokens.
- The first row gives the names of the partitions.
- There are no row names.
Additionally, the only characters that are allowed in this file are uppercase and lowercase letters, digits, underscores, and commas. The presence of any other character is sufficient to invalidate the file.
The settings.json
file contains a single object with the
following key-value pairs.
-
required_partitions
– array of strings -
state_variables
– array of strings -
flow_variables
– array of strings
The derivations.json
file contains a single array of
objects. Each of these objects has the following required key-value
pairs.
-
output_names
– array of strings -
expression
– string
Each of these objects has the following optional key-value pairs, although some of these optional pairs become required given the existence of others.
-
filter_names
– array of strings -
group_names
– array of strings -
input_names
– array of strings -
input_dots
– array of strings -
filter_partition
– string -
group_partition
– string -
output_partition
– string -
input_partition
– string