Contributing to macpan2

Thank you for contributing to macpan2. Pull requests and issues are welcome.

Developers can see here for documentation useful those who will contribute code.

Developer Installation
C++ Development
C++ Standards
Adding Engine Functions
Developer Installation on Windows
Test Suite
Changelog Management
Testing Installability of Specific Commits
Log Files

Developer Installation

Developers and contributors should clone this repository and call make at the command-line in the top level directory. The following make rules are available for getting more control over the build process.

make quick-install     # for changes that only modify R source
make quick-doc-install # for changes that modify R source and roxygen comments
make quick-test        # quick-doc-install + run-examples + run-tests
make run-examples      # help file checks only (without package rebuild)
make run-tests         # run scripts in tests (without package rebuild)
make full-install      # for all changes, including changes to C++ source
make src-update        # push changes to dev.cpp to macpan2.cpp (see below)
make enum-update       # register new C++ engine functions on the R-side
make engine-doc-update # generate roxygen comments from comments in dev.cpp
make doc-update        # roxygenize
make pkg-build         # build the R package
make pkg-install       # install the R package from the build
make pkg-check         # R package checks

`C++` Development

In most R packages with compiled code, developers edit the source files to be compiled in the src directory. In macpan2 there is a single file in that directory called macpan2, which is generated automatically from the file misc/dev/dev.cpp. This setup allows for quicker C++ development cycles, because developers can edit misc/dev/dev.cpp and then use this file in tests without needing to re-install the package with the new source. In particular, the above hello-world example could use dev.cpp as follows.

library(macpan2)
macpan2:::dev_compile() ## compile dev.cpp
options(macpan2_dll = "dev")
sir = mp_tmb_library("starter_models", "sir", package = "macpan2")
mp_simulator(sir, time_steps = 100, outputs = "I")

To update src/macpan2 to the state of misc/dev/dev.cpp one may run make src-update.

Running with misc/dev/dev.cpp will print out debugging information in a verbose manner, whereas src/macpan2.cpp will not. The src-update make rule removes the #define MP_VERBOSE flag at the top of the file.

We #include both Rcpp.h and TMB.hpp, which increases the possibility of namespace clashes. Our approach to addressing this is with include-guarding. We assume that TMB takes precedence and so we include Rcpp first and then un-define any names in Rcpp that we want to use from TMB instead. Here is the example of the dnorm function.

#include <Rcpp.h>
#ifdef dnorm
#undef dnorm
#endif
#include <TMB.hpp>

When you attempt to use functions from TMB when adding an engine function, you should be aware that you might need to do some include-guarding. You will find out via compilation errors.

`C++` Standards

We are targeting support for both C++14 and C++17. This means, for example, that we cannot use variants because they were introduced in C++17.

Adding Engine Functions

If you need a function for defining model simulations that is not currently supported by the C++ TMB engine, here is how you can add one.

This is a reasonably advanced topic in that in involves using the TMB C++ framework. However, you just might find that for simple functions you can just work by analogy with existing functions.

Declare a new function by adding it to the macpan2_func enum in dev.cpp. These declarations must be of the following form:

MP2_{UPPERCASE-FUNCTION-LABEL} = {UNIQUE-INTEGER} // {FUNCTION-TYPE}: {R-SIDE-FUNCTION-NAME}({ARG-1, ARG-2, ...})

Here are some examples:

MP2_MULTIPLY = 3 // binop: `*`(x, y)
MP2_LOG = 7 // fwrap: log(x)
MP2_ROUND_BRACKET = 8 // paren: `(`(...)
MP2_SUM = 12 // fwrap: sum(...)

These examples illustrate the three FUNCTION-TYPEs:

binop : Functions that will be used as binary operators
fwrap : Function with arguments wrapped in round brackets
paren : Functions that will be used as parentheses

Check to see if your function should be added to one or more of several lists of functions that get treated in similar ways:

mp_math : Add if your function can only take numerical matrices as arguments, and cannot take integer vectors.
mp_elementwise_binop : Add if your function is an elementwise binary operator.
mp_history : Add if your function depends on having a first argument being a matrix with saved history (e.g., a lag function).

Add the function body as an item in the following switch structure:

switch (table_x[row] + 1) {...}

Here is a very simple example of a function that extracts and returns the diagonal of the argument.

case MP2_FROM_DIAG: // from_diag
  m = args[0].diagonal();
  return m;

The arguments to your function are contained in the args object. The first argument is args[0], the second is args[1], etc. The number of arguments that are passed is given by n. Each of these arguments, if it is a matrix (or integer vector), has an index giving its position within the complete list, valid_vars (or valid_int_vecs), of matrices (or integer vectors) in the model. The index2mats vector gives you these indexes, with index2mats[0] giving the position for the first matrix, etc. The index2what vector gives you information about the type of argument. Argument i is a matrix if index2what[i] = 0, an integer vector if index2what[i] = 1, or invalid if index2what[i] = -1. Sometimes you will want to assert that an argument is a matrix or an integer vector, depending on the context. The method args.get_as_mat(i) will return the ith argument if it is a matrix, and throw an error otherwise. The args.get_as_int(i) will return the ith argument as an integer vector, converting to an integer vector if necessary. Checkout the ArgList class for other methods that might be useful.

Names for intermediate matrices and integer vectors are defined way above in the section marked Available Local Variables. There you will find names like m and m1 for matrices, v and v1 for integer vectors, and more including other types like bool, int, and Type. The Type type is particularly important for making the automatic differentiation provided by TMB work properly. Variable names must be defined in this Available Local Variables section, and not inside the case statement for each particular function.

Your function can also make use of the current time index, t, which increments as the simulation loop iterates.

Every function must return a matrix. Integer vectors are not allowed to be returned (although they can be modified, but that is another story). If your function is called for its side effect (e.g., MP2_PRINT), you should return an empty matrix.

Developer Installation on Windows

Developers using make on Windows, could encounter the following compilation error.

Fatal error: can't write xxx bytes to section .text of macpan2.o: 'file too big' as: macpan2.o: too many sections

To resolve this, you may need to pass the -Wa,-mbig-obj compiler flag to GCC via the Makeconf file located in your R installation directory (typically here C/Program Files/R/[your version of R]/etc/x64).

Locate the CXXFLAGS macro in the Makeconf file. It will look something like the following. CXXFLAGS = -O2 -Wall $(DEBUGFLAG) -mfpmath=sse -msse2 -mstackrealign $(LTO)

Append the -Wa,-mbig-obj flag to the end of this line and save the file. You will likely need to make this change using a Windows Administrator role. CXXFLAGS = -O2 -Wall $(DEBUGFLAG) -mfpmath=sse -msse2 -mstackrealign $(LTO) -Wa,-mbig-obj

You should now be able to use make as described. Note this change might increase the compilation time (~2 min) as described here. It would be nice to be able to set this flag globally for all Windows developers. An attempt was made to update the Makefile with this additional line, CXXFLAGS := $(CXXFLAGS) -Wa,-mbig-obj, as suggested here, but it was not successful.

Test Suite

We use the testthat package. Tests are located here.

To run tests interactively (e.g., in RStudio), please run the following code once in your R session.

source("tests/testthat/setup.R")

This will: * Load packages that are assumed throughout the test suite * Set options(macpan2_verbose = FALSE) * Generate a cache of objects that can be (and are) reused in different tests

After running the setup, can also get access to a few useful functions for managing the testing cache. Three examples of usage follows.

You can check where the cache got placed by running the following.

test_cache_dir()

Here is an example of reading in a simulated trajectory of the infection variable over five time steps from the library SIR model.

test_cache_read("TRAJ-sir_5_infection.rds")

To get a list of all objects in the cache.

test_cache_list()

Changelog Management

We use semi-automated construction of NEWS.md, which is updated using the following command.

make NEWS.md

This system generates and maintains version metadata and release notes for the package using three scripts located in misc/build. It produces commit-version-map.txt, version-bumps.txt, and NEWS.md in the project root. commit-version-map.txt records the version number, commit hash, and date for each commit on the main branch. version-bumps.txt extracts the most recent commit for each version from that map. NEWS.md combines these version bumps with optional developer-written content in news-narratives.md, adding GitHub compare links between versions. The scripts update these files incrementally for efficiency. See misc/build and the root-level files commit-version-map.txt, version-bumps.txt, and NEWS.md.

Please do not check in any of these generated txt files.

Testing Installability of Specific Commits

To test whether a specific commit installs cleanly, use:

misc/build/test-install.sh <commit-hash>

For example, to test a version listed in version-bumps.txt:

misc/build/test-install.sh ded98a20184b9e382521472a8de90951a6cc3359

This script will:

Abort if you have any uncommitted changes (staged or unstaged).
Check out the given commit in detached head mode.
Run make quick-doc-install.
Restore any files that were modified during installation (e.g., roxygen2 tends to automatically update the DESCRIPTION file).
Return to your previous branch or commit.
Append (or update) the result (OK, FAIL, or CHECKOUT-FAIL) in install-tests.txt.

The install-tests.txt file contains one line per tested commit:

<commit-hash> <OK|FAIL|CHECKOUT-FAIL>

Please do not check in this generated install-tests.txt file.

Log Files

Every simulation generates or overwrites a log file. The default location is described here. The path of the log file is created when the simulator is created. So if the simulator is saved to a file (e.g., stored with saveRDS), there is a risk that when it is reloaded the path to the log file will no longer exist. If this happens macpan2 will try to recreate it, but this might fail. If the log file path is not valid for any of these reasons, the log file will be written to .macpan2/bail-out/log.txt in the current working directory. Log files are used internally by macpan2 when producing error messages that originate within an engine function. This mechanism of getting messages from C++ to R is not ideal, but provides a workaround for the limitation that TMB cannot report back strings (I would welcome being wrong so that we could simplify this part of the code).

Table of Contents