Obtain period midpoints and average daily rates for count data

Usage

period_averager(
  data,
  count_col = "cases_this_period",
  start_col = "period_start_date",
  end_col = "period_end_date",
  norm_col = NULL,
  norm_const = 1e+05,
  keep_raw = TRUE,
  keep_cols = names(data)
)

Arguments

data: Data frame with rows at minimum containing period start and end dates and a count variable.
count_col: Character, name of count data column.
start_col: Character, name of start date column.
end_col: Character, name of end date column.
norm_col: Character, name of column giving data for normalization. A good option is often population_reporting, which is a column in many datasets containing the total size of the reference population for the count data. To avoid normalization set norm_col to NULL, which is the default.
norm_const: Numeric value for multiplying the daily_rate column if a norm_col is supplied. By default this is 1e5, which corresponds to daily_rate having units of count per day per 100,000 individuals if the norm_col represents the reference population size.
keep_raw: Logical value indicating whether to force all *_col columns in the output, even if they are not specified in keep_cols, and to place them at the beginning of the columns list. The default is TRUE.
keep_cols: Character vector containing the names of columns in the input data to retain in the output. All columns are retained by default.

Value

Data frame containing the following fields.

Columns from the original dataset specified using keep_raw and keep_cols.
year : Year of the period_start_date.
num_days : Length of the period in days from the beginning of the period_start_date to the end of the period_end_date.
period_mid_time : Timestamp of the middle of the period.
period_mid_date : Date containing the period_mid_time.
daily_rate : Daily count rate, which by default is given by daily_rate = count_col / num_days. If the name of norm_col is specified then daily_rate = norm_const * count_col / num_days / norm_col. When interpreting these formulas, please keep in mind that norm_const is a numeric constant, num_days is a derived numeric column, and count_col and norm_col are columns supplied within the input data object.

Examples

set.seed(666)
data <- data.frame(disease = "senioritis"
 , period_start_date = seq(as.Date("2023-04-03"), as.Date("2023-06-05"), by = 7)
 , period_end_date = seq(as.Date("2023-04-09"), as.Date("2023-06-11"), by = 7)
 , cases_this_period = sample(0:100, 10, replace = TRUE)
 , location = "college"
)

period_averager(data, keep_raw = TRUE, keep_cols = c("disease", "location"))
#>    period_start_date period_end_date cases_this_period    disease location year
#> 1         2023-04-03      2023-04-09                61 senioritis  college 2023
#> 2         2023-04-10      2023-04-16                95 senioritis  college 2023
#> 3         2023-04-17      2023-04-23                10 senioritis  college 2023
#> 4         2023-04-24      2023-04-30                27 senioritis  college 2023
#> 5         2023-05-01      2023-05-07                13 senioritis  college 2023
#> 6         2023-05-08      2023-05-14                 4 senioritis  college 2023
#> 7         2023-05-15      2023-05-21                11 senioritis  college 2023
#> 8         2023-05-22      2023-05-28                32 senioritis  college 2023
#> 9         2023-05-29      2023-06-04                49 senioritis  college 2023
#> 10        2023-06-05      2023-06-11                 2 senioritis  college 2023
#>    num_days     period_mid_time period_mid_date daily_rate
#> 1         7 2023-04-06 12:00:00      2023-04-06  8.7142857
#> 2         7 2023-04-13 12:00:00      2023-04-13 13.5714286
#> 3         7 2023-04-20 12:00:00      2023-04-20  1.4285714
#> 4         7 2023-04-27 12:00:00      2023-04-27  3.8571429
#> 5         7 2023-05-04 12:00:00      2023-05-04  1.8571429
#> 6         7 2023-05-11 12:00:00      2023-05-11  0.5714286
#> 7         7 2023-05-18 12:00:00      2023-05-18  1.5714286
#> 8         7 2023-05-25 12:00:00      2023-05-25  4.5714286
#> 9         7 2023-06-01 12:00:00      2023-06-01  7.0000000
#> 10        7 2023-06-08 12:00:00      2023-06-08  0.2857143