Skip to contents

Obtain period midpoints and average daily rates for count data

Usage

period_averager(
  data,
  count_col = "cases_this_period",
  start_col = "period_start_date",
  end_col = "period_end_date",
  norm_col = NULL,
  norm_const = 1e+05,
  keep_raw = TRUE,
  keep_cols = names(data)
)

Arguments

data

Data frame with rows at minimum containing period start and end dates and a count variable.

count_col

Character, name of count data column.

start_col

Character, name of start date column.

end_col

Character, name of end date column.

norm_col

Character, name of column giving data for normalization. A good option is often population_reporting, which is a column in many datasets containing the total size of the reference population for the count data. To avoid normalization set norm_col to NULL, which is the default.

norm_const

Numeric value for multiplying the daily_rate column if a norm_col is supplied. By default this is 1e5, which corresponds to daily_rate having units of count per day per 100,000 individuals if the norm_col represents the reference population size.

keep_raw

Logical value indicating whether to force all *_col columns in the output, even if they are not specified in keep_cols, and to place them at the beginning of the columns list. The default is TRUE.

keep_cols

Character vector containing the names of columns in the input data to retain in the output. All columns are retained by default.

Value

Data frame containing the following fields.

  • Columns from the original dataset specified using keep_raw and keep_cols.

  • year : Year of the period_start_date.

  • num_days : Length of the period in days from the beginning of the period_start_date to the end of the period_end_date.

  • period_mid_time : Timestamp of the middle of the period.

  • period_mid_date : Date containing the period_mid_time.

  • daily_rate : Daily count rate, which by default is given by daily_rate = count_col / num_days. If the name of norm_col is specified then daily_rate = norm_const * count_col / num_days / norm_col. When interpreting these formulas, please keep in mind that norm_const is a numeric constant, num_days is a derived numeric column, and count_col and norm_col are columns supplied within the input data object.

Examples

set.seed(666)
data <- data.frame(disease = "senioritis"
 , period_start_date = seq(as.Date("2023-04-03"), as.Date("2023-06-05"), by = 7)
 , period_end_date = seq(as.Date("2023-04-09"), as.Date("2023-06-11"), by = 7)
 , cases_this_period = sample(0:100, 10, replace = TRUE)
 , location = "college"
)

period_averager(data, keep_raw = TRUE, keep_cols = c("disease", "location"))
#>    period_start_date period_end_date cases_this_period    disease location year
#> 1         2023-04-03      2023-04-09                61 senioritis  college 2023
#> 2         2023-04-10      2023-04-16                95 senioritis  college 2023
#> 3         2023-04-17      2023-04-23                10 senioritis  college 2023
#> 4         2023-04-24      2023-04-30                27 senioritis  college 2023
#> 5         2023-05-01      2023-05-07                13 senioritis  college 2023
#> 6         2023-05-08      2023-05-14                 4 senioritis  college 2023
#> 7         2023-05-15      2023-05-21                11 senioritis  college 2023
#> 8         2023-05-22      2023-05-28                32 senioritis  college 2023
#> 9         2023-05-29      2023-06-04                49 senioritis  college 2023
#> 10        2023-06-05      2023-06-11                 2 senioritis  college 2023
#>    num_days     period_mid_time period_mid_date daily_rate
#> 1         7 2023-04-06 12:00:00      2023-04-06  8.7142857
#> 2         7 2023-04-13 12:00:00      2023-04-13 13.5714286
#> 3         7 2023-04-20 12:00:00      2023-04-20  1.4285714
#> 4         7 2023-04-27 12:00:00      2023-04-27  3.8571429
#> 5         7 2023-05-04 12:00:00      2023-05-04  1.8571429
#> 6         7 2023-05-11 12:00:00      2023-05-11  0.5714286
#> 7         7 2023-05-18 12:00:00      2023-05-18  1.5714286
#> 8         7 2023-05-25 12:00:00      2023-05-25  4.5714286
#> 9         7 2023-06-01 12:00:00      2023-06-01  7.0000000
#> 10        7 2023-06-08 12:00:00      2023-06-08  0.2857143