Compute observation periods with a specified event density
get_observation_periods.Rd
For each patient, get_observation_periods()
computes time periods of when
patients meet a specified density of given healthcare events.
Usage
get_observation_periods(
table,
dates,
window_size,
min_utilization,
min_event_dates_per_period = 2,
min_period_length = 30
)
Arguments
- table
A table of events which includes a
person_id
column and at least one date column. Often this will be a table from a PicnicHealth data set such asvisit
,cohort_measurement_occurrence
,measurement_occurrence
,cohort_procedure_occurrence
,procedure_occurrence
,cohort_drug_era
,document
etc; these can be filtered for specific entities before passing to this function. Only one row per patient per event start date will be kept (see Details).- dates
A character vector of length 1 or 2 containing the name of the date variable(s) for the events in
table
. If passed two variable names, the first one must give the start of each event, and the second one must give the end of each event. It's ok if the end dates are sometimes NA. If only one date column is specified, events are assumed to begin and end on the same day.- window_size
Integer number of days in rolling window.
- min_utilization
For a given date to appear in an observation period, a patient must have this many events within
window_size
/2 days of the date. In other words, an observation period is the period of time over which a patient has at leastmin_utilization
events everywindow_size
days.- min_event_dates_per_period
Minimum number of distinct days with events that must have occurred during a candidate period. Default is 2; changing this value is not generally necessary.
- min_period_length
Minimum period length in days, inclusive of period start and end dates.
Value
A tibble with the following columns:
person_id
same as the
person_id
fromtable
argumentperiod_start
first day of the observation period
period_end
last day of the observation period
period_length
length of the period in days, inclusive of end dates
n_event_dates
number of distinct days in the period on which the patient had events
Details
This function takes a table of events, a specification of which of its
columns are dates, and a window_size
as input.
First, duplicate events on the same day are removed: the function keeps only one event per start date per patient (if end dates are provided, the latest end date is kept). From this point forward, a "count of events" for a patient is really the count of distinct event start dates for the patient.
For each patient, the function then computes the time periods when their
"healthcare utilization" is greater than min_utilization
. A patient's
healthcare utilization on date x is defined as the number of distinct days
on which they had an event start date within a window of window_size
days
centered on x (i.e. window_size
/2 days before x and window_size
/2
days after x, inclusive).
Next, periods with fewer than min_event_dates_per_period
events are removed.
Furthermore, periods are clipped to the first and last events in them: the
extra days before the first event and after the last event are stripped, such
that the first (respectively, last) date of the period is the date of the
first (resp., last) event within it.
Finally, periods shorter than min_period_length
are removed.
For more information and examples, refer to
vignette("observation_periods")
.
Examples
if (FALSE) { # \dontrun{
# Periods when patients had at least one visit every 120 days
get_observation_periods(
table = ds$visit,
dates = c("visit_start_date", "visit_end_date"),
window_size = 120,
min_utilization = 1
)
# Periods when patients had at least three hemoglobin measurements every 365 days
ds$lab_result %>%
# use the concept_id corresponding to hemoglobin
filter(measurement_concept_id == "29124cfe-a4a8-5939-8d36-3f43f0017600") %>%
get_observation_periods(
dates = "collection_date",
window_size = 365,
min_utilization = 3
)
} # }