Minimal checklist

Preprocessing ESM (Experience Sampling Method) data can be a complex task, particularly for newcomers to the field. To assist researchers in this process, we have compiled a list of minimal checks to perform and to report in a preprocessing report. This list serves as a foundation to ensure data quality and consistency. However, it is important to note that this checklist should be considered a starting point rather than an exhaustive guide, as most of the ESM data require additional checking.

List of essential preprocessing steps

Step 1: Import data and preliminary preprocessing

Import and merge data sources.
Rename, relabel and reformat variables.
Check identification variables: one id value per participant, no missing values.
Check for duplicated observations.
Missing values coding: all missing values must be coded as ‘NA’.
Look at the distribution of missing values in the dataframe.
Look for observations that have non-consistent missing values (‘vis_miss()’ function from ‘visdat()’ package can be a starting point).
Check coherence: variable type-value (e.g., range of values) and variable type-missing value (e.g., no missing values expected in participant-level variables).
Compute the ‘valid’ variable to flag invalid observations (see the “Flag (in)valid observations” topic).
Compute time-related variables (e.g., observation number, beep number) used for later checks/analysis.

Step 2: Design and sampling scheme

Check sampling scheme (e.g., number of days, number of beeps per day).
Time of the day the beeps were sent.
Number of beeps sent per participant.
Delay between two consecutive beeps.
If missed beeps (not answered beeps) are not recorded, consider fill in with missing observations.

Step 3: Participants’ response behaviors

Time of the day the beeps were started and ended.
Delay with which participants started answering beeps.
Delay between two consecutive valid beeps.
Response rates (e.g., number of beeps answered per day).
Compliance rates.

Step 4: Compute and transform variables

Compute scores/variables of interest.
Check computed variables (e.g., range of values, missing values, etc.).
Psychometric properties (if applicable).
Consider centering/lagging variables of interest.

Step 5: Descriptive statistics and visualization

Descriptive statistics of variables of interest.
Distribution across all persons and at person-level (e.g., skewed, bimodal) of variables of interest.
Check some participants’ time series for trensds, variance, etc.