Minimal checklist
Preprocessing ESM (Experience Sampling Method) data can be a complex task, particularly for newcomers to the field. To assist researchers in this process, we have compiled a list of minimal checks to perform and to report in a preprocessing report. This list serves as a foundation to ensure data quality and consistency. However, it is important to note that this checklist should be considered a starting point rather than an exhaustive guide, as most of the ESM data require additional checking.
List of essential preprocessing steps
Step 1: Import data and preliminary preprocessing
- Import and merge data sources.
- Rename, relabel and reformat variables.
- Check identification variables: one id value per participant, no missing values.
- Check for duplicated observations.
- Missing values coding: all missing values must be coded as ‘NA’.
- Look at the distribution of missing values in the dataframe.
- Look for observations that have non-consistent missing values (‘vis_miss()’ function from ‘visdat()’ package can be a starting point).
- Check coherence: variable type-value (e.g., range of values) and variable type-missing value (e.g., no missing values expected in participant-level variables).
- Compute the ‘valid’ variable to flag invalid observations (see the “Flag (in)valid observations” topic).
- Compute time-related variables (e.g., observation number, beep number) used for later checks/analysis.
Step 2: Design and sampling scheme
- Check sampling scheme (e.g., number of days, number of beeps per day).
- Time of the day the beeps were sent.
- Number of beeps sent per participant.
- Delay between two consecutive beeps.
- If missed beeps (not answered beeps) are not recorded, consider fill in with missing observations.
Step 3: Participants’ response behaviors
- Time of the day the beeps were started and ended.
- Delay with which participants started answering beeps.
- Delay between two consecutive valid beeps.
- Response rates (e.g., number of beeps answered per day).
- Compliance rates.
Step 4: Compute and transform variables
- Compute scores/variables of interest.
- Check computed variables (e.g., range of values, missing values, etc.).
- Psychometric properties (if applicable).
- Consider centering/lagging variables of interest.
Step 5: Descriptive statistics and visualization
- Descriptive statistics of variables of interest.
- Distribution across all persons and at person-level (e.g., skewed, bimodal) of variables of interest.
- Check some participants’ time series for trensds, variance, etc.