Minimal checklist

Preprocessing ESM (Experience Sampling Method) data can be a complex task, particularly for newcomers to the field. To assist researchers in this process, we have compiled a list of minimal checks to perform and to report in a preprocessing report. This list serves as a foundation to ensure data quality and consistency. However, it is important to note that this checklist should be considered a starting point rather than an exhaustive guide, as most of the ESM data require additional checking.

List of essential preprocessing steps

Step 1: Import data and preliminary preprocessing

  • Import and merge data sources.
  • Rename, relabel and reformat variables.
  • Check identification variables: one id value per participant, no missing values.
  • Check for duplicated observations.
  • Missing values coding: all missing values must be coded as ‘NA’.
  • Look at the distribution of missing values in the dataframe.
  • Look for observations that have non-consistent missing values (‘vis_miss()’ function from ‘visdat()’ package can be a starting point).
  • Check coherence: variable type-value (e.g., range of values) and variable type-missing value (e.g., no missing values expected in participant-level variables).
  • Compute the ‘valid’ variable to flag invalid observations (see the “Flag (in)valid observations” topic).
  • Compute time-related variables (e.g., observation number, beep number) used for later checks/analysis.

Step 2: Design and sampling scheme

  • Check sampling scheme (e.g., number of days, number of beeps per day).
  • Time of the day the beeps were sent.
  • Number of beeps sent per participant.
  • Delay between two consecutive beeps.
  • If missed beeps (not answered beeps) are not recorded, consider fill in with missing observations.

Step 3: Participants’ response behaviors

  • Time of the day the beeps were started and ended.
  • Delay with which participants started answering beeps.
  • Delay between two consecutive valid beeps.
  • Response rates (e.g., number of beeps answered per day).
  • Compliance rates.

Step 4: Compute and transform variables

  • Compute scores/variables of interest.
  • Check computed variables (e.g., range of values, missing values, etc.).
  • Psychometric properties (if applicable).
  • Consider centering/lagging variables of interest.

Step 5: Descriptive statistics and visualization

  • Descriptive statistics of variables of interest.
  • Distribution across all persons and at person-level (e.g., skewed, bimodal) of variables of interest.
  • Check some participants’ time series for trensds, variance, etc.