Terminology

Along this website and for illustrative purposes, we use variations of a synthetic dataset that has been simulated to reflect a typical ESM dataset. Hereafter, we provide a description of the dataset and an overview of the design and sampling scheme used to simulate the dataset. Next, we present two tables that provide a broad description of the variables included in the datasets that you will find in the different parts of the website. Specifically, Table 1 displays the variables included in the original simulated data set and Table 2 displays the variables that have been created for specific topics in the website. If you look for a specific term, we recommend using the “CTRL+F” command to open the search bar of your navigator.

Dataset(s) description

The original dataset has been generated to mimic the structure of a dataset from a heterosexual romantic couples study (i.e., ESM dyadic design). This simulated study focuses on the emotional affect dynamics of romantic partners taking into account the context (location and contact with the partner). In addition, this study assumes that couples are randomized into two treatments. Thus, the dataset includes a baseline phase and a treatment phase, where couples are randomly allocated to one of the two treatments. Here are the general characteristics of the dataset:

  • Sample: there are 30 dyads (couples) composed of two persons from different roles (‘role’=1, ‘role’=2). Hence, there are 60 participants between 20 and 70 years old. The compliance rate for each participant ranges from .25 to 1, with an average of .74. Dyads were assigned to two treatment conditions (‘dyad_cond’=0 and ‘dyad_cond’=1).
  • Sampling scheme: the simulated dataset follows a sampling scheme in which participants answered for 7 days with 10 beeps a day between 10 a.m. and 7 p.m. The beeps follow a fixed-contingent (or time-contingent) sampling scheme in which 1 beep was sent every hour. The “data collection” started 02/02/2018 and lasted around one year and a half.
  • Self-report variables: There are 6 simulated variables based on 1-100 sliders (positive emotions (PA), negative emotions (NA)), 1 simulated categorical question (‘location’) with 4 possible answers, and 1 simulated dichotomous variable (‘contact’).

The script to generate the general dataset is included in the file “src/main_data_sim.R”. This dataset is generated without any issues to illustrate a typical ESM dyadic study. Across the different topics, the general dataset is modified (e.g., create specific variables, introduce issues into the dataset such as duplicated answers) to showcase the functionality of the different topics, while keeping the general structure described here. For reproducibility, each of the datasets used in the topics can be downloaded by clicking at the top of each of the topics’ webpages.

Dataset variables

Below, you can see the first 5 rows of the original dataset:

  dyad role obsno id age cond_dyad           scheduled                sent
1    1    1     1  1  40     condB 2018-10-17 08:00:08 2018-10-17 08:00:11
2    1    1     2  1  40     condB 2018-10-17 09:00:01 2018-10-17 09:00:22
3    1    1     3  1  40     condB 2018-10-17 09:59:56 2018-10-17 10:00:08
4    1    1     4  1  40     condB 2018-10-17 10:59:48 2018-10-17 10:59:52
5    1    1     5  1  40     condB 2018-10-17 12:00:12 2018-10-17 12:00:15
6    1    1     6  1  40     condB 2018-10-18 07:59:47 2018-10-18 08:00:08
                start                 end contact PA1 PA2 PA3 NA1 NA2 NA3
1                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
2                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
3                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
4 2018-10-17 11:00:12 2018-10-17 11:03:01       0   1  11  25  10  16  28
5                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
6                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
  location
1     <NA>
2     <NA>
3     <NA>
4        A
5     <NA>
6     <NA>

In the following table, we describe the variables that can be found in the original dataset:

Name Description
dyad Dyad identification number
id Participant identification number
role Role of a participant within a dyad (make the partner distinguishable)
age Age of the participant
nationality Nationality of the participant
dyad_cond Treatment condition to which the dyad was assigned, as follows: 0=control group, 1=experimental group
obsno ESM questionnaire (beep) number of the observation that indicates their serial order
phase Phase of the treatment: 0=baseline, 1=treatment
scheduled Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was scheduled
sent Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was sent
start Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was opened by the participant
end Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was ended by the participant
PA1 A positive affect (PA) item with a slider scale (1-100)
PA2 A positive affect (PA) item with a slider scale (1-100)
PA3 A positive affect (PA) item with a slider scale (1-100)
NA1 A negative affect (NA) item with a slider scale (1-100)
NA2 A negative affect (NA) item with a slider scale (1-100)
NA3 A negative affect (NA) item with a slider scale (1-100)
location Categorical item with 4 possible answers (home, work, public space, other)
contact Dichotomous item (1=contact, 0=no contact)

Computed variables

Finally, we describe the variables that are created within the topics of this website. We specifically mention the variables that are part of multiple topics:

Name Description
valid Specify if an observation is valid (using pre-defined rules to assign observation as valid or invalid) as follows: 1=valid, 0=invalid
valid_var Number of valid variables within a row (among the variableS of interest)
year Year element of the timestamps (e.g., scheduled, sent)
month Month element of the timestamps (e.g., scheduled, sent)
day Day number element of the timestamps (e.g., scheduled, sent)
hour Hour element of the timestamps (e.g., scheduled, sent)
minute Minute element of the timestamps (e.g., scheduled, sent)
wday Week day number of the element of the timestamps (e.g., scheduled, sent)
obsno Beep number of the observation that indicates their serial order
beepno Beep number within a day
beeptime Number of minutes since midnight
daycum Day number since the first beep sent to the participant
daycum_dyad Day number since the first beep sent to the dyad
weekcum Week number since the first beep of the participant
weekday Indicates if the observation has been performed during a weekday or a weekend day
duration Data collection duration (in days) for each participant
daynr_study Day number since the start of the data collection across all participants
continuoustime The time intervals based on a unit (e.g., 30 minutes, 1 hour) between the first observation (continuoustime=0) of a participant and all subsequent observations
continuoustime_dyad Same as ‘continuoustime’ instead it is the first beep of the dyad that is taken as starting time
period Moment of the day (e.g., morning, afternoon, evening)
PA1_inv Reversing item PA1
NA3_inv Reversing item NA3
delay_sent_min Time interval (in minutes) between the time the beep was scheduled and the time the beep was sent to the participant
delay_start_min Time interval (in minutes) between the time the beep was sent and the time the beep was started by the participant
delay_end_min Time interval (in minutes) between the moment the beep was started and the time the beep was finished by the participant
PA1_gc PA1 grand-mean centered
PA1_pc PA1 person-mean centered
PA1_lag_pc PA1 lagged and person-mean centered
PA1_dc PA1 dyad-mean centered
cr_flag Flag observation as careless response (cr_flag = 1) or not (cr_flag = 0) based on a predefined metric (e.g., long-string, response time)