Terminology

Along this website and for illustrative purposes, we use variations of a synthetic dataset that has been simulated to reflect a typical ESM dataset. Hereafter, we provide a description of the dataset and an overview of the design and sampling scheme used to simulate the dataset. Next, we present two tables that provide a broad description of the variables included in the datasets that you will find in the different parts of the website. Specifically, Table 1 displays the variables included in the original simulated data set and Table 2 displays the variables that have been created for specific topics in the website. If you look for a specific term, we recommend using the “CTRL+F” command to open the search bar of your navigator.

Dataset(s) description

The original dataset has been generated to mimic the structure of a dataset from a heterosexual romantic couples study (i.e., ESM dyadic design). This simulated study focuses on the emotional affect dynamics of romantic partners taking into account the context (location and contact with the partner). In addition, this study assumes that couples are randomized into two treatments. Thus, the dataset includes a baseline phase and a treatment phase, where couples are randomly allocated to one of the two treatments. Here are the general characteristics of the dataset:

Sample: there are 30 dyads (couples) composed of two persons from different roles (‘role’=1, ‘role’=2). Hence, there are 60 participants between 20 and 70 years old. The compliance rate for each participant ranges from .25 to 1, with an average of .74. Dyads were assigned to two treatment conditions (‘dyad_cond’=0 and ‘dyad_cond’=1).
Sampling scheme: the simulated dataset follows a sampling scheme in which participants answered for 7 days with 10 beeps a day between 10 a.m. and 7 p.m. The beeps follow a fixed-contingent (or time-contingent) sampling scheme in which 1 beep was sent every hour. The “data collection” started 02/02/2018 and lasted around one year and a half.
Self-report variables: There are 6 simulated variables based on 1-100 sliders (positive emotions (PA), negative emotions (NA)), 1 simulated categorical question (‘location’) with 4 possible answers, and 1 simulated dichotomous variable (‘contact’).

The script to generate the general dataset is included in the file “src/main_data_sim.R”. This dataset is generated without any issues to illustrate a typical ESM dyadic study. Across the different topics, the general dataset is modified (e.g., create specific variables, introduce issues into the dataset such as duplicated answers) to showcase the functionality of the different topics, while keeping the general structure described here. For reproducibility, each of the datasets used in the topics can be downloaded by clicking at the top of each of the topics’ webpages.

Dataset variables

Below, you can see the first 5 rows of the original dataset:

  dyad role obsno id age cond_dyad           scheduled                sent
1    1    1     1  1  40     condB 2018-10-17 08:00:08 2018-10-17 08:00:11
2    1    1     2  1  40     condB 2018-10-17 09:00:01 2018-10-17 09:00:22
3    1    1     3  1  40     condB 2018-10-17 09:59:56 2018-10-17 10:00:08
4    1    1     4  1  40     condB 2018-10-17 10:59:48 2018-10-17 10:59:52
5    1    1     5  1  40     condB 2018-10-17 12:00:12 2018-10-17 12:00:15
6    1    1     6  1  40     condB 2018-10-18 07:59:47 2018-10-18 08:00:08
                start                 end contact PA1 PA2 PA3 NA1 NA2 NA3
1                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
2                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
3                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
4 2018-10-17 11:00:12 2018-10-17 11:03:01       0   1  11  25  10  16  28
5                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
6                <NA>                <NA>      NA  NA  NA  NA  NA  NA  NA
  location
1     <NA>
2     <NA>
3     <NA>
4        A
5     <NA>
6     <NA>

In the following table, we describe the variables that can be found in the original dataset:

Name	Description
dyad	Dyad identification number
id	Participant identification number
role	Role of a participant within a dyad (make the partner distinguishable)
age	Age of the participant
nationality	Nationality of the participant
dyad_cond	Treatment condition to which the dyad was assigned, as follows: 0=control group, 1=experimental group
obsno	ESM questionnaire (beep) number of the observation that indicates their serial order
phase	Phase of the treatment: 0=baseline, 1=treatment
scheduled	Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was scheduled
sent	Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was sent
start	Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was opened by the participant
end	Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was ended by the participant
PA1	A positive affect (PA) item with a slider scale (1-100)
PA2	A positive affect (PA) item with a slider scale (1-100)
PA3	A positive affect (PA) item with a slider scale (1-100)
NA1	A negative affect (NA) item with a slider scale (1-100)
NA2	A negative affect (NA) item with a slider scale (1-100)
NA3	A negative affect (NA) item with a slider scale (1-100)
location	Categorical item with 4 possible answers (home, work, public space, other)
contact	Dichotomous item (1=contact, 0=no contact)

Computed variables

Finally, we describe the variables that are created within the topics of this website. We specifically mention the variables that are part of multiple topics:

Name	Description
valid	Specify if an observation is valid (using pre-defined rules to assign observation as valid or invalid) as follows: 1=valid, 0=invalid
valid_var	Number of valid variables within a row (among the variableS of interest)
year	Year element of the timestamps (e.g., scheduled, sent)
month	Month element of the timestamps (e.g., scheduled, sent)
day	Day number element of the timestamps (e.g., scheduled, sent)
hour	Hour element of the timestamps (e.g., scheduled, sent)
minute	Minute element of the timestamps (e.g., scheduled, sent)
wday	Week day number of the element of the timestamps (e.g., scheduled, sent)
obsno	Beep number of the observation that indicates their serial order
beepno	Beep number within a day
beeptime	Number of minutes since midnight
daycum	Day number since the first beep sent to the participant
daycum_dyad	Day number since the first beep sent to the dyad
weekcum	Week number since the first beep of the participant
weekday	Indicates if the observation has been performed during a weekday or a weekend day
duration	Data collection duration (in days) for each participant
daynr_study	Day number since the start of the data collection across all participants
continuoustime	The time intervals based on a unit (e.g., 30 minutes, 1 hour) between the first observation (continuoustime=0) of a participant and all subsequent observations
continuoustime_dyad	Same as ‘continuoustime’ instead it is the first beep of the dyad that is taken as starting time
period	Moment of the day (e.g., morning, afternoon, evening)
PA1_inv	Reversing item PA1
NA3_inv	Reversing item NA3
delay_sent_min	Time interval (in minutes) between the time the beep was scheduled and the time the beep was sent to the participant
delay_start_min	Time interval (in minutes) between the time the beep was sent and the time the beep was started by the participant
delay_end_min	Time interval (in minutes) between the moment the beep was started and the time the beep was finished by the participant
PA1_gc	PA1 grand-mean centered
PA1_pc	PA1 person-mean centered
PA1_lag_pc	PA1 lagged and person-mean centered
PA1_dc	PA1 dyad-mean centered
cr_flag	Flag observation as careless response (cr_flag = 1) or not (cr_flag = 0) based on a predefined metric (e.g., long-string, response time)