ESM Preprocessing Gallery – affect

Affect scores

Packages: dplyr, ggplot2, psych, lme4, mlVAR, DescTools

One wildly studied topic with the ESM method is the dynamics of affects. Research focuses on specific emotions (e.g., anger, joy) or latent constructs such as positive affect (PA) and negative affect (NA). Below, we present how to compute the usual scores that can be found in the literature. In particular, we demonstrate how to compute some of the affect scores presented in Dejonckheere et al. (2019), summarized in table 1 of this article.

Time-varying scores

Among the most common time-to-time scores used are positive affect (PA) and negative affect (NA). They are often computed by averaging the items associated with those dimensions. We propose 2 methods: one using base R functions and one using the dplyr package. For the latter, we need to use the ‘rowwise() function’ to force the computation of the mean one row at a time. An important decision to make is whether the computation should ignore the missing values or not (argument na.rm=TRUE). In other words, it is about deciding if the score can be computed with only a part of the associated items whenever there are missing values.

PA_vars = c("PA1","PA2","PA3")
NA_vars = c("NA1","NA2","NA3")

data$PA = apply(data[,PA_vars], 1, mean)
data$NA_ = apply(data[,NA_vars], 1, mean)

It occurs that PA and NA are computed differently, such as taking the maximum or minimum value. To do so, we only need to change the function used, from ‘mean()’ to ‘max()’ or ‘min()’.

PA_vars = c("PA1","PA2","PA3")
NA_vars = c("NA1","NA2","NA3")

data$PA = apply(data[,PA_vars], 1, max)
data$NA_ = apply(data[,NA_vars], 1, max)

Person-level mean and standard deviation

After having computed the time-varying PA and NA, we can compute different person-level scores based on them. Two usual ones are the mean and the standard deviation (or variance). For both, we show how to compute it with the base R function (in particular the ‘ave()’ function) and dplyr functions.

Person-level mean of PA and NA

data$PA_mean = ave(data$PA, data$id, FUN = function(x) mean(x, na.rm=TRUE))
data$NA_mean = ave(data$NA_, data$id, FUN = function(x) mean(x, na.rm=TRUE))

Person-level standard deviation of PA and NA

data$PA_sd = ave(data$PA, data$id, FUN = function(x) sd(x, na.rm=TRUE))
data$NA_sd = ave(data$NA_, data$id, FUN = function(x) sd(x, na.rm=TRUE))

Relative variability of PA and NA

In Mestdagh et al (2018), the authors introduced new indicators to investigate the variability of the PA and NA scores. In contrast to the simple variability score, this new measure is not confounded with the average of the score as it takes into account the maximum possible variability (variance or standard deviation) given the observed mean. The functions come from a package in Mestdagh et al (2018). To use this function you have to first download the supplementary materials. Then, extract the “relativeVariability_1.0.tar.gz” file, store it in your working folder and install the package with the ‘install.packages()’ function, as follows:

install.packages("relativeVariability_1.0.tar.gz", repos = NULL, type="source")

To compute the relative standard deviation of PA, we proceed as follows:

library(relativeVariability)
min=1
max=100
data$PA_sd = ave(data$PA, data$id, FUN = function(x) relativeSD(x, min, max))

MSSD PA and NA

The mean squared successive difference (MSSD) is also part of the scores that capture the variability of affects. In particular, MSSD is the average difference in emotional intensity (often for PA and NA) for two consecutive measurement occasions. The formula is the following:

\[ \sqrt{\frac1n\sum (x_t - x_{t-1})^2} \]

Based on this formula, here are two methods (base R functions and dplyr functions) to compute it for PA:

data$PA_MSSD = ave(data$PA, data$id, FUN = function(x) sqrt(mean((x - lag(x))^2, na.rm=TRUE)))

Auto-regression PA and NA (AR)

Briefly, the autoregressive (AR) effect is the carry-over effect of one variable between time t-1 and time t (for an overview see Ariens et al., 2020) The challenge here is to compute an auto-regressive effect for each participant taking into account the multilevel structure of the data. To do so, we will use the ‘lmer()’ function from the lme4 package. Here are the steps to compute and extract the AR-effect of each participant:

Compute lagged PA and person-center it.

data = data %>%
    group_by(id) %>%
    mutate(PA_lag = lag(PA),
           PA_lagc = mean(PA_lag, na.rm=TRUE) - PA_lag)

Fit a multilevel AR(1) model allowing the predictor (‘PA_lagc’) to vary over participants.

library(lme4)
model = lmer(PA ~ 1 + PA_lagc + (1 + PA_lagc | id), data=data)

Extract the fixed and random effects of each participant. Compute the AR for each participant by making an addition between the fixed effects and the person-specific random effect. Then join the computed score to the main dataset.

fixeeffect = fixef(model)["PA_lagc"]
randeffect = ranef(model)$id
df_randeffect = data.frame(id = as.numeric(rownames(randeffect)), 
                           PA_ar = fixeeffect + randeffect[,"PA_lagc"])

data = data %>%
    left_join(df_randeffect, by=c("id"))

Emotion-network density (D)

The dynamic network analysis would compute the carry-over and cross-over effects between the positive and negative emotions over time. The higher the coefficients the more stable the system and the more resistant it is to change. The density score captures this resistance to change by computing the sum of all the coefficients of the VAR(1) model. This approach requires running a dynamic network model. Here, we use the ‘mlVAR()’ function from the mlVAR package. Here are the steps to compute and extract the density score for each participant:

sFit a dynamic network analysis based on a multilevel VAR(1) model.

library(mlVAR)
data$daycum = as.character(data$daycum)
model = mlVAR(data = data, 
              vars = c("PA1", "PA2", "PA3", "NA1", "NA2", "NA3"),
              idvar = "id",
              lags = 1, 
              dayvar = "daycum", 
              beepvar = "obsno",
              verbose=FALSE)

Extract the density values for each participant.

df_density = data.frame(id=unique(data$id), affect_dens = NA)
for (id in unique(df_density$id)){
    person_network = model$result$Gamma_Theta$subject[[id]] # TODO: check if it is the good variable that is extracted ! + Check that fixed + random effect has been done
    density = sum(abs(person_network))
    df_density[df_density$id==id, "affect_dens"] = density
}

Join the scores to the main dataset.

data = data %>% left_join(df_density, by=c("id"))

Within PA-NA correlation

The within-correlation computes a correlation between PA and NA for each participant. This score captures how much those scores are contemporaneously dependent or independent. In other words, how someone’s experience of PA and NA is dependent or independent from each other. Here is a way to compute it:

data = data %>%
    group_by(id) %>%
    mutate(affect_wt_cor = cor(PA, NA_, use="complete.obs"))