Codebook table esmtools

Packages: dplyr, esmtools, readxl


A codebook serves as a reference document that can help data analysts and researchers understand the structure and characteristics of the dataset quickly.

The ‘codebook_table()’ function is a custom function that creates a codebook table, which contains descriptions, statistics, and visualizations for each variable in a given dataframe. Additionally, it can integrate a codebook produced in spreadsheet software, such as Excel, with the codebook that’s generated. This capability streamlines the process by leveraging existing codebooks (for instance made for the data collection) or by specifying part of the codebook using spreadsheet software. By merging the codebooks, the final codebook table includes both the pre-existing metadata and the newly generated statistics and visualizations, offering a comprehensive overview of the dataset.

To merge the original codebook, users need to import it and ensure that the variable names in the dataframe match those in the original codebook. Users also need to inform the ‘origin_vars’, in which column in the original codebook to find the variable names.

Below, we created an example for the simulated dataset:

Note

To enhance readibility and sharing, an option is to generate a distinct HTML output for the codebook by utilizing the ‘html_output’ parameter (e.g., html_output=“report_examples/Codebook_table_example.html”). To see an example of an output, click here.

More specifications are available (see the documentation website).

library(esmtools)

# Import original codebook
library(readxl)
origin_cbook = read_excel("data/cbook_original.xlsx")

# Create codebook table
codebook_table(data, 
               origin_cbook=origin_cbook, 
               origin_vars="Variable")
Variable Description Standard Item_wording Item_origin Dim Rating_source Coding Comment R_type missing stats Values Freq Date_stats Hist Boxplot
dyad Dyad identification number Number: 1-30 pre-study A grouping variable numeric 0 (0%) n_uni = 30.00
min = 1.00
q25 = 8.00
median = 15.50
q75 = 23.00
max = 30.00
mean = 15.50
sd = 8.66
Embedded Image Embedded Image
role Role of a participant within a dyad number: 1, 2 What is your gender participant dummy variable A grouping variable integer 0 (0%) 1
2
2100 (50%)
2100 (50%)
Embedded Image Embedded Image
obsno ESM questionnaire (beep) number of the observatio number: 0 - 50 app starting from 1 integer 0 (0%) n_uni = 70.00
min = 1.00
q25 = 18.00
median = 35.50
q75 = 53.00
max = 70.00
mean = 35.50
sd = 20.21
Embedded Image Embedded Image
id Participant identification number Number: 1-60 pre-study starting from 1 numeric 0 (0%) n_uni = 60.00
min = 1.00
q25 = 15.75
median = 30.50
q75 = 45.25
max = 60.00
mean = 30.50
sd = 17.32
Embedded Image Embedded Image
age Age of the participant number: 20 - 50 What is your age? participant integer 0 (0%) n_uni = 19.00
min = 25.00
q25 = 25.00
median = 35.50
q75 = 42.25
max = 65.00
mean = 35.10
sd = 9.79
Embedded Image Embedded Image
cond_dyad Treatment condition to which the dyad was assigned 0=control group, 1=experimental group experimental condition character 0 (0%) condA
condB
2100 (50%)
2100 (50%)
Embedded Image Embedded Image
scheduled Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was scheduled Timestamp app POSIXct 2 (0%) min=2018-02-02 08:59:47
max=2019-06-10 11:59:46
Embedded Image Embedded Image
sent Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was sent Timestamp app POSIXct 0 (0%) min=2018-02-02 08:59:51
max=2019-06-10 11:59:54
Embedded Image Embedded Image
start Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was opened by the participant Timestamp app POSIXct 1254 (30%) min=2018-02-02 09:00:31
max=2019-06-10 12:00:15
Embedded Image Embedded Image
end Timestamps (e.g., “2023/04/14 10:23:47”) of when the ESM questionnaire was ended by the participant Timestamp app POSIXct 1254 (30%) min=2018-02-02 09:03:07
max=2019-06-10 12:02:30
Embedded Image Embedded Image
contact Partner's in contact since the last beep Number: 0 - 1 Have you been in contact with your partner since the last beep? Partner's contact participant Recode with numbers integer 1254 (30%) 0
1
2584 (61.524%)
362 (8.619%)
Embedded Image Embedded Image
PA1 A positive affect (PA) item Sliders: 1-100 … relaxed Emotion scale PA participant integer 1254 (30%) n_uni = 95.00
min = 1.00
q25 = 4.00
median = 18.00
q75 = 32.00
max = 100.00
mean = 23.09
sd = 23.54
Embedded Image Embedded Image
PA2 A positive affect (PA) item Sliders: 1-100 … happy Emotion scale PA participant integer 1254 (30%) n_uni = 94.00
min = 1.00
q25 = 3.00
median = 19.00
q75 = 33.00
max = 100.00
mean = 21.77
sd = 21.50
Embedded Image Embedded Image
PA3 A positive affect (PA) item Sliders: 1-100 … angry Emotion scale NA. participant integer 1254 (30%) n_uni = 96.00
min = 1.00
q25 = 3.00
median = 16.00
q75 = 31.00
max = 100.00
mean = 23.32
sd = 25.91
Embedded Image Embedded Image
NA1 A negative affect (NA) item Sliders: 1-100 … sad Emotion scale NA. participant integer 1254 (30%) n_uni = 99.00
min = 1.00
q25 = 1.00
median = 11.00
q75 = 31.00
max = 100.00
mean = 21.36
sd = 26.53
Embedded Image Embedded Image
NA2 A negative affect (NA) item Sliders: 1-100 … sressed Emotion scale NA. participant revesed item integer 1254 (30%) n_uni = 67.00
min = 1.00
q25 = 1.00
median = 7.00
q75 = 15.00
max = 83.00
mean = 10.50
sd = 11.53
Embedded Image Embedded Image
NA3 A negative affect (NA) item Sliders: 1-100 … joyful Emotion scale PA participant integer 1254 (30%) n_uni = 101.00
min = 1.00
q25 = 40.00
median = 72.00
q75 = 89.00
max = 100.00
mean = 63.72
sd = 28.66
Embedded Image Embedded Image
location Participant's location since the last beep Number: 0 -4 What have you done since the last beep? Activity scale Activity participant Recode with numbers character 1254 (30%) Embedded Image Embedded Image
valid Valid (1) or invalid (0) ovservation Number: 0 - 1 preprocessing followed compliance definition numeric 0 (0%) 0
1
1254 (29.857%)
2946 (70.143%)
Embedded Image Embedded Image