{
"Modification 1": "reformating the timestamps variables to be in POSIXct format.",
"Modification 2": "create variable to report maximum value in the multi-response item 'perc_stress_child'.",
"Error 1": "from the descriptive output in the Importation check section, we can see that the missing values in perc_stress_child are coded as ''",
"Modification 3": "We recoded the missing values of the 'perc_stress_child' variable as NA",
"Error 2": "There are inconsistent missing values:",
"Modification 4": "set as missing the 'start' and 'end' variables of those 27 first inconsistent cases. For the 3 remaining observations, it will have no implications for the later analysis.",
"Modification 5": "extract time elements (day, year, etc.) as well as create observation number (obsno), day number (daycum), beep number in a day (beepno) and duration in days variables (duration).",
"Modification 6": "the valid observations are the ones in which there are no missing values in the variables of interest. It is 3 variables (from 'pos_aff', 'pos_neg', 'perc_stress_child' variables) and, in addition, 'perc_fun_child' and 'perc_fun_signaled' in function of the branching conditions (see above).\nWe created a function that can be reused later.",
"Error 3": "the sampling scheme plot aids in visualizing that: <br> \n<ol>\n<li>There are big intervals between the first day and the rest of the days for many participants (e.g., participants 1, 49, 52, 72).<\/li> \n<li>The first and sometimes the second days of participation must be removed. Indeed, those days have often less than 4 beeps sent and are testing days. In the end, participants should only have 10 days of participation, starting on a Friday and including 2 weekends. <\/li>\n<\/ol>",
"Modification 7": "remove test observations and recompute time variables (checking the new sampling scheme is in the supplementary part below). Test observations are all day 1 observations and: \n<ul>\n<li>day 4 observations for participants 79 and 73.<\/li> \n<li>day 17 observations for participants 1 and 72.<\/li> \n<li>day 10 observations for participants 49, 52.<\/li> \n<li>day 5 observations for participant 66.<\/li> \n<li>day 3 observations for participants 9 and 32. <\/li>\n<\/ul>",
"Error 4": "both within and between observations. There is an observation with start before sent with an an hour of difference. It is not an issue for later analysis.",
"Error 5": "negative time interval for an obs (issue already mentioned higher).",
"Error 6": "there are outliers, specifically belonging to participants 7, 15, 66, 73, and 77, that require further investigation. Additionally, Participant 66 exhibits low compliance.",
"Error 7": "there are time differences superior to 10 and to 20 minutes (max=28 mins). \nIt may be problematic for the analysis but differences follow the sampling scheme (i.e., delay to start the questionnaires).",
"Error 8": "the overall compliance is rather low. In particular, the participant 66 has a compliance close to 0, and the participants 1, 32, 72 and 79 have a compliance lower than .2.",
"Error 9": "when taking dyads' partner observations together, the dyads' compliance (defined as the proportion of beeps answer by both partners) are very low overall.",
"Modification 8": "person-mean center the 'pos_aff' and 'neg_aff' variables.",
"Modification 9": "remove irrelevant variables for later analysis"
}
Reporting tools
The esmtools package has some tools to help to create a preprocessing report and data quality reports. In particular, we have developed two tools: highlight text, and toggle button. Also part of those tools, you can have a look at the Session and Dataset info section. All tools are implemented in the advanced version of the preprocessing report (see the report example folder).
Highlighting text
In the report, we particularly encourage highlighting any data modifications and identified issues. Users can emphasize specific sections, making it easier to spot and communicate changes made during the preprocessing phase. Often, you only need to copy and paste the required code (see the instruction in the template) and adapt the description part. We propose two methods:
- ‘txt()’ function from the esmtools package. It can be used in two different ways.
- Basic: make reporting simple.
- Exporting: enable exporting of the comments.
- HTML tags and CSS code: it does not requiere function.
txt(): basic
The txt() function generates custom text and provides the option to include an optional count that can be integrated into the text. Whenever you have spotted an issue in your dataset (e.g., duplication), you can use the inline code below (within ``). The arguments of the function are the following:
- id: a character string that makes the link with the css style. Predefined styles are: ‘esm-issue’, ‘esm-mod’, ‘esm-inspect’. You can create your own CSS style and associate it.
- title: the part that is highlighted.
- count: a logical value (TRUE by default) indicating whether to include a count in the title part.
For instance, the following inline code `r txt(id='esm-issue',title='Issue',count=TRUE)` The issue is that ...
gives:
Issue 1: The issue is that …
Note for consiseness, you can remove the argument names (e.g., ‘txt(’esm-issue’,‘Issue’,TRUE)’).
You can as well highligth:
- Data modifications: using
`r txt(id='esm-mod','Modification',TRUE)` I changed ...
Modification 1: I changed …
- Data inspection: using
`r txt(id='esm-inspect','Inspection')` Here we can see that ...
Inspection: Here we can see that …
The ‘esm-issue’, ‘esm-mod’, and ‘esm-inspect’ css class styles are imported whenever you import the esmtools package. You can override those styles to modify the fonts, colors, etc., as you like (an example is commented). In certain cases, you might need to use the ‘!important’ declaration to override the default style definitions.
<style>
.esm-issue{
/* font-family: Georgia; */
}</style>
txt(): exporting
The txt() function generates custom text and provides the option to include an optional count that can be integrated into the text. Whenever you have spotted an issue in your dataset (e.g., duplication), you can use the inline code below (within ``). In constract with the basic use of the txt() function, the description need to be within the function, specified inside the ‘text’ argument. The arguments of the function are the following:
- id: a character string that makes the link with the css style. Predefined styles are: ‘esm-issue’, ‘esm-mod’, ‘esm-inspect’. You can create your own CSS style and associate it.
- title: the part that is highlighted.
- text: give a description to the spotted issue.
- count: a logical value (TRUE by default) indicating whether to include a count in the title part.
For instance, the following inline code `r txt(id='esm-issue',title='Issue',text='The issue is that ...',count=TRUE)`
gives:
Issue 2: The issue is that …
Note for consiseness, you can remove the argument names (e.g., ‘txt(’esm-issue’,‘Issue’,‘The issue is that …’,TRUE)’).
You can as well highligth:
- Data modifications: using
`r txt(id='esm-mod','Modification','I changed ...',TRUE)`
Modification 2: I changed …
- Data inspection: using
`r txt(id='esm-inspect','Inspection','Here we can see that ...')`
Inspection: Here we can see that …
Additionally, we can implement LaTeX code to change the font, the size or the layout of the code For instance, we can use itemization with the ul and il tag such as `r txt(id='esm-issue',title='Issue',text='the plot aids in visualizing that: <br> <ul> <li>Firstly ...</li> <li>Secondly ...</li> </ul>')`
- Firstly …
- Secondly …
Exporting the highlighted elements: by using the txt() function, you can export selected or all highlighted elements into a JSON file. This not only provides a concise summary of these elements but also creates a machine-readable file that can be used later. The output file will be named following the rmarkdown file name adding ’_list’ (e.g., ‘Preprocessing_report_list.json’). For instance, the advanced preprocessing report example (here) gives the following output:
To create this export, you need to specify the ids of the textual elements you wish to include in the ‘json_esm’ variable within the ‘params’ section of the header. For example, if you want to export all spotted issues (with id=‘esm-issue’) and data modifications (with id=‘esm-mod’), simply specify these ids in the ‘json_esm’ parameter, and proceed as follows:
---
title: report
output:
html_document
params:
json_esm: esm-issue, esm-mod ---
The ‘esm-issue’, ‘esm-mod’, and ‘esm-inspect’ css class styles are imported whenever you import the esmtools package. You can override those styles to modify the fonts, colors, etc., as you like (an example is commented). In certain cases, you might need to use the ‘!important’ declaration to override the default style definitions.
<style>
.esm-issue{
/* font-family: Georgia; */
}</style>