Calendar plot
Packages: dplyr, lubridate, ggplot2, ggpubr, ggTimeSeries, esmtools
Creating a calendar plot can provide a comprehensive overview of the time the data was collected. It can brings interesting piece of information about the data collection process. For instance, it allows you to identify periods (e.g., holidays vs. regular weeks), or to detect periods with incoherent number of observations sent/missing. We propose three methods for generating a calendar plot that displays the number of observations for each day. Each one of them can be applied to either the entire dataset or a subset of participants.
By default, the start of the week in the lubridate package is defined as Sunday, following the American calendar. If you want Monday to be specified as the first day of the week, you can set the global variable (see also lubridate section):
options("lubridate.week.start" = 1)
Method 1: ‘ggplot_calendar_heatmap()’ function
This method uses the ggplot_calendar_heatmap function from the ggTimeSeries package. Before passing the data in this function, you have to first isolate the date and compute the number of observations for each date.
# Organize the dataset and compute the number of beep sent for each date
= data %>%
df_date mutate(date = as.Date(sent)) %>%
group_by(date) %>%
summarise(n_sent = n()) # Number of observation sent for each date
# Create the plot
ggplot_calendar_heatmap(df_date, 'date', 'n_sent') +
scale_fill_continuous(low = 'green', high = 'red') + # Set scale color
theme(legend.position = "bottom") # Legend at the buttom
However, this plot may lack information (e.g., the precise number of beeps sent) and have a layout that can be improved to better get closer to the usual calendar layout.
Method 2: ‘heatcalendar_plot()’ function
The following plot is a heatmap that is divided by year and month created using the ‘heatcalendar_plot’ from the esmtools package. Again, the color represents the number of beeps per day. The function as the following arguments:
- .data: A dataframe that contains the time variable
- timevar a timestamp variable name
library(esmtools)
heatcalendar_plot(data, "sent")
Method 3: ‘calendar_plot()’ function
We use the ‘calendar_plot()’ function from the esmtools package to create a calendar plot that displays each month for each year in the timestamp variable. Furthermore, beyond the color legend, we display the number of beeps sent each day. The arguments of the following function are:
- .data: A dataframe that contains the time variable
- timevar a timestamp variable name
- interval: specifies the time interval over which to group and display data in the calendar plot, with options “halfyear” (default) or “year.”
library(esmtools)
calendar_plot(data, "sent")