John Burn-Murdoch of the Financial Times published a chart in March 2022 showing the development of confirmed Covid-19 cases and deaths from Covid-19 in Hong Kong and New Zealand. The chart indicates the effectiveness of vaccines against Covid-19.
The chart will be our example to create the typical Financial Times theme for graphs with {ggplot2} in R.
NEW: I’m not sure people appreciate quite how bad the Covid situation is in Hong Kong, nor what might be around the corner.
— John Burn-Murdoch (@jburnmurdoch) March 14, 2022
First, an astonishing chart.
After keeping Covid at bay for two years, Omicron has hit HK and New Zealand, but the outcomes could not be more different.
R packages used:
library(tidyverse)
library(lubridate)
library(ggtext)
library(grid)
library(glue)
library(zoo)
The underlying data for the FT chart is provided by Our World in Data. We use the same date range as in the original plot.
# Download OWID data
owid_new_deaths_per_million_url <- "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/jhu/new_deaths_per_million.csv"
covid_deaths <- read_csv(owid_new_deaths_per_million_url)
owid_new_cases_per_million_url <- "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/jhu/new_cases_per_million.csv"
covid_cases <- read_csv(owid_new_cases_per_million_url)
# reporting period in plot
start_date <- as_date("2022-02-01")
end_date <- as_date("2022-03-11")
We prepare the data and keep the data for New Zealand and Hong Kong. In order to highlight the relationship between Covid-19 cases and deaths, the dates of the cases are shifted forward by 14 days.
# Function to prepare the datasets for confirmed cases and deaths
prep_long_data <- function(df,
start = start_date, end = end_date,
metric = c("cases", "deaths")) {
df %>%
pivot_longer(cols = -date, names_to = "region", values_to = "new_per_million") %>%
filter(region %in% c("New Zealand", "Hong Kong")) %>%
mutate(new_per_million = replace_na(new_per_million, 0),
new_per_100k = new_per_million / 10) %>%
group_by(region) %>%
mutate(new_per_100k_7drollmean = rollmean(new_per_100k, 7, fill = NA, align = "right"),
new_per_million_7drollmean = rollmean(new_per_million, 7, fill = NA, align = "right"),
metric = metric) %>%
ungroup() %>%
arrange(region, date)
}
# Lag to shift cases
lag_cases_deaths <- duration("14 days")
# Prepare the datasets and
# shift cases by x days to account for lag between infections and deaths
covid_df_long <- prep_long_data(covid_cases, metric = "cases") %>%
bind_rows(prep_long_data(covid_deaths, metric = "deaths")) %>%
mutate(date2 = if_else(metric == "cases", date + lag_cases_deaths, date)) %>%
filter(date2 >= start_date & date2 <= end_date)
The FT uses Metric as the main sans-serif font. Metric is a commercial font by Klim Type Foundry. A test version for non-commercial internal use can be downloaded from https://klim.co.nz/test-fonts/.
We will use Outfit for our chart. Outfit is available from Google Fonts.
When creating a new theme, using a standard theme shipped with {ggplot2} might be a good starting point. Then you add more and more details by modifying the components of the theme with a theme
function.
Here, we start with theme_minimal
and add typical theme elements which the FT charts are known for:
plot.background
)We choose a pretty larger font size since the plot will saved in a larger format. We also set strip.text
to blank to remove the default facet titles in this particular plot. Furthermore, we define colors for the red and blue areas as well as a separate text color.
# Colors
text_color <- "#68625D"
color_cases <- "#71C8E4"
color_cases_text <- "#258BC3"
color_deaths <- "#CE3240"
# Financial Times theme
theme_ft <- function(...) {
theme_minimal(base_family = "Outfit Medium", base_size = 16) +
theme(
plot.background = element_rect(color = NA, fill = "#FFF1E5"),
panel.background = element_rect(color = NA, fill = NA),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "#E3DACE", size = 0.3),
text = element_text(color = text_color, lineheight = 1.3),
plot.title = element_textbox(color = "#040000", family = "Outfit Medium",
face = "plain", size = 20, width = 1),
plot.title.position = "plot",
plot.subtitle = element_markdown(family = "Outfit Medium"),
plot.caption = element_markdown(
family = "Outfit", hjust = 0, size = 11.5, color = "#5E5751"),
plot.caption.position = "plot",
axis.title = element_blank(),
axis.text.x = element_text(hjust = 0, color = text_color, size = 14),
axis.text.y.left = element_markdown(family = "Outfit Medium"),
axis.text.y.right = element_markdown(family = "Outfit Medium"),
axis.ticks.x = element_line(size = 0.3),
axis.ticks.length.x = unit(1.8, "mm"),
plot.margin = margin(t = 12, b = 2, l = 1, r = 1),
strip.text = element_blank() # remove default facet titles
)
}
We create our first plot and apply our theme_ft
function.
covid_df_long %>%
ggplot(aes(date2)) +
geom_area(data = . %>% filter(metric == "cases"),
aes(y = new_per_100k_7drollmean),
fill = "#71C8E4") +
geom_area(data = . %>% filter(metric == "deaths"),
aes(y = -2 * new_per_million_7drollmean),
fill = "#CE3240") +
coord_cartesian(ylim = c(-60, 100), clip = "off") +
facet_wrap(vars(region)) +
theme_ft()
The original plot only shows two labels for February and March on the date axis. More importantly, we have to format the y-axis so that a) the labels show position values in both directions (up = cases, down = deaths) and b) that the labels are colored according to the color of the areas. We apply the different colors by passing the result of an anonymous function to the label
argument of scale_y_continuous
. The right axis is set up by passing the shorthand function dup_axis()
to the sec.axis
argument.
By setting ylim
to c(-60, 100)
and clip = "off"
, we allow drawing the red area outside the panel area.
p <- covid_df_long %>%
ggplot(aes(date2)) +
geom_area(data = . %>% filter(metric == "cases"),
aes(y = new_per_100k_7drollmean),
fill = "#71C8E4") +
geom_area(data = . %>% filter(metric == "deaths"),
aes(y = -2 * new_per_million_7drollmean),
fill = "#CE3240") +
scale_x_date(breaks = as_date(c("2022-02-01", "2022-03-01")),
date_labels = "%b") +
scale_y_continuous(
breaks = seq(-200, 200, 20),
labels = function(x) {
value <- ifelse(x > 0, x, -x)
color <- case_when(x > 0 ~ color_cases_text, x < 0 ~ color_deaths, x == 0 ~ text_color)
glue::glue("<span style='color: {color}'>{value}</span>")
},
# add the right axis
sec.axis = dup_axis()) +
coord_cartesian(ylim = c(-60, 100), clip = "off") +
facet_wrap(vars(region)) +
theme_ft()
p
John Burn-Murdoch is a proponent of using text annotions to guide the audience and enable further exploration (see his talk at rstudio::global 2021 conference). Thus, the original plot contains a descriptive title and annotations comparing the situation in both countries.
We define the titles and the annotations in separate objects and add them to the plot in a second step. (This approach might also come in handy if you have to create titles in more than one language. It can be more convenient to set the titles in the graph with list variables and change the values of the variables for each language.) We use HTML tags and CSS to format the annotation texts, which will be added to the plot using geom_richtext()
from the {ggtext} package.
plot_titles <- list(
title = "Cases are translating into deaths at much higher rates in
Hong Kong than in New Zealand, where elderly vaccination rates are much higher ",
subtitle = glue("Daily
<span style='color:{color_cases_text}; font-family: \"Outfit SemiBold\"'>cases</span>
per 100,000 people, and daily
<span style='color:{color_deaths}; font-family: \"Outfit SemiBold\"'>deaths</span>
per 2 million"),
caption = "Cases shifted forward (14 days) to account for lag between infection and death.<br>
Original plot by John Burn-Murdoch (Financial Times).
Data: Johns Hopkins University, Our World in Data")
country_annotations <- data.frame(
region = c("Hong Kong", "New Zealand"),
label = c(
glue("<span style='color: black; font-size: 14pt; font-family: \"Outfit SemiBold\"'>
Hong Kong</span><br>
66% of over-80s unvaccinated<br>when Omicron took off<br>
<span style='color: {color_deaths}; font-family: \"Outfit SemiBold\"'>Case fatality
<br>rate: 4.7%</span>"),
glue("<span style='color: black; font-size: 14pt; font-family: \"Outfit SemiBold\"'>
New Zealand</span><br>
2% unvaccinated<br>
<span style='color: {color_deaths}; font-family: \"Outfit SemiBold\"'>CFR: 0.1%</span>")))
Now we add the titles and annotations to the plot:
p2 <- p +
# country annotation
geom_richtext(data = country_annotations,
aes(x = as_date("2022-02-01"), y = 107.5, label = label),
size = 4, label.size = 0, fill = NA, family = "Outfit Medium",
color = "#68625D", hjust = 0, vjust = 1, lineheight = 1.2
) +
labs(
title = plot_titles$title,
subtitle = plot_titles$subtitle,
caption = plot_titles$caption
)
p2
p2
# Add black thin rectangle in the top left corner
grid.rect(
x = 0, y = 1, width = 0.165, height = 0.018,
gp = gpar(fill = "black")
)
To save the plot to a png file, wrap png()
or ragg::agg_png()
around the printed plot object and the grid.rect()
call:
ragg::agg_png("24-ft.png", res = 300, width = 2800, height = 1616, units = "px")
p2
# Add black thin rectangle in the top left corner
grid.rect(
x = 0, y = 1, width = 0.165, height = 0.018,
gp = gpar(fill = "black")
)
invisible(dev.off())