Visualize Temporal Changes of Multiple Groups Using Slope Graph in ggplot2
Slopegraphs are minimalist and efficient presentations of your data that can simultaneously convey the relative rankings, the actual numeric values and group labels, and the changes and directionality of the data over time.
The flagship functionnewggslopegraph() of the CGPfunctions package by Chuck Powell is a wrapper of ggplot2 to create such elegant slopegraphs. It takes a dataframe as input, with three named columns for arguments Times (mapped to the x), Measurement (the y), and Grouping (the color), to draw the plot.
Below we’ll create four plots to demonstrate this useful function.
# A tibble: 96 × 3 Year Type Survival <ord> <fct> <dbl> 1 5 Year Prostate 99 2 10 Year Prostate 95 3 15 Year Prostate 87 4 20 Year Prostate 81 5 5 Year Thyroid 96 # ℹ 91 more rows
p1 <-newggslopegraph(newcancer, # dataframe Year, # Times, the 'x' Survival, # Measurement, the 'y' Type, # Grouping, the 'color'Title ="the Rate of Cancer Survival",# remove the default subtitle and captionSubTitle =NULL, Caption =NULL) p1
Visualize the increase of life expectancy in Europe
In this graphic, we use the dataset gapminder built in the gapminder package. Here we’ll demonstrate two more aspects of the newggslopegraph() function:
The variable mapped to the x-axis (the Times argument) must be an ordered factor. (Setting ordered factor is also the key to rearrange graphic elements in ggplot2, such as bars, facet plots, legend keys, etc.)
As the output is a ggplot2 object, generic functions of ggplot2 can be subsequently added for continued plot customization.
library(gapminder) g <- gapminder %>%filter(continent %in%"Europe") %>%filter(year %in%seq(1952, 2007, 10)) %>%# convert the x-axis variable (the 'Times' argument) to ordered factormutate(year =factor(year, ordered = T),lifeExp =round(lifeExp)) g
Output:
# A tibble: 180 × 6 country continent year lifeExp pop gdpPercap <fct> <fct> <ord> <dbl> <int> <dbl> 1 Albania Europe 1952 55 1282697 1601. 2 Albania Europe 1962 65 1728137 2313. 3 Albania Europe 1972 68 2263554 3313. 4 Albania Europe 1982 70 2780097 3631. 5 Albania Europe 1992 72 3326498 2497. # ℹ 175 more rows
# create colors as many as the number of countries in EuropemyColors <-colorRampPalette(brewer.pal(11, "Spectral"))(n_distinct(g$country))
p2 <- g %>%# create slope plotnewggslopegraph(year, lifeExp, country, DataTextSize =3, # size of the text (life expectancy)LineColor = myColors, WiderLabels = T) +# add verticle lines to mark the yearsgeom_vline(xintercept =1:5, linetype ="dashed", color ="snow2") +# remove plot and axis titles labs(title =NULL, subtitle =NULL, caption =NULL) +# annotate with plot title at graphic bottom right corner annotate(geom ="text", label ="Life Expectancy\nIn Europe",# x-axis registered as 1, 2, 3, etc. under the hoodx =5.3, y =48, size =8, fontface ="bold", color ="snow4") p2
# create colors as many as the number of yearsmyColors2 <-colorRampPalette(brewer.pal(9, "YlGn"))(n_distinct(g$country))
p3 <- J %>%newggslopegraph(Times = quarter, Measurement = sales, Grouping = year,DataTextSize =3, LineColor = myColors2 ) +labs(title ="Johnson & Johnson Earnings ($) per share",subtitle =NULL,caption ="Shumway, R. H. and Stoffer, D. S. (2000) Time Series Analysis and its Applications.Second Edition. Springer") p3
Visualize indometh pharmacokinetics
This plot shows how quickly the drug is cleared away from blood after intravenous administration into six subjects.
library(patchwork) # create a function adding plot margin around the plotsf <-function(p){ p <- p +theme(plot.margin =margin(rep(20, 4)))return(p)} # combine the plots( f(p1) |f(p2) ) / ( f(p3) |f(p4) )
# save the combined plotsggsave(filename ="slopePlot_completed.png",path ="graphics",height =10, width =11)
# e.g. 1. Visualize cancer survival ratelibrary(tidyverse)library(CGPfunctions)library(RColorBrewer) as_tibble(newcancer) p1 <-newggslopegraph(newcancer, # dataframe Year, # Times, the 'x' Survival, # Measurement, the 'y' Type, # Grouping, the 'color'Title ="the Rate of Cancer Survival",# remove the default subtitle and captionSubTitle =NULL, Caption =NULL) p1 # --------------------------------------------------------------- # e.g.2 Visualize increase of life expectancy in Europe library(gapminder) g <- gapminder %>%filter(continent %in%"Europe") %>%filter(year %in%seq(1952, 2007, 10)) %>%# convert the x-axis variable (the 'Times' argument) to ordered factormutate(year =factor(year, ordered = T),lifeExp =round(lifeExp))g # create colors as many as the number of countries in EuropemyColors <-colorRampPalette(brewer.pal(11, "Spectral"))(n_distinct(g$country)) p2 <- g %>%# create slope plotnewggslopegraph(year, lifeExp, country, DataTextSize =3, # size of the text (life expectancy)LineColor = myColors, WiderLabels = T) +# add verticle lines to mark the yearsgeom_vline(xintercept =1:5, linetype ="dashed", color ="snow2") +# remove plot and axis titles labs(title =NULL, subtitle =NULL, caption =NULL) +# annotate with plot title at graphic bottom right corner annotate(geom ="text", label ="Life Expectancy\nIn Europe",# x-axis registered as 1, 2, 3, etc. under the hoodx =5.3, y =48, size =8, fontface ="bold", color ="snow4") p2 # --------------------------------------------------------------- # e.g.3 visualize JohnsonJohnson's quarterly sales # install.packages("TSstudio")library(TSstudio) J <- JohnsonJohnson %>% TSstudio::ts_reshape() %>%as_tibble() %>%pivot_longer(-quarter, names_to ="year", values_to ="sales") %>%mutate(quarter =paste0("Q", quarter) %>%factor(ordered = T),sales =round(sales, 1))J # create colors as many as the number of yearsmyColors2 <-colorRampPalette(brewer.pal(9, "YlGn"))(n_distinct(g$country)) p3 <- J %>%newggslopegraph(Times = quarter, Measurement = sales, Grouping = year,DataTextSize =3, LineColor = myColors2 ) +labs(title ="Johnson & Johnson Earnings ($) per share",subtitle =NULL,caption ="Shumway, R. H. and Stoffer, D. S. (2000) Time Series Analysis and its Applications.Second Edition. Springer")p3 # --------------------------------------------------------------- # e.g.4 Visualize indometh pharmacokinetics as.tibble(Indometh) p4 <- Indometh %>%mutate(time =factor(time, ordered = T)) %>%newggslopegraph(time, conc, Subject) +labs(title =NULL, subtitle =NULL, caption =NULL) +theme_classic() +theme(legend.position ="none") +scale_x_discrete(name ="Time (h)") +scale_y_continuous(breaks =seq(0, 3, .5), name ="mcg/mL") +annotate(geom ="text",label ="Pharmacokinetics\nof indometacin",x =7, y =1.8, size =8, color ="snow4", fontface ="bold")p4 # --------------------------------------------------------------- # Combine all plots together.library(patchwork) # create a function adding plot margin around the plotsf <-function(p){ p <- p +theme(plot.margin =margin(rep(20, 4)))return(p)} # combine the plots( f(p1) |f(p2) ) / ( f(p3) |f(p4) ) # save the combined plotsggsave(filename ="slopePlot_completed.pdf",path ="graphics",height =10, width =11)
Continue Exploring — 🚀 one level up!
In a line plot containing multiple groups, highlighting certain observations of interest is a powerful way of storytelling, as demonstrated in the following annotated and highlighted line plot that shows the changing popularity of smoking in U.S., Germany, and France, as well as other countries over the past century.