Create Faceted Arrow Plots in ggplot2 to Visualize Women’s Seats Change in National Parliaments
This article visualizes the changes in the proportion of seats held by women in national parliaments from 2000 to 2020. This work features a high level of plot customization. Major technical highlights covered in this article include:
Draw arrows.
Aesthetics inheritance.
Create unique annotations in selected faceted panels (subplots).
Annotation alignment and plot margin control.
This work is a ggplot2 reproduction of the graphic by Datawrapper.
The dataset is sourced from World Bank, and can be downloaded here.
library(ggplot2)library(dplyr)theme_set(theme_minimal()) d <-read.csv("/Users/boyuan/Desktop/R/gallery/DATASETS/women in parliament.csv") d2 <- d %>%as_tibble() %>%# Is the percentage in 2020 higher than 2020? # If true, then the arrows are plotted in blue, otherwise redmutate(increased = percent_2020 > percent_2000) %>%arrange(percent_2020) # country as ordered factor based on the percentage of seats held by women in 2020d2$country <-factor(d2$country, levels = d2$country, ordered = T) head(d2, n =3)
Output:
# A tibble: 3 × 5 country percent_2000 percent_2020 region increased <ord> <dbl> <dbl> <chr> <lgl> 1 Japan 7.29 9.89 East Asia & Pacific TRUE 2 Malaysia 10.4 14.4 East Asia & Pacific TRUE 3 Turkey 4.18 17.3 Europe & Central Asia TRUE
Visualization
1. Create arrows to indicate the changes in the percentage of seats held by women from 2000 (start of arrows) to 2020 (end of arrows).
2. Facet the plots based on the region variable, and flip the axes. Here expand = 0 removes the default margins around the plot, and helps to align subplot titles and axial labels (manually added at step 3) with the plot title (added at step 8). The plot margins will be added back later using the theme() function (at step 8) while retaining the alignment.
3. Add subplot titles and x-axis (left) labels manually. Here we use geom_text() to create texts left-justified (hjsut = 0) to the anchoring point y = -20. We’ll remove the default texts later at step 4. In the first layer of geom_text (for subplot titles), we create a new input dataset using the tibble function. The region variable is not only mapped to the label aesthetic, but also serves as the faceting variable indicating which subplot the titles should be added in. In addition, we use inherit.aes = F to inhibit aesthetic inheritance of this geom from the ggplot() line.
p3 <- p2 +# add faceted panel (subplot) titlesgeom_text(data =tibble(region =c("East Asia & Pacific", "Europe & Central Asia")),aes(x =c(7, 10), y =-20, label = region), hjust =0, fontface ="bold", size =4,inherit.aes = F) +# add x-axis (country) labelsgeom_text(aes(y =-20, label = country), color ="black", hjust =0) +# increase margin between the two subplotstheme(panel.spacing.y =unit(30, "pt")) p3
4. Remove the redundant default facet titles and x-axis (left) labels. Note that after coordinate flip, the aesthetic mapping remains unchanged, and the country variable (left vertical axis) remains the x aesthetic (e.g., as used in labs()). In contrast, in the theme() syntax, horizontal axis is always treated as the x-axis, and vertical axis as the y-axis, regardless of the presence of coordinate flip or not.
p4 <- p3 +theme(strip.text =element_blank(), # remove default facet titlesaxis.text.y =element_blank()) +# remove default country labelslabs(x =NULL) # remove the country titlep4
5. Revise the y-axis (bottom) labels, and remove its title. After coordinate flip, the “percent_2000” variable remains the y aesthetic (e.g., as used in scale_y_continuous()), but is treated as the x-axis in the theme() syntax (e.g., axis.text.x).
# breaksp5 <- p4 +scale_y_continuous(breaks =seq(0, 50, 10),labels =function(x){paste0(x, "%")},minor_breaks =NULL, # remove the vertical minor gridsname =NULL) +# remove the titletheme(# thinner horizontal gridspanel.grid.major.y =element_line(linewidth = .2), # increase the top margin of the bottom - axis labelsaxis.text.x =element_text(margin =margin(t =15))) p5
6. Add labels showing the percent of seats held by women in 2020. Here we use ifelse() to determine the labels’ y aesthetic depending on the direction of the percent change from 2000 to 2020.
7. Mark the start of the arrow as year 2000, and end of the arrow as 2020, using the first arrow for illustration. In the new input dataset, the region variable serves as the faceting variable, similar to step 3. As region is assigned here with only one level “East Asia & Pacific”, the text is displayed only in the first subplot.
p7 <- p6 +geom_text(data =tibble(region ="East Asia & Pacific",x =7, y =c(31, 40.8),label =c(c("2000 |", "| 2020"))),aes(x = x, y = y, label = label),inherit.aes = F, hjust =c(1, 0), fontface ="bold", size =4,color =c("darkgreen", "orange3")) p7
8. Final polish-up. Here we add the plot title, which is nicely aligned to the left with the manually added subplot titles and country names. Accurate alignment was made possible by removing the plot margin at the earlier step 2. After adding the plot title, we then manually add the margin back here while retaining the alignment.
p8 <- p7 +# add and customize plot titleggtitle("Most countries have a higher share of women\nin their national parliaments than twenty years ago") +theme(legend.position ="none") +theme(plot.title =element_text(face ="bold", size =16,margin =margin(b =30)),# add 20 units of margin to the four sides of the plotplot.margin =margin(rep(20, 4))) p8
library(ggplot2)library(dplyr)theme_set(theme_minimal()) d <-read.csv("/Users/boyuan/Desktop/R/gallery/DATASETS/women in parliament.csv") d2 <- d %>%# Is the percentage in 2020 higher than 2020? # If true, then the arrows are plotted in blue, otherwise redmutate(increased = percent_2020 > percent_2000) %>%arrange(percent_2020) # country as ordered factor based on the percentage of seats held by women in 2020d2$country <-factor(d2$country, levels = d2$country, ordered = T) head(d2, n =3) #Create arrows to indicate the changes in the percentage of seats from 2000 (start of arrows) to 2020 (end of arrows).p1 <- d2 %>%ggplot(aes(x = country, xend = country, y = percent_2000, yend = percent_2020,color = increased)) +geom_segment(arrow =arrow(length =unit(8, "pt")),linewidth =1) p1 # Facet the plots based on `region`, and flip the axes.p2 <- p1 +facet_grid(region ~ ., scales ="free_y", space ="free_y") +scale_color_manual(values =c("firebrick4", "steelblue4")) +coord_flip(expand =0, clip ="off") p2 # Add panel titles and x-axis labels manuallyp3 <- p2 +# add faceted panel titlesgeom_text(data =tibble(region =c("East Asia & Pacific", "Europe & Central Asia")),aes(x =c(7, 10), y =-20, label = region), hjust =0, fontface ="bold", size =4,inherit.aes = F) +# add x-axis (country) labelsgeom_text(aes(y =-20, label = country), color ="black", hjust =0) +# increase margin between the two subplotstheme(panel.spacing.y =unit(30, "pt")) p3 # Remove the duplicated old (default) facet titles and x-axis (country) labels.p4 <- p3 +theme(strip.text =element_blank(), # remove default facet titlesaxis.text.y =element_blank()) +# remove default country labelslabs(x =NULL) # remove the country titlep4 # Revise the y-axis (percentage) labels, and remove its title.p5 <- p4 +scale_y_continuous(breaks =seq(0, 50, 10),labels =function(x){paste0(x, "%")},minor_breaks =NULL, # remove the vertical minor gridsname =NULL) +# remove the titletheme(# thinner horizontal gridspanel.grid.major.y =element_line(linewidth = .2), # increase the top margin of the axis labelsaxis.text.x =element_text(margin =margin(t =15))) p5 # Display the percentage of seats held by women in 2020.p6 <- p5 +geom_text(aes(y =ifelse(increased == T, percent_2020 +4, percent_2000 -6), label =round(percent_2020, 1) %>%paste0("%"))) p6 # 7. Mark the start of the arrow as year 2000, and end of the arrow as 2020p7 <- p6 +geom_text(data =tibble(region ="East Asia & Pacific",x =7, y =c(31, 40.8),label =c(c("2000 |", "| 2020"))),aes(x = x, y = y, label = label),inherit.aes = F, hjust =c(1, 0), fontface ="bold", size =4,color =c("darkgreen", "orange3")) p7 # Final polish-up.p8 <- p7 +ggtitle("Most countries have a higher share of women\nin their national parliaments than twenty years ago") +theme(legend.position ="none") +theme(plot.title =element_text(face ="bold", size =16, margin =margin(b =30)),# add 20 units of margin to the four sides of the plotplot.margin =margin(rep(20, 4))) p8
Continue Exploring — 🚀 one level up!
Alongside arrow plots, line plots are commonly employed to illustrate the evolving trends over a period of time. In the following plot, we employ annotated lines and points to highlight the significant changes in the human life span and population size from 1800 to 2015.
Check the following annotated line plot that shows the changing popularity of smoking worldwide, in particular in the countries of the United States, France, and Germany.
Furthermore, ribbons are an attractive alternative to illustrate chronological changes with engaging visual appeal. Check out this awesome stacked ribbon / alluvium plot, which shows dynamic shifts in the migrant population to the United States from 1820 to 2009.