library(tidyverse)
library(usdata)
library(ggbeeswarm)
gerrymander <- usdata::gerrymander # force df to appear in environmentAE 05: Gerrymandering + data exploration II
Suggested answers
These are suggested answers. This document should be used as reference only, it’s not designed to be an exhaustive key.
Getting started
Packages
We’ll use the tidyverse package for this analysis.
Data
The data are available in the usdata package.
glimpse(gerrymander)Rows: 435
Columns: 12
$ district <chr> "AK-AL", "AL-01", "AL-02", "AL-03", "AL-04", "AL-05", "AL-0…
$ last_name <chr> "Young", "Byrne", "Roby", "Rogers", "Aderholt", "Brooks", "…
$ first_name <chr> "Don", "Bradley", "Martha", "Mike D.", "Rob", "Mo", "Gary",…
$ party16 <chr> "R", "R", "R", "R", "R", "R", "R", "D", "R", "R", "R", "R",…
$ clinton16 <dbl> 37.6, 34.1, 33.0, 32.3, 17.4, 31.3, 26.1, 69.8, 30.2, 41.7,…
$ trump16 <dbl> 52.8, 63.5, 64.9, 65.3, 80.4, 64.7, 70.8, 28.6, 65.0, 52.4,…
$ dem16 <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0,…
$ state <chr> "AK", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AR", "AR",…
$ party18 <chr> "R", "R", "R", "R", "R", "R", "R", "D", "R", "R", "R", "R",…
$ dem18 <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0,…
$ flip18 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ gerry <fct> mid, high, high, high, high, high, high, high, mid, mid, mi…
Congressional districts per state
Which state has the most congressional districts? How many congressional districts are there in this state?
California, with 53 congressional districts in the state.
Gerrymandering and flipping
Is a Congressional District more likely to be flipped to a Democratic seat if it has high prevalence of gerrymandering or low prevalence of gerrymandering? Support your answer with a visualization and summary statistics.
Based on the plot below, we actually find that a Congressional District is more likely to be flipped to a Democratic seat if it has low prevalance of gerrymandering.
gerrymander |>
mutate(flip18 = as_factor(flip18)) |>
ggplot(aes(x = gerry, fill = flip18)) +
geom_bar(position = "fill") +
labs(title = "Level of gerrymandering by 'flip' status",
x = "Level of gerrymandering preceding the 2018 House election",
y = "Proportion of observations",
fill = "'Flip' status") +
theme_minimal()# A tibble: 8 × 4
# Groups: gerry [3]
gerry flip18 n prop
<fct> <dbl> <int> <dbl>
1 low -1 2 0.0323
2 low 0 52 0.839
3 low 1 8 0.129
4 mid -1 3 0.0111
5 mid 0 242 0.896
6 mid 1 25 0.0926
7 high 0 98 0.951
8 high 1 5 0.0485
Aesthetic mappings
Recreate the following visualization, and then improve it.

Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
## Recreate
ggplot(gerrymander, aes(x = gerry, y = clinton16)) +
geom_beeswarm(alpha = .5, color = "grey") +
geom_boxplot(aes(color = gerry), alpha = .5) +
theme_minimal()## Improve
gerrymander |>
ggplot(aes(x = gerry, y = clinton16)) +
geom_beeswarm(alpha = .5, color = "grey") +
geom_boxplot(aes(color = gerry), alpha = .5) +
theme_minimal() +
scale_y_continuous(labels = label_percent(scale = 1)) +
labs(title = "Distribution of % Vote for Clinton in 2016",
subtitle = "By Level of Gerrymandering",
x = "Level of Gerrymandering",
y = "% Vote for Clinton in 2016") +
theme(legend.position = "none")