Gerrymandering-Graphs.Rmd

---
title: "Gerrymandering-Graphs"
author: "Max Pagan"
date: "7/26/2022"
output: html_document
---

# Gerrymandering Project

## Importing the dataset

```{r setup, include=TRUE}
library(tidyverse)
voting_data <- read.csv("~/Downloads/1976-2020-house.csv")
```

## Extracting the necessary the data. 

Which year do you want to look at?

```{r cleanup, include=TRUE}
year.input <- readline(prompt="Enter a year between 1976 and 2020: ")
voting_data_clean <- voting_data  %>% filter(year == year.input) %>% select(state_po, district, candidate, party, candidatevotes) 
```

## How many districts are there per state?

```{r districts, include=TRUE}
num_districts <- voting_data_clean %>%  group_by(state_po) %>% summarise(district_total = max(district)) %>% arrange(district_total)

num_districts <- num_districts%>% mutate_all(funs(replace(., .== 0, 1)))

voting_data_clean <- voting_data_clean %>% group_by(state_po) %>% mutate(num_districts = max(district)) %>% mutate_all(funs(replace(., .== 0, 1))) %>% ungroup()
```

```{r proportional info, include=TRUE}
voting_data_clean <- voting_data_clean %>% group_by(state_po) %>% mutate(totalvotes = sum(candidatevotes)) %>% ungroup()

voting_data_clean <- voting_data_clean %>% group_by(state_po, party) %>% mutate(partyvotes = sum(candidatevotes)) %>% mutate(proportion = partyvotes / totalvotes) %>% mutate(idealNumSeats = round((proportion * num_districts), digits = 0))

proportional_districts <- voting_data_clean %>% group_by(state_po, party) %>% select(state_po, party, partyvotes, idealNumSeats)

finalframe <- proportional_districts %>% distinct(state_po, party, partyvotes, idealNumSeats)

```

We'll get rid of the parties that should win 0 seats

```{r proportional info, include=TRUE}
finalframe2 <- finalframe %>% filter(idealNumSeats != 0) %>% replace(.=="", "NO PARTY LISTED")
```

Now that I have the ideal proportional division of house seats per party in each state, let's take a look at the real numbers.

## Real results

```{r proportional info, include=TRUE}
clean_data_2 <- voting_data_clean %>% select(state_po, district, candidate, party, candidatevotes) %>% group_by(state_po, district) %>% mutate(winner=ifelse(candidatevotes==max(candidatevotes, na.rm=TRUE),T,F))

clean_data_3 <- clean_data_2 %>% filter(winner == T)

num_seats_district <- clean_data_3 %>% select(state_po, party) %>% group_by(state_po, party) %>% mutate(numseats=n())
num_seats_state <- num_seats_district %>% select(state_po, party, numseats) %>% distinct(state_po, party, numseats)
```

Now I have two useful tables: one that gives us the number of seats each party currently has (num_seats_state), and one that gives us the number of seats each party "deserves" based on the number of votes it received (finalframe2).

## Creating useful graphs

```{r pt1, include=TRUE}
state.input <- readline(prompt="Enter 2-Letter State Postal Code: ")

this_state_ideal <- finalframe2 %>% arrange(match(party, c("DEMOCRAT", "REPUBLICAN"))) %>% mutate(party = factor(x = party, levels = party))   %>% filter(state_po == state.input)  
this_state_current <- num_seats_state %>% arrange(match(party, c("DEMOCRAT", "REPUBLICAN"))) %>% mutate(party = factor(x = party, levels = party))   %>% filter(state_po == state.input) 

this_state_ideal
this_state_current
hsize <- 2

plot1 <- ggplot(data = this_state_ideal, aes(x = hsize, y = partyvotes, fill = party)) + 
    geom_bar(stat = "identity", position = position_fill()) +
    geom_text(aes(label = partyvotes), color = "white", size = 2, position = position_fill(vjust = 0.5)) +
    coord_polar(theta = "y") + xlim(c(0.2, hsize + 0.5)) +
    facet_wrap(~ state_po, nrow = 1) +
  scale_fill_manual(values = c("#E9141D", "#0015BC", 'orange', 'green', 'purple'))
plot1 + theme(axis.text.x=element_blank(),
      axis.ticks.x=element_blank(),
      axis.text.y=element_blank(),
      axis.ticks.y=element_blank())

plot2 <- ggplot(data = this_state_ideal, aes(x = hsize, y = idealNumSeats, fill = party)) + 
    geom_bar(stat = "identity", position = position_fill()) +
    geom_text(aes(label = idealNumSeats),color = "white", position = position_fill(vjust = 0.5)) +
    coord_polar(theta = "y") + xlim(c(0.2, hsize + 0.5)) +
    facet_wrap(~ state_po, nrow = 1) +
  scale_fill_manual(values = c("#E9141D", "#0015BC", 'orange', 'green', 'purple'))
plot2 + theme(axis.text.x=element_blank(),
      axis.ticks.x=element_blank(),
      axis.text.y=element_blank(),
      axis.ticks.y=element_blank())

plot3 <- ggplot(data = this_state_current, aes(x = hsize, y = numseats, fill = party)) + 
    geom_bar(stat = "identity", position = position_fill()) +
    geom_text(aes(label = numseats),color = "white", position = position_fill(vjust = 0.5)) +
    coord_polar(theta = "y") + xlim(c(0.2, hsize + 0.5)) +
    facet_wrap(~ state_po, nrow = 1)  +
  scale_fill_manual(values = c("#E9141D", "#0015BC", 'orange', 'green', 'purple'))
plot3 + theme(axis.text.x=element_blank(),
      axis.ticks.x=element_blank(),
      axis.text.y=element_blank(),
      axis.ticks.y=element_blank())
```