README.Rmd

---
output:
  md_document:
    toc: false
    variant: gfm
---

# Simulating NFL Play-Calling Strategies


```{r setup, include=FALSE}
knitr::opts_chunk$set(fig.width=10, fig.height = 4,echo = TRUE)
options(width = 10000)

```


In this post I’ll use simulation to test out different play-calling strategies. EPA (expected points added) is very useful in telling you expected value of a play on average. For example, it will say that passing on first down gives you +.25 expected points added on average, whereas running gives you +.1 on average. I'm still trying to understand though if the variance of a play matters, for example, does it make sense to take a lower EPA run on first down so you end up with 2nd and 7 rather a higher variance pass that will give you a good chance of second and long. I ran drive simulations to test out these different strategies, and then I could see if the high EPA and/or YPA play is the best choice. I got a lot of this code from [this post](https://statsbylopez.netlify.com/post/resampling-nfl-drives/) which was very helpful, and I made changes like creating game states and some other things.


## Part 1: Methodology: Simulating Drives 

```{r include=F, eval=T , echo=F}
library(crosstalk)
library(knitr)
library(kableExtra)
library(DT)
library(webshot)
library(leaflet)

source("simulate-functions.R")

```

The data I am using is PBP data from 2011-2018, filtered to when the game was within 1-score, >=2 minutes before halftime and >=5 minutes before the end of the game. I define game state below based on down, yards-to-go, and yards-from-own-goal. Below are some of the states:


```{r include=T, eval=T , echo=F}
stateDF[1:10, ]#%>%
  # kable(row.names = F, digits = 3) %>%
  # kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

There are around 300 total states. Next, like in the post I linked above, I simulate a drive by sampling from the data, sampling the play result based on the state I am currently in. The drive stops when there’s a touchdown, turnover,turnover-on-downs, field goal,field goal miss, or safety. Below is an example of a simulated drive starting from the 25:

```{r include=T, eval=T , echo=F}

#simulation-run.R


strategyDF<-stateDF
yfog.start<-25
set.seed(0)

#initialize drive
drive.store <- list()
length(drive.store)<-40  #no drive is ever gonna be more than 40 plays
new.down <- 1
new.distance <- 10
new.yfog <- yfog.start
end.of.drive <- FALSE
play.num <- 1

#simulate until absorbing state
while (!end.of.drive){
  run.play <- sample.play(df.scrimmage=df.scrimmage,
                          stateDF=stateDF,
                          down = new.down, 
                          yards.to.go = new.distance, 
                          yards.from.own.goal = new.yfog,
                          strategyDF = strategyDF
                          
  ) #can specify play_type here
  run.play$play.num <- play.num
  drive.store[[play.num]] <- run.play #add each play
  new.down <- run.play$new.down
  new.distance <- run.play$new.distance
  new.yfog <- run.play$new.yfog
  end.of.drive <- run.play$end.drive
  play.num <- play.num + 1
  drive.store[1:play.num]
}
drive.store<-rbindlist(drive.store)
drive.store[, c("down.original", "yards.to.go", "yards.from.own.goal", "State.ID", "desc", "yards_gained", "new.down", "new.distance")]#%>%
  # kable(row.names = F) %>%
  # kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))%>%
  # column_spec(5, width = "30em")


```


I repeat this to get 10,000 simulated drives starting from the 25 yard line. I can now look at how these results line up to drives in the actual data:
```{r include=T, eval=T , echo=F}

#analyze-sim-results.R

all.drives.store<-fread("Data/alldrives.csv")
all.drives.store$is.safety[all.drives.store$desc=='(:59) (Punt formation) L.Cooke punts 47 yards to TEN 7, Center-M.Overton. C.Batson MUFFS catch, touched at TEN 7, and recovers at TEN 1. C.Batson tackled in End Zone for -1 yards, SAFETY (L.Jacobs).']<-F

df.scrimmage$is.fg.miss<-df.scrimmage$is.fg& !grepl("is GOOD|is Good", df.scrimmage$desc)
df.scrimmage$is.fg.make<-df.scrimmage$is.fg& grepl("is GOOD|is Good", df.scrimmage$desc)
all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)


df.scrimmage<-data.table(df.scrimmage)[, `:=`(start.drive.yfog=as.numeric(yfog[start.drive==T]),
                                              end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
                                                                                 ifelse(is.fg.make[end.drive==T], "fg.make",
                                                                                        ifelse(is.td.offense[end.drive==T], "td.offense",
                                                                                               ifelse(is.safety[end.drive==T], "safety",
                                                                                                      ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
                                                                                                             NA))))))), by=c("drive", "game_id") ]
all.drives.store<-data.table(all.drives.store)[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
                                                                                         ifelse(is.fg.make[end.drive==T], "fg.make",
                                                                                                ifelse(is.td.offense[end.drive==T], "td.offense",
                                                                                                       ifelse(is.safety[end.drive==T], "safety",
                                                                                                              ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
                                                                                                                     NA))))))), by=c("sim.id") ]


####PLOTS: DRIVE-LEVEL RESULTS######

#bind and plot
actual.means<-df.scrimmage[!is.na(df.scrimmage$end.drive.play)& df.scrimmage$start.drive.yfog%in% 22:28& df.scrimmage$end.drive==T,
                           list(n=length(game_id)), by="end.drive.play"]
actual.means$percent<-actual.means$n/sum(actual.means$n)
actual.means$Type<-"Actual"
sim.means<-all.drives.store[all.drives.store$end.drive==T, list(n=length(sim.id)), by="end.drive.play"]
sim.means$percent<-sim.means$n/sum(sim.means$n)
sim.means$Type<-"Simulated"
all.means<-rbind(actual.means, sim.means)
# all.means
ggplot( data = all.means, aes(x = Type, y = percent, fill = end.drive.play, label=round(percent, 3))) +
  geom_bar(stat="identity") + 
  geom_text(size = 3, position = position_stack(vjust = 0.5))+
  labs(title="Drive Outcomes",y="Percent", x=NULL, caption  =paste0("Actual=Drives starting from 22-28 Yard Line, 1 possession game, at least 2 minutes before half ends (2011-2018), n=",
                                                                    sum(all.means$n[all.means$Type=="Actual"]) ,") 
       Simulated=10,000 simulated drives starting from 25 Yard Line"))


```

It lines up pretty well, the difference could be due to the fact that the actual data includes penalties and my simulation doesn't. Even if there are differences, I'm mainly focusing on how strategies will increase from my simulated baseline. Now that I have the baseline, I can define any custom strategy I want and see how it changes the results. For example, instead of the default state here:

```{r include=T, eval=T , echo=F}
stateDF[1:5, ]#%>%
  # kable(row.names = F, digits = 3) %>%
  # kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

I can increase passes +10% on 1st-3rd downs:

```{r include=T, eval=T , echo=F}

strategyDF<-stateDF
strategyDF[strategyDF$down%in% 1:3,]<-
  adjust.row(strategyDF[strategyDF$down%in%1:3 , ], increase.percent = .1, increase.var = "percent.pass")
strategyDF[1:5, ]#%>%
  # kable(row.names = F, digits = 3) %>%
  # kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```


For example in state 1, I select a pass 57% of the time instead of 47% of the time.. pretty simple. In R, I just do a weighted sample now.  I then rerun the 10K simulations for the new strategy. 


## Part 2: Methodology: Long and Short Passes
Before I test out different strategies, I wanted to account for one more thing. I wanted to separate short passes from long passes so that in a strategy I can specify long or short pass. I define short as <10 air-yard pass, and long as >=10 air-yard pass. There's evidence that deep passes are where the real value is at, for example [here](http://archive.advancedfootballanalytics.com/2010/09/deep-vs-short-passes.html) so I wanted to include this.

The main problem here though which was mentioned in that link is that sacks happen before a pass is attempted, so you don’t know whether it should be attributed to short or long. I do have some way to handle this though. First of all, if a state has 90% long passes and 10% short, then sacks in that state should probably mostly be attributed as long pass attempts. In addition, there is probably a different sack-rate for long and short attempts. To calculate this, I aggregate long-passes, short-passes, and sacks by state, then estimate how much a change in the number of long or short passes in a state will increase the sack rate:


```{r include=F, eval=T , echo=F}
source("simulate-with-short-passes-functions.R")
```

```{r include=T, eval=T , echo=F}

summary(lm(percent.sack~scale(percent.short.pass)+scale(percent.long.pass)+as.factor(down), data=stateDF)) 
```


Based on this above model, I’m going to assume long passes have a sack rate of 2x that of short passes. So for example, if a state had 30 short, 30 long and 12 sacks, I say 4 were short, 8 were long attempts. This is just a way for me to attribute sacks, I think it is somewhat reasonable. The 2x might be wrong but it is just a parameter that I can experiment with to see how it affects the results. After attributing the sacks I end up with this:


```{r include=T, eval=T , echo=F}
if("percent.sack"%in% colnames(stateDF)){
  total<-(stateDF$percent.short.pass+2*stateDF$percent.long.pass)
  stateDF$percent.short.pass<-stateDF$percent.short.pass+stateDF$percent.sack*stateDF$percent.short.pass/total
  stateDF$percent.long.pass<-stateDF$percent.long.pass+stateDF$percent.sack*2*stateDF$percent.long.pass/total
  head(stateDF, 10)
  stateDF$percent.sack<-NULL
  stateDF[is.na(stateDF)]<-0
}

stateDF[1:5, ]#%>%
  # kable(row.names = F, digits = 3) %>%
  # kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

```


I'm finally ready to test out several different strategies of increasing long and/or short passes. Below I show the drive results of different strategy parameters.


## Part 3: Analyzing the Results

```{r include=T, eval=T , echo=F}

files<-c("alldrives.csv",
         "alldrives_withshortpasses.csv",
         "alldrives_passx10_downs1-3.csv",
         
         "alldrives_shortpassx10_down1-2.csv", 
         "alldrives_shortpassx10_down1-3.csv",
         # "alldrives_shortpassx30_down1-2.csv",
         
         "alldrives_longpassx10_down1-2.csv",
         "alldrives_longpassx10_down1-3.csv"  
         # ,"alldrives_longpassx10_down1-2_sacksx4.csv",
         # "alldrives_longpassx30_down1-2.csv"
         )

readFile<-function(file){
  data<-fread(paste0("Data/", file))
  data$Params<-file
  data
}
all.drives.store<-rbindlist(lapply(files, readFile))

all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store<-all.drives.store[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
                                                                             ifelse(is.fg.make[end.drive==T], "fg.make",
                                                                                    ifelse(is.td.offense[end.drive==T], "td.offense",
                                                                                           ifelse(is.safety[end.drive==T], "safety",
                                                                                                  ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
                                                                                                         NA))))))), by=c("Params", "sim.id")]
all.drives.store$Params<-gsub("alldrives_|[.]csv", "", all.drives.store$Params)
all.drives.store$Params[all.drives.store$Params=="alldrives"]<-"default_part1"
all.drives.store$Params[all.drives.store$Params=="withshortpasses"]<-"default_part2"
all.drives.store$Params<-factor(all.drives.store$Params, levels=unique(all.drives.store$Params))

meds<-all.drives.store[all.drives.store$end.drive==T, 
                       list(mean.yfog.on.punts=as.double(mean(new.yfog[is.punt], na.rm=T)),
                            mean.drive.play.length=mean(play.num),
                            percent.3.and.out=mean(play.num==4& is.punt), 
                            percent.td.offense=mean(is.td.offense),
                            mean.end.EP=mean(ep+epa)
                       ),  by=c("Params")]

group.means<-all.drives.store[all.drives.store$end.drive==T& all.drives.store$end.drive.play!="safety",
                              list(n=length(play_type), end.pos=mean(new.yfog), 
                                   end.EP=mean(ep+epa)
                              ),
                              by=c("Params", "end.drive.play")]
group.means$percent<-group.means$n/10000
ggplot(group.means, aes(x = Params,y= percent , fill= end.drive.play, label=round(percent, 3))) +
  geom_bar(stat="identity") + 
  theme(axis.text.y = element_text( hjust = 1),  
        axis.title.y = element_blank())+
  geom_text(size = 3, position = position_stack(vjust = 0.5))+
  scale_x_discrete(limits = rev(levels(group.means$Params)))+
  coord_flip()

```


Above I display the results from different parameters, ex: x10longpass_down1-2 means add 10% to long pass percentage on every 1-2nd down state. As you can see from the chart, the parameters that do the best are long-pass. Adding +10% to first and second down increases TD-rate and decreases punts. There’s countless strategies to try so I just showed a few to keep it simple. This isn’t an exact science, and I’ve noticed that the results vary even with 10K simulations. I’ll look into different metrics and quantifying the error. The main takeaway is that long passes definitely sem to increase TD’s and decrease punts from the baseline “default_part2”.

The last thing I’d like to look at is how these results are different to just using EPA and/or YPA. From my simulated data which includes the estimated sack-rate, I look at EPA and YPA below:


```{r include=T, eval=T , echo=F, fig.width=12}

# source("simulate-with-short-passes-functions.R")

all.drives.store<-fread("Data/alldrives_withshortpasses.csv")
all.drives.store$is.safety[all.drives.store$desc=='(:59) (Punt formation) L.Cooke punts 47 yards to TEN 7, Center-M.Overton. C.Batson MUFFS catch, touched at TEN 7, and recovers at TEN 1. C.Batson tackled in End Zone for -1 yards, SAFETY (L.Jacobs).']<-F

df.scrimmage$is.fg.miss<-df.scrimmage$is.fg& !grepl("is GOOD|is Good", df.scrimmage$desc)
df.scrimmage$is.fg.make<-df.scrimmage$is.fg& grepl("is GOOD|is Good", df.scrimmage$desc)

all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)

##CHECK DRIVES/LOOK AT END STATES OF SIMULATED AND ACTUAL DRIVES

df.scrimmage<-data.table(df.scrimmage)[, `:=`(start.drive.yfog=as.numeric(yfog[start.drive==T]),
                                              end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
                                                                                 ifelse(is.fg.make[end.drive==T], "fg.make",
                                                                                        ifelse(is.td.offense[end.drive==T], "td.offense",
                                                                                               ifelse(is.safety[end.drive==T], "safety",
                                                                                                      ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
                                                                                                             NA))))))), by=c("drive", "game_id") ]
all.drives.store<-data.table(all.drives.store)[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
                                                                                         ifelse(is.fg.make[end.drive==T], "fg.make",
                                                                                                ifelse(is.td.offense[end.drive==T], "td.offense",
                                                                                                       ifelse(is.safety[end.drive==T], "safety",
                                                                                                              ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
                                                                                                                     NA))))))), by=c("sim.id") ]


group.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
                                          list(yards_gained.mean=mean(yards_gained),
                                               yards_gained.median=median(yards_gained), 
                                               sack.rate=mean(is.sack),
                                               yards_gained.sd=sd(yards_gained) ,
                                               epa.mean=mean(epa, na.rm=T),
                                               epa.median=median(epa, na.rm=T),
                                               epa.sd=sd(epa, na.rm=T)
                                          ), by="play_type" ]


##DRIVE STATS BY DOWN###
down.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
                                         list(yards_gained.mean=mean(yards_gained),
                                              yards_gained.median=median(yards_gained), 
                                              yards_gained.sd=sd(yards_gained),
                                              sack.rate=mean(is.sack),
                                              epa.mean=mean(epa, na.rm=T),
                                              epa.median=median(epa, na.rm=T),
                                              epa.sd=sd(epa, na.rm=T)
                                         ), by=c("play_type", "down.original") ]
group.means$down.original<-"TOTAL"

p1<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= yards_gained.mean , fill= play_type)) +
  geom_bar(position=position_dodge(),stat="identity") +
  xlab("Down")+
theme(legend.position = "none")  + 
  ggtitle("YPA by Play Type & Down, sacks included")

p2<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= epa.mean , fill= play_type)) +
  geom_bar(position=position_dodge(),stat="identity")  +
  xlab("Down")+
  ggtitle("EPA by Play Type & Down, sacks included")

p3<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= sack.rate , fill= play_type)) +
  geom_bar(position=position_dodge(),stat="identity")  +
  xlab("Down")+
  ggtitle("sack.rate by Play Type & Down")

gridExtra::grid.arrange(p1, p2, ncol=2)


##DRIVE STATS BY Yards-to-go###
all.drives.store$yards.to.go.bin<-factor(cut(all.drives.store$yards.to.go, breaks=c(0, 2, 6,9 ,11, 100), include.lowest = F, 
                                             labels = c("1-2", "3-6", "7-9", "10-11", "12+")), 
                                         levels = c("1-2", "3-6", "7-9", "10-11", "12+"))

down.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
                                         list(yards_gained.mean=mean(yards_gained),
                                              yards_gained.median=median(yards_gained), 
                                              yards_gained.sd=sd(yards_gained),
                                              sack.rate=mean(is.sack),
                                              epa.mean=mean(epa, na.rm=T),
                                              epa.median=median(epa, na.rm=T),
                                              epa.sd=sd(epa, na.rm=T)
                                         ), by=c("play_type", "yards.to.go.bin") ]
group.means$yards.to.go.bin<-"TOTAL"

p1<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(yards.to.go.bin),y= yards_gained.mean , fill= play_type)) +
  geom_bar(position=position_dodge(),stat="identity") +
  xlab("Yards.To.Go")+
  theme(legend.position = "none")  + 

  ggtitle("YPA by Play Type & yards.to.go.bin, sacks included")

p2<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(yards.to.go.bin),y= epa.mean , fill= play_type)) +
  geom_bar(position=position_dodge(),stat="identity")  +
  xlab("Yards.To.Go")+
  ggtitle("EPA by Play Type & yards.to.go.bin, sacks included")

gridExtra::grid.arrange(p1, p2, ncol=2)

```

My results line up well with EPA chart shown on the top right. On 1st and second down you can see how long passes seem to have high expected value, which is supported in my simulations. On third down, the effect isn’t there as much, which I saw in my simulation as there was not change when I added long passes to 3rd down. While YPA would suggest long passes is best in all states, looking at EPA seems to provide better guidance.


## Part 4: Conclusion

In this post I showed my simulation method for testing different play calling strategies. Long passes, despite their higher variance, seem to improve the result of drives and lead to both more touchdowns and less punts. Overall, my results align with “Expected Points Added” values, which is cool because it confirms that EPA is a good guideline for play selection, and it also suggests that my system isn’t completely off. I was hoping to find a totally contrarian result here but it seems for now that EPA does a good job of improving a drive. There are definitely some shortcomings, for example, a team might be basing it’s play on what the defense is giving, and that a deep attempt occurs when things like coverage and pass protection are in the offense's favor. It's possible that rying to increase deep passing by 10% will result in lower quality deep pass attempts. I think having data on time to throw and receiver seperation may be useful in better understanding the EPA difference of deep and short passes. Finally, another shortcoming is that the binning I used may be misleading. Changing the binning to short, medium, and long could definitely make a difference it's entireley possible that the really short plays are the ones dragging things down for the short passes. I’d like to keep testing this system to try out different strategies or add medium passes. Thanks for reading.


[code](github.com/dlm1223/nfl-simulation)