-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
397 lines (285 loc) · 23.6 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
---
output:
md_document:
toc: false
variant: gfm
---
# Simulating NFL Play-Calling Strategies
```{r setup, include=FALSE}
knitr::opts_chunk$set(fig.width=10, fig.height = 4,echo = TRUE)
options(width = 10000)
```
In this post I’ll use simulation to test out different play-calling strategies. EPA (expected points added) is very useful in telling you expected value of a play on average. For example, it will say that passing on first down gives you +.25 expected points added on average, whereas running gives you +.1 on average. I'm still trying to understand though if the variance of a play matters, for example, does it make sense to take a lower EPA run on first down so you end up with 2nd and 7 rather a higher variance pass that will give you a good chance of second and long. I ran drive simulations to test out these different strategies, and then I could see if the high EPA and/or YPA play is the best choice. I got a lot of this code from [this post](https://statsbylopez.netlify.com/post/resampling-nfl-drives/) which was very helpful, and I made changes like creating game states and some other things.
## Part 1: Methodology: Simulating Drives
```{r include=F, eval=T , echo=F}
library(crosstalk)
library(knitr)
library(kableExtra)
library(DT)
library(webshot)
library(leaflet)
source("simulate-functions.R")
```
The data I am using is PBP data from 2011-2018, filtered to when the game was within 1-score, >=2 minutes before halftime and >=5 minutes before the end of the game. I define game state below based on down, yards-to-go, and yards-from-own-goal. Below are some of the states:
```{r include=T, eval=T , echo=F}
stateDF[1:10, ]#%>%
# kable(row.names = F, digits = 3) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
There are around 300 total states. Next, like in the post I linked above, I simulate a drive by sampling from the data, sampling the play result based on the state I am currently in. The drive stops when there’s a touchdown, turnover,turnover-on-downs, field goal,field goal miss, or safety. Below is an example of a simulated drive starting from the 25:
```{r include=T, eval=T , echo=F}
#simulation-run.R
strategyDF<-stateDF
yfog.start<-25
set.seed(0)
#initialize drive
drive.store <- list()
length(drive.store)<-40 #no drive is ever gonna be more than 40 plays
new.down <- 1
new.distance <- 10
new.yfog <- yfog.start
end.of.drive <- FALSE
play.num <- 1
#simulate until absorbing state
while (!end.of.drive){
run.play <- sample.play(df.scrimmage=df.scrimmage,
stateDF=stateDF,
down = new.down,
yards.to.go = new.distance,
yards.from.own.goal = new.yfog,
strategyDF = strategyDF
) #can specify play_type here
run.play$play.num <- play.num
drive.store[[play.num]] <- run.play #add each play
new.down <- run.play$new.down
new.distance <- run.play$new.distance
new.yfog <- run.play$new.yfog
end.of.drive <- run.play$end.drive
play.num <- play.num + 1
drive.store[1:play.num]
}
drive.store<-rbindlist(drive.store)
drive.store[, c("down.original", "yards.to.go", "yards.from.own.goal", "State.ID", "desc", "yards_gained", "new.down", "new.distance")]#%>%
# kable(row.names = F) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))%>%
# column_spec(5, width = "30em")
```
I repeat this to get 10,000 simulated drives starting from the 25 yard line. I can now look at how these results line up to drives in the actual data:
```{r include=T, eval=T , echo=F}
#analyze-sim-results.R
all.drives.store<-fread("Data/alldrives.csv")
all.drives.store$is.safety[all.drives.store$desc=='(:59) (Punt formation) L.Cooke punts 47 yards to TEN 7, Center-M.Overton. C.Batson MUFFS catch, touched at TEN 7, and recovers at TEN 1. C.Batson tackled in End Zone for -1 yards, SAFETY (L.Jacobs).']<-F
df.scrimmage$is.fg.miss<-df.scrimmage$is.fg& !grepl("is GOOD|is Good", df.scrimmage$desc)
df.scrimmage$is.fg.make<-df.scrimmage$is.fg& grepl("is GOOD|is Good", df.scrimmage$desc)
all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)
df.scrimmage<-data.table(df.scrimmage)[, `:=`(start.drive.yfog=as.numeric(yfog[start.drive==T]),
end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
ifelse(is.fg.make[end.drive==T], "fg.make",
ifelse(is.td.offense[end.drive==T], "td.offense",
ifelse(is.safety[end.drive==T], "safety",
ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
NA))))))), by=c("drive", "game_id") ]
all.drives.store<-data.table(all.drives.store)[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
ifelse(is.fg.make[end.drive==T], "fg.make",
ifelse(is.td.offense[end.drive==T], "td.offense",
ifelse(is.safety[end.drive==T], "safety",
ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
NA))))))), by=c("sim.id") ]
####PLOTS: DRIVE-LEVEL RESULTS######
#bind and plot
actual.means<-df.scrimmage[!is.na(df.scrimmage$end.drive.play)& df.scrimmage$start.drive.yfog%in% 22:28& df.scrimmage$end.drive==T,
list(n=length(game_id)), by="end.drive.play"]
actual.means$percent<-actual.means$n/sum(actual.means$n)
actual.means$Type<-"Actual"
sim.means<-all.drives.store[all.drives.store$end.drive==T, list(n=length(sim.id)), by="end.drive.play"]
sim.means$percent<-sim.means$n/sum(sim.means$n)
sim.means$Type<-"Simulated"
all.means<-rbind(actual.means, sim.means)
# all.means
ggplot( data = all.means, aes(x = Type, y = percent, fill = end.drive.play, label=round(percent, 3))) +
geom_bar(stat="identity") +
geom_text(size = 3, position = position_stack(vjust = 0.5))+
labs(title="Drive Outcomes",y="Percent", x=NULL, caption =paste0("Actual=Drives starting from 22-28 Yard Line, 1 possession game, at least 2 minutes before half ends (2011-2018), n=",
sum(all.means$n[all.means$Type=="Actual"]) ,")
Simulated=10,000 simulated drives starting from 25 Yard Line"))
```
It lines up pretty well, the difference could be due to the fact that the actual data includes penalties and my simulation doesn't. Even if there are differences, I'm mainly focusing on how strategies will increase from my simulated baseline. Now that I have the baseline, I can define any custom strategy I want and see how it changes the results. For example, instead of the default state here:
```{r include=T, eval=T , echo=F}
stateDF[1:5, ]#%>%
# kable(row.names = F, digits = 3) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
I can increase passes +10% on 1st-3rd downs:
```{r include=T, eval=T , echo=F}
strategyDF<-stateDF
strategyDF[strategyDF$down%in% 1:3,]<-
adjust.row(strategyDF[strategyDF$down%in%1:3 , ], increase.percent = .1, increase.var = "percent.pass")
strategyDF[1:5, ]#%>%
# kable(row.names = F, digits = 3) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
For example in state 1, I select a pass 57% of the time instead of 47% of the time.. pretty simple. In R, I just do a weighted sample now. I then rerun the 10K simulations for the new strategy.
## Part 2: Methodology: Long and Short Passes
Before I test out different strategies, I wanted to account for one more thing. I wanted to separate short passes from long passes so that in a strategy I can specify long or short pass. I define short as <10 air-yard pass, and long as >=10 air-yard pass. There's evidence that deep passes are where the real value is at, for example [here](http://archive.advancedfootballanalytics.com/2010/09/deep-vs-short-passes.html) so I wanted to include this.
The main problem here though which was mentioned in that link is that sacks happen before a pass is attempted, so you don’t know whether it should be attributed to short or long. I do have some way to handle this though. First of all, if a state has 90% long passes and 10% short, then sacks in that state should probably mostly be attributed as long pass attempts. In addition, there is probably a different sack-rate for long and short attempts. To calculate this, I aggregate long-passes, short-passes, and sacks by state, then estimate how much a change in the number of long or short passes in a state will increase the sack rate:
```{r include=F, eval=T , echo=F}
source("simulate-with-short-passes-functions.R")
```
```{r include=T, eval=T , echo=F}
summary(lm(percent.sack~scale(percent.short.pass)+scale(percent.long.pass)+as.factor(down), data=stateDF))
```
Based on this above model, I’m going to assume long passes have a sack rate of 2x that of short passes. So for example, if a state had 30 short, 30 long and 12 sacks, I say 4 were short, 8 were long attempts. This is just a way for me to attribute sacks, I think it is somewhat reasonable. The 2x might be wrong but it is just a parameter that I can experiment with to see how it affects the results. After attributing the sacks I end up with this:
```{r include=T, eval=T , echo=F}
if("percent.sack"%in% colnames(stateDF)){
total<-(stateDF$percent.short.pass+2*stateDF$percent.long.pass)
stateDF$percent.short.pass<-stateDF$percent.short.pass+stateDF$percent.sack*stateDF$percent.short.pass/total
stateDF$percent.long.pass<-stateDF$percent.long.pass+stateDF$percent.sack*2*stateDF$percent.long.pass/total
head(stateDF, 10)
stateDF$percent.sack<-NULL
stateDF[is.na(stateDF)]<-0
}
stateDF[1:5, ]#%>%
# kable(row.names = F, digits = 3) %>%
# kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```
I'm finally ready to test out several different strategies of increasing long and/or short passes. Below I show the drive results of different strategy parameters.
## Part 3: Analyzing the Results
```{r include=T, eval=T , echo=F}
files<-c("alldrives.csv",
"alldrives_withshortpasses.csv",
"alldrives_passx10_downs1-3.csv",
"alldrives_shortpassx10_down1-2.csv",
"alldrives_shortpassx10_down1-3.csv",
# "alldrives_shortpassx30_down1-2.csv",
"alldrives_longpassx10_down1-2.csv",
"alldrives_longpassx10_down1-3.csv"
# ,"alldrives_longpassx10_down1-2_sacksx4.csv",
# "alldrives_longpassx30_down1-2.csv"
)
readFile<-function(file){
data<-fread(paste0("Data/", file))
data$Params<-file
data
}
all.drives.store<-rbindlist(lapply(files, readFile))
all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store<-all.drives.store[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
ifelse(is.fg.make[end.drive==T], "fg.make",
ifelse(is.td.offense[end.drive==T], "td.offense",
ifelse(is.safety[end.drive==T], "safety",
ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
NA))))))), by=c("Params", "sim.id")]
all.drives.store$Params<-gsub("alldrives_|[.]csv", "", all.drives.store$Params)
all.drives.store$Params[all.drives.store$Params=="alldrives"]<-"default_part1"
all.drives.store$Params[all.drives.store$Params=="withshortpasses"]<-"default_part2"
all.drives.store$Params<-factor(all.drives.store$Params, levels=unique(all.drives.store$Params))
meds<-all.drives.store[all.drives.store$end.drive==T,
list(mean.yfog.on.punts=as.double(mean(new.yfog[is.punt], na.rm=T)),
mean.drive.play.length=mean(play.num),
percent.3.and.out=mean(play.num==4& is.punt),
percent.td.offense=mean(is.td.offense),
mean.end.EP=mean(ep+epa)
), by=c("Params")]
group.means<-all.drives.store[all.drives.store$end.drive==T& all.drives.store$end.drive.play!="safety",
list(n=length(play_type), end.pos=mean(new.yfog),
end.EP=mean(ep+epa)
),
by=c("Params", "end.drive.play")]
group.means$percent<-group.means$n/10000
ggplot(group.means, aes(x = Params,y= percent , fill= end.drive.play, label=round(percent, 3))) +
geom_bar(stat="identity") +
theme(axis.text.y = element_text( hjust = 1),
axis.title.y = element_blank())+
geom_text(size = 3, position = position_stack(vjust = 0.5))+
scale_x_discrete(limits = rev(levels(group.means$Params)))+
coord_flip()
```
Above I display the results from different parameters, ex: x10longpass_down1-2 means add 10% to long pass percentage on every 1-2nd down state. As you can see from the chart, the parameters that do the best are long-pass. Adding +10% to first and second down increases TD-rate and decreases punts. There’s countless strategies to try so I just showed a few to keep it simple. This isn’t an exact science, and I’ve noticed that the results vary even with 10K simulations. I’ll look into different metrics and quantifying the error. The main takeaway is that long passes definitely sem to increase TD’s and decrease punts from the baseline “default_part2”.
The last thing I’d like to look at is how these results are different to just using EPA and/or YPA. From my simulated data which includes the estimated sack-rate, I look at EPA and YPA below:
```{r include=T, eval=T , echo=F, fig.width=12}
# source("simulate-with-short-passes-functions.R")
all.drives.store<-fread("Data/alldrives_withshortpasses.csv")
all.drives.store$is.safety[all.drives.store$desc=='(:59) (Punt formation) L.Cooke punts 47 yards to TEN 7, Center-M.Overton. C.Batson MUFFS catch, touched at TEN 7, and recovers at TEN 1. C.Batson tackled in End Zone for -1 yards, SAFETY (L.Jacobs).']<-F
df.scrimmage$is.fg.miss<-df.scrimmage$is.fg& !grepl("is GOOD|is Good", df.scrimmage$desc)
df.scrimmage$is.fg.make<-df.scrimmage$is.fg& grepl("is GOOD|is Good", df.scrimmage$desc)
all.drives.store$is.fg.miss<-all.drives.store$is.fg& !grepl("is GOOD|is Good", all.drives.store$desc)
all.drives.store$is.fg.make<-all.drives.store$is.fg& grepl("is GOOD|is Good", all.drives.store$desc)
##CHECK DRIVES/LOOK AT END STATES OF SIMULATED AND ACTUAL DRIVES
df.scrimmage<-data.table(df.scrimmage)[, `:=`(start.drive.yfog=as.numeric(yfog[start.drive==T]),
end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
ifelse(is.fg.make[end.drive==T], "fg.make",
ifelse(is.td.offense[end.drive==T], "td.offense",
ifelse(is.safety[end.drive==T], "safety",
ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
NA))))))), by=c("drive", "game_id") ]
all.drives.store<-data.table(all.drives.store)[, `:=`(end.drive.play=as.character(ifelse(is.punt[end.drive==T], "punt",
ifelse(is.fg.make[end.drive==T], "fg.make",
ifelse(is.td.offense[end.drive==T], "td.offense",
ifelse(is.safety[end.drive==T], "safety",
ifelse(is.turnover[end.drive==T]| is.fg.miss[end.drive==T]|is.turnover.downs[end.drive==T], "turnover/turnover.downs/fg.miss",
NA))))))), by=c("sim.id") ]
group.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
list(yards_gained.mean=mean(yards_gained),
yards_gained.median=median(yards_gained),
sack.rate=mean(is.sack),
yards_gained.sd=sd(yards_gained) ,
epa.mean=mean(epa, na.rm=T),
epa.median=median(epa, na.rm=T),
epa.sd=sd(epa, na.rm=T)
), by="play_type" ]
##DRIVE STATS BY DOWN###
down.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
list(yards_gained.mean=mean(yards_gained),
yards_gained.median=median(yards_gained),
yards_gained.sd=sd(yards_gained),
sack.rate=mean(is.sack),
epa.mean=mean(epa, na.rm=T),
epa.median=median(epa, na.rm=T),
epa.sd=sd(epa, na.rm=T)
), by=c("play_type", "down.original") ]
group.means$down.original<-"TOTAL"
p1<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= yards_gained.mean , fill= play_type)) +
geom_bar(position=position_dodge(),stat="identity") +
xlab("Down")+
theme(legend.position = "none") +
ggtitle("YPA by Play Type & Down, sacks included")
p2<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= epa.mean , fill= play_type)) +
geom_bar(position=position_dodge(),stat="identity") +
xlab("Down")+
ggtitle("EPA by Play Type & Down, sacks included")
p3<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(down.original),y= sack.rate , fill= play_type)) +
geom_bar(position=position_dodge(),stat="identity") +
xlab("Down")+
ggtitle("sack.rate by Play Type & Down")
gridExtra::grid.arrange(p1, p2, ncol=2)
##DRIVE STATS BY Yards-to-go###
all.drives.store$yards.to.go.bin<-factor(cut(all.drives.store$yards.to.go, breaks=c(0, 2, 6,9 ,11, 100), include.lowest = F,
labels = c("1-2", "3-6", "7-9", "10-11", "12+")),
levels = c("1-2", "3-6", "7-9", "10-11", "12+"))
down.means<-data.table(all.drives.store)[grepl("pass|run", all.drives.store$play_type),
list(yards_gained.mean=mean(yards_gained),
yards_gained.median=median(yards_gained),
yards_gained.sd=sd(yards_gained),
sack.rate=mean(is.sack),
epa.mean=mean(epa, na.rm=T),
epa.median=median(epa, na.rm=T),
epa.sd=sd(epa, na.rm=T)
), by=c("play_type", "yards.to.go.bin") ]
group.means$yards.to.go.bin<-"TOTAL"
p1<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(yards.to.go.bin),y= yards_gained.mean , fill= play_type)) +
geom_bar(position=position_dodge(),stat="identity") +
xlab("Yards.To.Go")+
theme(legend.position = "none") +
ggtitle("YPA by Play Type & yards.to.go.bin, sacks included")
p2<-ggplot(rbindlist(list(down.means,group.means), fill=T), aes(x = as.factor(yards.to.go.bin),y= epa.mean , fill= play_type)) +
geom_bar(position=position_dodge(),stat="identity") +
xlab("Yards.To.Go")+
ggtitle("EPA by Play Type & yards.to.go.bin, sacks included")
gridExtra::grid.arrange(p1, p2, ncol=2)
```
My results line up well with EPA chart shown on the top right. On 1st and second down you can see how long passes seem to have high expected value, which is supported in my simulations. On third down, the effect isn’t there as much, which I saw in my simulation as there was not change when I added long passes to 3rd down. While YPA would suggest long passes is best in all states, looking at EPA seems to provide better guidance.
## Part 4: Conclusion
In this post I showed my simulation method for testing different play calling strategies. Long passes, despite their higher variance, seem to improve the result of drives and lead to both more touchdowns and less punts. Overall, my results align with “Expected Points Added” values, which is cool because it confirms that EPA is a good guideline for play selection, and it also suggests that my system isn’t completely off. I was hoping to find a totally contrarian result here but it seems for now that EPA does a good job of improving a drive. There are definitely some shortcomings, for example, a team might be basing it’s play on what the defense is giving, and that a deep attempt occurs when things like coverage and pass protection are in the offense's favor. It's possible that rying to increase deep passing by 10% will result in lower quality deep pass attempts. I think having data on time to throw and receiver seperation may be useful in better understanding the EPA difference of deep and short passes. Finally, another shortcoming is that the binning I used may be misleading. Changing the binning to short, medium, and long could definitely make a difference it's entireley possible that the really short plays are the ones dragging things down for the short passes. I’d like to keep testing this system to try out different strategies or add medium passes. Thanks for reading.
[code](github.com/dlm1223/nfl-simulation)