Created
September 17, 2016 14:47
-
-
Save bayesball/5338cc7271f643bc16aa849413c89a03 to your computer and use it in GitHub Desktop.
Constructs heat map of probability of a hit or home run for a specific player from pitchFX data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
heat_plot <- function(player, d, HR=FALSE){ | |
# inputs | |
# player - name of player | |
# d - pitchRX data frame with variables Batter, Event, and X, Z (location of pitch) | |
# will output a ggplot2 object | |
# need to use print function to display the pot | |
require(dplyr) | |
require(ggplot2) | |
require(mgcv) | |
# define the strike zone | |
topKzone <- 3.5 | |
botKzone <- 1.6 | |
inKzone <- -0.95 | |
outKzone <- 0.95 | |
kZone <- data.frame( | |
x=c(inKzone, inKzone, outKzone, outKzone, inKzone), | |
y=c(botKzone, topKzone, topKzone, botKzone, botKzone) | |
) | |
# only consider events that are official at-bats | |
TT <- table(d$Event) | |
To_Remove_indices <- c(1, 5, 8, 16, 18, 21, 22, 23, 24, 30) | |
AB <- names(TT)[-To_Remove_indices] | |
d_AB <- filter(d, Event %in% AB ) | |
# define the 1/0 response variable | |
if(HR == FALSE) d_AB <- mutate(d_AB, | |
Hit=ifelse(Event %in% c("Single", "Double", "Triple", "Home Run"), | |
1, 0)) else | |
d_AB <- mutate(d_AB, Hit=ifelse(Event == "Home Run", | |
1, 0)) | |
# implement the GAM fit (logistic link) | |
pdata <- filter(d_AB, Batter==player) | |
fit <- gam(Hit ~ s(X, Z), family=binomial, data=pdata) | |
# find predicted probabilities over a 50 x 50 grid | |
x <- seq(-1.5, 1.5, length.out=50) | |
y <- seq(0.5, 5, length.out=50) | |
data.predict <- data.frame(X = c(outer(x, y * 0 + 1)), | |
Z = c(outer(x * 0 + 1, y))) | |
lp <- predict(fit, data.predict) | |
data.predict$Probability <- exp(lp) / (1 + exp(lp)) | |
# construct the plot | |
type <- ifelse(HR==TRUE, "HR", "HIT") | |
ggplot(kZone, aes(x, y)) + | |
geom_tile(data=data.predict, | |
aes(x=X, y=Z, fill= Probability)) + | |
scale_fill_distiller(palette = "Spectral") + | |
geom_path(lwd=1.5, col="black") + | |
coord_fixed() + | |
ggtitle(paste(player, type)) | |
} |
Hi
A generalized linear model assumes that a response is a linear function of some covariates. In contrast a generalized additive model (gam) allows for input that is an arbitrary function of the covariates. This works better when you have nonlinear relationships— for example, the distance a ball travels depends on the launch speed and launch angle in a nonlinear way.
Jim
…Sent from my iPhone
On Jun 10, 2019, at 2:22 AM, hansr0518 ***@***.***> wrote:
Thank you for sharing your work. I have a question about using GAM. Why did you use GAM function instead of using GLM? If you could share it, that will be so great. Thanks again!!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
Thank you for the explanation!!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for sharing your work. I have a question about using GAM. Why did you use GAM function instead of using GLM? If you could share it, that will be so great. Thanks again!!