I have a really big post coming, but I’m driving to cleveland tonight and there’s no chance I’ll be able to finish it. So here is a little digression from the middle of it all that pretty much stands on its own. I hope to post the rest sometime this weekend.
Chad wondered: can we look at luck on individual plays or does it only make sense in aggregate? We discussed this over brunch for a while (our fiancé’s were thrilled), and I think the answer is yes, you can look at individual plays. In my original post, I didn’t specify what the odds were that the shortstop got to the ball, but in my head I was imagining a play that would be a hit maybe 90%-95% of the time. Now is that 5%-10% unlucky or just unlikely but inevitable? How about this play? That’s a home run 99.999% of the time, and I think we can all agree the guy who hit that was unlucky.
After a lot of thought, I realized that any cutoff is an arbitrary cutoff and in reality we should only be talking about degrees of luck rather than bucketing everything into categories. From a batter’s perspective, every hit is lucky and every out is unlucky. It’s a bit like win probability added insofar as every hit increases the chances of a win and every (nonproductive) out decreases the chances of a win.
A ball in play which is a hit 50% of the time that is turned into an out is unlucky, and that same ball when it is a hit is equally lucky. A ball which is a hit 99% of the time that actually results in a hit is lucky, but just a little tiny bit lucky. A ball which is a hit 99% of the time but is turned into an out is about as unlucky as you could be. getting 6 hits on 10 balls in play with an expected hit total of 3.6 is lucky, but not as lucky as getting 7 or 8 or 9 or 10 hits on those same balls; 5 hits in that situation is still somewhat lucky, but relatively less lucky than what actually happened. And if you’ve got a 30% chance of a ball in play turning into a hit and it does, that’s equivalently lucky to getting two hits on balls each with a 54.8% chance each of turning into hits or getting three hits on balls each with a 66.9% chance each of turning into hits (using the binomial distribution: 30% = 54.8% * 54.8% = 66.9% * 66.9% * 66.9%).
it seems to me that you are on the verge of creating a new stat that actually measures luck in some way. if we could narrow down each play into a likelihood of going for a hit based on the batted ball data and the “true talent” of the fielders, you could establish a way of measuring both average luck and cumulative luck, it seems.
if a batted ball is 30% likely to fall for a hit (a perfectly average batted ball, i guess), and it does, you could argue that is a luck factor of +.7 (it should have been .3 of a hit…instead it was a full hit) and if the same ball turns into an out, it would be a luck factor of -.3 (same argument but in reverse).
A player with 10 balls in play, all with a 30% chance of falling who gets 3 hits, would have a cumulative luck factor of 0 and an average luck factor of 0. If all 10 went for hits, the cumulative luck factor would be 7, the average would be .7.
it seems to me that this effectively gives the same information we EXPECT to get from BABIP, but corrects it for the ability of the hitter to hit the ball in certain places or certain ways. a batter who on average hits balls with a 40% chance of dropping and has a .360 babip, should have a luck factor of -.04. this number is essentially the difference between actual BABIP and a very precise measurement of expected BABIP.
i am not sure how easy it would be to create this expectation for every ball in play…but i think it would be pretty useful in terms of interpreting just how lucky a guy is, as opposed to how likely he is to repeat a very good (or very bad) BABIP.
I do agree there is something there!
I did some more thinking, and this is the right way to start things off. However, the value isn’t just 1 for a hit and 0 for an out – a batter can still be unlucky if a CF cuts a ball off in a gap and holds him to a single as opposed to extra bases. But overall, I think that’s basically the right math that you outlined.