Jon William Crain

Jon thunk about Baseball: Pitch Selection, Location and Hitting

Welcome to Jon thunk, where I, Jon Crain, think about things and write about them in a hopefully interesting way. My sources are Google and Wikipedia mostly and I won't be citing them except as sources of further interest for nerds like myself.

Baseball, specifically the part of baseball I'm going to thunk about today, pitching and hitting, is a delicate balance of humans ability to predict and react.

The catcher's job is to locate the ball somewhere the batter doesn't expect at a time the batter doesn't expect, in conjunction with the pitcher who must actually deliver the pitch. Pitching must be a particularly difficult job as it pays quite well and a lot of people would like to do it even though it makes your arm hurt.

The hitter's job is to predict the type and location of the pitch, react and hit the ball hard.

The battery (pitcher and catcher) can't control the hitter's reactions, they can only try to exploit the hitter's predictions. If they make a pitch the hitter hasn't predicted at all and the hitter just reacts perfectly and gets a home run, the battery is powerless.

If we made the pitcher pitch from 2nd base, the game wouldn't be very interesting, hitters would just hit almost every ball as there would be no need to predict, just react.

Conversely, if we let pitchers throw from half distance, we'd have a comical game of predictions only and no reactions where the hitter would probably be best off bunting every time and moving the bat rapidly in the zone as the pitcher tried to throw it past his wildly flailing bat. I would actually pay to watch that. I'd suggest a two-on-one game where the battery has to field the ball and race the runner to first if they don't catch it in the air. The catcher's job would be even less fun than it is today, facing 100 mph pitches from 30 feet away. Actually I think we can make this work. (What if the pitcher threw from half the distance to the plate?)

So, how can the battery make it so the hitter will predict incorrectly most of the time and have to rely purely on reaction? The optimal theory that would give the hitter no predictive power from the battery's perspective (unexploitable by the hitter) is to randomly select one of their pitches, and a random location near the plate. Sometimes it will go over the plate, sometimes it won't, and that's a good thing because it's less predictable. This is incidentally the probabilistically optimal strategy in absence of information.

(https://en.wikipedia.org/wiki/Nash_equilibrium)

Imagine eight points, one on each corner of the strike zone, one on the middle of each edge of the strike zone, and then 8 points outside the strike zone corresponding to each of these points. That's where you want the ball to cross the plate as a pitcher.

This incidentally is my pitching strategy on MLB The Show(TM) [Not a sponsored article but whoever owns The Show or a competing product feel free to reach out].

Math section, feel free to skip.

Lets assign the hitter's predictive power a number based on the above system. We have 16 possible pitch locations times the number of pitches the pitchers can throw (normally three to five), T. Predictive Power P equals 16 times T, let's say 4 for example gives us a 1 in 64 chance of guessing the right location and pitch and getting a home run. That gives us a predictive power of 1.5% per pitch.

But really, if the hitter can just predict, inside, outside, high, or low and a pitch type they'll probably crush it, so let's give it a more realistic number of 4 times 4, or 1 in 16 so 6.25% chance to get really good contact per pitch. Average of five pitches per at bat, means you would expect a batter to get a solid hit or a home run about 30% of the time per at bat. So a .300 batting average, plus whatever scraps the batter can pick up on pure reaction or flubbed pitches. If the hitter can even predict the pitch type (control or fastball) they probably have a much better chance of getting a hit. So we're not actually that far off from reality with our uber-simplistic model, if you consider the hitter may only have an inkling of the pitch type or location, almost never both at the same time. I'll definitely do a follow up formal model with every factor I can think of in 30 minutes from each side.

End Math Section

Note whether the hitter wants a ball or strike based on the count is irrelevant if we want to truly be unpredictable. If we always throw a strike on a 3-1 (3 balls, one strike) pitch, that's bad as our hitter prediction power goes up. If we always throw a ball on an 0-2 pitch, that's bad for the same reason.

Immediately, theory runs into the harsh wall that is reality.

Problem one, humans are bad at generating random sequences. A catcher's brain, no matter how hard they try, can't generate truly random choices. Our brains are so wired to make connections that we try to make random things too random. For example, if the pitcher has four pitch types and throws a fastball, the batter still should choose a fastball for the next pitch 25% of the time, otherwise the batter can predict the type of the pitch 33% of the time knowing they never throw two fastballs in a row.

(https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables)

If a catcher actually wanted to try this feel free to consult me and I can give some tips for generating good random sequences.

Problem two, the pitcher may not be able to always use all their pitches as they get tired. Pitchers tend to lose control first, so later in the game they have to add in more fastballs in general. This bias makes them more predictable to the batter as the game goes later.

Problem three, the pitcher cannot always hit his target. He could be aiming for the middle-right with a fastball and miss. This really just makes him more unpredictable, which is good in theory, as long as it doesn't go over the middle of the plate and have Schwarber crush it 450 feet.

There are certainly more problems trying to play theory-optimally, but I think I've made my point. The main reason not to play theory-optimally is that the batter is a human, and prone to human weaknesses and bias. The battery is almost certainly better off trying to deceive the hitter by exploiting human psychology. While the above strategy cannot be predicted by the hitter, it's very possible that the correct sequence of pitches could make the hitter guess wrong much more than if the catcher picked a random pitch and the hitter predicted a random pitch. Using the history of pitches thrown, he can exploit the human tendency to see patterns that aren't there.

This sets up a natural terrier and mouse situation.

Let's make this as simple as possible (which as you will see isn't simple at all) and just think about the decision to throw a strike or a ball as the catcher, and whether to swing or not as the hitter.

Starting from the catcher's perspective at level one, the battery thinks about what the hitter is expecting. With no count, the hitter is probably expecting a strike, as the rules say three strikes and you're out and you can't get there, and won't have your job for very long, without throwing the first strike.

So, the pitcher should throw a ball the first pitch of the game, which the batter should swing on and miss most of the time statistically because they expect a strike the first pitch.

The second pitch now gets more interesting as there's now history, which is the actual art of pitching. Incidentally, there's probably already history unless it's the pitcher's first pitch of their career, and normally you can go back to minors, then college to look for patterns if the information is there. Also the batter has history. Let's put that aside and think about the most simple situation with no history, first pitch of their career in the majors, first pitch of the game, against a batter hitting for the first time (what a coincidence!).

Now let's consider the hitter's perspective, which we'll call level two. The hitter knows the pitcher is going to throw a ball as the level one battery thinks the hitter is expecting a strike. So, don't swing at the first pitch, unless it's slow enough that you can just crush it reactively. This also makes sense realistically. The pitcher may be off today, physically or mentally, so they might not even be able to throw a strike. If they are a new pitcher, they haven't even established they can throw a strike fast enough that you can't just react and hit the ball.

Back to the catcher, level three. The battery knows the hitter knows the pitcher is going to throw a ball and not swing. So, the battery should throw a strike because the level two hitter isn't going to swing.

I think we can see where this is heading (madness).

These levels can go forever, but really for this decision set it boils down to is if the battery predicts the hitter will always swing, he should throw a ball, if he thinks the hitter will always not swing he should throw a strike. If the batter predicts the pitcher will always throw a ball he shouldn't swing, and the opposite for a strike. The real world decision process is more complex, as the battery should really be thinking is the hitter more likely to swing or not? The batter should be thinking if the pitcher is more likely to throw a ball or a strike. It's a continuum, not a binary system as I modeled in the example.

Not only that but the battery really needs to decide, should we throw a ball or a strike? Where should we place the ball? What pitch type should we throw?

The hitter has to decide, should I swing or not? If I swing, where will the ball likely be placed? And finally, what pitch type will he throw?

Anyway, in reality, I imagine (I don't know as I've never played baseball at a professional level) we run into other human factors, which are limited intelligence to consider all factors, capacity to remember, and limited time to make a decision. Woe is the pitch clock. As a catcher, if you're considering where to aim you probably want to consider the umpire history as well and a myriad of other things.

Batters probably just learn to rely on their instincts as predictions clearly aren't very tractable for mere mortals when actually calculating and catchers try to guess what the batter's instinct will be for a given count and situation in the game and throw accordingly.

So we've analyzed a theory-optimal unexploitable approach to pitch selection and a decision model for pitch selection. In a follow up article I'll go over how I think batteries should actually think about decisions and batters should actually think about decisions.

Let's make like a scientist and actually look at some evidence. Pitchers throw a strike (in general, disregarding batter handedness, batter average, history, etc.) on the first pitch of every at bat about 57% of the time. So the takeaway is if you're a batter, swing away every 0-0 count because it's most likely a strike. The Phillies were right all along.

Enjoyed this article? Tip me on gofundme.