THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Sunday, April 16, 2023

Statcast Lab: CS accuracy model

The Catcher Caught Stealing Leaderboards will be upon us any day now.  As part of that, you will see a breakdown for Catcher Accuracy.  The baseline uses this CS data (click to embiggen):

The top row is how many feet the ball is from the 2B bag, along the baseline.  The first column is the height of the throw, if the throw reaches the baseline.  Otherwise, we look at the second column, which is the number of feet the ball hits the ground in front of the baseline.

As you can imagine, the best throw is a couple of feet toward the incoming runner, and a couple of feet above the ground.  You can see the general radius, as 2 to 4 feet into the 2B bag, and 1 to 3 feet high. And the more the ball is away from the target radius, the lower the CS percentage is.

If we assume the average CS is 27%, then a throw in the target radius of 62% means that the throw is worth +.35 CS in accuracy.  Naturally, a throw far away from the 2B bag and way high off the ground will have an estimated CS% of 0%, and so the accuracy value will be -.27 CS.


(2) Comments • 2023/04/16 • Baserunning

Sunday, April 09, 2023

Speed and Pickoffs in 2023

We can create a simple metric that is SB minus 2 times CS.  The breakeven point is somewhere above there.  It's closer to 3 to 1 in some occasions.  But if you use the stolen base in small ball situations, it is closer to 70%.  So, a 2:1 is a simple enough point to show.

And for SB I will include balks, while for CS I will include pickoffs.

For players with 30+ feet/sec of seasonal Sprint Speed, and no prior pickoff attempts, they have 6 SB and 1 CS, on 134 pitches.  That's a net of +4 SB on 134 pitches, or +2.6 per 100 pitches.

And this follows the pattern you'd expect.  For players at 29-29.99 ft/s, they are a net +0.8 SB per 100.  At 28-28.99, they are +0.5 per 100.  Below 28, and they are a net negative.  

That was with no prior pickoff attempts.  Once a pickoff attempt happens, things change radically.  Players at 26-26.99 ft/s jump up to +0.7 net SB per 100.  This puts them squarely at the equivalent of 29 ft/sec with no prior pickoff attempts.  In other words, it's like adding 2.5 ft/s of speed, simply by having a prior pickoff attempt.

And it gets worse.  Players at a speed of 27-28.99 ft/s have a net SB of +2.9 per 100 pitches (with 1 prior pickoff attempt).  That is similar to the speedsters of 30+ ft/sec with no pickoff attempts.  Again, it's like adding 2 or 2.5 ft/s of speed.  That's what the prior pickoff attempt does.

The solution?  Do not attempt a pickoff on a runner.  Ever.  Of course, once word gets out that a pitcher is not going to pickoff a runner, ever, the runners will take longer leads.  So, there's going to be a heavy amount of game theory going on, as the season progresses.

This is alot like the 3-0 count. Once you learn that a batter will NEVER swing at a 3-0 count, the pitcher will groove more in.  Of course, a batter will only resist for so long before he swings at 3-0.  And so, there's a balance that's going to be achieved.  Same thing applies with the running game.


Tuesday, April 04, 2023

Rewatching Gibson/Eck, with the Pitch Timer

Watch it with me, and fact-check me...

https://www.youtube.com/watch?...

Play 1: It's not clear how the pinch hitter works here, the machinations required to make it official, and when Gibson has 30 seconds to get in the box. At the 0:55 second mark, you see the umpire still walking. At the 1:17 mark, the umpire points to Eck, meaning the ball is now live. Don't forget, there's a runner on base, so, the ball is live only when the umpire says so.

Play 2: It's at the 1:20 mark, so clear compliance. It's a foul ball, so the batter will get some leeway here. Runner has to get back to the base at a quicker-than-Panda pace.

Play 3: At the 1:41 mark, the umpire again signals a live ball. But even if the ball was live back at the 1:30 mark, Gibson is alert by 1:42. Pickoff attempt at 1:45. We're in compliance.

Note: Eck will have ALOT of pickoffs. So, eventually he will be out of compliance here. But since Davis will end up stealing anyway, then stolen base or forced balk, you get the same result!

Play 4: At the 1:49 mark, we'll presume Eck gets the ball back. Gibson has stepped out and will step out after EVERY play, whether a foul ball or pickoff or stolen base. Every time. Every single time. Gibson is alert at 2:03. That's 14 seconds, rather than 12. Given a non-existent timer, I think we can probably say Gibson would have been in compliance had he known that 35 years later, the internet would exist and be doing this. Eck delivers in compliance, and we get a foul ball at 2:07.

Play 5: Eck gets the ball at 2:13, but we need the runner to get back to retouch the bag. Let's say that's at 2:20. Gibson is alert at 2:28. Eck another pickoff at 2:33.

Play 6: Eck gets the ball at 2:37. Like I said, Gibson is always stepping out. There will be no exception. So just remember that. He's alert at 2:50. That's 13 seconds, but let's assume that's compliance again. Another pickoff at 2:55.

Play 7: Eck gets the ball at 3:00. Gibson alert at 3:09. Just rewatch that. He had stepped out. He slowly comes in. He digs his foot in, because he is injured and he needs to set himself perfectly. He does all that in 9 seconds. Eck delivers in time. This is the little foul ball down the line that Gibson had to start running.

Play 8: The umpire gets a new set of balls at the 3:42 mark, so obviously, play is not live yet. No timer is running. Gibson is alert and in the box before the umpire puts on his mask. A catcher pickoff now happens.

Play 9: Eck gets the ball at 4:02. It's at 4:19 now when we come back from a replay and Gibson is already alert in the box. Unclear when he become alert. Let's say he was at, or close to, compliance. Eck in compliance, DAvis tries to steal, and we have a foul ball. We need time for Davis to get back to first.

Play 10: Davis is back at 4:41. Gibson is alert at 4:46. Eck in compliance.

Play 11: Eck gets the ball at 4:55. Gibson is alert at around 5:08. Hard to tell. But close enough to compliance. Eck another pickoff.

Play 12: Eck gets the ball at 5:16. Gibson alert at 5:29. Again, 13 seconds, close enough for what we're doing here. Gibson I should note not only walked out of the box, but also tapped his shoes with his bat to get rid of dirt. Gibson does a heckavu in the 12 or so seconds he didn't know we are alloting him 35 years later. Eck in compliance, Davis steals. Finally, all those Eck pickoffs come to an end.

Play 13: Eck gets the ball at 5:39. Runner has dirt on his pants, not sure how much time he gets to clean himself up. Doesn't matter because Hassey calls for a timeout for a mound visit. At 6:14 umpire puts his mask back on. Gibson is alert at 6:19. Doesn't matter, because now it's Gibson's turn to call for time. Batters can call one timeout in 2023, and this is it. Even though it's 1988. Gibson is ahead of his time.

Play 14: At 6:37, the umpire signals for the pitch, making the ball now live. Everyone in compliance. Spoiler alert.... spoiler alert.... Home run.

And for 7 glorious minutes, we have what is for many people the greatest baseball event ever. And those 14 plays in 7 minutes was in full compliance of the non-existent pitch timer.


(1) Comments • 2023/04/05 • History

Sunday, April 02, 2023

Strikeouts v other outs

The run value of the out is similar to the run value of the strikeout. Therefore, to evaluate the AVERAGE PLAY, there's not much difference (overall). For specific plays, naturally a strikeout with a runner on 3B and less than 2 outs is highly preferred to the defense. At the same time, a ground out with a runner on 1B and less than 2 outs is highly preferred to the defense.

To evaluate the PLAYER, then things are very different. For a pitcher, the strikeout is HIGHLY indicative of quality talent. For a batter, the strikeout in isolation can't be used. It's all part of their profile. A HR hitter will naturally have more strikeouts. So, there's a balance there.


(3) Comments • 2023/04/06 • Linear_Weights

Monday, March 27, 2023

NHL Draft, using the Gold Points

There is a rather clever standings model proposed by Adam Gold (@winunlimited).  If we apply a small variation: a team, starting at the all-star break, declares whether they will forego a run at the Stanley Cup, and instead make a ran for the top draft pick (Connor Bedard, let's say).

At the all-star break, courtesy of data provided by the kind and generous @domluszczyszyn, eleven teams had almost no shot at the Stanley Cup.  Those teams would all declare they are giving up on the Cup and are all-in on the Draft (aka, the Bedard sweepstakes).  Every game now counts toward the Draft.

The wonderful thing about the Gold points (besides the amazingly great name so we're lucky that Adam is named Gold) is that every win counts toward something.  Tanking is a thing of the past.

Two games after the All-star break, the Sabres would also likely declare themselves for the Draft.  The Sabres went on a decent run there for a while, winning 5 of 6, and so was able to get into the Gold Standings pretty well, even though they will end up having a couple of fewer games than the rest.  At some point in early to mid March, the Capitals would also have declared.

If things went as above, this is how the Gold Standings look:

Gold TEAM
30 Vancouver Canucks
29 Arizona Coyotes
24 Ottawa Senators
23 St. Louis Blues
21 Detroit Red Wings
20 Buffalo Sabres
20 Chicago Blackhawks
20 Montréal Canadiens
19 Anaheim Ducks
19 Columbus Blue Jackets
17 Philadelphia Flyers
12 San Jose Sharks
7 Washington Capitals

As you can see, the Canucks and the Coyotes would be fighting for Bedard (or whoever they want).  And every game becomes important.  And this is true whether you go for the Cup or not.

The Predators and Flames would be the next teams to try to figure out when they'd declare for the draft.

One note: while I said "all star game", we'd probably have to make it something like "after 50 games" or something.  At the all-star break, games played ranged from 48 to 54, so naturally, you couldn't use the date, since the number of future games would not be the same when everyone has the chance to declare.  Just a matter of selecting the game number.  You don't want it too early.  Probably 41 games (halfway point) is the fewest games before you declare for the Draft.


Friday, March 24, 2023

Improving WAR - Implicit Regression Toward the Mean

Yesterday, I showed you a method to finding the replacement level.

Perhaps the biggest source of confusion when it comes to sports data is the amount of Random Variation contained in the observations.  In the above thread, I focused on players who had at least 5 Individualized Games (iGames, or iG), which represents half a season, for the most recent season, and at least 15 iG over the previous three seasons.  Why did I do that?  Because I needed a substantial amount of data in order for the signal to suppress the noise.  And with an average of about 20 iGames (the equivalent of two full seasons), that was enough.

Now, I will show you how NOT to find the replacement level. I will focus on players with at least 5 iG in the previous season, with no checking on how they did in prior seasons.  With 1424 players, there are 124 who have an Indis win% of under .200.  Their average win% was .119.

Does this represent their TRUE talent?  No, not at all.  In represents SOME of their True Talent, but also SOME Random Variation.  These aren't necessarily below replacement level players.  They aren't even necessarily players who "played" at a below replacement level.  For all we know, these are .350 players who, through bad luck, ended up recording stats at a level of  .119.  How can we tell?

We can tell by looking for an unbiased estimator.  And the best place to look for that is from seasons that are NOT part of the performance observations you select from.  And the easiest place to find that is the NEXT season.  And in the next season, these players averaged 3.6 iGames at a win% of .382.  And that becomes our estimate as to their true talent level.

So, what does this mean?  Is .119 the replacement level?  Or is it .382?  They both have their problems, but the first one has a much bigger problem than the second.  The next season, the one with the .382 win%, that's limited to only players who actually played in the next season.  This is the survivorship bias.  Players who were hurt the previous year might have gotten better the next season.  Or players who truly were bad did not get a chance to show how bad they were because they were dropped.

What therefore can we do with that .119?  This is where it get really tricky.  The .119 win% is observation that has more bad luck than good luck.  But we want to compare those players to the true talent level of .300 win%, which has equal amount of good and bad luck.  Is it necessarily fair to have negative WAR for those players at .119?

The median Indis win% in the next season was .275.  That is probably our best estimate as to the true talent level of the group.  That still leaves us what-to-do with the .119.  Can we represent them as very below replacement level?  After all, we probably think their true talent level as a group is .275.  How can then we therefore look at them as .119.

This is why we don't want to get too stuck on single-season observations.  By expanding our sample size, we can a much truer representation of what the replacement level is.  It will be close to .300.

But we are stuck with the idea that true players at .275, who happened to put up numbers at a .119 level will get evaluated as .119 level against the .300 level.

The alternative, other than an explicit Regression Toward the Mean, is a floating replacement level.  So, the .119 players get compared to say a .150 replacement level.  And the .225 players might get compared to say a .240 replacement level.  And so on, until you get to .300.  This is an implicit Regression Toward the Mean.

And this may be what Bill James may be talking about.  He may be actually be proposing a Regression Toward the Mean solution, but instead of it being explicit (meaning adjusting the observations to give us a posterior number to work with), he instead sticks with the observations being unaltered, and floats the comparison baseline.  If this is what he is talking about, then my proposal here may be just the way to get both sides on the same page.  


() Comments • • WAR

Thursday, March 23, 2023

Improving WAR - Finding the replacement level

Bill James responded to a discussion we had:

But if we can agree or more-or-less agree about the wins and losses, then we have isolated the problem of finding the replacement level. Once we reach THAT point, where players have won-lost records like 11-19 (Win Shars) or 3.7-6.3 (WAR), then I BELIEVE that everybody will be able to see that players with .200-.300 winning percentages in a season mostly do NOT lose their jobs, and sometimes continue to play at that level for years--thus, that the replacement level has not been properly assessed. Regardless of what the truth is on that level, the point is that we will be able to see. Every discipline is a matter of solving sequential problems. When we resolve one, we move on to the next.

Well, Bill, glad you asked.

Step 1: Gerardo Parra

After the 2018 season was over, Gerardo Parra was granted free agency. In those last three seasons, 2016-2018, he had a total WAR of -1.4. We can represent his performance as an Individualized Won-Loss record (or what I affectionately call The Indis). Parra had a 4-15 record (which is 19 Individualized Games, or iGames, or iG). That's 4 Individualized Wins, or iW, and 15 Individualized Losses, or iL. A 4-15 record is a win percentage of between .200 and .250.

A full season is about 10 iGames, and an average player would have a 5-5 record. Parra has 19 iG in his last three seasons, so the equivalent of 2 full seasons played over three years. He played in 359 games with 1249 plate appearances in 2016-2018. That's about 2 full seasons, which is how he gets 19 iGames.

Let's continue. Parra signed with the Giants before the 2019 season. Since the Rockies were still on the hook for his guaranteed deal, they just had to sign him for the absolute minimum. The Giants released him after playing a little bit, and the Nationals signed him on a free agent deal for the rest of the 2019 season. He did not play in 2020. He played for the Nationals in 2021, and that's the last time he played in MLB. In total, from 2019-2021, he earned close to the league minimum as a free agent. His iW-iL record was 2-4 (6 iGames), which is a win percentage of .333.

Step 2: Prologue

When we talk about replacement level, this is pretty much what we are talking about. You find a player who had a win percentage around .250 to .300 over the previous three seasons, on the idea that most of that was real, and some of it was bad luck. And we expect that player to put up a .300 win percentage the following season (aging notwithstanding). And, this is the important part: we expect such players to be on their last leg, maybe 1 or 2 more seasons left.

And in this illustration Gerardo Parra is representative of this idea. Now, is this idea REAL? Or just a cherry-picked example?

Step 3: The worst players

I looked at all players between 1982 and 2018, who were in their walk year: at the end of that season, they would enter free agency. It's not technically a walk year for all players, since some players will just be outright released, and granted free agency on that basis. There's going to be a selection bias to consider.

In any case, we have 1256 players (non-pitchers) in our study. Of those, 14 had an Indis win% of under .200. Who are these 14 players? Let me pick out a few with a personal connection. We start with Doug Flynn entering free agency for the 1985 season. In the three previous seasons (1982-84, almost all with my Expos), he had a 2-19 record. When your Indis win% is under .100, you are really really on your last leg. Which was the case for Doug Flynn: 1985 was his last season, with 61 total plate apperances with two teams. Based only on Flynn, we suspect that the win% has to be well over .100 for a player to be able to last more than a season or two.

My next player is Mike Laser Lansing, also a former Expos player. He had a 3-13 record, an Indis win% of just under .200. How did Laser do? Well, he never played in MLB again. So, a .200 win% is too low for a replacement player, if we focus only on Lansing.

How about the anti-saber player Joe Carter, hero to all Canadians? In each of 1995, 1996, 1997, he was a below 0 WAR player, even though those last two seasons, he had 100+ RBI in each. His Indis record is 5-24 for an Indis win% under .200. What happened after he hit free agency? Well, 1998 was his last season, he played for two teams totalling just over 400 PA.

Let's look at a 4th and final player, before we do the summary of all 14 players, my saber-nemesis, Ryan Howard. Those who follow my blog know how much I've written about Howard. In his 2014-16 seasons, his Indis record was 3-20. He never played MLB afterwards, even though a couple of teams signed him to minor league free agent deals.

Of these 14 players, the most successful was Billy Hatcher. He had a 4-18 record, but pulled a rabbit out of his hat in 1993, where he was an average player in 136 games. Alas, 1994-95 had him come to bat a total of 340 plate appearances.

So, we can conclude that under a .200 win% is a clear signal that your career is just about over. These 14 players averaged 2 iGames in their first season in free agency, which is 20% of a season. Their total career after entering free agency was an Indis record of 8-38. A .174 win% is not what a team has any tolerance for.

Step 4: The .200 to .250 players

Now, how about the next group of players, those with a .200 to .250 win%? We have 17 players in this group, including Parra. Remember, we are looking at a total group of 1256 players, so these 14+17 players represent less than 3% of all players. They are the really worst players of this time period. These players had an Indis record of .226. How did their careers unwind?

In their first free agent year, none of them were full-time players. The most successful player was probably Mike Matheny, as he followed up his 3-12 record before free agency with a 16-30 record, since 1999. Matheny has the distinction of being one of the worst batters ever, but as a catcher, that can be forgiven with Gold Gloves, which Matheny claimed 4 during this time period. Matheny is probably the exception that proves the rule.

Another player among these 17 is Willie Bloomquist, who is the very face of replacement level. If you play worse than Willie, you are not going to play in MLB. Willie really has had a remarkably long career, but always for a very low salary each year. I believe one time he was able to sign a two-year deal. That's another staple of replacement level players: they only sign one year deals.

Anyway, 9 of the 17 played less than one full season's worth of games. Another 3 played about exactly one full season (Bloomquist, Chris James, Tony Pena), another 4 played 2 or 3. And then Matheny.

Step 5: Interlude

Now, if you want to argue that the replacement level is somewhere between .200 and .250, you could make that argument, based on these 17 players. These 17 players averaged 12 iGames, which is just a shade over 1 full season.

Step 6: The .250 to .300 players

But before we make that proclamation, let's look at the next group, those with a .250 to .300 Indis win%. How did THEY do? Well, we have 43 players. So, now we are up to 74 total players with an Indis record under .300, which is 6% of all the players. These 43 players with the .250-.300 win% actually had a similar rest-of-career as the 17 players with the .200-.250 win%. They averaged 11 iGames, which is just a shade over a full season. They averaged a 4-7 record during free agency.

Step 7: The .300 to .400 players

Indeed, even the NEXT group were very similar. We have 78 players with a .300 to .350 Indis win%. We are now at the bottom 12% of all the players. They averaged 12.5 iGames for the rest of their career, with a .360 win%.

Even the NEXT group after THAT also had a similar rest-of-career. We have 121 players with a .350 to .400 Indis win%. They averaged 13.3 iGames for the rest of their career, for a .361 win%.

As you can see, teams are not really giving much playing time to players at this low a performance level.

Step 8: Epilogue

Remember, earlier I said this:
Now, if you want to argue that the replacement level is somewhere between .200 and .250, you could make that argument, based on these 17 players. These 17 players averaged 12 iGames, which is just a shade over 1 full season.

I will now amend that to say this:
Now, if you want to argue that the replacement level is somewhere between .200 and .400, you could make that argument, based on these 259 players (20% of all players). These 259 players averaged 12.6 iGames, which is just a shade over 1 full season.

Including the 14 really bad players, we have 273 players out of 1256, or 22% of all players. And in the three years prior to entering free agency, these players averaged a .327 Indis win%.

And that's why we use .300 as our baseline level to represent the zero-point of replacement level.


(2) Comments • 2023/03/24

Monday, March 20, 2023

Measuring Finger Pressure on a Pitch

One of the things that we don't measure is the finger pressure on a pitch.  Alex Fast is looking to change that, and this recent research looks fascinating.

Start with the first chart.  The difference between the first line and the second line is simply the effect of the hardware.  So, we don't learn anything there, other than the impact of the hardware.  Which as we can see, is pretty substantial: 7mph loss of speed and 8% lower spin rate.  In any case, our baseline is the 2nd line, and the third line is what happens when we increase finger pressure: pitch speed goes up by 4mph and spin rate goes up by 7%.  The amount of movement goes up by 14%.

Is it possible that just applying more finger pressure than usual can do that?  Well, I think it's very possible and could very well explain why pitchers as relievers will throw harder than those same pitchers do as starters.  They do in fact throw harder, by about 3 to 4 mph, and would explain the much better performance by those pitchers as relievers, compared to themselves as starters (aka Rule of 17).  I've never looked at it beyond that (I don't know why, or maybe I have, and have forgotten), but higher spin rate and more movement would seem to be likely results as well.

The study does seem to be focused on one part of the finger to measure pressure.  Eventually, we'd likely measure every part of every finger and thumb at every point in the pitch release.  Exciting days and years ahead for biomechanical researchers and saberists alike.


What is a baserunner?

Last week, I saw a Twitter poll asking if a player who hit a homerun was a baserunner.  From there, I generated my own polls, all starting with the same premise: you sent 26 batters to the plate, all 26 were retired, then the 27th batter did something unusual.  And so the question at the end was: how many baserunners did this team have for this game?

If you have 27 ground outs, I think we can all agree that we have no baserunners.  Even though the batter-runner is running, he's in the runner's lane, and the defense got the batter-runner out.  But if he was safe, he's now a baserunner for the next batter.  If he got to second base on a double, he is no longer a batter-runner, but a runner going from 1B to 2B (no different than any other runner starting at 1B).  If he got thrown out trying to stretch it into a double, he was a runner thrown out at 2B.(*)

(*) Though you can also argue that the runner has to still be on the bases for the next batter. So your typical homerun has no baserunning. Being thrown out at second base on your own turn at bat has no baserunning.

So, it would seem that once the batter SAFELY reaches first base, that establishes that we now have a baserunner. 

However.

Let me provide two examples, and you tell me how you see it.  You have a fence-clearing hit, where the excited batter skips over first base.  The defense will appeal the play at 1B, and the batter is out.  The batter gets no HR, no single, no basehit of any kind.  There's no walk or hit batter or error or defensive interference.  There is nothing positive that happens here in the record book.  We have a 27-up, 27-down perfect game.  And so, no baserunners.  The batter never safely reached first base, never claimed it.

Now, how about a 4-ball walk, with an errant pitch.  The batter is awarded first base by the umpire, and in his excitement to try to get to 2nd base on the errant pitch (the ball is live after all), he skips over first base.  The defense appeals the play at first base, and the player is out.  The exact same thing as happened with the fence-clearing out.  

Except.

Well, this is officially a walk, in the scoring rules.  In both cases, the walk and the fence-clearing hit, the batter has the right to go to first base without chance of being put out.  The batter however has the obligation to touch first base.  Once he skips over that obligation, his right to first base no longer exists. 

This now becomes a discussion of scoring rules.  There's nothing to credit the batter for the fence-clearing hit.  You can't give him a single, because he never touched first base.  We can give him a walk for the 4-balls, because the umpire awards those on the spot, regardless if the batter does anything with it.

And so, when it comes to deciding what is a baserunner, are we necessarily tied to the idea that it must be based on the scoring rules on the batter?  Or, can we say that in either of these extreme cases, the team did not have a baserunner at all.  They both did the exact same thing (skipped over first base, out on appeal to first base).  They both had the right to go to first base without being putout.  The only difference is we have a scoring rule category for one, and not the other.



Thursday, March 16, 2023

Are clubs becoming smarter at stealing bases?

Yes.

In the soon-to-come Catcher Caught Stealing leaderboard on Savant, I am able to model each stolen base attempt to determine its expected caught stealing rate, based on where the runner is at the time the ball enters the catcher's glove.  As we know, the breakeven point for a stolen base attempt is for the runner to be at least 75% successful.  If the runner thinks they can be safe at least 75% of the time, they should go for it.  That is of course a combination of reading the pitcher and beating the catcher.

Similarly, if a runner thinks they will be safe at most 60% of the time, they should never attempt to steal.  In-between, and there's some other variables to consider.

In 2016, runners had an expected SB% of 75%+ almost half the time, while they had an expected SB% under 60% over one-quarter of the time.  Runners were far too aggressive.  But over the years, runners are getting smarter.  And in 2022, they were at their smartest.

In the above chart, we see that in 2022, 59% of their steal attempts were in the clear "GO" category.  They had a 75%+ chance of stealing, and so, those are clear GO situations.  Only 16% of their steal attempts were in the "NO GO".  It's still far too much, given they had a less than 60% chance of being successful.

Some of that, they couldn't know.  If a pitch ends up in the dirt, that could change a no-go into a go.  In the above chart, we are giving the benefit of the doubt to the runner here.  At the same time, a pitchout is usually a no-go, and we are blaming the runner for not knowing that.

More importantly overall is the trend.  And the trend is in the right direction: runners are learning.

2023 will throw all that up in the air because of the new rules.  It will be interesting to see how their change in aggressiveness comes into play.


(4) Comments • 2023/03/20 • Baserunning

Friday, March 10, 2023

Predictiveness of the Tools of Pitching

Fangraphs just released a couple of new pitching metrics in the wild, metrics that are based on specific pitch characteristics (pitch speed, movement, and so on).

I've been a bit skeptical on all these metrics because, I had believed, that part of the value of the pitches is the arsenal of a pitcher.  In other words, two identical fastballs thrown from a two-pitch reliever must be different from the batter's perspective to one from a five-pitch starting pitcher.  After all, part of the value of a pitch is in guessing the pitch to begin with.  If all a batter has to do is guess between fastball/slider, that's alot easier than guessing what Ohtani is throwing them.

I've also tried to stay away from doing too much analysis because I'm not technically independent.  Before my MLBAM days, I quite enjoyed my time as a saber watchdog.  But these days, I kind of have to step aside on that.  Except in this case, I'm jumping in because the results were pretty impressive, for at least one of the two new metrics.

I did not spend much time here looking at it, so I am hopeful that the #AspiringSaberists out there take up the slack here.

One of the requirements to evaluate a metric is to compare the results to an unbiased estimator.  What is that exactly?  Well, it's our best estimate of a pitcher's talent, that in no way is based on the underlying data.  For example, if I want to know which of FIP and ERA in 2021 represent a pitcher's talent, I need to compare it to ERA in 2022.  Why?  Because 2022 is outside the sample.  And ERA is ultimately what we want to measure (well, it's RA/9, but ERA is ubiquitous and we get similar results).  Isn't ERA filled with random variation?  Yes, but that's irrelevant in terms of keeping the estimated unbiased.  Random Variation doesn't affect the bias, only the uncertainty of the estimate.

Well, technically, we'd rather control for the park as well.  So, ERA in 2022 is not exactly unbiased because of the park, as well as the year-to-year similarity of fielders.  But, we have to start somewhere, and we wait for others to flex their data muscles to take it to the next step.

Ok, enough words, now for the numbers. I limited to pitchers with at least 40 IP in each of 2021 and 2022 that the new Fangraphs metrics provide data on. I have 264 pitchers. Hopefully someone can recreate (and/or correct) this. Year to year correlation of ERA to itself is r=0.21. This provides our baseline. FIP to ERA is r=0.29. That's really what everyone is trying to do: beat FIP. PitchingBot does come very close to matching, at r=0.28. Considering its inputs, it is impressive it could reach that point.

For Stuff, I had to first convert it to an ERA scale.  Well, I didn't have to, I could have let the correlation take care of it.  But, anyway, it's: ERA = 7 - Stuff/30.  I wanted to at least make sure I didn't have to square it or something.  The correlation is a very impressive r=0.37.

I don't know what it's doing, but it is allaying my skepticism somewhat.  Looking forward to others digging in here.


(5) Comments • 2023/03/11

Thursday, March 09, 2023

Get ready for stolen bases in 2023

While most presentations of the new rules is strictly about the rules themselves, the consequences of those rules are not discussed very much as it pertains to stolen bases.  The intended consequence of the new rules, notably the limited number of pickoffs, is that it frees up the running game.  And given what we've seen in the minors, and what's happened in spring training so far, we will get that impact in 2023.

In Spring Training 2007, the number of stolen base attempts of 2B was at 1.68, which was down from 2006 at 1.74. Not coincidentally, the SB success rate went up from 66% to 70%. This pattern repeated itself in 2008: SB attempts went back up to 1.75 (similar to 2006), while SB success rate went down to 67% (also similar to 2006). And most of seasons, this yo-yo effect continued.

Why would this happen? Well, when you are aggressive, the number of SB attempts goes up. But being overly aggressive also means you are taking extra chances, running in scenarios where the likely success rate is fairly low.

You can especially see the pattern from 2017 to 2022, as every season the SB attempts went down from the previous year, from 1.66 attempts per game in 2017, down to 1.31 in 2022. And we had an almost perfect matching of the SB success rate going up almost every season, from 67% in 2017 to 73% in 2021-22. So, the more you pick your spots, the fewer the attempts, but the more successful you will be when you finally do run.  There's a tradeoff that happens.

2023 changes all of that. Big time. Because of the new pickoff rule, SB attempts have skyrocketed to 2.01 per game, more than 50% higher than in 2022. Are the runners being insanely aggressive? No. They are taking advantage of the new Pitch Timer rule, which limits the number of disengagements (read: pickoff attempts) the pitcher has. Knowing the pitcher has limitations, the runners are attempting more steals, likely with somewhat larger leads. And the result is a an enormous jump in SB success rate: Whereas we were at 73% in 2021-22, we are now over 80% in 2023.

Let me show it to you in chart form, so you can really appreciate the difference (click to embiggen).  The blue line is the number of SB attempts per game (both teams), while the dotted orange line is the SB success rate.

In addition to the SB, regular balks are also way up. And while pickoffs are up slightly, most of that gain for the pitcher is given back with the new forced balks (balks awarded due to the violation of the new rule, what I call Stolen Balks). In the next chart, I plot all the positive baserunning events (steals, balks, errant pitches, defensive indifference) on the x-axis, and the negative events (caught stealing, pickoffs) on the y-axis.

Focus on 2018 through 2022: as the number of positive events goes down, so too does the number of negative events. 2023 throws all that out the window. We have as many negative events in 2023 as we did in 2021-22, but the number of positive events is up by 1 per game. In other words: a free base at no cost, once per game (or 0.5 for each team per game).

As players adjust to the new environment, expect the success rate to continue to go down while the steal attempts continue to go up. We should be reaching a breakeven point at the league level of 75% success rate. Essentially, SB should be attempted whenever the breakeven point is at least 70%, and therefore, the average of the steals above this line should likely come in at around 75%. Whether it takes under a season or multiple season for players to adjust, we will soon learn.


(2) Comments • 2023/03/09

Wednesday, March 01, 2023

Improving WAR - Synchronicity of Scoring Runs

On Sept 3, 2022, Dylan Cease faced 29 batters for a shutout, allowing one hit, 2 walks, while striking out 7. The Whitesox scored 13 runs.

Did the Sox win because of Cease? Or the batters? Even if the batters would have had one of their worst outings, the Sox would have won. Similarly, with 13 runs of support, the pitching would have to have been an enormous disaster to lose that game. For the sake of discussion, let's say that both Cease and the batters contributed equally to the win. Let's give Cease 0.5 wins and 0 losses, and we'll do the same for the batters. And because I don't like to carry decimals, I'll just multiply everything by 100:

50 Wins, 0 Losses: Cease (+0.25 wins above average)

50 Wins, 0 Losses: Batters (+0.25 WAA)

Game #2

Now, suppose the batters provided the league average 4 or 5 runs of support, then what? Well, in that case, if the batters are providing league average support, then they are probably contributing 0.25 wins and 0.25 losses. Cease and his sensational game is providing the rest. And since everything has to add up to 1 win and 0 losses, it looks like this:

75 Wins, -25 Losses: Cease (+0.50 WAA)

25 Wins, 25 Losses: Batters (+0.00 WAA)

Now, remember, it's the same Cease performance in either game. He didn't do anything different. But we're assigning a different value to Cease because for this particular example game, winning the game 4-0 or 5-0 puts the spotlight on the pitcher a great deal. In the actual game he had won, the 13-0 game, the spotlight was really shared.

Game #3

Let's go the other way. Let's suppose the batters scored 23 runs instead of 13. Cease still had the great game. Now what? Maybe it looks something like this:

20 Wins, 0 Losses: Cease (+0.10 WAA)

80 Wins, 0 Losses: Batters (+0.40 WAA)

In all these games, it's always the same Cease performance. But with one win available, we have to make some choices as to which players earned their share of the win. Everything has to add up.

Proposal

Now, let me offer an alternative for the actual 13-0 game.

100 Wins, 0 Losses: Cease (+0.50 WAA)

98 Wins, 2 Losses: Batters (+0.49 WAA)

-98 Wins, -2 Losses: Poor Synchronicity (-0.49 WAA)

So, what did I do here? Well, I'm evaluating Cease independent of his batters. A shutout will always get you a win 100% of the time, so that's what he gets here: 100% wins, 0% losses.

The batters scored 13 runs, which we evaluate independent of the 0 runs allowed by Cease. How often do teams that score 13 runs win? That's 98% of the time. We give the batters 98% wins and 2% losses.

Explanation

Giving Cease 100% of a win and giving the batters 98% of a win, that's 198% wins. But we only have 1 actual win. Since we want to freeze Cease at 100% and freeze the batters at 98% (keeping each independent of the other), we have only one choice left: create a Synergy or Synchronicity bucket, assigning minus 98% wins and minus 2% losses. In other words, the sum of the parts is greater than the whole. And we need to do a reconciliation in order to ensure the sum equals the whole. To do that, we create a Synergy bucket, one that reflects the fact that the Cease and the batters are not in Synchronicity for this game.

And this gives us the best evaluation of the contributions of the players independent of their teammates, while also being able to understand the contributions of the players in totality of the actual game.

Over the course of 162 games, we will expect that for most teams that Synergy bucket will total around 0 Wins and 0 Losses, give or take a few wins. This is how we can ensure we can properly evaluate the players, without needing to worry about linking their contributions directly to wins and losses.

This process ONLY works if we know how to evaluate a player's contributions in the form of runs.  Which we do.


(6) Comments • 2023/03/06 • WAR

Monday, February 27, 2023

Time shifting Front Offices

Someone on Reddit posted this delicious thought:

The year is 1960, and the spirit of Tom Tango visits a random MLB manager in his dreams, and tells him everything he needs to know about sabermetrics. The manager learns about FIP, OPS+, BABIP, leverage, etc. (Tom doesn't tell him anything about the future, just about sabermetric concepts.) The manager gets his GM and team on board and they start applying these principles. How much does the team improve over the next few years compared its non-sabermetric competition?

We always talking about transporting PLAYERS across time. Well, how about we transport FRONT OFFICES?  

The first step is just like Brad Pitt was doing, we'd see a bevy of trades.  Maybe if someone wants to have some fun with this, imagine you are the Chicago Cubs.  From 1947 to 1966, they were in the bottom half of the NL in each and every one of these twenty seasons.  So, let's take the midpoint, the 1956 season has just finished.  You transport the Epstein/Hoyer front office to the fall of 1956.  What happens?


(9) Comments • 2023/03/02 • History

Sunday, February 26, 2023

Improving WAR - Determining extent to link Game by Game Wins to Player Performance

Bill James created Win Shares on the idea that we need to ensure that the Player Performances, when added up, matches to Team Wins (at least at the Seasonal Level).  Of course, if you do it at the Seasonal Level, you should also do it at the Game Level.  Especially if you know all the performances of the players are tracked at the game level.  Which, in this day and age, we do know.  So we agree: let's break down each game, and assign each win and each loss to the players based on their contributions in those specific games.

Once you take that step however, well... you know, I was going to write about this, but then someone else wrote the argument against doing this better than I could have written it.  In other words, the argument against a Win Shares approach, at the game-level.  And who made this argument, by reading my mind and kindly attributing it to me, even though I could not have articulated it as well?  None other than Bill James:

...Tom's argument that there is no need to justify WAR with actual wins, but I PRESUME that what he is saying is that won-lost outcomes are somewhat random, thus not appropriate to adjust skill measurements to match them. Not just won-lost outcomes that are random, but interim outcomes. You put walks, hits, doubles, homers, stolen bases and errors into a pot, you get a somewhat predictable but somewhat random number of runs scored. Therefore, there is no logical requirement to match the outcome, because the outcome itself is a somewhat unreliable measurement. I ASSUME that is his argument.

I should point out that Bill is not necessarily agreeing with me (he might be, but it's irrelevant if he is).  The important part is that Bill articulated better than I could the argument against trying to get things to add up at the game-level.

Now, I will also say that I WILL create this metric.  I will make sure everything adds up at the game-level.  And once you see those results, you will likely determine that we shouldn't be doing this.  That there is so much random variation game to game that to try to assign a whole win to one team and a whole loss to another team will require some unusual choices.  But I will do it, because otherwise someone will say "why don't you do it".  And the best way to answer that question is to actually do it, so everyone can see that we really shouldn't be doing this.

And my larger point to Bill was that if it doesn't work at the game-level, then it won't work at the seasonal-level.  And the only reason it LOOKS like it works at the seasonal level is because 162 games allows us to wash away so much of that random variation.  If you had one game where a team wins 20-1 and then loses three other games 2-1, then after 4 games, that's 23 runs scored and 7 runs allowed.  With one win and three losses.  Four games won't cut it, and maybe forty might.  And by 162 games, it'll work out most of the time.  Until it doesn't. Like I said, only Random Variation saves you.  And if we rely on Random Variation, then why bother?


(6) Comments • 2023/02/27 • WAR
Page 1 of 241 pages  1 2 3 >  Last ›

<< Back to main