THE BOOK cover
The Unwritten Book
is Finally Written!

Read Excerpts & Reviews
E-Book available
as Amazon Kindle or
at iTunes for $9.99.

Hardcopy available at Amazon
SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
Shop Amazon & Support This Blog
Looking for MGL's blog?
RECENT FORUM TOPICS
Jul 12 15:22 Marcels
Apr 16 14:31 Pitch Count Estimators
Mar 12 16:30 Appendix to THE BOOK - THE GORY DETAILS
Jan 29 09:41 NFL Overtime Idea
Jan 22 14:48 Weighting Years for NFL Player Projections
Jan 21 09:18 positional runs in pythagenpat
Oct 20 15:57 DRS: FG vs. BB-Ref

Advanced

Tangotiger Blog

<< Back to main

Saturday, June 15, 2013

Bias in Hockey Reference Point Shares

By Tangotiger 11:41 PM

?The good news is that the top three in goalie point shares were the three finalists for the Vezina (top goalie). The bad news is that the top 8 goalies in point shares were also the top 8 PLAYERS in point shares. The good news is that the top non-goalie in point shares just won the MVP.

This points to an issue of bias, similar to the way that Bill James’ Win Shares is biased against pitchers (*). In the Hockey Reference case, the bias is in the goalies’ favor.

(*) Bill gives around 35% or 36% of Win Shares to pitchers, while I give out 43%. Fangraphs also gives out 43%, while Baseball Reference gives out 41%.


#1    aweb 2013/06/16 (Sun) @ 12:15

Why wouldn’t goalies dominate a list like this? Do you see the replacement level being set too low?

Crosby was on pace to top the list, which is about what I would expect - it takes a great offensive season to be as helpful to your team as a very good goaltending season.


#2    Tangotiger 2013/06/16 (Sun) @ 21:05

Then why aren’t the top 8 players being paid all goalies?


#3    Ryan JL 2013/06/16 (Sun) @ 22:25

Then why aren’t the top 8 players being paid all goalies?

Too much uncertainty? I have likened goaltenders to MLB relievers in the sense that it’s difficult to keep track of who is supposed to be good and who isn’t, and the projections seem much less reliable than for skaters.  The top 8 players for this year might be goalies, without the top 8 projected for next year being goalies, if that makes sense.


#4    aweb 2013/06/17 (Mon) @ 07:40

There’s a salary ceiling, so sort of like the NBA, top players aren’t getting what they are worth. But goalies have never been the top paid players, so that’s at most a small part of it.

I’d say Ryan pretty much has it. The top 8:
Lundquist - predictable, paid top rates.
Bobrovsky - completely unpredictable, and further evidence that goalie issues in Philadelphia are the fault of team strategy.
Niemi - has been above average, had best year. Signed 2 years ago for 3.8mil/year.
Howard - reasonably predictable, played a ton to accumulate point shares. Starts a 6 year deal @5.3mil/year next year. Took possible lower salary for long term security. I wouldn’t guess he’ll be a starter 6 seasons from now.
Miller - predictable, paid 6.25/year.
Dubnyk - surprised me, on his RFA contract @3.5/year.
Rask - Not shocking, low salary, in his RFA years.
Holtby - surprising, in RFA years so low pay.

I would guess a key to leverage performance/cost out of hockey goalies is to let them play young, and don’t pay for past performance when they become free agents. I would also be surprised if teams are spending money in anything approaching a rational manner in the NHL. The top “cap hit” page (http://capgeek.com/leaders/?type=CAP_HIT) is a frightening look at the future. Top skaters are signed to ages 40, 34 (Malkin), 37, 36, 35, 36, 41, 37, 44, 41, 41, 38…yeesh, that is not going to end well.


#5    Tangotiger 2013/06/17 (Mon) @ 09:11

The uncertainty thing would be one component… if true.

In order to test your hypothesis, run a year-to-year correlation on Point Shares for all players with at least 30 games played, and report back the results.

If there’s little difference between forwards, defensemen, and goalies, then the uncertainty levels would likely be similar.  Which would then bring us back to bias.

I’ve love for someone to run with this…


#6    Tangotiger 2013/06/17 (Mon) @ 10:02

A good way to see if there’s alot of uncertainty in seasonal records is simply to look at career totals.  Alot of that uncertainty will wash away.

Looking at all players with at least 100 point shares since expansion to 12 teams:

84 skaters
34 goalies

Roloson is between Serge Savard and Eric Lindros. Rollie became a starting goalie at age 32.

So, no, I don’t think you can make a good case for Point Shares as currently constructed. 

Really, when you go this far away from the conventional wisdom, it’s not up to ME to make the case that CW is right, but rather it’s for HR.com to make its case that HR.com is right.


#7    Tangotiger 2013/06/17 (Mon) @ 10:22

And if you look at pitchers, they have ALOT of uncertainty.  Not only do we have the BABIP to contend with, but we also have performance with men on base and bases empty.  So, pitchers have one huge level of uncertainty piled on top of another.

And yet, if you look at seasonal totals based on runs allowed, or look at career totals, you are not going to see a ton of pitchers lead in seasonal WAR, and then suddenly disappear from career WAR leaderboards.

Pretty much, you’ll see close to one-third of seasonal AND career leaderboards being pitchers (something like that).  I think.  I never actually checked, but, at the same time, it never stood out as an issue.

In any case, this is just to point out that the uncertainty level can’t possibly contribute as much as is being hypothesized.


#8    Tangotiger 2013/06/17 (Mon) @ 11:49

This I think is the ideal goalie in terms of replacement level:

http://www.hockey-reference.com/players/b/billicr01.html

Craig Billington.  I figured he was never #1 goalie on any team, but, he was.  The year his team won 11 and lost 41 times with him in nets.  Apparently, he was not last in save %, which scares me to find out who was.

Somehow, he has 33 point shares for his career.  Including getting 4 point shares in the above season where he had a .859 save percentage.

We can do this all day.  Let’s just admit there’s a bias in the goalie calcs.


#9    Tangotiger 2013/06/17 (Mon) @ 13:05

Continuing my all-day assault: if we go with the hypothesis that it’s easy because of the uncertainty to have a high number of point shares for goalies, then it should be just as easy to get negative point shares as well. 

There were only 2 goalies last year with negative point shares, bottoming out at -0.2 point shares.  Neither of them played more than 3 games.

In contrast, 125 (!) skaters had negative point shares, and a good portion (38) of them played more than half the season.

Just waiting for someone to stop me.


#10    Tangotiger 2013/06/17 (Mon) @ 13:25

In the 2013 seasons (48 games), goalies had 257 point shares, while skaters had 1309 point shares.

That’s one-sixth of all point shares going to goalies, which is where the problem is.  One-sixth is how many players are on the ice of goalies, but that doesn’t mean we need to use that any more than we’d reason that pitchers get one-ninth of the value.

Skaters totalled 31.5 point shares below 0 and 1340 above zero, for a ratio of .024 negative point shares per positive point shares.

Goalies as noted totalled only 0.3 point shares below 0 and 257 above zero, for a ratio of .001.

I think in baseball, the ratio was something like 0.1 negative WAR per positive WAR.

So, I’d think that the skater replacement level is too low and the goalie replacement level is way way way too low.


#11    Tangotiger 2013/06/17 (Mon) @ 13:38

It looks like replacement level for goalies is set at about 160% of league average.  Which means that for the average goalie giving up 2.54 goals per 60 minutes and facing an average of 28.4 shots per 60 minutes, the replacement level is 4.01 goals per 60 minutes.  Or, in save% talk:
.912 = average
.861 = replacement level

Given that the top goalie was either at .932 or .941, depending where you put the cutoff, this means that there’s more gap between average and replacement, than between average and best.  And that’s NOT how talent is distributed.

I’d probably set replacement level at somewhere near .890.  That is, giving up goals at around 120% to 125% of the league average.

 


#12    Steve C 2013/06/17 (Mon) @ 13:47

Shouldn’t there also be a position adjustment for the skaters.  Center is harder to play than winger.  Defense is rather different.


#13    Tangotiger 2013/06/17 (Mon) @ 13:48

Doing a quick revamped point shares, with my method, you get 3 goalies and 21 skaters with at least six point shares.  That has a nice balance.


#14    Tangotiger 2013/06/17 (Mon) @ 13:51

Again, let salaries help guide you here.  If half the top skaters are centres, then make sure that your point system makes it that half your top skaters are centres.

It doesn’t have to be a solid rule, but it’s a good guideline.

This is how you can balance out QB and RB and LB and P.  If you want to deviate from what the market suggests, then it’s on YOU to make the case.


#15    aweb 2013/06/17 (Mon) @ 13:51

I thought I left a comment earlier but apparently did not. The short version - goalies in the 1980s sucked. 0.861 might be a reasonable replacement save percentage in the middle of that decade (only two starters finished over 90% in 85-86, and hardly anyone played 60 games), although it’s likely low even for that sieve-like era.

Goalies are way, way better now, due to equipment, style, and athleticism (and now the good ones play more often too). I don’t think a constant defensive share for the goalies is ever going to work.

Once I looked at what point shares were doing - yeah, it’s a decent starting point, but it’s clearly flawed.


#16    Tangotiger 2013/06/17 (Mon) @ 14:05

I wouldn’t look at sv% or GAA as an indication of talent.  This is no different than MLB, where the run environment was DIFFERENT (not better, not worse, but different), pre-1993 and post-1994.

NHL was high-flying in the 1980s.  It was DIFFERENT.

Anyway, setting replacement level to about 120% to 125% of league average should work in any post-expansion era.



#18    aweb 2013/06/18 (Tue) @ 07:17

That’s kind of my point in te comment that I lost - a fixed save % replacement level that might work for the highest scoring era ever will not work in other eras.

One issue save percentage has that it is close to the boundary - 100%. It’s not an intuitive range of percentages, since in most contexts, 93% (all-star) and 91% (average) aren’t all that different (and the “event” is really in the 100%-sv% number).  I’d much rather see it expressed as shots/goal (so leaders end up around 15-17, and modern trailers end up around 10). 93% - 14.3 shots/goals. 91% - 11.1 shots per goal. Back in the 80s, with an average save% at 88% - 8.3 shots/goal.

Of course, hockey has a few things like this. PP% instead of min/PP goal makes no sense, which gives a much clearer picture of what to expect. This made some inroads on NBC years ago, but I haven’t seen it since (on broadcasts, analysis uses this sort of thing by default).

 


#19    Tangotiger 2013/06/18 (Tue) @ 07:30

They don’t use a fixed number for sv%.  They use almost 160% of the (1-sv%) figure.  So, if the league average is .900 sv%, then goals allowed is .100 per shot, and the baseline level is .160, or a sv% of .840.

It is of course ridiculously low.  Really, a goalie gets value simply by being on the ice and not being a monumental scr-w up.


#20    Michael Cheyne 2013/06/18 (Tue) @ 14:50

Maybe this is too simplistic (or maybe just wrong), but I was just trying to further coroborate whether the .861 SV% passed the smell test and get an idea of what might be a more realistic replacement level for goaltenders. I looked at the past 10 seasons, sorted all the goalies by minutes played, and aggregated the save percentage for all goalies not in the top 60 in minutes played (the assumption being with 30 NHL teams and most teams carrying 2 goalies, all below that could be considered proxy replacement level players). The average SV% over the entire period for this contingent of players was .896 (on 35,190 SA), and was .900 in 2013 (on 1,600 SA). This is a lot closer to Tango’s number and seems to further suggest .861 is way too low. Again, there are probably reasons this was too simplistic, but it seemed fairly reasonable to get an approximate replacement value for SV%.

What I did find interesting was the sample varied pretty substantially over the last 10 years. By my method, teams only gave 1,600 minutes to “replacement” goaltenders in the 2012-2013 season, which averages out to about 2 games per team. 2006 was the peak, with 10,362 minutes or nearly 6 games per team.

Yr, Min, SA, SV%
2003, 8,579, 3,966, .900
2004, 9,416, 4,349, .896
2006, 10,362, 5,149, .885
2007, 7,181, 3,473, .891
2008, 4,906, 2,310, .893
2009, 8,361, 4,173, .890
2010, 5,317, 2,598, .907
2011, 7,267, 3,657, .906
2012, 7,692, 3,915, .905
2013, 3,419, 1,600, .900
Total, 72,500, 35,190, .896


#21    Tangotiger 2013/06/18 (Tue) @ 15:11

Well, the lockout explains the reason for this past season.

But yes, your approach is one I have done in the past to get a quick sniff test.  You can do that with starting pitchers as well, etc.

A BETTER way is to take that contingent of goalies, and see what they did in year T+1.  The reason is because by selecting based on minutes played in year T, you are biasing the sample.  In effect, what you are showing, the .896 save%, is the WORST-case scenario.

Hence, replacement level would be a bit higher than that.


#22    Tangotiger 2013/06/18 (Tue) @ 15:11

By the way, good job and thanks for rollign up your sleeves.


#23    Michael Cheyne 2013/06/18 (Tue) @ 15:23

Right. Lockout. Completely slipped my mind.

Since my sample is selectively biased and therefore should be a worst-case scenario for replacement, I’m wondering why your suggestion for the replacement level SV% (.890) was actually lower than mine.


#24    Tangotiger 2013/06/18 (Tue) @ 15:28

Probably a bad guess on my part.  I was figuring about 20 points worse than average I guess. 

Since the average was 8.8 goals per 100 shots, then 125% of that was 11, hence the .890 save%.  To have it at say .900 would mean only 114% of league average.  It seemed too low a number.

But, I could definitely be wrong.  I just never really looked into it too much.


#25    Michael Cheyne 2013/06/18 (Tue) @ 16:19

So I wanted to implement the T+1 framework, but I’m worried that doing so will cause additional bias I’m not sure I can control for. Since I am requiring they appear in the T+1 season, that means they are subject to both regression to the mean and again to selective sampling. I noticed that the average age was 25 for the goalies who were 61st and below in total minutes in year T but also appeared in year T+1. The average age of all goalies however is 28. So in the T+1 analysis I’m probably getting a lot of guys who were decent/good goalie prospects for their team getting a tryout in year T (and having somewhat poor results in limited time) and then getting brought up for more time (total minutes of the sample increased by 230% from T to T+1) in the next year and having better results due to regression (since year T was likely a worst-case scenario).

Anyway, if I implement the T+1 method with no controls other than they needed to appear in the T+1 season, I get a sample of 136 player/seasons. The SV% for these goalies goes from .901 (year T) to .908 (T+1). Since the average SV% for ALL goalies over the entire period was .909 (.912 in 2013), this study would suggest setting the replacement level for goalies at nearly average. This doesn’t seem to pass the smell test either.

So originally I gave a value of .896 for replacement level. If this is too low, maybe adjusting upward that value in a similar fashion to the improvement I observed between T and T+1 (.901 to .908) would give a better estimate. So maybe a replacement level of around .905. I don’t know…


#26    Tangotiger 2013/06/18 (Tue) @ 16:35

Excellent!

Yes, we do have survivorship bias.  But, what a fantastic way to handle that.

For 3rd stringers who did in fact play in both T and T+1, they went up 7 points.  You then simply applied that to all the goalies of T+1.  That puts it at .903 as the replacement level, compared to league average of .909.

Pretty high replacement level, but, if that’s what it is, that’s what it is.

+.006 goals per shot above replacement implies that with about 70,000 shots per season and 6 goals per win, then WAR for all goalies is 70.

Since I had WAR for all players at 700, then having 10% go to goalies is believable.  They may up 10% of the roster.  They probably make up 10% of the payroll (guessing, didn’t check).

Well done!  I love it when it all comes together so nice.


#27    Tangotiger 2013/06/18 (Tue) @ 16:43

Using Michael’s “average - .006” as the baseline, here are the leaders/trailers in WAR for goalies:

WAR
4.7 Sergei Bobrovsky
4.0 Henrik Lundqvist
3.9 Craig Anderson
3.7 Tuukka Rask
3.6 Antti Niemi
3.2 Jimmy Howard
...
(0.9) Jose Theodore
(1.0) Richard Bachman
(1.1) Ilya Bryzgalov
(1.2) Justin Peters
(1.2) Chris Mason
(1.7) Johan Hedberg
(2.2) Scott Clemmensen
(2.6) Miikka Kiprusoff

Total positive WAR = 63
Total negative WAR = -21

Ooof… that’s a 1:3 ratio, which would be… well, tough to accept.

I’d have to see this at the career level.


#28    Michael Cheyne 2013/06/18 (Tue) @ 16:51

Yes, .903, not .905 as my post states incorrectly. I tried to check on the salaries, but couldn’t quickly find a source that made it easy. But with the WAR% suggested by this new replacement level and Roster% lining up so nicely, we’re probaby on to something. Now HR just needs to take notice.


#29    Tangotiger 2013/06/18 (Tue) @ 16:56

If you make it league - .007, you get 12% WAR going to goalies. Ratio is -19 to +68.

At league - .008: 13.6% WAR, -17 to +73.

At league - .100: 17% WAR, -14 to +84.

We already saw that it doesn’t work at one-sixth, since this is how the thread started.

So, we’re really left going to about league -.007 really.  That’s probably the most reasonable one.


#30    Tangotiger 2013/06/18 (Tue) @ 16:58

I’m in contact with Neil at HR.  Let’s see what comes of it…


#31    Michael Cheyne 2013/06/18 (Tue) @ 17:06

2013 seems to be an outlier. I first confirmed I was getting the same results as you for 2013, and I did indeed get +63 and -21 WAR. So we’re good there. If I look at the past 10 season, this is what I get:

Total postitive WAR = 805
Total negative WAR = -107

Spo 0.133 negative point shares for every positive point share.


#32    Michael Cheyne 2013/06/18 (Tue) @ 17:11

And just for fun, over the last 10 seasons the top 10 in WAR using “lg. avg. - 0.006”:

WAR
52.7 Roberto Luongo
45.2 Tomas Vokoun
40.8 Henrik Lundqvist
35.0 Tim Thomas
30.4 Martin Brodeur
28.1 Ryan Miller
25.5 Miikka Kiprusoff
22.4 Jean-Sebastien Giguere
21.3 Kari Lehtonen
21.2 Niklas Backstrom


#33    Tangotiger 2013/06/18 (Tue) @ 17:17

Excellent, thanks for running all that.

And while I’m using lg - .006 in my illustration, it’s best to use (1-sv%) x 1.055 as the repl level.

Though at these low levels, it probably doesn’t change things much.


#34    Tangotiger 2013/06/18 (Tue) @ 17:18

Can you also show the bottom 5?


#35    Michael Cheyne 2013/06/18 (Tue) @ 19:21

Recognizing that the better method is (1 - SV%) * 1.055, I stayed with the “lg. avg. - 0.006” method to maintain consistency with what I reported above. Bottom 10 for WAR over the last 10 seasons:

WAR
(2.5) Hannu Toivonen
(2.8) Fred Brathwaite
(2.8) Brian Boucher
(3.1) Johan Hedberg
(3.4) Steve Shields
(3.6) Arturs Irbe
(3.7) Jonas Gustavsson
(3.7) Johan Holmqvist
(3.7) Byron Dafoe
(4.2) Sebastien Caron

And since you mentioned Craig Billington earlier, for the only season he appeared in the sample (2003) he had a WAR of -1.3 by this method. The “face” of replacement goaltenders in the sample would likely be Peter Budaj, having a total WAR of 0.2 over 8 seasons. 6 of those seasons were between -1 and +1 WAR, with a peak of 1.3 and a low of -2.5.

Top 10 goaltender seasons in sample:

Yr, Name, WAR
2004, Roberto Luongo, 10.5
2011, Tim Thomas, 9.4
2006, Miikka Kiprusoff, 8.9
2009, Tim Thomas, 8.5
2007, Martin Brodeur, 8.1
2010, Ryan Miller, 8.1
2006, Tomas Vokoun, 8.0
2006, Roberto Luongo, 7.9
2007, Roberto Luongo, 7.9
2012, Mike Smith, 7.8

Probably not important, but a slight correction to something posted above. The Total Negative WAR should have been reported as -111, not -107.


#36    Davor 2013/06/19 (Wed) @ 08:43

Tango,
if new method gives 0.133 negative WS for every positive WS for goalies, and for skaters that ratio is 0.024, isn’t new replacement level too strict for goalies?

As for pay difference: best forwards are often stars around 20, goalies need more time to develop. So, goalies are rarely in position that their 2nd (RFA) contract is big one.
As for Roloson, yes, he became starter at 32, but he was starter for 10 years (and lost a year to lockout when he was at his best). Would you point him out if he was starting between 26 and 36?

Interesting question, for someone who has time and interest: what is average sv% of players who were on the roster at the beginning of the season, and have played less than 25 games (prorated for strike seasons)? Those are real replacement level players, every offseason there are plenty of NHL backups on the market.


#37    Michael Cheyne 2013/06/19 (Wed) @ 15:09

So I thought Tango’s goaltender salary % of total spending was a good avenue to persue, and with the help of the invaluable capgeek.com I was able to compile the results for 2012-2013. In the future I would like to look at it over multi-years, but I’m betting there aren’t massive fluctations year-to-year in the percentage (though maybe that’s a bad assumption).

I used the total cap hit $ value for each player that logged time as a goalie in 2012-2013 and divided it by the total league spend for the year.

Goaltender Cap Hit: $176.6m
Total Spend: $1.82b
Goaltender % of spend: 9.7%

I recognize that certain players may have logged NHL time, but may not have contributed dollars against the cap (though I’m not exactly sure how it works). For example, I know that players can be brought up for a 10 game tryout without losing their rookie eligibility. I thought maybe these players cap hit values wouldn’t actually count against the league total spend, but I wasn’t sure. Anyway, if you do remove these players whose first logged NHL time was in 2012-2013 but had fewer than 10 games played, the total goaltender spend was $170.4m (9.4% of total spend).

Either way, we’re talking about goaltender salaries equating to about 9-10% of overall cap spend, which aligns nicely with their roster represenation, and which seems to help further validate the “lg. avg - 0.006” (or .007) method for determining the replacement level.


#38    Tangotiger 2013/06/19 (Wed) @ 15:09

Davor: because of the uncertainty level in the stats being discussed, it would be better to look at it at the career level.

***

I’m emailing with Neil, and I had proposed a 60/30/10 split in WAR for F/D/G, based simply on the fact that gameday rosters are split at 12/6/2.

Interestingly, he reports cap hit numbers of: 59/32/9.

It seems to me therefore that WAR should be split somewhere along the lines of 60/30/10.

Uninterestingly, he shows that the 59% breaks down as 26/16.5/16.5 between C/LW/RW.  It’s well-known that the best players are disproportionately centres, so it’s nice to see payrolls match that.

While we don’t need to force the WAR split among the forwards in that manner, we should expect to see the results match closely to it.  If you don’t, then it WOULD be a call for some additional positional adjustment.


#39    Mike Rogers 2013/06/19 (Wed) @ 20:16

This thread is fascinating.


#40    Davor 2013/06/20 (Thu) @ 03:55

#37:
Cap hit is based on players on the active (23 men) roster. Every day spent on the roster is salary/n for the cap hit (cap-adjusted salary, not salary in that year). So, 10-day trialists cost 10 days of their entry-level contract. Players who are sent down count only for the amount of time they were on NHL roster, even if they are on one-way contracts.

I checked starting age of NHL leaders in games played, age when they had their first big contribution to the NHL club (skaters - less than 10 games in minors, 70+ games played for modern players, 60+, 50+ for earlier ones (seasons of 70 and 60 games); goalies - more than 40 games played).
Forwards: top 25 started at the average age of 19.32;
Defensmen: top 25 started at the average age of 20.08;
Goalies: top 25 started at average age of 22.56.
(Brodeur, leading goalie with 1220 games played would have been 28th among defensmen and 60th among forwards.)

Only three of the top 25 goalies had their first season as starters as teenagers, 11 had at 23 or older. No skater had his first full season older than 22.
Because of the way salary is structured under cap, goalies are likely to enter their best years on 2nd cheap contract, or even on cheap FA contract. That means the are likely to get less for equal talent level than skaters during their big FA contract - less track record, not so linear development, possibility of a fluke season…
How much would that impact goalies? 10%? 20%? More? Less?
60/30/10 is nice, round split. Even if goalies do deserve 1-2 percentage points more, until there is some evidence (like too low ratings for goalies), better to go with round numbers.


#41    Tangotiger 2013/06/20 (Thu) @ 07:07

I agree that using the round numbers shows a level of uncertainty.  If we start using say 58/31/11, it presents a false sense of certainty.

This is why I make the fielding spectrum is steps of 0.25 wins, rather than steps of 0.10 or 0.05 wins.  It’s why my split between non-pitchers and pitchers is a ratio of 4:3, rather than use something more specific.

I’ll let other people bang their heads against the wall trying to get to a level of precision that can’t possibly be supported anyway, while I just walk ahead.


#42    Tangotiger 2013/06/20 (Thu) @ 15:32

Neil sent me some data to help in making the WAR for goalies.

Since 1984, here are the career leaders:
WAR
95 Roy
86 Hasek
64 Brodeur
63 Luongo
53 Beezer
50 Cujo
50 Belfour
48 Vokoun
37 Hrudey
37 Se Burke
37 King Henry
...
-6 Waite
-6 Whitmore
-6 Sidorkiewicz
-7 Eliot
-8 Chevrier
-15 Billington

I’m not happy that the WAR are that much negative.

If I increase it for goalies, I’d have to increase it for skaters, and suddenly, instead of around .285 WAR per game, I’ll be at .400 WAR per game.  And I think that’s bad too.

The other option is to “regress” the results.  We’re relying on save percentage entirely.  But, if it’s an imperfect metric, then we shouldn’t rely on it so much.

If I add say 200 games of league average to every goalie, that’ll halve those guys on the low side (since they averaged about 200 games each), while reducing the top guys by about 20% (since they played around 800 games each).

Of course, no one wants to talk about regression, even though that may simply be the answer.


#43    Tangotiger 2013/06/20 (Thu) @ 15:32

Sorry, not 200 games of league average, but 200 games of replacement level.


#44    Michael Cheyne 2013/06/21 (Fri) @ 12:35

I do think regression is the best answer, for a variety of reasons, one of which being when I re-ran the same study using the last 30 years of data it came out that replacement level would actually be “lg. avg. - 0.010”. At that level, goaltender WAR would end up representing about 15% of total WAR, which is nearly where we started and already decided we didn’t find palatable.

Anyway, if you add about 350 shots of replacement level per season*, you get the results you were after. Using “lg. avg. - 0.006” to remain consistent with what Neil was doing, this is what I get when I regressed:

WAR, WAR_RGR
100, 82, Patrick Roy
86, 70, Dominik Hasek
64, 53, Martin Brodeur
62, 52, Roberto Luongo
57, 46, John Vanbiesbrouck
53, 45, Curtis Joseph
51, 42, Ed Belfour
49, 41, Tomas Vokoun
41, 33, Kelly Hrudey
41, 34, Henrik Lundqvist
...
-5, -3, Frank Caprice
-5, -3, Eddie Mio
-6, -3, Jean-Claude Bergeron
-6, -3, Kay Whitmore
-6, -3, Jimmy Waite
-6, -5, Richard Brodeur
-7, -6, Peter Sidorkiewicz
-8, -5, Darren Eliot
-10, -7, Alain Chevrier
-17, -11, Craig Billington

You’ll likely notice my numbers and Neil’s numbers are at times slightly different, but for the most part they were either the same or off by +/- 1 WAR.

In aggregate:

Total Positive WAR: 2,073
Total Positive WAR_RGR: 1,687

Total Negative WAR: 305
Total Negative WAR_RGR: 155

So regressing in this manor lowers the positive WAR by 19% and halves the negative WAR, which ends up being nearly exactly what Tango was after.


*I did this for simplicity of implementation, though I know this wasn’t your suggestion. It works out to about ~200 games for most of the top guys, but obviously doesn’t correct for the truly bad players enough since they didn’t play very long.


#45    Tangotiger 2013/06/21 (Fri) @ 12:45

Right, in my case, what I would do is add in 3600 shots per career, and then, spread that out on a seasonal basis.

If the typical top-end goalie has say 12 seasons, then that means about 300 shots per season, which is kinda where you landed.

Where it gets more complicated is at the low-end, when you have guys only playing 3 or 4 seasons, and so, in my case, I’d add in 900 or 1200 shots of replacement-level.

We’re basically saying “the more seasons you play the less you need to regress per season”.  Which makes perfect sense.

***

Yes, and .006 comes out pretty close to a standard with my process.  Neil provided me with the raw dataset, and I applied my methodology onto it.  This is what I get for replacement level and goals per win:

year_id GPW replBelowAvg WbelowAvg
2012 5.79   0.0055   0.00095
2011 5.86   0.0055   0.00093
2010 5.91   0.0055   0.00094
2009 5.98   0.0056   0.00094
2008 5.86   0.0057   0.00098
2007 6.02   0.0058   0.00096
2006 6.18   0.0059   0.00095
2004 5.71   0.0058   0.00101
2003 5.80   0.0058   0.00100
2002 5.76   0.0059   0.00103
2001 5.90   0.0060   0.00102
2000 5.89   0.0060   0.00101
1999 5.79   0.0061   0.00105
1998 5.78   0.0060   0.00104
1997 6.05   0.0058   0.00095
1996 6.29   0.0059   0.00094
1995 6.14   0.0060   0.00097
1994 6.39   0.0060   0.00094
1993 6.78   0.0062   0.00092
1992 6.62   0.0062   0.00094
1991 6.60   0.0063   0.00096
1990 6.81   0.0064   0.00094
1989 6.88   0.0064   0.00094
1988 6.87   0.0064   0.00093
1987 6.81   0.0064   0.00095
1986 7.11   0.0065   0.00091
1985 7.03   0.0065   0.00093
1984 7.08   0.0066   0.00093

That fourth column converts the goals per shot below average into WINS per shot below average by dividing the third column by the second column.

As you can see, it’s pretty much at .001 wins per shot below average.

Which is kinda the point, since we have around 28.5 x 82 x 30 shots per season: 70110 shots, which times .001 gives us 70 wins.

Which is teh standard I’m adopting (70 of 700 wins goes to goalies).


#46    Tangotiger 2013/06/21 (Fri) @ 12:47

Where it looks weird, on a seasonal basis: you can have one bad goalie with an .880 sv% and one great goalie having a bad year with an .880 sv% (say Kipper), but the regression would move the bad goalie closer to 0 WAR (for that season) than the great goalie!

That’s because the bad goalie has more regression being added in.

Whatever we do at the seasonal level, it’s going to look weird.  It’ll work at the career level, but look weird at the seasonal level.


#47    Tangotiger 2013/06/21 (Fri) @ 13:04

In order to ensure that Kipper’s .880 sv% stays above the cup-of-coffee goalie of .880 sv%, while ALSO giving Kipper a smaller amount of regression, then the regression point for all goalies has to be to a much lower point, say .850 or even .800.

Which is weird.  You have to treat the population of goalies not as NHL goalies, but of pro goalies (including minors, maybe even juniors).

Anyway, trying to get the sum of the regressed-seasonals to match the regressed-career is going to be a major issue.


#48    Michael Cheyne 2013/06/21 (Fri) @ 14:02

Ok, glad you ran your own process to validate the .006 value.

The fact that the seasonal view will look weird at times coupled with I think a general aversion people have to regressing observed results when calculating value could make it a tough sell.




Login to comment

<< Back to main


Latest...

COMMENTS

Oct 29 16:39
How to handle the 2020 season for forecasting the 2021 season

Oct 29 03:54
Maybe taking out Snell was a bit too early?

Oct 23 10:34
Statcast Lab: Should the centerfielder play to pull or go the other way? Part 2 of 2

Oct 20 17:13
Statcast Lab: How much space should you place between infielders, part 2 of N

Oct 20 09:37
Statcast Lab: Why are clubs shifting RHH ?

Oct 06 13:42
Probability of Winning a game, with accelerated scoring rules, part 2

Sep 30 08:26
Value scale of players

Sep 28 11:19
Cy Young Predictor 2020

Sep 06 17:04
Statcast Lab: Components of Movement

Aug 12 13:41
Common baseline, common opponents

Aug 10 20:50
Statcast: Launch and Landing Relationship

Jul 26 13:55
Catcher WOWY

Jul 07 11:56
What are the chances of an extra inning game ending after the first extra inning?

Jun 30 00:08
Could Dave Winfield have been a poor fielding outfielder?

Jun 30 00:05
How much can fielding contribute to a baseball game?

Jun 29 21:19
Math behind the Pitcher-as-Batter to DH increase in run scoring

Jun 25 10:01
Common baseline for Fielders

Jun 19 17:50
Park Impact, 6 of N: wOBAcon v xwOBAcon

Jun 17 17:34
Park Impact, 4 of N: BACON

Jun 17 16:11
Park Impact, 3 of N: Homeruns