Author Archives: @acthomasca

Annotated Glossary

Note (Nov 27, 2015): As a blanket disclaimer, we didn’t define most of the terms or notations used on this site. These are all products of the community, both as individuals and as collectives. Our innovations are specifically labelled here; the rest should not be taken as ours, especially if a reference is provided.

As complete a list of terms that we have on the site follows (or the relevant links.)

WAR/GAR: The whole series is linked here.

Time on ice:

  • TOI is the time in minutes when a player is on the ice.
  • TOIoff is the time when a player is off the ice, but in a game in which they played.
  • TOI% is the percentage of time they spend on the ice;
  • TOI60 is the amount of minutes out of 60 that the player was on the ice.
  • TOI/Gm is the amount of minutes spent on the ice per game.

General Shot-Based Event Counts: The bread and butter of modern hockey analysis.

Goal Events (G): Shots that are not saved and cross the goal line. The ceremony consists of a bright red light of shame, a stripe-fashioned league employee exaggeratedly drawing attention to the goaltender’s failure to do their job, and a small party at center ice to begin the process anew.

  • G: Goals scored by the individual/team.
  • A: All assists. A1: Primary assists; A2: Secondary assists.
  • G60: Goals scored by the individual/team, per 60 minutes. A60 and P60 for all assists and points.
  • GF: Goals scored by a player or their teammates when the focal player is on the ice. GFoff: goals scored by a player’s teammates when the player is off the ice.
  • GA: Goals scored against a team when the focal player is on the ice.
  • G+/-: Goal differential (GF – GA). Similar to plus-minus but should always include the specific game scenario when identified (such as 5v5).
  • GF60: GF / TOI * 60 minutes.
  • GA60: GA / TOI * 60 minutes.
  • GF%: The share of goals scored by the focal player’s team compared to all goals scored when the player is on the ice — GF/(GF + GA).
  • GF%off: The share of goals scored by the focal player’s team compared to all goals scored when the player is on the ice — GFoff/(GFoff + GAoff).
  • GF%Rel: The on-ice share of goals for a player’s team minus the off-ice share of goals.

Shots On Goal (S): Shot attempts that are somehow the goaltender’s responsibility. Includes goals, as above, and saved shots, which comprises the bulk of a goaltender’s productive output.

Take the above quantities and replace “G” with “S” like so:

  • SF: Shots on goal produced by a player or their teammates when the focal player is on the ice. SFoff: Shots on goal produced by a player’s teammates when the player is off the ice.

New ones:

  • iSF: Individual shots-on-goal for.
  • SH: Individual saved shots.
  • PSh%: Personal shooting percentage: a player’s goals (G) divided by their individual shots on goal (iSF).
  • OSh%: On-ice shooting percentage: a player’s on-ice goals for (GF) divided by their on-ice shots on goal (SF).
  • OSv%: On-ice save percentage: a player’s on-ice goals against (GA) divided by their on-ice shots on goal against (SA).
  • PDO: The sum of on-ice save and shooting percentages (OSh% + OSv%). League average is 100 by construction. Origins: Irreverent Oilers, with a nice write-up here.

Fenwick (F)/Unblocked Shot Attempts (USAT): Take shots on goal and add shots that miss the net entirely. Named for its inventor, Battle of Alberta author Matt Fenwick, not a fictional Duchy that changed the course of the fictional world.

Take the above quantities and replace “G” with “F” like so:

  • FF: Unblocked shots produced by a player or their teammates when the focal player is on the ice. FFoff: Unblocked shots produced scored by a player’s teammates when the player is off the ice.

New ones:

  • iFF: Individual unblocked-shots-on-goal for.
  • MS: Individual shots that missed the net.
  • PFenSh%: Personal Fenwick shooting percentage: a player’s goals (G) divided by their individual unblocked shots (iFF).
  • OFenSh%: On-ice shooting percentage: a player’s on-ice goals for (GF) divided by their on-ice unblocked shots (FF).
  • OFenSv%: On-ice save percentage: a player’s on-ice goals against (GA) divided by their on-ice unblocked shots against (FA).
  • FenPDO: The sum of on-ice Fenwick save and shooting percentages (OFenSh% + OFenSv%). League average is 100 by construction. Not on the site at present.
  • FP60: Fenwick Pace (per 60 minutes), equal to FF60 + FA60.
  • OFOn%: On-ice Fenwick on-goal percentage: a player’s on-ice shots on net (SF) divided by their on-ice unblocked shots (FF).
  • OFAOn%: On-ice Fenwick on-goal percentage against: a player’s on-ice shots on net against (SA) divided by their on-ice unblocked shots allowed (FA).

Corsi (C)/Shot Attempts (SAT): Corsi events consist of all shot attempts: blocked, missed, saves, goals. Should probably just be called “shots”, so we’ll do that here. Named for goaltending coach and mustache champion Jim Corsi, coined and minted by Tim Barnes (aka Vic Ferrari).

Take the above quantities and replace “G” with “C” like so:

  • CF: Shots produced by a player or their teammates when the focal player is on the ice. CFoff: Shots produced scored by a player’s teammates when the player is off the ice.

New ones:

  • iCF: Individual shots.
  • BK: Shots that an individual attempted that were blocked.
  • AB: Attempts Blocked, shots that the individual themselves blocked.
  • CP60: Corsi Pace (per 60 minutes), equal to CF60 + CA60.
  • OCOn%: On-ice Corsi on-goal percentage: a player’s on-ice shots on net (SF) divided by their on-ice shots (CF).
  • OCAOn%: On-ice Corsi on-goal percentage against: a player’s on-ice shots on net against (SA) divided by their on-ice shots allowed (CA).

Location Data:

The NHL has (x,y) location data available for all shots-on-goal, as well as hits and penalties. ESPN and Sportsnet both have this data available for missed shot locations and the location where shots were blocked (not from where they were taken, though both are useful).

This location data has systematic bias from rink to rink as well as random measurement error. We can’t do much for the random error, but to correct the bias in shot location data, we use the basic method proposed by Schuckers and Curro (Appendix A):

  • Get the distances for each shot from the net, conditioned on type (slap/not) and whether they were at home or away for the team.
  • Calculate the cumulative distribution functions for the distances of these shots at home and away for each team (which assumes that the shot distance distribution is truly the same, both home and away). Assume that all distances differ around the same league average and that there is no net league bias (which for standardization is fine).
  • The adjusted distance for a shot is then calculated by quantile: what fraction of shots in this building of this type were at this distance? (Say, 25% of non slap shots were within 17 feet of the net.) Take that quantile and get the number for that team on the road. (Say, 25% of non slap shots were within 19 feet of the net on the road).
  • Project the shot on a line from the center of the goal line (which is the reference point for distance) going through the shot; move the shot to a position on that line with the correct distance.

Having a de-biased measure for shot location is essential for any measures that are going to compare from building to building. Speaking of:

Danger Zones And Types: 

There are all sorts of mechanisms for judging the relative worth of a shot given its (x,y) coordinates and other information. Schuckers has a comprehensive method for evaluating expected goals called DIGR; for our purposes, we simplified the available data into three main features:

  1. Shot location, by block. There are any number of ways to dissect the impact of location, but the most straightforward is by grouping into location blocks rather than smoothing a continuous function over a surface (as in DIGR). There were a few inspirations for this scheme:
  2. Shot features, gleaned from the play by play: rebounds are classified shots taken within 3 seconds of another shot attempt, and rush shots are taken within 4 seconds of an event in another zone (a definition derived from David Johnson’s work).
  3. Blocked shots pose an extra problem: they’re shots that have been recorded at the point at which they’ve been blocked, and are also more likely to be shots of less quality and speed by nature of their blocking.

A shot’s Danger is then defined by this method:

  1. Start with the zone in which the shot attempt was recorded, 1 through 3.
  2. Add 1 if it was a rebound or a rush shot.
  3. Subtract 1 if it was a blocked shot.
  4. Increase to 1 if it was equal to 0.

For goaltenders, we then have

  • G.U, S.U: Goals and Saves with unknown danger.
  • G.L, S.L: Goals and Saves with low (1) danger.
  • G.M, S.M: Goals and Saves with medium (2) danger.
  • G.H, S.H: Goals and Saves with high (3+) danger.

Scoring Chances (SC):

All shot attempts that have danger 2 or greater. As originally described here.

Take the above quantities and replace “G” with “SC” like so:

  • SCF: Scoring chances produced by a player or their teammates when the focal player is on the ice. SCFoff: Scoring chances produced scored by a player’s teammates when the player is off the ice.
  • iSC: Individual scoring chances.
  • SCP60: Scoring Chance Pace (per 60 minutes), equal to SCF60 + SCA60.

 

High-Danger Scoring Chances (HSC):

All shot attempts that have danger 3 or greater. Take the above quantities and replace “G” with “HSC” like so:

  • HSCF: Scoring chances produced by a player or their teammates when the focal player is on the ice. HSCFoff: Scoring chances produced scored by a player’s teammates when the player is off the ice.
  • iHSC: Individual scoring chances.
  • HSCP60: Scoring Chance Pace (per 60 minutes), equal to HSCF60 + HSCA60.

 

Adjusted Save Percentage (AdSv%):

Defined here, it is the weighting of a goaltender’s save percentage in each danger level by the fraction of shots that would be expected from the league-wide distribution.

Score Situations:

Score Effects are the acknowledged differences in team performance based on the difference in score. There are popular methods are used for accounting for the score in whatever results are presented; we host two. The first, Score Close, was pioneered by Tore Purdy (aka JLikens) and simply includes situations where teams are within 1 goal of each other in Periods 1 and 2, and tied afterwards.

The second, manual score adjustment, has a few different predecessors:

Not surprisingly, we went with our adjustments, and implement a full Poisson model for score, period and rink effects for each shot type by each danger zone. Adding the rink bias correction to our score and period correction was inspired by Schuckers and Macdonald.

Charts:

Bubble charts: The main look and design of the bubble charts, with four variables displayed simultaneously, comes from Rob Vollman’s Player Usage Charts, including the starting variables: x-axis for zone starts, y-axis for quality of competition, color for Relative Corsi. Our expansions and additions include every variable we have at our disposal including different game and score states.

Hextally: Directly inspired by Kirk Goldsberry’s NBA Shot Charts for Grantland and 538 (but perhaps the lack of a player with an appropriate name in the NBA made it difficult.) Expanded for both shot success probabilities (standard for basketball) and the rate of shots taken from each area of the ice (not standard for basketball, even if it were played on the ice.)

Shift Charts: We started with the original NHL shift charts before adding our own features. Both ShiftChart.com and timeonice.com (now defunct) hosted their own versions, inspiredby the same NHL.com template.

Shot Attempt Timelines: ExtraSkater (defunct) made them popular, but Behind The Net had the first ones we could find online.

Pulling the Goalie: Original post is here.

Raw Teammate/Competition Statistics:

For each of the teammate and competition statistics, relative numbers on a game by game basis by taking an exponentially weighted prediction of the next game’s numbers.

  • TOIT60, TOIC60: The average time on ice per 60 minutes for teammates and competition in previous games, weighted by mutual time on ice.
  • CorT%, CorC%: The share of Corsi events for the teammates and competition in previous games, weighted by mutual time on ice.
  • tCF60, cCF60: The rate of Corsi events recorded on-ice for the teammates and competition in previous games, weighted by mutual time on ice.
  • tCA60, cCA60: The rate of Corsi events recorded on-ice against the teammates and competition in previous games, weighted by mutual time on ice.

Other:

  • Penalties: PN are non-coincidental penalties taken by a player; PN- are non-coincidental penalties drawn by a player. PenD is the difference, PN- minus PN; PenD60 is the net rate of penalties drawn every 60 minutes.
  • Faceoffs: FO_W are faceoff wins. FO_L are faceoff losses. FO%^ is a shrunken faceoff win percentage, to avoid extreme results: identical to FO% if more than 20 faceoffs were taken, a combination of this and a 40% success ratio if less.
  • Zone starts: ZSO, ZSN and ZSD are the number of faceoffs taken in the offensive, neutral and defensive zones for which the player was present; ZSOoff, ZSNoff and ZSDoff are the number of faceoffs taken in the offensive, neutral and defensive zones for which the player was absent. ZSO% is the share of offensive starts divided by the offensive plus defensive. ZSO%Off is the share of offensive starts for when that player is absent; ZSO%Rel is ZSO% minus ZSO%Off.
  • GV are giveaways, TK are takeaways. HIT are hits taken, HIT- are hits absorbed. None of these are recorded reliably in NHL buildings.

nhlscrapr updates now on GitHub

We will no longer be making updates to nhlscrapr on CRAN; instead you can get the most recent version from the war-on-ice repository on GitHub.

To use this, you’ll need to install the library devtools first, then once that’s loaded, use the command

install_github (“war-on-ice/nhlscrapr”)

This is updated for the 15-16 season and includes all our other most recent upgrades.

A Quick Note on Adjusted Save Percentage

Different goaltenders face different distributions of shots from across the ice due to the offenses they face and the defenses in front of them. We adjust save percentage by re-weighing the components according to the league-wide distribution of shots, so that the value better translates between different goaltenders. This is similar to stratified sampling in survey methodology, and also goes by the name benchmarking.

With our danger breakdown, standard save percentage of Saves/(Saves + Goals) is expressed as

Sv% = (Saves_low + Saves_med + Saves_high)/(Saves_low + Goals_low + Saves_med + Goals_med + Saves_high + Goals_high)

The adjustment is to re-weigh every danger-based save percentage by the league-wide distribution of shots on goal in each zone:

AdSv% = (S_l/(S_l + G_l) * AllShots_l + S_m/(S_m + G_m) * AllShots_m + S_h/(S_h + G_h) * AllShots_h ) / (AllShots_l + AllShots_m + AllShots_h)

If the shots faced by the goalie have the same ratio as the league average, then their unadjusted and adjusted save percentages will be equal.

 

 

Sharing is Caring

Back in March, I met Darryl Metcalf for the first time at the Sloan conference. We were talking about how we had advertised that we would be open with our data and infrastructure, and he said something to the effect of “when?” And as I recall, my answer was “real soon”.

We’ve always taken requests for data and shared privately, but since the news of our comrade’s hiring, and our desire to stimulate further research, we’ve begun the process to share everything we’ve put together — and we mean everything — with the hopes that given our existing database format, it will be even easier for community members to make their own tools and run their own analyses.

Here’s what we’re posting publicly in the first round:

  • The raw output from nhlscrapr, including our integration from other sources and corrections for rink bias.
  • The full underlying database for all player, team and goaltender statistics.
  • By-game and by-season calculations for Goals Above Replacement.

The full file list is available here and will be updated as needed.

Here is a description for all variables in nhlscrapr and in the derived WAR files.

UPDATE: Here is the processed contracts table.

 

Playoffs Prediction Contest Scoreboard

Entry names and final scores listed below.

Entry.Name Score
Perfect 0
Don’t Toews Me Bro! 6.8716238602
The Brass Bonanza 7.0910481644
microdino 7.2444384897
Corsi Hockey League 7.2457407713
Sean Dooley 7.272956604
Adam Odenwelder 7.3923966229
derek8 7.5417095449
Clown Predictions, Bro 7.6645163552
JohnScottScoringMachine 7.7063411263
Corsi Calamity 7.7112029266
HawksNumbers 7.7604368224
Ben Lutz 7.8195240411
BentleyNathan1 7.8491414946
Sebastian Mankowski 7.8646730078
danbowie 7.9074140647
Carolyn Tries Her Hand 8.1370624749
dvgmacdonald 8.1405948215
thehaze 8.204558261
trevor 8.2293711818
mikael johansson 8.2495068414
Jay32600 8.2540820936
Flesh and Bone 8.3085470955
Fancy Fenwicks 8.3136010606
whichocho 8.3401821184
Losing Entry 8.3442167105
YT 8.4366901828
sad bruins fan :( 8.4866993775
Tom.Andreu 8.5119277864
Andrei D 8.5239516775
Holmgren 8.5284431538
Danton Danielson 8.6351995803
Daniel Sandler 8.650645999
Smittens 8.6721579007
Ken Peterson 8.6911295819
Corey 8.7173200392
Ovi is God 8.7205833069
Eric Single 8.7439823599
Getzlaf’s Forehead 8.7872070859
QuickkNess 8.8246743919
altrockposeur 8.8278775422
Nilesh Shah 8.8403094971
StatsbyLopez 8.8705267611
Steven David 8.8833133041
Andrew Wisneski 8.8868548107
John Barr 8.9228851524
Adam R 8.9424973307
quack attack 8.9674498118
seanbailly 9.0954598132
Flashes of Quincy 9.1015593194
Kurtis Wells 9.1390202842
Midnight Ramblers 9.1484016026
Félix Magny 9.1547807581
The Math of Khan 9.1563240565
kncpt 9.1598845942
Sabres Win 9.1612333752
Matt Cane 9.163722536
evohnave 9.1648887848
Hero Squad 9.2113579453
Mcurcio94 9.2398682861
clib542 9.2730859542
Sapp Macintosh 9.2840653428
RangerSmurf 9.2915912403
gut feels 9.3410879975
Jason Richland 9.4032997817
MBwinz 9.4134528979
Iron Ringer 9.468006452
Legs Feed The Wolf 9.5184845615
Lugnut Ninety-Two 9.5650159312
Nick 9.5677243517
Andrew Pritchard 9.5986201249
Jenni 9.5990676441
Sean 9.6218580301
Mysonzdad 9.654913344
Josh Norman 9.6743593226
Pat Holden 9.7033732654
Adrian F 9.717788625
Neel 9.7563674943
Wolfram Ott 9.7777860246
PDOwned 9.7863596557
Nikm8 9.787107307
Peace On Liquid Water 9.7934054506
RyanPrice 9.8405501909
Jon Stolte 9.8851916059
ScuttlePuck 9.9107184334
Winterhawk11 9.9367867374
aasiaat 9.9585003239
Jon Garcia 9.9856430718
PhilKessel 10.1325615448
artigascruz 10.1368246784
The Philosopher 10.1498649026
Regression to the mean streak 10.1720356159
CBJ 2016 10.3383148238
Zone Entries 10.3457779324
Wrathman 10.3840926146
clarendonbandit 10.3850245284
Kerfuffle on Frozen Water 10.4104886866
Andrew Rasmussen 10.4440463984
bigMac 10.6044809869
UponFurtherReview 10.6437886061
amcassells 10.663771638
BlueMoons68 10.6866708283
Karan 10.7103734232
BFitz 10.7969523003
Not Gonna Win 10.839300026
Stefan 10.9414853807
ZachMacDonald 11.043075811
Chris Kang 11.1936846403
kd5mdk 11.3652086796
kadri’s drunks 11.6127684377
Ohno 11.7344710644
Micah Blake McCurdy 11.9241412719
Stanley’s Bandwagon 12.0942643644
the Flamingos 12.1588773778
JH 12.7173764514
fbourassa 13.4674255004
Paddyboy 14.963914817
Aerofan79 15.0526656878
A J G 16.1596523009
Sean Burke 16.3475285916
finlayj 20.0711222646

Recapped, geek

Summary: go to http://war-on-ice.com/cap/ to see what we’ve put together. Read below for what we have to share.

Back in January when CapGeek went offline, we grabbed a series of contract pieces from USA Today and the NHLPA website to help tide the community over. We didn’t have any plans to take the reins on a new site, partly because it would be a lot of work, partly because we didn’t have a clue where to get the data from an original source, but mainly because we hoped that the permanence of the shutdown was overstated.

What changed for us? Once the initial salary data went up, we were approached by those with the original, genuine contract data, from the same basic sources that Matthew had access to, who wanted to see the work continue. We then set to work converting it over into a new database, recruited volunteers to help us crawl through it (particularly the incomparable Alexandra Mandrycky), spent way too many hours bothering our resident cap expert (Mike Colligan) with questions, partnered up with other efforts (including the ninja himself, Greg Sinclair) and posted an initial set of contracts for all the players we could find who were active this past season at the end of February, along with a buyout and cap recapture tool. The reaction broke the site temporarily, so we knew we’d have to build a better infrastructure before we could go big.

Well, we’re ready to go big today. The “beta” version of our contracts database is now ready for consumption, reliably dating back to the 2009-10 season, and including everything we could find on contract structure, signing and performance bonuses (particularly the achievable A and B bonuses). Other features:

  • You can find any player contract or statistical breakdown from the homepage at war-on-ice.com.
  • We have a link under every contract to see what the buyout terms would be for any year after the first. (We’ve only found one example where a contract was bought out without a game being played — Tim Kennedy with the Sabres — and that was a result of an arbitration decision.)
  • We’ve made it clearer which contracts have slid, which have been bought out, and what years have been “retired” out.
  • We have a quick summary of active contracts, new signings, and performance bonuses achieved at the cap home, war-on-ice.com/cap.
  • We’ll be posting quick team summaries, including team-level obligations like buyouts, performance bonus overages and retained salaries, as soon as they’re ready.

We and our (friendly) competitors surely have a ways to go before the functionality of CapGeek is once again matched; Matthew Wuest put over five years of work into it, after all. And if our experience after ExtraSkater’s shuttering has taught us anything, it’s that a gap in the market leads to fresher and better alternatives to come from those who would not have acted given the dominance of the one, so we’re expecting greater work from ourselves and others in following Matthew’s lead.

But of all the roles that CapGeek played — to have the most trusted database, the quickest reaction time to contract announcements, and the user-friendliest interface — we’re most confident that we can serve the community best in the first role, and so that’s where the bulk of our energy has gone. Which is why we’re opening all our data — contracts, game statistics, and (soon) transactions — to anyone who wants to use it (but not sell it), as long as we’re cited as the original source.

Let’s go forward together. With everyone jumping on board this train, this should be a fun off-season.

 

Site Terms of Use

We at war-on-ice.com built this site together for two reasons: to disseminate our own research ideas and purposes, and to have tools that everyone can use for their own research. To clarify our goals and our means, and in light of other sites having similarly stated policies, we state our positions on all of these matters.

1. The use of this site comes with absolutely no warranty.

2. We are not responsible for the consequences of what you do with the data — legally or morally.

3. Our site’s prime purpose is for research: you can use what you like for plain fandom, articles, personal exploration and knowledge, tweets, blog posts, and so forth. Our work to build the site started as the result of our needs for academic work and that’s still what we do with it first. In that spirit, we ask that you cite our work and, in particular, the specific page you obtained the data from so that others can obtain it as well. If you would like to show extra appreciation, the Donate button is on the site’s front page.

4. Automatically scraping our pages for data is not only strictly prohibited, it’s not worth your time to do it; it’s an unnecessary strain on our servers, and that’s why we have Download buttons. If you want data that’s more complicated than our current queries, you can ask us by email at waronice.com@gmail.com or on Twitter at @war_on_ice.

5. We reserve the right to offer our services in consulting arrangements to help you get the most from transforming and aggregating the data in meaningful ways. We offer what we can for free because we treasure the community and value openness, but we’re not able to take every request for free simply because we all have full time jobs.

5. You may not sell any raw data you obtain from the site. This is not just because you can get it from here for free and your customers would be fleeced; it’s because this comes from the league and their partners and it’s not our place to sell what they allow us to use. This is not to say you can’t use it in articles that are posted behind a paywall, or use it in anything that’s at all transformative. You know what we mean here; if you have any doubt, ask us,.

6. Our underlying software is licensed under the GPLv2. We will be sharing what we can on GitHub for your convenience.

Back-to-backs and goalie performance

I’ve been curious for a while about the impact of rest and travel on goaltending, especially after reading the work of Gus Katsaros and Eric Tulsky, so I re-ran the numbers on save percentages including back to back games, away versus home and with danger zones included. We know that shooting rates for go down and rates against go up, especially for high danger shots, when teams play back to back games; this is enough to make me wonder whether our enhanced database will tell us more with goaltending than we previously knew.

Eric Tulsky previously found that the back-to-back edge was worth a full percentage point in the second half of back to backs, from .912 to 901, using data from the 2011-12 and 2012-13 seasons. Since we now have data at war-on-ice.com with quality goaltending data from 2005-06 until this season (2014-15),  it’s worth a fresh look. Here’s the effective difference by season.

b2b

The reputation for tired goalies has apparently been made based on the two worst years in our record; in fact, in three other seasons the effective change in save percentage is positive.

Given the additional tools we have at our disposal, let’s break them out and see if they tell us anything new about this. Let’s do it in this sequence using good old logistic regression:

  1. Start with the home advantage, the indicator for the second half of a back to back, and the interaction between the two.
  2. Add in danger zones, since we know this has played a role.
  3. Add score difference, since teams with the lead have higher shooting percentages.
  4. Add in the game state (5v5, PP, SH, 4v4, etc)
  5. Finally, we add in terms for each goaltender in case there are selection effects for which coaches are willing to lean on their number ones.

The negative changes in save “percentage” in thousandths for each factor:

Model Away Goalie Back-To-Back (Home) Back-To-Back (Away)
1 3.3 1.3 3.1
2 1.2 2.1 3.4
3 0.6 1.9 3.4
4 0.1 1.8 2.9
5 -0.03 2.3 3.7

The home advantage on save percentage disappears the more factors we add, and the difference in “tired” performance persists, but only at 3 and a half points below their usual performance, not 11. I was personally expecting the differences to be bigger, and I was also expecting shot danger to play a bigger role than effectively none. Still, while we still don’t have a good idea if it’s there’s greater risk for injury, or other unknown factors, we can be confident that coaches aren’t completely nuts if they send their Number One out back to back.

Replication materials:

The Road To WAR, Part 11: Shot Rates For And Against, or that quality that we deliberately avoid calling “possession”

This is the big one that drives most of what we see in the game, but is also the most difficult to calculate directly: how would the shot rates for and against a team behave if we swapped out a player with their equivalent replacement?

First, here’s the progression in methods that we’ve seen so far:

  1. Good old plus-minus (+/-), which no one seems to think is good but everyone agrees is old. It was the number that was used for the longest time to capture supposed relative defensive ability, but among its flaws are that it’s too dependent on goaltenders, too dependent on linemates, and the sample sizes are too small to produce a strong signal. Relative plus-minus doesn’t have the first problem, if the only job is to compare against one’s own teammates, but can still suffer with too much common time with other players.
  2. Corsi/Fenwick/Bowman numbers take away the impact of the goaltender and of shooting skill, in favor of at least a tenfold increase in sample size. They add in contributions from usage like zone starts which can now be detected statistically and still have the linemate and competition problem.
  3. Regression-adjusted statistics for shot differential; see our comprehensive historical list here, then add in Stephen Burtch’s dCorsi and Domenic Galamini’s Usage-Adjusted Corsi. Essentially, make adjustments to the macro-level stats depending on whom they played with and against.

You could hypothetically drop in any of the above pieces and spin them into a measure of goals; the conversion than can be slotted along the other contributions to get a total value. But we have a few other needs:

  1. We want to adjust for teammates and competition simultaneously, including replacement level players.
  2. We need to separate offensive and defensive contributions.
  3. We adjust for usage, including whether a faceoff was won or lost, and score situation.
  4. We model separately for each shot danger, because we know that forwards and defensemen contribute differently between and within these groups.
  5. We also want to distinguish between performance (what happened) and talent (what would be most likely in future).

Continue reading

The Road to WAR Series (Index)

All the articles in the Road to WAR series.

  1. The Single Number Dream
  2. All Rate Now
  3. Shot Quality Assurance, plus A Bonus on Travel Fatigue
  4. You can’t spell “An Incremental Improvement” without two “team”s<
  5. Getting Goals Above Baseline
  6. Rate-Based Event Adjustments For Score Effects, Home Advantage and Event Count Bias
  7. What do we mean by “replacement”? A case study with faceoffs
  8. Penalties Taken And Drawn
  9. Historical Shooting and Goaltending
  10. Modern Goaltending and Shooting
  11. Shot Rates For And Against