WAR On Ice: The Blog » @stat_sam

GUEST POST: Hockey and Euclid — Introduction to Bombay Ratings

@stat_sam — Mon, 20 Apr 2015 14:12:46 +0000

Note: This is part two of a series of guest posts written by @MannyElk. In the first installment of Hockey and Euclid, Manny outlined the player similarity calculation used in the Similarity Calculator. The explanations to follow will assume knowledge from that article, so we urge anybody who wishes to understand the derivation of the Bombay function in detail to get caught up.

We at www.war-on-ice.com are happy to host Manny’s newest Bombay Ratings App! We continue to encourage others in the hockey research community to follow Manny’s lead and develop public applications that will further the frontiers of research in hockey analytics.

While working on the Similarity Calculator, I stumbled upon a study in which Euclidean distance was used to compare NBA players to Michael Jordan. The author used the distances to generate a list of the most similar players to the man most would agree is the best ever. From this idea, Bombay ratings were just a conceptual hop, skip and jump away. Instead of choosing my own Michael Jordan from a list of historical players, I invented one.

Gordon Bombay (no relation to the legendary coach) played two seasons in the NHL. In his first season, as a forward, he led the league in every single statistical category and eventually his team to the Stanley Cup. Seeking a challenge, Bombay converted to defence in his second season. Undaunted, he repeated his rookie season success, once again unmatched by his peers in every single facet of the game. Bombay promptly retired, and no player since 2005 has been able to surpass his accomplishments.

Bombay’s stats at either position are equal to the best recorded values among regular skaters at that position since the 2005-2006 season. Thus, he possesses the best stats we can imagine without stepping outside the boundaries of what real players have been able to accomplish. If you don’t wish to entertain hypotheticals, consider an alternative explanation: The similarity calculation evaluates “distance” between players, each occupying a position in imaginary space. This space has as many dimensions as there are categories by which you choose to compare players, and the limits of each dimension are set by the maximum and minimum recorded values since 2005-2006. Gordon Bombay is simply a marker we’ve decided to place at the positive-most position in space — the position where the positive extrema of each dimension meet. In a three-dimensional plot, this is simply a corner. The Bombay Rating is the similarity between a player and Gordon Bombay.

Hence, we’ve laid the foundation for a method by which we can easily evaluate how “good” a player’s stats are that is surprisingly flexible and reasonably effective at producing intuitively pleasant results. The Bombay function essentially does what we all do when we pull up a player’s statistics. The advantage is that it’s more precise, quicker, and returns a single number. Recall that the similarity calculation is a function of the chosen dimensions and corresponding weights. It follows that the Bombay is function of those same variables. While this permits fluidity in what can be accomplished by the method, it also makes it entirely dependent on the quality of the measures used.

The Bombay app I developed uses a variety of 5v5 stats to assign ratings to skaters based on the selected weights and generate charts comparing players to Gordon Bombay in each of the chosen categories.

(click to enlarge)

The outer edge of the chart represents a 100% similarity to Bombay in that measure. This is only achieved if a selected player-season possesses the best recorded value in that metric among regular skaters since 2005. The dashed grey polygon represents another fictional player – one whose stats are all equal to the league average for regular skaters at that position. Note that league average does not signify a 50% similarity. At the default weights, this hypothetical average forward has a Bombay rating near 46 and the defenceman, 45.

I should confess that the default weights are largely arbitrary. I believe the correct weights to use are case-dependent, and I certainly encourage users to assign their own. I’ve found that using the “Defence” preset weights as a starting point to evaluate bottom-six or defensively-oriented forwards often produces more agreeable results. Individual season rankings can be viewed by toggling the “Table” tab and further filtered using the inputs at the bottom of each column. Using preset weights, the names atop the Forward rankings (Ovechkin, Sedin, Jagr, Crosby, Zetterberg, Malkin, Sakic) are who you’d expect; Defencemen, to a much lesser extent (Visnovsky, Karlsson, Giordano, Byfuglien, Campbell, Niskanen, Weber). It’s no secret that the evaluation of defencemen, by analytical and traditional methods alike, leaves to be desired at times. With better measures of defensive ability will come better results by this method.

Bombay ratings can easily be computed using aggregate player stats. You can view career Bombays here. While I wouldn’t necessarily trust default Bombays to provide a single number indicative of player quality over metrics like WAR and GvT, I believe the method has very interesting potential and flexibility. For one, it can easily be expanded as new stats become available. Secondly, the same method can be applied in other leagues, namely Canadian Major Junior and college leagues.

Stanley Cup Playoff Prediction Contest

@stat_sam — Sun, 12 Apr 2015 14:55:27 +0000

Enter Here: Submit your entry for the 2015 WAR On Ice Stanley Cup Playoff Prediction Contest (Round 1) here. Rules are below.

Entries: Only one entry permitted per person. Violators to this rule will be disqualified (and publicly shamed on Twitter). Sam, Andrew, and Alexandra will be participating, but their entries will be ineligible to win the prize.

Prize: First place receives a $50 Amazon.com gift card. All other entries will be awarded with nothing, because you don’t play to lose in the Stanley Cup Playoffs.

Donations: Entry is free, as this is a “for fun” contest only. Although not required, we suggest making a small donation through the link on our home page.

Half of all donations received between now and the end of the Stanley Cup Playoffs will be forwarded to Colon Cancer Canada in memory of capgeek.com founder Matthew Wuest. We also encourage people to donate directly to Colon Cancer Canada if they prefer.

Official Scores: All statistics will be taken from www.war-on-ice.com. Any stat changes that occur after noon on the day following the completion of the Stanley Cup Playoffs will not be taken into account.

Scoring: Answers to each question will be standardized using the scale() function in R. The Euclidean distance between each entry and the actual results will be calculated using the dist() function in R. The entry with the lowest Euclidean distance from the actual results will be the winner.

Subsequent Rounds: Additional questions will be sent to participants for rounds 2, 3, and 4. These will be sent shortly after the conclusion of the previous round. Participants who fail to submit entries for rounds 2, 3, and 4 will be assigned a random response from one of the other contestants for each question. Note that there will be limited time between rounds to complete these questions, so please plan accordingly.

Submission Times: Any submission received after the first puck drops each round will be disqualified (for Round 1) or assigned random responses, as described in “Subsequent Rounds” above (for Rounds 2, 3, and 4).

Disclaimer: We reserve the right to disqualify any entry for any reason at our discretion.

GUEST POST: Hockey And Euclid — Calculating Statistical Similarity Between Players

@stat_sam — Sun, 29 Mar 2015 03:30:43 +0000

Editor’s note: This is a guest post written by Emmanuel Perry. Manny recently created a Shiny app for calculating statistical similarities between NHL players using data from www.war-on-ice.com. The app can be found here. You can reach out to Manny on Twitter, @MannyElk.

We encourage others interested in the analysis of hockey data to follow Manny’s lead and create interesting apps for www.war-on-ice.com.

The wheels of this project were set in motion when I began toying around with a number of methods for visualizing hockey players’ stats. One idea that made the cut involved plotting all regular skaters since the 2005-2006 season and separating forwards and defensemen by two measures (typically Rel CF% and P/60 at 5v5). I could then show the position of a particular skater on the graph, and more interestingly, generate a list of the skaters closest to that position. These would be the player’s closest statistical comparables according to the two dimensions chosen. Here’s an example of what that looked like:

(click to enlarge)

The method I used to identify the points closest to a given player’s position was simply to take the shortest distances as calculated by the Pythagorean theorem. This method worked fine for two variables, but the real fun begins when you expand to four or more.

In order to generalize the player similarity calculation for n-dimensional space, we need to work in the Euclidean realm. Euclidean space is an abstraction of the physical space we’re familiar with, and is defined by a set of rules. Abiding by these rules can allow us to derive a function for “distance,” which is analogous to the one used above. In simple terms, we’re calculating the distance between two points in imaginary space, where the n dimensions are given by the measures by which we’ve chosen to compare players. With help from @xtos__ and @IneffectiveMath, I came up with the following distance function:

And Similarity calculation:

In decimal form, Similarity is the distance between the two points in Euclidean n-space divided by the maximum allowable distance for that function, subtracted from one. The expression in the denominator of the Similarity formula is derived from assuming the distance between both points is equal to the difference between the maximum and minimum recorded values for each measure used. The nature of the Similarity equation means that a 98% similarity between players indicates the “distance” between them is 2% of what the maximum allowable distance is.

To understand how large the maximum distance is, imagine two hypothetical player-seasons. The highest recorded values since 2005 for each measure used belong to the first player-season; the lowest recorded values all belong to the second. The distance between these two players is the maximum allowable distance.

Stylistic similarities between players are not directly taken into account, but can be implicit in the players’ statistics. Contextual factors such as strength of team/teammates and other usage indicators can be included in the similarity calculation, but are given zero weight in the default calculation. In addition, the role played by luck is ignored.

The Statistical Similarity Calculator uses this calculation to return a list of the closest comparables to a given player-season, given some weights assigned to a set of statistical measures. It should be noted that the app will never return a player-season belonging to the chosen player, except of course the top row for comparison’s sake.

(click to enlarge)

Under “Summary,” you will find a second table displaying the chosen player’s stats, the average stats for the n closest comparables, and the difference between them.

(click to enlarge)

This tool can be used to compare the deployment and usage between players who achieved similar production, or the difference between a player’s possession stats and those of others who played in similar situations. You may also find use in evaluating the average salary earned by players who statistically resemble another. I’ll continue to look for new ways to use this tool, and I hope you will as well.

** Many thanks to Andrew, Sam, and Alexandra of WAR On Ice for their help, their data, and their willingness to host the app on their site. **

Sam’s Zone Transition Time Paper

@stat_sam — Mon, 23 Mar 2015 00:15:46 +0000

In November, I introduced a preliminary version of my work on “Zone Transition Times” (ZTTs) at the Pittsburgh Hockey Analytics Workshop. The slides for and video of my presentation can be found here.

In December, I submitted a paper on ZTTs to the Sloan Sports Analytics Conference; the paper was not selected as a finalist in the research paper competition. A slightly modified version of this paper can be found here. The results and text are unchanged from the December submission, except for minor typos.

Since then, I’ve identified some flaws with this work that I didn’t (have time to) explore in November/December:

While there are modest in-season correlations between different team ZTTs and future positive outcomes (e.g. Corsi%, Goals-For%, GoalsFor/60), these correlations are lower than those of more established metrics like Corsi, Fenwick, and scoring chances.
Team ZTTs are only moderately repeatable across seasons. The correlations between previous-season-ZTTs and next-season-ZTTs are positive but usually less than 0.5. This varies substantially depending on which ZTTs are being examined. For example, fast transitions out of the team’s defensive zone (“good puck moving”) are typically more repeatable than are slow transitions out of the offensive zone (“good forecheck”). The repeatability of ZTTs is lower than that of more established metrics for team evaluation.
While we can calculate ZTTs for players, there just isn’t enough data to come to any reasonable conclusions about which players have better/worse ZTTs. That is, the standard errors on player ZTTs are very high, so that differences in players’ ZTTs are almost never statistically significant within a given season.

Thanks to everyone who gave feedback on earlier versions of this work. Please feel free to share your own feedback with me on Twitter. I hope to revisit this in the summer, or when player tracking data allows us greater precision to evaluate players and teams with ZTTs.

Finally, I’d like to conclude this post with a comment on null results in quantitative research. For those unfamiliar with scientific jargon, the term “null result” is typically associated with completed scientific studies that are “unsuccessful” in proving a hypothesized claim (statistically, studies where there is not enough evidence to reject the null hypothesis).

I’m not sure I would call my findings with ZTT a “null result,” but the results I did find were not as grandiose or game-changing as I had originally hoped they would be; they could best be described as “weak.” I’m sure that others have experienced similar results when trying to further research into hockey analytics and other fields. For those people, I encourage you to publish your null/weak results! They are interesting on their own.

For example, if I found that a team’s ability to quickly transition the puck out of their defensive zone had absolutely no effect on that team’s ability to suppress goals or shots in the future, would you think that was interesting? If I found that a team’s ability to keep the puck in the offensive zone for longer periods of time had no effect on a team’s ability to score in the future, would you think that was interesting? I would. And if I was someone interested in exploring this topic in the future, I’d want to know what previous researchers have found.

Publish all of your results, regardless of how “strong” or “weak” they are. It can only serve to benefit the research community by putting this information out there.

NHL Salary Cap FAQ — Mike Colligan

@stat_sam — Fri, 27 Feb 2015 17:01:57 +0000

Our running list of Frequently Asked Questions on the NHL Salary Cap, provided by site partner Mike Colligan.

For more from Mike Colligan, visit Colligan Hockey.

Sam on 93.7 The Fan on Tuesday, 2/10

@stat_sam — Thu, 12 Feb 2015 17:31:51 +0000

Thanks to Dan Kingerski from 93.7 The Fan for promoting the site and having Sam on the show.

Listen to Sam’s interview here.

[WAR Off Ice] Updates to Bergeron and M. Staal Contract Info

@stat_sam — Tue, 10 Feb 2015 22:54:37 +0000

Over the next few weeks, we will be releasing salary cap charts and information for NHL teams under the unofficial name, “WAR Off Ice”. Leading up to the full release, we’ll post important contract news to the WAR On Ice blog.

We’re pleased to release early information on two player contracts today. First, the widely reported contract terms for Patrice Bergeron are incorrect, according to two verified, high level sources. Additionally, we have what were (to our knowledge) previously unreleased details on the structure of Marc Staal’s contract.

In Bergeron’s case, the average annual value (AAV) of his contract ($6.875M) is higher than what has been publicly reported to-date ($6.5M). Staal’s AAV is $5.7M, same as previously reported.

Bergeron

Before today, it was publicly believed that Bergeron and the Bruins agreed to an 8 year, $52M contract ($6.5M AAV). However, our sources have confirmed that they actually agreed to an 8 year, $55M contract ($6.875M AAV), structured as follows:

Years 1-4: $8.750M salary, $0.0M bonus
Year 5: $0.875M salary, $6.0M bonus
Year 6: $0.875M salary, $3.5M bonus
Years 7-8: $3.375M salary, $1.0M bonus

Bergeron will be 36 years old when the contract expires after the 2021-22 season. We do not know at this time what implications this has for the Bruins’ salary cap situation for the 2014-15 season.

Staal

Below are details on the structure of Marc Staal’s contract with the Rangers, which goes into effect next season and expires after the 2020-21 season: UPDATE: this was reversed originally. Below is now correct.

Year 1 (2015-16): $4.0M salary, $3.0M bonus — $7.0 M total
Years 2-4 (2016-17 — 2018-19): $5.0M salary, $1.0M bonus — $6.0 M total/y
Year 5 (2019-20): $4.0M salary, $1.0M bonus — $5.0 M total
Year 6 (2020-21): $4.0M salary, $3.0M bonus — $4.2 M total

Again, this results in a $5.7M AAV for the Rangers.

All of this information will be available soon on our NHL team salary cap charts site.

Predictability Differences for Forwards and Defensemen

@stat_sam — Fri, 09 Jan 2015 22:18:31 +0000

Note: If you haven’t yet read our post on how SCF% better predicts future GF% than does CF%, we recommend reading that first. The definitions of metrics, data used, and methodology used in this post is the same as what is written here, so we refer interested readers to there for more info.

Summary: This is an update to our previous post on the best metrics for predicting player performance. Here, we split the analysis out by position (forwards vs. defensemen).

In this analysis, using all data going back to the 2005-06 NHL season:

For forwards, SCF% is the best predictor of future GF% of the metrics we tested
For forwards, CF% is a better predictor of future GF% than is FF%, but for defensemen, the opposite is true.
For defensemen, FF% is the best predictor of future GF% (with SCF% finishing a close second) of the metrics we tested.
SCF% is a much better predictor of future GF% for forwards than it is for defensemen.
In general, future GF% is more accurately predicted for forwards than it is for defensemen.

Glossary: See here.

Data: See here, and add FF% = Fenwick For% = percentage of unblocked shot attempts directed at the opposing goal when a player is on the ice.

Methods: See here, and add FF% to the list of metrics evaluated. (Note: In our original analysis, we also evaluated FF%, but it was found to be worse than CF% and SCF%, so we did not include it in our post.

Results — Past-vs-Future Correlations:

For forwards, SCF% had the highest past-vs-future correlation with future GF% across all seasons in this analysis. Here are the results (in order of correlation magnitude):

cor(past SCF%, future GF%) = 0.348
cor(past CF%, future GF%) = 0.332
cor(past GF%, future GF%) = 0.331
cor(past FF%, future GF%) = 0.316

For defensemen, FF% had the highest past-vs-future correlation with future GF% across all seasons (with SCF% finishing a close second) in this analysis. Here are the results (in order of correlation magnitude):

cor(past FF%, future GF%) = 0.285
cor(past SCF%, future GF%) = 0.282
cor(past CF%, future GF%) = 0.277
cor(past GF%, future GF%) = 0.198

Results — Future | Past Regression Models (all seasons):

For forwards, we used past SCF% and past CF% as explanatory variables in a multiple linear regression model of future GF% across all seasons. The results of this model are summarized here:

Past CF%: Coefficient = 0.2063, p-value = 0.0223. Note, this coefficient is similar to what was found in the all-positions regression from our previous post (0.2227).
Past SCF%: Coefficient = 0.4856, p-value < 0.0000001. Note, this coefficient is increased from what was found in the all-positions regression from our previous post (0.4067).
Both SCF% and CF% have statistically significant associations with future GF%.
Interpretation: For every one-percentage-point increase in SCF%, future GF% is expected to rise by about 0.49 percentage points, holding all other variables constant.
Interpretation: For every one-percentage-point increase in CF%, future GF% is expected to rise by about 0.21 percentage points, holding all other variables constant.

Interestingly, the magnitude of the SCF% coefficient increased for forwards, indicating that SCF% is a better predictor of future GF% for forwards than it is for defensemen in this analysis.

Note that we also repeated this analysis using FF% instead of CF%, but FF% was found to have an insignificant effect on future GF% when accounting for SCF% in the model (results not shown). This is very interesting: CF% was found to be significant even after accounting for SCF%, while FF% was not. This may indicate that the additional information included in CF% — blocked shots, for and against — is driving some of the metric’s predictability of future GF%. Our hypothesis is that blocked shots against (i.e. shot attempts taken by the opposition at the player’s goal) are driving the effect here, since forwards do a lot of shot-blocking at the points in the defensive zone.

For defensemen, we first used past SCF%, past CF%, and past FF% as explanatory variables in a multiple linear regression model of future GF% across all seasons. Since CF% and FF% are highly collinear, we opted to do two separate two-explanatory-variable regressions and examine the results: (1), future GF% given past SCF% and past CF%, and (2), future GF% given past SCF% and past FF%.

The results of the first model, which models future GF% given past SCF% and past CF%, are summarized here:

Past CF%: Coefficient = 0.2230, p-value = 0.02705. Note, this coefficient is almost identical to what was found in the all-positions regression from our previous post (0.2227).
Past SCF%: Coefficient = 0.3114, p-value = 0.00282. Note, this coefficient is decreased from what was found in the all-positions regression from our previous post (0.4067).
Both SCF% and CF% have statistically significant associations with future GF%.
Interpretation: For every one-percentage-point increase in SCF%, future GF% is expected to rise by about 0.31 percentage points, holding all other variables constant.
Interpretation: For every one-percentage-point increase in CF%, future GF% is expected to rise by about 0.22 percentage points, holding all other variables constant.

In other words, SCF% has a more substantial effect* on future GF% than does CF% in this analysis.

The results of the second regression model, which models future GF% given past SCF% and past FF%, are summarized here:

Past FF%: Coefficient = 0.2993, p-value = 0.00417
Past SCF%: Coefficient = 0.2495, p-value = 0.01713. Note, this coefficient is decreased from what was found in the all-positions regression from our previous post (0.4067).
Both SCF% and FF% have statistically significant associations with future GF%.
Interpretation: For every one-percentage-point increase in SCF%, future GF% is expected to rise by about 0.25 percentage points, holding all other variables constant.
Interpretation: For every one-percentage-point increase in CF%, future GF% is expected to rise by about 0.30 percentage points, holding all other variables constant.

Interestingly, SCF% is less predictive of future GF% for defensemen than it is for forwards, and FF% is the superior predictor of future GF% for defensemen by a small margin in this analysis.

Season-to-Season Past-vs-Future Correlation Plot, Forwards:

Season-to-season, the metric with the highest past-vs-future correlation for forwards varies. Recall that across all seasons, SCF% has the highest past-vs-future correlation. This seems to be backed up by the graph, where SCF% is highest by a relatively large margin in 4 of 9 seasons.

Season-to-Season Past-vs-Future Correlation Plot, Defensemen:

Season-to-season, the metric with the highest past-vs-future correlation for defensemen varies quite a bit. Recall, though, that across all seasons, FF% has the highest past-vs-future correlation. This seems to be backed up by the graph, where FF% appears to be a bit more consistent from season to season than other metrics.

Notes:

*Since SCF%, CF%, and FF% can all be approximated with a Normal(mean = 50%, standard deviation = 10%) distribution — that is, they are on the same scale — we can directly compare the magnitude of their regression coefficients.

Similar to our last post, one thing we found interesting is that in these multiple linear regression models across all seasons, both SCF% and CF% / FF% (whichever was used) had significant coefficients. In these analyses, it’s common for only one independent variable to explain most of the variance in the dependent variable (due to collinearity). The fact that both are significant indicates that SCF% and CF% / FF% are accounting for (at least slightly) different parts of the variance in future GF%. Predictive models of player performance would do well to include both metrics, regardless of player position.

Is future GF% the metric we should be using to evaluate forwards? Defensemen? If not, what should we use? We used future GF% since that seems to be the standard in predicting future player performance. That said, we’re open to suggestions here, and we’ll happily update our analyses depending on what the community thinks.

We removed Brett Burns and Dustin Byfuglien from these analyses, since their positions changed from season to season. If there are other players who we should remove, please let us know!

Better Than Corsi: Scoring Chances More Accurately Predict Future Goals For Players

@stat_sam — Tue, 06 Jan 2015 23:07:01 +0000

Summary:

Since the 2005-06 NHL season, the percentage of Scoring Chances For (SCF%) is a better predictor of future Goals For (GF%) than Corsi For (CF%) is for individual players. (Under specific conditions, of course, but it’s promising either way.)

Combining data across all seasons since 2005-06, the season-to-season correlations are:

cor(Past SCF%, Future GF%) = 0.322
cor(Past CF%, Future GF%) = 0.311
cor(Past GF%, Future GF%) = 0.287

Using multiple linear regression and combining across all seasons, we find that:

For every one-percentage-point increase in SCF%, future GF% is expected to rise by about 0.41 percentage points, holding all other variables constant.
For every one-percentage-point increase in CF%, future GF% is expected to rise by about 0.22 percentage points, holding all other variables constant.

Glossary:

SCF%: The percent of all on-ice “scoring chances” that were for a player’s team.
Scoring chances: See our post defining this.
GF%: The percentage of all on-ice goals that were scored by the player’s team.
CF%: The percentage of all on-ice shot attempts (on goal, missed, or blocked) that were taken by the player’s team.
SA: “Score-adjusted” — these statistics adjusted for score situation, home/away advantage, and rink scorer bias.

Our definition of what constitutes a “scoring chance” came about through discussions with people in the community first, because it’s a fairly subjective term. It’s clear that the community wants something with more definition than those based on distance, like our three danger zones, that captures something else about both scoring probabilities and in-game opportunities. Here we show that this definition has predictive advantages over other commonly used measures, even in their score-adjusted states.

Data:

This war-on-ice.com table, which has all the data we need.
Why start in 2005-06? Because before that, data on shot location and missed/blocked shots were not collected, rendering our definition of scoring chances (see above) useless.
Why use score-adjusted measures? Because these were found to increase out-of-sample predictive accuracy (results not shown here). Score-adjusted measures are used for SCF%, GF%, and CF%, ensuring a fair comparison of these metrics.
Why “divide data by season”? Because the goal of this post is to determine which metric is best at predicting future outcomes. In other words, we’ll use players’ SCF%, GF%, and CF% data from one season to predict their GF% in the following season. Note: We also plan to do “in-season” predictions as well, similar to what Micah Blake McCurdy did here.
Why require at least 500 minutes of time-on-ice per player? Because players who don’t play much in one season are more likely to have skewed metrics that aren’t representative of their true ability. In other words, because small samples. Note: A better approach for this would be to use regularization in a formal statistical model, but in the interest of laziness, using a min-TOI will do.
Why use both home and away data? Because any home advantages and rink count biases are taken into account using our implementation of score-adjusted measures.

Methods:

Using the data described above, we iterated through each season, finding players who played at least 500 minutes in that season and the following season. This gave us a matrix that looks something like this (except, for all players, not just the handful listed here):

Name past CF% past GF% past SCF% future GF%
Jake.Muzzin 61.8 57.7 60.5 51.2
Marc-Edouard.Vlasic 59.6 60.0 62.5 57.8
Drew.Doughty 59.3 58.5 57.0 54.4
Justin.Williams 61.4 58.1 60.2 55.6
Brent.Seabrook 58.2 56.2 58.2 56.4
Duncan.Keith 57.9 56.7 58.2 64.7

Then, we did two things:

Past-vs-Future Correlations: For each season, we found the correlation between the following season’s GF% and each of the current season’s SCF%, CF%, and GF%. (Actually, we did it for Fenwick% and a bunch of other metrics too, but these were not as good at predicting future GF%.)
Future | Past Regression Models: We used the current season’s SCF%, CF%, and GF% as explanatory/predictor variables and next season’s GF% as the response variable. Note: Even though 0 < GF% < 1, we used linear regression, since the GF%s are roughly distributed as Normal(mean = 50, standard deviation = 10). In a more formal analysis, we might instead use something like Beta regression.

Results — Past-vs-Future Correlations:

Above is a graph of the past-vs-future correlations of each metric with GF% over time. From this, we make the following observations:

In 6 of 9 seasons, SCF% (blue) has a higher past-vs-future correlation with GF% than does CF% (red).
In 2 of 9 seasons, CF% has a higher past-vs-future correlation with GF% than does SCF%.
In 1 of 9 seasons, the two are roughly equal.
Immediately after the lockout, past GF% was a great predictor of future GF%. We’re not sure why this is the case, but we’re open to others’ explanations!
The 2010-11 to 2011-12 season transition was strange. The predictive accuracy of all three metrics is reduced, most substantially that of SCF%. We’re not sure what happened here, but we’re again open to others’ explanations.

As mentioned in the intro, these results hold when combining data across all seasons since 2005-06, where the season-to-season correlations are:

cor(Past SCF%, Future GF%) = 0.322
cor(Past CF%, Future GF%) = 0.311
cor(Past GF%, Future GF%) = 0.287

In other words, combining across all seasons, SCF% is more highly correlated with future GF% than is CF%.

Results — Future | Past Regression Models:

First, we used past SCF% and past CF% as explanatory variables in separate, univariate linear regression models of future GF% (one model for each season and explanatory variable). Not surprisingly, these results were nearly identical to the past-vs-future correlations:

In 6 of 9 seasons, SCF% has a lower p-value than does CF% (both coefficients are positive).
In 2 of 9 seasons, CF% has a lower p-value than does SCF% (both coefficients are positive).

Second, we used both past SCF% and past CF% as explanatory variables in a multiple linear regression model of future GF% across all seasons. The results of this regression are summarized here:

Past CF%: Coefficient = 0.22271, p-value = 0.00101
Past SCF%: Coefficient = 0.40665, p-value < 0.00000001
This means that both SCF% and CF% have statistically significant associations with future GF%.
Interpretation: For every one-percentage-point increase in SCF%, future GF% is expected to rise by about 0.41 percentage points, holding all other variables constant.
Interpretation: For every one-percentage-point increase in CF%, future GF% is expected to rise by about 0.22 percentage points, holding all other variables constant.

In other words, SCF% has a more substantial effect* on future GF% than does CF%.

*Note that since SCF% and CF% can both be approximated with a Normal(mean = 50%, standard deviation = 10%) distribution — that is, they are on the same scale — we can directly compare the magnitude of their regression coefficients.

Author’s note: One thing we found interesting is that in the across-season regression, both CF% and SCF% had significant coefficients. In these analyses, it’s common for only one independent variable to explain most of the variance in the dependent variable (due to collinearity). The fact that both are significant indicates that CF% and SCF% are accounting for (at least slightly) different parts of the variance in future GF%. Predictive models of player performance would do well to include both metrics.

Future Work:

Cross-validation
Within-season predictions
Repeat for teams

Appendix:

Below is the R script used for this analysis. Feel free to try it out and add your own analyses.

nhl-past-vs-future

Density Plots for Modern Hockey Statistics (Warning: There’s Math, But It’s Useful Math)

@stat_sam — Wed, 12 Nov 2014 22:08:53 +0000

At #PGHAnalytics on Saturday, there was a short discussion about uncertainty in metrics such as Corsi% and Fenwick%. How can we quantify this uncertainty / variability? The simplest way to do this would be to include standard errors with each player rating such as Corsi% or Fenwick%, which is a good start. What else can we do?

Suppose we told you that you could choose between two hypothetical players, and the only pieces of information we gave you about them were their respective 5-on-5 Close Corsi%s from the first 10 games of the season:

Player A: 90%, 70%, 30%, 33%, 50%, 75%, 25%, 80%, 90%, 22%

Player B: 55%, 60%, 44%, 55%, 58%, 63%, 55%, 66%, 45%, 66%

Which would you choose? Why?

After the jump, we introduce a graphical approach to comparing pairs of players, looking at the distribution of their single-game Corsi%s, Fenwick%s, and much more.

Player A is the more up-and-down player, turning in some great performances and some poor ones. Player B is the more consistent player, turning in performances that are usually above-average, and at worst very close to average. (Both players are fictitious, but similar types can easily exist.)

We’re not going to tell you which player you should prefer, since there are clearly pros and cons to both. Instead, we provide a tool to help you compare players: a smoothed density plot of game-by-game outcomes to give a sense of what you can expect from them in future games. For example, we can show that Patrice Bergeron is highly likely to give a better 5-on-5 Corsi% than Tanner Glass:

What The Heck Is A Density Plot?

For those who haven’t seen these before, a density plot is essentially a histogram shown at a continuous scale: It shows the distribution of values that a given variable (e.g. Corsi%) takes, along with how likely/common each particular value is.

The x-axis shows the range of possible values of the given variable (in the above graph, Corsi% at 5-on-5 since 10/1/2013). The y-axis shows the “density” of the distribution of observed values of that variable. Higher densities (y-axis) mean that the corresponding value of the given variable (x-axis) is more likely; lower densities mean that the corresponding value of the given variable is less likely.

In the above plot, Patrice Bergeron’s Corsi% distribution (black curve) has a large mode around 60% (in other words, really good), meaning it’s common for him to have Corsi%s around 60% in a single game. Bergeron’s density curve drops off around 40% and 80%, meaning it’s fairly uncommon for him to have Corsi%s below 40% or above 80%.

Tanner Glass’s Corsi% distribution (red curve) has a mode around 35% (in other words, really bad). His distribution is much wider, indicating that his Corsi% is much more variable than Bergeron’s. As @senstats pointed out on Twitter, this is likely because Glass typically has a low 5-on-5 TOI in any particular game, which will make these numbers more variable, so we have to note these things when picking our comparable players.

OK, But What Does The Density Curve Actually Mean, and How Did You Get It?

The curve represents what’s called a “kernel density estimate“. We use it to estimate what the data would look like if we had much more of it available, but also — and most importantly for our purposes here — to smooth out the data to make it easier to visualize. A histogram does the same thing — it smoothes data by counting it in blocks/bins — but it can be harder to compare two of those at once, and nearly impossible to compare three or more.

A player’s density curve for a given statistic shows an estimate of the probability density function for that player/stat. The main use of this is to estimate a probability that something will happen in a future game if little changes. For example, the probability that Patrice Bergeron will have a Corsi% greater than 50 is the total area under the curve where that’s the case:

P(Bergeron’s Single-Game Corsi% > 50) = [area under Bergeron's curve where the x-axis > 50]

Some additional details:

See R’s help file for the density function for more information.
In the above link, go to “Details” for more on kernel density estimation.
We fix the density to be between 0 and 100 for percentages like Corsi% and Fenwick%.
Otherwise, we used the default settings for all players/stats.
We intended for these to be used for continuously valued stats only (e.g. Corsi%, OnIceSh%, etc), but we’ll update this later on with similar plots for discrete-valued stats (e.g. # of shots, +/-), since the primary purpose is to make the full distribution of data easier to see, rather than being able to reassemble the raw data.

In the statistical programming language R, which we used to build WAR On Ice, we used the density function to calculate the kernel density estimate for each player/stat.

How Can I Make My Own Density Plots?

On war-on-ice.com, you can create these graphs for yourself for any variable, any situation, and any pair of players:

Under the Players drop-down menu, select Skater History and click the By Game tab.
Select your home/away situation, score situation, man-strength situation, etc.
Enter the first player’s name in the “Filter Player” textbox at the top of the page.
Scroll down below the table and enter the second player’s name in the “Filter Players” textbox.
Choose which variable you want to use to compare your players.

Two graphs should show up: First, the moving average graph of the chosen variable over time for both players. Second, the density plot comparing the distribution of the chosen variable for both players.

Additional Examples

Above: QoC-TOI since 10/1/2013 at 5-on-5 for Sidney Crosby and Craig Adams. As expected, Crosby consistently faces much tougher competition, since there isn’t much overlap in these two distributions.

Above: Offensive Zone Start% since 10/1/2013 at 5-on-5 for Evgeni Malkin vs. Brandon Sutter. Dan Bylsma typically used Malkin heavily in the offensive zone, as evidenced by the mode in his density estimate around 70%. Brandon Sutter is typically used in a more defensive role, since a large mass of his distribution is below 50%.

Above: Finally, the above graph shows Kadri vs. Bozak in Corsi% Rel at 5-on-5 since 10/1/2013. What’s interesting here is Bozak’s seemingly bi-modal distribution. One mode is above zero, and one mode is below zero, indicating that he has some good games, some bad games. Another interpretation of this is that sometimes he is used in more defensive roles, and sometimes in more offensive roles (since usage can impact Corsi% Rel).

Make some player comparisons with density plots of your own, and let us know what you think!