The Great Data Debate of Oct 22

Because this is what Hockey Twitter is for: finding little discrepancies. Leading off a @Mirtle quote on Dion Phaneuf’s Corsi For Percent, we noticed a discrepancy in raw counts between our site and Hockey Analysis, which only got weirder when we compared to HockeyStats.ca and NaturalStatTrick as well — no one seemed to agree completely.

Here’s what I found, going through the numbers with tweezers, particularly the game between TOR and NYI last night and in further conversation:

1) HockeyAnalysis uses the shift charts to determine who was on the ice; the rest of us do it from the On-Ice players in the PL play-by-play file. There are discrepancies between the two, which I wish I understood better, and there’s no good way to know which to trust first.

2) HA and we consider all events with an empty net to be separate from standard 5v5, which the others do not. For example, we have Dion Phaneuf with SF/SA 4/4 for the TOR/NYI game, whereas HockeyStats.ca and NaturalStatTrick have him at 4/5 — the difference is a pre-penalty shot at 15:39 of the 3rd period.

So, one minor mystery solved. Many more to go!