Wednesday, July 22, 2009

Statistical Analysis in the NHL

By Glen Miller

During the first season after the lockout and of the new CBA, I had occasion to take the inaugural NHL Scouting and Management course offered through Sports Management Worldwide (SMWW). It was definitely interesting and well worth the price of admission. We had weekly chats moderated by ESPN’s E.J. Hradek that also including Barry Madigan who runs a highly respected Goaltenders School and has worked in hockey as both a player agent and in the front offices of several professional teams.

It was during one of these live chats that E.J. asked the group about our thoughts on the potential of statistical analysis in hockey. I thought long and hard about it and realized that I hadn’t seen much out there at the time. I like to think of myself as a bright guy and I am pretty good at math so every so often I would find myself jotting down stats and formulas and trying to think of something that would revolutionize statistical analysis in the NHL. Of course, as smart as I think I am, I’m evidently not smart enough to formulate an equation that can properly assess the true value and potential of an NHL player.

For those of you who may not be familiar with the principals of statistical analysis I’ll try to give you a crash course. I suppose that as Sir Isaac Newtom would be considered the father of modern physics, Bill James would be considered the father of statistical analysis. James has written over 2 dozen books dedicated to baseball history and statistics. He coined the term “sabermetrics” in reference to the Society for Baseball Research (SABR) of which he is a member. SABR’s stated goal is to “foster the research and dissemination of the history and record of baseball, while generating interest in the game”. Some of the more popular members engage in numbers crunching and data analysis. Think tanks filled with “sabermetricians” have created complicated formulas used to determine things like an individual player’s “VORP” (Value Over Replacement-Level Player). Bill James himself has created a formula to determine a player’s “Win Shares” or how many of a player’s team’s wins that player is responsible for.

Statistical analysis became a staple of Oakland A’s Management when the team hired Dartmouth graduate, former marine and Harvard Law educated Sandy Alderson to be GM in 1983. Alderson had no tangible background in player personnel so began using statistical analysis as a tool to evaluate players. Alderson mentored current A’s GM Billy Beane who has continued to advance the statistical analysis movement by graduating several of his front office employees to GM jobs throughout baseball. Beane and the way that the A’s manage the player personnel side of the business even inspired a book written by Michael Lewis called “Moneyball”. There was even talk earlier this year of a “Moneyball” movie which was rumored to be starring Brad Pitt.

Baseball sabermetricians tend to value certain statistics like On Base Percentage and Slugging Percentage over other, more traditional baseball statistics like Batting Average or RBI’s. They spend countless hours researching the math behind on-field strategies and try to determine why teams win. An example of this; it has been determined that batting a pitcher in the 8 spot in the lineup instead of in the more traditional 9th spot results in that team scoring more runs over the course of the season than they would have if the pitcher batted 9th and thus more wins.

If some enterprising and bright person could develop statistics that can be used to better evaluate players and their actual value to a team then it would be a huge step for NHL Managers. Of course baseball has gotten a 30 year jump start on hockey in the use of statistical analysis. I decided to take a quick look in cyberspace to see what, if anything, hockey “sabermetricians” were working on. Believe it or not, there are people dedicated to this activity; surprisingly, more than I thought.

One difficulty facing hockey sabermetricians is the fundamental differences between the two sports. The biggest difference between hockey and baseball is that hockey is much more a team game than an individual sport. Even individual statistics are greatly affected by things that your teammates do. A prime example of a stat that can be directly affected by the actions or inactions of a teammate is the +/- rating. Let’s say that Player A (Josh Bailey) is just coming onto the ice to replace Player B (Doug Weight). Player A (Bailey) is on the ice but is still several feet away from the play when Player C (we’ll say Bruno Gervais for argument’s sake) coughs the puck up to say, Marian Gaborik who then fakes Rick DiPietro out of his skates for an easy goal. If this scenario had occurred at even strength then Bailey would get hung with a -. That isn’t exactly fair since it was Gervais who coughed the puck up and DiPietro who looked foolish trying to stop Gaborik. Bailey wasn’t even on the ice long enough to affect the play in any way and yet he has a negative statistic next to his name in the box score.

Baseball, on the other hand, is a game full of one-on-one match-ups between the pitcher and the batter. Especially offensively, a baseball player’s stats are definitely earned by the individual player with little or no help from his teammates. In hockey it is rare for a player to record any stat without a teammates help. That’s why players earn points for assists. Hockey sabermetricians already start at a bit of a disadvantage because of this.

However, if you do a little research online, you can find some sites where smart people are actually trying to come up with the right stats. One site is Behind the Net.com It lists several “Advanced Statistics” such as “On/Off-Ice +/-“ and “Strength of Opponents” to help quantify how effective an individual player is. The On/Off-Ice +/- compares a specific player’s +/- rating with that of the entire team. For example, if a player is a -10 (player is on the ice at even strength when a goal is scored against 10 more times than he is on the ice when a goal is scored for) and the entire team is a combined -15 then the player’s On/Off-Ice +/- rating is +5. This helps remove the quality, or lack thereof, of the team from the individual player’s +/- rating.

Alan Ryder has a website called HockeyAnalytics.com and has copyrighted a method he calls “Player Contribution” (PC). According to the website, “PC is a method for allocating credit for a team’s performance to the individual contributor’s on a hockey team. More precisely, it is a way of allocating a team’s wins to individual players.” Ryder cites the work of Bill James and James’ “Win Shares” statistic for baseball as the “inspiration for my work and his methodology has been adapted here.” Ryder attempts to quantify the performances of offense, defense and goaltending into one statistic that can be used evenly across the board.

Puck Prospectus is a spin-off of the popular Baseball Prospectus. Baseball Prospectus is basically a think tank of baseball “sabremetricians” and Puck Prospectus aspires to be the same for hockey. In fact, Puck Prospectus just introduced on July 20th, a statistical projection system referred to by the acronym “VUKOTA”. Like Baseball Prospectus’ statistical projection system, PECOTA, VUKOTA is named also after a former player in the sport. I’m sure I don’t have to explain to Islanders’ fans who Mick Vukota was.

As defined by Puck Prospectus, VUKOTA utilizes a combination of fundamental and advanced statistics to compare current NHL players with similar players throughout history to project the player’s stat line next season in terms of goals, assists and GVT (Goals Versus Threshold). It will be interesting to see how the first VUKOTA projections compare to the actual players stats in the upcoming season.

GVT is a stat that blends a wide array of offensive and defensive statistics to determine a player’s value (in terms of goals) versus the value of a “marginal” (minor league) player. This stat is based on the same principal as Baseball Prospectus’ VORP (Value Over Replacement-Level Player).

While not nearly as advanced as statistical analysis is in baseball, hockey guys are making headway creating our own stats to help quantify the value of players. This appears to be a growing movement and the popularity of statistical analysis in baseball will likely inspire more smart mathematicians to try their hands with hockey stats. Hopefully, NHL organizations are looking into statistical analysis as a viable source of player information and further down the road authors will be competing to write the story of “Moneypuck”.

8 comments:

Mess11 said...

Wow, I never thought about it that way. What a great write-up Glen! I never really read anything of this sort, so it was cool to think of hockey from a more statistical angle. Great read!

DevsFan said...

I have to agree with Mess! This was an awesome write-up Glen, definitely a whole new perspective. We are constantly talking about player movement and game recaps taht we forget about things like stats that matter. Great write up!

Glen Miller said...

Truthfully I hadn't heard of a lot of this either. I do remember just before free agency that EJ Hradek and a guy from Puck Prospectus did a write-up on what every team needed in FA and the perfect solution player-wise for that team. That's where I first heard of GVT but I still have no idea what the calculation is. there is a lot of potential with statistical analysis though. If I ever was an NHL GM you can bet I'd be researching some of this and utilizing it.

Rob M. said...

Glen,

what an interesting read. I have never read something like this before, but this is great. Keep up the great work!

-Rob M

Andy Strickland said...

Great read, Glen. You are a very intelligent writer, and this is an awesome piece of work. I too believe that statistics in the NHL need to be tweaked, and it can really help us out greatly in augmenting the knowledge of our players and our game. Take care!

-Andy

Glen Miller said...

Is this the same Andy Strickland that writes for HockeyBuzz?

Glen Miller said...

Regardless, I appreciate the compliment. Usually I can pound out a pretty good piece in a couple of hours but this one took me over a week and I had Justin critique it in the beginning too. I wish I was better in math so I could crunch some numbers and figure out how to get Sather to not spend too much money in UFA!

Nav said...

Nice writeup. There's going to have to be so much more advancement in the publicly available data before anyone can make many further advancements than they have already. Of course, hockey is like baseball in that goal creation and goal prevention are what win games in hockey, much like run creation and run prevention do in baseball.

In baseball, it's easy to quantify how to create runs (hit) and prevent them (defend and pitch). In hockey, you have to listen to people talk about 'grit', 'character', 'makeup', 'desire', etc. that are in the eye of the commentator.

When I think about hockey, there are so many goals that are scored as a result of luck. For example, think about an offensive zone faceoff win that leads to a point shot with a lot of traffic in front of the net. The goalie sees nothing, the puck hits a stick and two skate blades, and goes in. Is this a repeatable skill? Surely, shooting the puck is, but after the shot has been fired, all kinds of things happen that are not within the control of the shooter. Even the last offensive player whose skate blade the puck hits will be credited with a goal, and all he did was jockey for position in front of the net with a defenseman. He may have had no idea that a shot was coming.

So, even goals are not as easy to work with as one may think. The theme of sabermetrics is to isolate the performance of the individual, because only then will we be able to correctly evaluate him. This is what lead to statistics such as FIP, tRA, UZR, and such.

Anyway, I'm interested to see where this ends up.