The Legend of the Most Dangerous Hockey Lead: Truth or Fiction? Web Scraping with Pentaho Data Integration

Tuesday, July 7, 2015 - 03:30

Now that the Stanley Cup is tightly grasped by the rightful victors, The Chicago Blackhawks, fans can look back on the 2014-2015 season and reflect. My reflection stems from an old adage in hockey that the two goal lead is the most dangerous lead. The two-goal lead has always sent chills down the spines of fans, announcers and players alike. Fear of complacency, against the unmatched speed of a hockey game surely must make a two-goal deficit the easiest to overcome, right?

But let's step away from sports superstition and conventional hockey wisdom and take a look at the facts. How can it be easier to triumph when you are two goals in the hole instead of just one?  After analyzing the scores of NHL games since the lockout in 2012, the findings seem to challenge these truths of the hockey bible.

Calculating a Lead is Not Easy as it Appears

First I'd like to define some key terms and assumptions made during this analysis. A lead is defined as the difference between the scores of each team at any point throughout the game. A lead change is any point at which a lead equals 0.

The maximum lead is the greatest difference between two teams scores in between a lead change. A positive maximum lead indicates that the home team holds the lead, a negative value shows that the away team holds the lead.  Lets take a look at an example:

Home Team

Home Score

Away Team

Away Score

Max Lead

Lead Status

Team A

0

Team B

0

NA

NA

Team A

1

Team B

0

1

NA

Team A

2

Team B

0

2

NA

Team A

2

Team B

1

2

NA

Team A

2

Team B

2

NA

Lead Lost

Team A

3

Team B

2

1

NA

Team A

4

Team B

2

2

NA

Final

4

 

2

2

Lead Held

In this illustration, Team A scores the first goal and the second goal. Which puts the home team up 2-0 and the max lead at 2. When team B scores its first goal, the max lead stays at 2 because while Team A had conceded a goal, they have yet to lose their lead. Once Team B scores their second goal however, the score is tied at 2-2 signifying a loss of a lead. Team As two goal lead has effectively been completely nullified. When Team A scores two more goals and wins the game by two goals 4-2, they have amassed a max lead of 2 goals again, and have also held the lead.

Scraping and Shaping Detailed Hockey Data

Using Pentaho Data Integration, I was able to scrape data from a popular hockey website, and shape it into an analytical set.  Many thanks to the folks at Hockey-Reference.com, who provided the raw data for this analysis.

The data chosen for this study includes 2,924 games. Within those 2,924 games there are 8,425 lead changes. For simplicity of data collection, games determined by shootout were omitted.

I collected the data, stored it in a MySQL Database, and was able to aggregate and query the data based on the team and the nature of the lead.

Is the Two Goal Lead More Likely To Result in a Loss?

After digging through the data, the table below shows that one goal lead is far more difficult to maintain than a two goal lead.

Max Lead

Overall %  Held

8

100.000%

7

100.000%

6

100.000%

5

100.000%

4

98.901%

3

92.805%

2

64.962%

1

28.964%

As expected, large leads of 4 or more are almost always maintained by the winning team. Which is what we'd expect. How many times have you watched a team overcome a five goal lead after all? The results seem indisputable. A two goal lead is maintained about 65% of the time whereas a single goal lead is maintained a mere 29% of the time.

Does Home Ice Advantage Matter?

After crunching the numbers a little further it also appears that it is slightly easier to maintain a lead as the home team than as the away team. However, the difference is quite small, which supports another piece of hockey wisdom: home ice advantage does not exist.

Home

 

Away

 

Lead

% Held

Lead

% Held

8

100.000%

8

100.000%

7

100.000%

7

100.000%

6

100.000%

6

100.000%

5

100.000%

5

100.000%

4

98.784%

4

99.078%

3

95.007%

3

89.641%

2

65.308%

2

64.543%

1

30.373%

1

27.485%

While the lead holding percentages are generally slightly higher for the home team, the difference is not large enough to conclude that there is a significant home ice advantage. But the 3% difference in one goal lead success could prove to be the marginal difference in tight games over the course of a season.

Are Some Teams Better Than Others With Holding Their Lead?

Lastly, for the sake of rivalry, I took a look on a team by team basis. Which teams take a lead and proceed to dominate the opposition? And which teams choke most frequently?

The teams with the best holding percentage of one goal leads held while they are at home, and the teams with the worst percentage are:

Best At Holding 1 Goal Lead at Home

Worst At Holding 1 Goal Lead at Home

Minnesota Wild

44.59%

Winnipeg Jets

17.92%

New York Rangers

43.21%

Dallas Stars

16.67%

Pittsburgh Penguins

41.43%

New York Islanders

13.73%

The teams with the best holding percentage of 2 goal leads held while they are at home , and the teams with the worst percentage are:

Best At Holding 2 Goal Lead at Home

Worst At Holding 2 Goal Lead at Home

Philadelphia Flyers

88.89%

Toronto Maple Leafs

41.67%

Anaheim Ducks

88.24%

New York Islanders

41.46%

Los Angeles Kings

86.67%

Arizona Coyotes

25.00%

The teams with the best holding percentage of one goal leads held while they are away , and the teams with the worst percentage are:

Best At Holding 1 Goal Lead while Away

Worst At Holding 1 Goal Lead while Away

New York Islanders

42.11%

Buffalo Sabres

17.74%

Anaheim Ducks

37.63%

Arizona Coyotes

13.64%

New York Rangers

36.84%

Edmonton Oilers

12.00%

The teams with the best holding percentage of 2 goal leads held while they are away , and the teams with the worst percentage are:

Best At Holding 2 Goal Lead while Away

Worst At Holding 2 Goal Lead while Away

Carolina Hurricanes

81.48%

New York Islanders

48.39%

San Jose Sharks

78.95%

Colorado Avalanche

47.37%

St. Louis Blues

78.95%

Boston Bruins

47.22%

It is important to note that all of these percentages pertain only to leads. A team that has a low one goal lead holding percentage may still be able to win lots of games by more than one goal. It is only a view of a teams performance under the different psychological conditions caused by the score of the game at a given time.

A One Goal Lead Is More Dangerous Than A Two Goal Lead

It is possible, however unlikely, that at one point in hockey history, there may have been some truth to the threatening two goal lead. But in today's modern game, the numbers starkly argue the contrary. There is much more to fear if you lead by one than if you lead by two.

 

Contact us today to find out how Inquidia can show you how to collect, integrate and enrich your data. We do data. You can, too.

Would you like to know more?

Sign up for our fascinating (albeit infrequent) emails. Get the latest news, tips, tricks and other cool info from Inquidia.