# Introducing TWC’s Player Offensive Evaluation Tool 2.0

Evaluating pure NHL hockey talent comes in many forms and is a forever evolving field of work for teams and analysts alike. Here at TWC, we’ve worked on making a model for comparing the 5v5 offensive contributions of all NHL players.

## Introduction

Heading into the new season, the TWC Player Offensive Evaluation Tool (POET) got a bit of a rework and this is its next evolution. Our first model was put together last year entirely on Google Sheets, which was familiar but lacked the necessary computing speed to make the model truly useful. This year, we redid the whole thing using R.

The basis of the model works by using publicly available stats courtesy of Natural Stat Trick to compare offensive contributions from NHL level players. The model focuses on three years of game play, so for this version, 5v5 data from the 2017-18 season through to the 2019-20 season is included.

Let’s see the POET output for a carefully* selected Calgary Flame.

*Most elite at making friends

You can see that Matthew Tkachuk is more than one standard deviation above average for all 5v5 offensive contributions according to POET. So how does the model work in computing these outputs? Here’s the full breakdown of the model, from data acquisition all the way to final POET chart outputs.

## The data

Two sets of data are retrieved per year. For each season, players with fewer than 100 minutes of 5v5 ice-time are filtered out. Players are labeled as forwards or defencemen, and are only compared to their positional peers.

### On-Ice percentages

First, 5v5 score-and-venue adjusted on-ice percentages are obtained for each of the following stats:

- Low-danger Corsi for (LDCF%)
- Medium-danger Corsi for (MDCF%)
- High-danger Corsi for (HDCF%)
- Expected goals for (XGF%)
- PDO
- Offensive zone starts (OZS%)

### Individual Per 60 Rates

Similarly, individual per 60 rates are also grabbed, but these stats are not adjusted for score and venue. Some of these stats directly complement the on-ice percentages.

- Individual low-danger Corsi for (iLDCF)
- Individual medium-danger Corsi for (iMDCF)
- Individual high-danger Corsi for (iHDCF)
- Goals, primary assists, and secondary assists (G, A1, A2)
- Penalties taken and penalties drawn (PENT, PEND)

### Data Preparation

Lastly, one more stat that is neither a percentage nor per 60 stat is obtained:

- Time on ice per game played (TOIGP)

One initial calculation is included, which converts penalties taken and drawn per 60 into a player’s penalty differential per sixty (PEN). Overall that makes 14 metrics that will make up the factors weighted within the model.

For all players that skated within the past three seasons, they have to be sorted into one of seven groups based on the years they played at least 100 minutes:

- Playing all three years (one group)
- Playing in two of the three years (three groups)
- Playing in one of the three years (three groups)

This gives us a basis to weight offensive contributions by year, giving more value to the most recent seasons, and less as we reach further back in time.

For simplicity, the years are linearly weighted such that a player skating in all three years would have his most recent year worth triple the weight, the second most recent year worth double the weight, and the oldest year worth its own weight.

Stats from players who played in two seasons are similarly weighted to give double the weight to the most recent year, and players who only appeared in season have all the weight in that year. This is an important piece of context to keep in mind when looking at the outputs of players who have just started their NHL careers.

As an example, here are the formulas for xGF% for Matthew Tkachuk, who played all three seasons; versus Andrew Mangiapane, who did not reach the threshold in 2017-18; versus Juuso Valimaki, who only reached the threshold in 2018-19.

Player | Weighted xGF% |
---|---|

Tkachuk | (1 / 6) * xGF_18 + (2 / 6) * xGF_19 + (3 / 6) * xGF_20 |

Mangiapane | (1 / 3) * xGF_19 + (2 / 3) * xGF_20 |

Valimaki | xGF_19 |

The same method is used for the rest of the metrics. At this point, all stats used in the model are weighted and calculated based on each player’s seasonal appearances.

## The Model

Next, the weighting of each contributing metric needs to be determined. This portion included a lot of trial and error and testing different players to see if the values made sense based in part from their stats and eye testing too. A bit of subjectivity is present, but we tried to be as methodical with the weightings as possible, and assign weights that accurately conveyed that player’s offensive performance.

### Weightings

Goals per 60 are the main metric that all other metrics are compared against in this model. To make the math a little easier, the weighting for goals per sixty is set at 100. The rest of the coefficients are set as follows:

Metric | Weighting | Reasoning |
---|---|---|

G | 100.00 | The main metric to compare other coefficients to |

A1 | 85.00 | Primary assists are viewed as nearly as valuable as a goal |

A2 | 15.00 | Secondary assists are viewed as much less valuable |

LDCF% | 5.95 | The calculated shooting percentage of all low-danger chances (past three seasons) |

MDCF% | 15.65 | The calculated shooting percentage of all medium-danger chances (past three seasons) |

HDCF% | 22.78 | The calculated shooting percentage of all high-danger chances (past three seasons) |

XGF% | 50.00 | Giving an expected goal half the weight of an actual goal |

PDO | 15.00 | Some luck is included, and weighted the same as an A2, but ranked in reverse |

iLDCF | 5.95 | Same as LDCF% |

iMDCF | 15.65 | Same as MDCF% |

iHDCF | 22.78 | Same as HDCF% |

TOI | 20.00 | Rewarding players with higher ice time |

PEN | 15.00 | Rewarding positive penalty impacts |

OZS | 10.00 | Rewarding players with harder deployment, but ranked in reverse |

### Normal distributions

For every metric in the model, normal distribution probabilities can be used to place where a skater compares to either the entire population of forwards or defencemen. To calculate probabilities from normal distributions, the means and standard deviations of every metric are computed.

Here is a hopefully easy to understand example for the forward group using the mean and standard deviation of the goals per 60 metric:

Goals per 60 statistic | Value |
---|---|

Mean | 0.6011 |

Standard deviation | 0.2875 |

Matthew Tkachuk’s three-year weighted goals per 60 | 0.8233 |

So the question to ask is then:

*What percentage of forwards have a goals per 60 value of less than 0.8233 if the mean is 0.6011 and the standard deviation is 0.2875?*

Computing the value in this example gives Tkachuk a score of 0.7802, meaning his goals per 60 is in the 78th percentile for forwards.

All metrics are computed this way to give each player a set of scores. However, for PDO and OZS, a slight difference in the value is used to reward players with lower PDO values and lower OZS ratios.

Using Tkachuk again, an example looking at his OZS:

OZS% statistic | Value |
---|---|

Mean | 53.95 |

Standard deviation | 10.81 |

Matthew Tkachuk’s three-year weighted OZS% | 52.20 |

The question is reframed to:

*What percentage of forwards have an OZS value of greater than 52.20 if the mean is 53.95 and the standard deviation is 10.81?*

The answer gives Tkachuk a score of 0.5643, meaning he has less offensive starts compared to 56.43% of all forwards. By reframing the question, he is rewarded instead of penalized for having tougher deployments.

## TWCScore and offensive contributions

Now that everything is computed, we can combine everything into an aggregate metric, which we coined TWCScore for skaters. This takes the weightings and the computed normal distribution scores for every player and adds it all up, giving us a novel metric to compare the offensive contributions of all forwards against each other and all defencemen against each other.

We also combined subsets of the metrics to create three additional components: Possession, individual shot generation, and scoring.

TWCScore includes all 14 statistics, possession includes the three on-ice Corsi for percentages at the three danger levels, individual shot generation includes the individual Corsi for rates at the three danger levels, and scoring simply sums goals, primary assists, and secondary assists.

So this gives us the entire basis of the POET and how its metrics are computed. For a sanity check, we can look at the list of top players by TWCScore to see if the names that show up make sense in terms of players that are known to have dominant 5v5 offence.