Introducing NHL Contract Projections Methodology (2020 Free Agency)
There are few dates more exciting in the hockey world than July 1st—at least in a normal year. The drama and intrigue surrounding NHL free agency has spawned an entire industry dedicated to contract rumours, predictions and analysis.
We’ve decided to throw our hats into that same ring. With different skill sets — one of us (Idriss, @thehockeycode) a CBA wizard with a finance and negotiation background and the other (Sam, @samforstner) an experienced hockey data analyst (previously @ Sportlogiq) with a background in statistics—but united by our love for and knowledge of the game and business of hockey, we decided to combine forces to analyze, predict, and better understand NHL contract decision making.
For the 2020 offseason, we’ve partnered with PuckPedia to project player contracts, and we will be outlining our contract projections methodology in this article. The full projections are available here.
*While most 2020 RFA and UFA free agent contracts are included, a subset will not be shared due to confidentiality reasons.*
NHL Contract Projections Methodology – Background and Prior Work
We are far from the first people to do this kind of work. Recently, Matt Cane published his projections in 2017 and 2018, followed by Evolving Wild taking up the mantle in 2019 and continuing to publish predictions today.
We lean extensively on this past work, benchmarking against the results of both models while developing our own logic and employing variations of some of the methods outlined in Evolving Wild’s excellent writeup. We hope you find this contract projections methodology insightful.
Data
Our dataset comprises 1,455 player contracts signed from the 2013 offseason to the NHL shutdown in March of this year. While we could have gone back farther, 2013 was the first offseason of the current CBA, so we chose to forego a larger sample in favor of apples-to-apples training data.
All of the contract data was sourced from puckpedia.com, and while it may not represent a full accounting of all contracts signed over this period, we favoured a depth of information (more variables about each individual contract) about each individual contract rather than a breadth of more contracts.
In addition to existing publicly sourced contract data, we create additional contract variables using Idriss’s proprietary business logic and cap management best practices that he has developed over years of work for his contract consulting service The Hockey Code.
On top of the contract information itself and CBA compliance elements, we layer traditional player statistics (goals, assists, etc.), “advanced” statistics (shot attempts, etc.), draft status (round, pick), and biographical information (height, weight), all of which were sourced from Hockey Reference.
NHL Contract Projections Methodology – Methods
Dataset Construction and Manipulation
Our first step after acquiring all the aforementioned data was to consolidate it into one dataset for modeling. For a variety of the model features, no additional data manipulation was required. However, this was not the case for a player’s year-to-year statistics.
When an organization and a player negotiate a contract, clearly more than the prior season’s performance is taken into consideration. Perhaps the player’s most recent performance was an outlier one way or another, or he was injured, or prioritized stability by requesting no-move clauses to be added to his deal —regardless, greater context is needed.
So, in line with past work, we include statistics from the past three seasons, with more recent performance receiving more weight (i.e. platform year). To arrive at optimal weights, we iterated over a large number of potential values, similar to the approach taken by Evolving Wild.
Multiple Models
There are two primary components to a player’s contract that we are interested in predicting: length of the contract (term) and value (expressed as a percentage of the salary cap in order to standardize across years).
One contribution we make beyond that of prior work is the addition of a third predicted component or target variable — no-movement or no-trade clauses. The inclusion of clauses as a feature not only provides a performance lift when projecting a player’s contract value, it also gives an additional layer of interpretable detail that we thought was valuable. Each negotiation is unique, and we operate under the expectation that each party (player and club) will look to prioritize specific needs to maximize short and long-term outcomes. As such, we can now better assess the impact of no-movement clause on contract value.
We project each of the three components sequentially in the following order:
Term
Predicting the length of a player’s contract amounts to a multi-class classification problem, with eight possible outcomes (eight years being the maximum length permitted under the CBA). We produced predictions for each possible contract lengths, and have accounted for cap management best practices when selecting the final predicted contract scenario.
In evaluating the term model, we prioritized F1 score, which represents a balance of precision (what % of positive predictions were correct?) and recall (what % of all positive results were predicted?). For this multi-class problem, we compute a weighted version of the score to account for class imbalance (far more contracts are one or two years in length than seven or eight).
Clause
We approach predicting the inclusion of a clause as a binary classification problem with only two outcomes: clause (1) or no clause (0). We make no distinction between no-move or no-trade clauses, nor between full and partial clauses. Then, if the predicted probability of a clause is greater than 50%, we predict the player’s contract will have a clause attached.
Note: For many players, the predicted probability will be very close to 0, as players who are under the age of 27 and have not accrued 7 professional seasons are not eligible (NHL CBA).
We evaluated the clause model using a standard F1 score, while also monitoring overall accuracy.
Salary Cap Percentage
Predicting the cap percentage of a deal is a regression problem, where we aim to predict the target variable of contract value as a percentage of the salary cap ceiling.
When evaluating the salary cap percentage model, we focused our attention on mean average error (how far, on average, were our predictions from the true value?).
Model Specification
After experimenting with other methods including Random Forest, we landed on XGBoost (eXtreme Gradient Boosting) trees for all three models.
In addition to being widely considered among the if not the current state-of-the-art choice for making predictions with structured, tabular (organized in rows and columns) data, the tree-based ensemble nature of the model limits the need for feature selection (though we only include features that we intuitively believe should impact a player’s contract).
Adjusting recommendations for real-world applicability
While many of our projections reflect the raw output of our models, we make some adjustments to term length in certain edge cases based on domain knowledge. For example, it is common practice for teams to avoid “walking” Restricted Free Agents (RFAs) exactly to Unrestricted Free Agency (UFA)—typically, it’s best practice to consider buying at least one UFA year to retain leverage ahead of subsequent negotiations (i.e. Matthews, Aho, Nylander). On the more conservative end of the spectrum, another common strategy is to shorten the term in order to ensure that the next round of negotiations will happen while the player is RFA with arbitration rights (i.e. Tkachuk, Point, DeBrincat).
Additionally, the negotiation track records of specific teams and their roster depth, player agencies, player signing status and other external factors make some model projections unrealistic.
We recognize this element of subjectivity may give some people pause, but we believe that leveraging both quantitative techniques and qualitative industry knowledge in tandem yields more realistic and useful results than either approach would on its own. We believe that carefully reviewing results generated from the model and ensuring practicability is a great way to drive value for the CBA-governed, ultra-competitive, and dynamic NHL marketplace.
Important Variables
Across all three models, some of the most important inputs include:
- Goals and assists
- Time on Ice Percentage (% of team’s total TOI in games the player played)
- Age
- Signing Status (RFA/UFA) & Arbitration Eligibility
- Draft status
- Special teams usage + performance
- Length and value of past contracts
- Even Strength Shot Attempts (While on Ice)
Additionally, we saw suggestive evidence of other factors influencing player contracts, including but not limited to:
- Height & Weight
- Career Achievement
- Team Performance
- Agent/Agency Representation
Conclusion and Discussion
Predicting player contracts is a complex process, and there is a real element of subjectivity in the way these deals are negotiated between clubs and players. Our objective is to provide an additional approach that improves on prior work, so that both hockey executives and fans can benefit.
Our NHL contract projections methodology of course has numerous limitations, a few of which we’d like to highlight here. Like the game of hockey itself, contract negotiations are undertaken by people, and we can’t hope to completely abstract away that human element. Organizations will put value on intangibles factors such as leadership and impact on team culture (i.e. GM Bergevin’s comments on Price and Weber’s legacy as Canadiens), intimidation and grit (Maroon, Reaves, Gudas), confidence, ability to bring confidence in others, communicate and motivate teammates (Stars’ goaltender Khudobin self-assessing his character and team-first attitude).
First of all, we were not able to directly include a team’s ability to spend money—either due to the league salary cap constraints or internal financial limitations—in the model. It would be a tremendous undertaking to find a team’s cap space on the day each individual contract was signed, and even this would not allow us to know the unobservable constraints affecting each organization’s decision making.
Additionally, we believe the continuation of a flat $81.5 million salary cap for the next several seasons poses a significant obstacle to making accurate predictions based on historical data. Though we do account for changes in the salary cap by using percentage of the cap as our target variable when predicting contract value, teams presumably made contract decisions expecting the cap to rise prior to the pandemic, and this unexpected change could very well affect negotiations in previously unseen ways. This is a great example of the challenges that decision-makers throughout the league must face when weighing short- and long-term considerations.
Things move quickly in the business of the NHL—there is a lot of information available, and making sense of it all can be a daunting task. If this NHL contract projections methodology helps at all in that regard, then we’ll have done our job.
Thank you/Contact
Thank you so much for reading! For questions, comments and additional work – you can find us on Twitter at @thehockeycode (Idriss) and @SamForstner.