Using machine learning to find the best and worst value contracts


As soon as free agency opened in July of 2016, the Lakers signed Timofey Mozgov to a 4 year, $64 million deal. At the time, this seemed like an enormous overpay. Today, it still seems like one. In the 2015-16 season, Mozgov averaged 6.3 PPG and 4.4 TRB in 17.4 MPG for the Cavs. The following season – his first as a Laker – his stats improved marginally. He averaged 7.4 PPG and 4.9 TRB in 20.4 MPG.

The following summer, the Lakers dumped Mozgov’s salary. They traded him along with D’Angelo Russell for Brook Lopez and the 27th pick in the draft. They used this pick to select Kyle Kuzma.

Mozgov’s contract was bad from the beginning. But, this is not always the case. Often, misfortune and other unforeseen circumstances make contracts bad. Key examples include injuries and accelerated aging. So, at the time, these contracts seem fine. But they become poison fast. In turn, the players earn much more than expected given their performance.

To find the best and worst value contracts, we’ll create 4 models to predict a player’s salary. This is not a predictive metric of what a player will earn in their next contract. This evaluates expected salary relative to real salary to see who’s overpaid and who’s underpaid.

History and understanding the data

In the 1984-1985 season, the NBA instituted the salary cap. This was a bare-bones salary cap; many of the rules that influence today’s cap weren’t in place. Unrestricted free agency, rookie contracts, max contracts, etc. only came into play later.

With the salary cap, each team had to decide how to allocate their money. Early on, many teams opted for a smooth approach where they would pay lots of middling players. In the 1990-1991 season, only 9 players earned over 20% of the total salary cap. The highest-paid player was Patrick Ewing, who earned 35.8% of the salary cap. At the time, the max contract did not yet exist. This year is somewhat of an outlier. The following season, the salary distribution changed to resemble what we see today.

Larry Bird earned a staggering 56% of the Celtics’ salary cap in the 1991-1992 season. Several players started to earn more of their team’s cap, as the league shifted more towards star power. The CBA only included the max contract about a decade later. Without max contracts, Michael Jordan earned 120% of the cap for two years in his second 3-peat.

Today, due to exceptions and restrictions, teams structure their cap room to get stars. Most contending teams often have a few max players, a couple guys in the $10 million range, and lots of minimums. So, if we’re going to predict salary, we must first understand its distribution.

We see that, historically, most players earned a small percentage of the cap. This is what we would expect. With the prevalence of tanking and salary dumps, most teams will have at least one high salary guy. For a contender, this is their main star. For a tanking team, this could be someone they took on along with a draft pick (such as Melo on the Hawks).

Next, we’ll look at how different factors correlate with salary. Theoretically, better players should always earn higher salaries. So, the correlation between something like points and salary should be positive. This is because better scorers are often better players, and, in turn, should earn more.

The CBA complicates this simple relationship with the rookie scale and minimum contracts. If we’re looking at performance relative to salary, rookie contracts by far the best value in the league. Luka Doncic is playing at an incredible level, but only earns about $7.5 million. Minimum contracts create a similar effect of underpaid players. Many contenders have no cap room. So, if a player wants to contend, he must take a pay cut. For the player to join the contender, they must either take a minimum or the mid-level exception (like Cousins on the Warriors). Relative to their performance, this is a steal.

The three graphs below show the relationship between different stats and salary.

As expected, points and win shares correlate with salary. Furthermore, the age plot shows the effect of rookie contracts. No player under 20 years old earned over 20% of the cap.

Now that we understand the data, we can discuss the methods for the analysis.


First, we collected all player data for every season since the 1990-1991 season. We go back to 1990-1991 for two reasons. First, the CBA only added unrestricted free agency in 1988. This changed the whole process of free agency – and in turn, contracts. So, we wouldn’t go back further than 1988 anyway. Second, our source for historical player salary data is this Kaggle data set, which only goes back to 1990.

We combined the salary data with each player’s counting and advanced stats for that season. Note that we’re taking stats for the given season. So, this is not a predictive metric. This evaluates whether the player is overpaid or underpaid given their expected salary. As such, players like Gordon Hayward will appear overvalued. At the time, the contract was fair. Due to a devastating injury later, Hayward spent a full year rehabbing and still had to shake off rust. Last season, Hayward did not play like a max player. So, he was overpaid for that season.

We used the following factors to predict a player’s salary:

  1. Age
  2. Points per game
  3. Rebounds per game
  4. Assists per game
  5. Steals per game
  6. Blocks per game
  7. True shooting %
  8. Win shares

These factors generally paint a picture of a player’s performance and situation. So, they can predict a player’s salary.

We included age as a feature to adjust for rookie contracts. Though they’re great value, it’s not a result of any negotiation or offer by the GM. It’s based on the player’s draft slot. So, even a superstar 18-year-old can’t get paid a large part of the cap. Adding age nullifies the effect of rookie contracts. If we did not have age, we would likely have to remove rookie contracts, as they would add noise to our data set. This transforms the problem from salary given performance to salary given expected salary. Our definition of expected salary mixes performance and age.

We used win shares instead of VORP or BPM because win shares is a cumulative stat. So, it depends on games played, which is a key factor in how much value a player provides to his team.

To understand the relationship between these features, we created a correlation plot.

With our 10,821 samples, we randomly split the data. We used 75% to train the models, and 25% to test them. We created four models:

  1. K-nearest neighbors regressor (KNN)
  2. Random forest regressor (RF)
  3. Gradient boosting regressor (GBR)
  4. Extreme gradient boosting regressor (XGB)

Note that we’re predicting the percentage of the cap a player deserves, instead of their raw salary. The salary cap rose and inflation occurred, so players today earn more than players 20 years ago. But, the percentage of the cap stayed consistent. Because we’re predicting percentage of cap, this is a regression problem.

We predicted salary for the 2018-19 season using 2018-19 stats (last season’s numbers). So, contracts won’t exactly match up with what they are today.

We also only considered players who played in the 2018-19 season. This affects some teams; for example, the Heat paid Chris Bosh $26.8 million last season, but he did not play. Because Bosh didn’t play, he’s excluded from the data. Also, Bosh doesn’t count against the Heat’s cap because he retired for medical reasons.

One final note about the data is that there are small inconsistencies due to player movement. We collected contract data from Basketball Reference’s contract page from April 1, 2019 (link). This is after the trade deadline. So, traded players count against the team they played for at the end of the season (e.g. Marc Gasol is on the Raptors, not the Grizzlies). Some players had multiple entries on the page, like Carmelo Anthony. He earned $25 million from the Hawks (salary dump from the Thunder). Then, after the Hawks waived him, Melo signed with the Rockets for $2 million. The Rockets then traded him to the Bulls. So, Melo has a $25 million Hawks entry and a $2 million Bulls entry. We removed duplicates and kept the first-indexed one (the higher salary one).

Neither of these inconsistencies affect the analysis much. Traded players still have the same contract, so on a player-by-player basis for value, they’re the same. This only affects the team’s sum of expected and observed salary difference. In the case of Melo with two entries, there are very few of these cases where it doesn’t matter much.

Regression analysis

In this section, we’ll check how our models perform.

Basic goodness of fit

For regression models, we have two basic metrics of performance. First, we have r-squared. This measures the proportion of the variance in the dependent variable (percent of cap) explained by the independent variables (features). It’s between 0 and 1, with 1 being the best possible value.

Second, we have mean squared error. This measures the average squared difference between the predicted and observed values. Unlike r-squared, lower MSE is better, with a best possible value of 0. We can interpret mean squared error, as it tells us how close our predictions are to the real value on average.

The table below shows the r-squared and mean squared error for the four models.


All the models have a very low mean squared error of 0.003. We can take the square root of this (root mean squared error) to interpret the result. This differs from mean absolute error, which is the absolute value of the differences. RMSE is better here because it penalizes large errors more, so it’s more valuable for this problem.

Because the mean squared error for each model is about 0.003, the RMSE is about 0.055. This means that, on average, the models are about 5.5% off in predicted percent of cap.

In our classification problems, we create dummy classifiers to represent improvement over random. It’s harder to create a random model here. However, we can still compare our models to simple regressions. We saw before that the correlation coefficient (r) for a points and salary regression is 0.6. So, the r-squared is 0.36. Our models all have r-squared above 0.5, so they outperform simple methods to predict salary. So, the models predict salary well.


In machine learning, we want to avoid overfitting. This occurs when the models learn the given data too well. So, they’re accurate on the given data but aren’t predictive on new data. To check for overfitting, we’ll perform cross-validation.

First, as in previous posts, we performed grid search on our hyperparameters. This means we tested lots of possible combinations for factors that determine how our models fit the data. Then, we selected the combination that resulted in the lowest MSE on different splits.

Additionally, we performed k-fold cross-validation. In k-fold cross-validation, we randomly split the data into k bins. The models receive k – 1 bins as training data, and then predict the one excluded bin. We repeat this process for every combination of bins. Then, we average the performance across the bins. This gives us an estimate of how our models perform on different splits of the data. A cross-validated score close to our initial score indicates the model performs almost the same on the different splits. So, if the two scores are close, it’s unlikely the models are overfitting.

The table below shows the cross-validation scores for r-squared and MSE, along with their 95% confidence intervals (2 standard deviations away from the mean).

Modelr-squared95% CIMSE95% CI
KNN0.507+/- 0.0710.004+/- 0.001
RF0.469+/- 0.1500.004+/- 0.001
GBR0.421+/- 0.2810.004+/- 0.002
XGB0.444+/- 0.1990.004+/- 0.001

Though the CV r-squared scores are lower, the actual r-squared scores are within the 95% confidence interval. Furthermore, the CV MSE is very close to the initial MSE. So, it’s unlikely the models are overfitting.

Standardized residuals

A big part of regression analysis depends on analyzing the residuals. A residual is the difference between the predicted and observed value at a point.

Residuals in strong models have two important characteristics. First, they follow the normal distribution. Second, they have no autocorrelation or trend. Both these characteristics show that the model isn’t repeating the same mistake.

First, we’ll look at the standardized residuals test. Ideally, 95% of a model’s standardized residuals fall within 2 standard deviations of the mean. Furthermore, the standardized residuals should have no noticeable trend.

The graph below shows the standardized residuals of the four models.

We see that only the KNN has 95% of its residuals within 2 standard deviations of the mean. The others are close to 95% (they’re all over 94.6%). This 95% is not a hard boundary; it exists because, in a normal distribution, 95% of data is within 2 standard deviations of the mean. So, the fact that close to 95% of the data is within 2 standard deviations is good.

The distributions of the standard residuals differ from a normal distribution. We see that the residuals peak close to 0 far above the expected density for a normal distribution. Furthermore, there are some residuals far away from most of the data. So, our residuals are probably heavy-tailed and not normal.

To analyze this assumption, we’ll perform a Shapiro-Wilk test for normality. The test returns a p-value and a w-value (not important here). If the p-value is less than 0.05, we can reject the null hypothesis, which is that the data is normally distributed. So, if p < 0.05, the data is not normal. The table below shows the p-values of the Shapiro-Wilk test.

KNN< 0.001
RF< 0.001
GBR< 0.001
XGB< 0.001

The p-value is small for all four models, so we can reject the null hypothesis. So, our standardized residuals are not normally distributed. This low p-value may be a result of a large sample size. To confirm this isn’t due to sample size, we’ll also look at a quantile-quantile (QQ) plot.

The QQ plot shows the theoretical quantiles and the order values of two distributions. If the two distributions are the same, their QQ plot will be a straight line. So, we will plot the residuals against a normal distribution (shown in red). The closer our points are to the red line, the better. The graph below shows each model’s QQ plot.

We see that the residuals stay close to the line for most of the middle values. At the upper and lower ends, the residuals differ a lot. This indicates the residuals have heavy tails like we thought before. So, we can confidently say the model’s residuals are not normal.

Now, we will test for autocorrelation. To do this, we’ll perform a Durbin-Watson test. The test returns a Durbin-Watson statistic between 0 and 4. Values close to 2 indicate no autocorrelation. Values close to 0 indicate positive autocorrelation. Values close to 4 indicate negative autocorrelation. The table below shows the results of the DW test.

ModelDW statistic

The DW statistic for each model is close to 2. So, there’s no autocorrelation in the residuals, which is promising.

Though our residuals are not normal, which is a problem, the models are still useful. The residuals have no autocorrelation – meaning the model is not repeating the same mistake – and the models have no error. Now that we’ve evaluated the models, we can see what they predict.


As mentioned earlier, rookie contracts are bargains. This is because their contracts depend on their draft position. Furthermore, they’re non-negotiable, as the rookie scale continues for 4 years.

Before diving into results, we’ll look at an example of how our models treat rookies. Last year, Luka Doncic put up 21.2 PPG, 7.8 RPG, and 6 APG on good efficiency at 19 years old. So, even though he’s a great player, his expected salary is low because he’s on a rookie contract.

Our models predicted Luka Doncic to earn 12% of the cap. This is higher than his actual percent of cap of about 6.5%. So, he’s still a bargain given his expected salary. However, a player putting up Doncic’s stats would earn far more than 12% of the cap on the open market. This shows that our models identify the effect of age.

Now, we’ll let the models predict what Luka Doncic would earn at 27 years old if he put up the same stats. Given Doncic was 19 his rookie year, he will start his rookie maximum contract when he’s 23. That contract will take up 25% of the cap with 8% annual raises.

The models predict Doncic will be worth 23% of the cap if he put up his rookie stats (which he’s already improved on this year) at 27 years old. This is close to what he’ll make depending on how fast the cap rises. Because of the large difference in salary depending on age, we know the models capture the effect of age. So, they won’t identify all rookie contracts as bargains, as they know the expectation for rookies.

Now that we understand this, we can examine the results.

We’ll look at both player-by-player and team-by-team differences between expected and actual percent of cap. Higher values mean the player is great value, as he’s earning below his expected salary. We can sum these across teams to see which teams overpay players and which teams get good value.

The graph below shows the top 10 best value contracts as decided by the average of our four models.

This list has some players we’d expect. Last season, Kemba earned only $12 million, a great value. Several of the players here signed a minimum contract or an exception like DeMarcus Cousins and Brook Lopez. Despite our earlier example of how the models filter out rookie contracts, there are still two rookie contract players here. But, they’re both older than typical rookie contract players. Last season, both Buddy Hield and Malcolm Brogdon were 26 years old. This is about as old as a player can get on their rookie contract unless they’re an international player. So, their age prevents their expected value from regressing to that of a rookie contract.

Now, let’s look at the worst value contracts.

This contains the usual suspects for the worst contracts. Before his improvement this year, Andrew Wiggins had one of the worst contracts in the league. Furthermore, Hayward’s injury recovery made him perform far below what you’d expect for a max player. We also have some less recent poison contracts, like Otto Porter, Chandler Parsons, and Ryan Anderson.

Let’s look at which teams handed out the best contracts last year. We do this by summing the difference in expected and actual salary for every player on the roster.

All the teams on this list have lots of good role players or young talent. Though the Lakers, Pelicans, and Bucks all had a max player (LeBron, Davis, Giannis), a max is great value for those players. Furthermore, the rest of the team had great value contracts. We see that the teams here made big moves over the summer, often resulting in improvement. The Lakers added Davis. The Pelicans added young talent. The Nets signed Kyrie and KD. The 76ers signed Horford. The Clippers signed Kawhi and traded for Paul George.

Now, let’s look at the teams with the worst value contracts.

OKC had the worst cumulative value difference by a large value. Westbrook’s max and Steven Adams’ large contract contribute to this. We see that a lot of the teams here ended up tanking or in limbo. For example, the Thunder traded Westbrook and George. The struggling Pistons held onto Griffin and Drummond and now look even worse.

Though it’s bad to be a team in the above graph, for tanking teams, it can be good. Tanking teams often receive salary dumps, where they take on a bad contract in exchange for picks. So, they’ll have a negative difference, but it’s worth it because of the attached assets.

Let’s look at the distribution of our predicted salary relative to the actual salary distribution.

We see that the predicted distribution is a right-shift of the observed distribution. Minimum contracts and exceptions contribute to this, as good players often take discounts to play for contenders.

Full individual results

The table below shows the predicted salary for each player and the difference between expected and actual salary.

Stephen Curry$37,457,1540.380.320.
Chris Paul$35,654,1500.360.
Russell Westbrook$35,654,1500.360.
LeBron James$35,654,1500.360.320.340.330.370.34-0.02
Blake Griffin$32,088,9320.320.
Gordon Hayward$31,214,2950.320.
Kyle Lowry$31,200,0000.310.
Paul George$30,560,7000.310.
Mike Conley$30,521,1160.310.
James Harden$30,431,8540.310.310.270.300.280.29-0.02
Kevin Durant$30,000,0000.300.
Paul Millsap$29,730,7690.300.
Al Horford$28,928,7100.
Damian Lillard$27,977,6890.
DeMar DeRozan$27,739,9750.
Otto Porter$26,011,9130.
Jrue Holiday$25,976,1110.
CJ McCollum$25,759,7660.
Carmelo Anthony$25,534,2530.
Andrew Wiggins$25,467,2500.
Joel Embiid$25,467,2500.
Bradley Beal$25,434,2630.
Anthony Davis$25,434,2630.
Andre Drummond$25,434,2630.
Hassan Whiteside$25,434,2630.
Nikola Jokic$24,605,1810.
Steven Adams$24,157,3040.
Giannis Antetokounmpo$24,157,3040.240.270.330.280.300.300.05
Kevin Love$24,119,0250.
Marc Gasol$24,119,0250.
Chandler Parsons$24,107,2580.
Harrison Barnes$24,107,2580.
Nicolas Batum$24,000,0000.
Rudy Gobert$23,241,5730.
Kawhi Leonard$23,114,0670.
DeAndre Jordan$22,900,0000.
LaMarcus Aldridge$22,347,0150.230.300.280.330.310.300.08
Serge Ibaka$21,666,6670.
Aaron Gordon$21,590,9090.
Danilo Gallinari$21,587,5790.
Victor Oladipo$21,000,0000.
Jimmy Butler$20,445,7790.
Ryan Anderson$20,421,5460.
Kyrie Irving$20,099,1890.
Jabari Parker$20,000,0000.
Zach LaVine$19,500,0000.
Tyler Johnson$19,245,3700.
John Wall$19,169,8000.
George Hill$19,000,0000.
Jeff Teague$19,000,0000.
Klay Thompson$18,988,7250.
Dwight Howard$18,919,7250.
Enes Kanter$18,622,5140.
Wesley Matthews$18,622,5140.
Joakim Noah$18,530,0000.
Allen Crabbe$18,500,0000.
Goran Dragic$18,109,1750.
Kent Bazemore$18,089,8870.
Evan Turner$17,868,8520.
Draymond Green$17,469,5650.
Tristan Thompson$17,469,5650.
Tim Hardaway$17,325,0000.
Reggie Jackson$17,043,4780.
Evan Fournier$17,000,0000.
Bismack Biyombo$17,000,0000.
Derrick Favors$16,900,0000.
Jonas Valanciunas$16,539,3260.
Gary Harris$16,517,8570.
Andre Iguodala$16,000,0000.
Ian Mahinmi$15,944,1540.
Dennis Schroder$15,500,0000.
DeMarre Carroll$15,400,0000.
Clint Capela$15,293,1040.
Gorgui Dieng$15,170,7870.
Pau Gasol$15,100,0000.
Eric Bledsoe$15,000,0000.
Trevor Ariza$15,000,0000.
Ricky Rubio$14,975,0000.
Tobias Harris$14,800,0000.
J.R. Smith$14,720,0000.
James Johnson$14,651,7000.
Brandon Knight$14,631,2500.
Robin Lopez$14,357,7500.
Luol Deng$14,354,0670.
Marvin Williams$14,087,5000.
Taj Gibson$14,000,0000.
Jeremy Lin$13,768,4210.
Thaddeus Young$13,764,0450.
Kenneth Faried$13,764,0450.
Tyson Chandler$13,585,0000.
Marcin Gortat$13,565,2180.
Cody Zeller$13,528,0900.
Eric Gordon$13,500,3750.
Joe Ingles$13,045,4550.
Khris Middleton$13,000,0000.
Michael Kidd-Gilchrist$13,000,0000.
Mason Plumlee$12,917,8080.
Wilson Chandler$12,800,5620.
Nikola Vucevic$12,750,0000.
Jordan Clarkson$12,500,0000.
Miles Plumlee$12,500,0000.
Nikola Mirotic$12,500,0000.
Tyreke Evans$12,400,0000.
Courtney Lee$12,253,7800.
Solomon Hill$12,252,9280.
J.J. Redick$12,250,0000.
Kelly Olynyk$12,137,5270.
Austin Rivers$12,000,0000.
Avery Bradley$12,000,0000.
Kentavious Caldwell-Pope$12,000,0000.
Kemba Walker$12,000,0000.
Will Barton$11,830,3580.
T.J. Warren$11,750,0000.
Marcus Smart$11,660,7160.
Patty Mills$11,571,4290.
Dion Waiters$11,550,0000.
Alec Burks$11,536,5150.
John Henson$11,327,4660.
Jusuf Nurkic$11,111,1110.
Iman Shumpert$11,011,2340.
Maurice Harkless$10,837,0790.
Tony Snell$10,607,1430.
Meyers Leonard$10,595,5060.
Terrence Ross$10,500,0000.
Bojan Bogdanovic$10,500,0000.
Robert Covington$10,464,0920.
Rudy Gay$10,087,2000.
Jon Leuer$10,002,6810.
Danny Green$10,000,0000.
Darren Collison$10,000,0000.
Dwight Powell$9,631,2500.
Matthew Dellavedova$9,607,5000.
Dante Exum$9,600,0000.
Jared Dudley$9,530,0000.
Norman Powell$9,367,2000.
Josh Richardson$9,367,2000.
Bogdan Bogdanovic$9,000,0000.
Rajon Rondo$9,000,0000.
E'Twaun Moore$8,808,6850.
Kosta Koufos$8,739,5000.
Jerami Grant$8,653,8470.
Fred VanVleet$8,653,8470.
Kyle Anderson$8,641,0000.
Julius Randle$8,641,0000.
Markieff Morris$8,600,0000.
Jerryd Bayless$8,575,9160.
Cristiano Felicio$8,470,9800.
Markelle Fultz$8,339,8800.
Joe Harris$8,333,3330.
C.J. Miles$8,333,3330.
Shaun Livingston$8,307,6920.
Deandre Ayton$8,165,1600.
Lou Williams$8,000,0000.
Garrett Temple$8,000,0000.
P.J. Tucker$7,969,5370.
Cory Joseph$7,945,0000.
JaMychal Green$7,866,6670.
Karl-Anthony Towns$7,839,4350.
Kyle Korver$7,560,0000.
Jeremy Lamb$7,488,3720.
Lonzo Ball$7,461,9600.
Doug McDermott$7,333,3340.
Jae Crowder$7,305,8250.
Marvin Bagley$7,305,6000.
D.J. Augustin$7,250,0000.
Dewayne Dedmon$7,200,0000.
Lance Thomas$7,119,6500.
D'Angelo Russell$7,019,6980.
Langston Galloway$7,000,0000.
Davis Bertans$7,000,0000.
Ersan Ilyasova$7,000,0000.
Boban Marjanovic$7,000,0000.
Al-Farouq Aminu$6,957,1050.
Jayson Tatum$6,700,8000.
Luka Doncic$6,560,6400.
Nemanja Bjelica$6,500,0000.
Mario Hezonja$6,500,0000.
Ben Simmons$6,434,5200.
Wayne Ellington$6,270,0000.
Marco Belinelli$6,153,8460.
Wesley Johnson$6,134,5200.
Josh Jackson$6,041,5200.
Montrezl Harrell$6,000,0000.
Jonathon Simmons$6,000,0000.
Ish Smith$6,000,0000.
Jaren Jackson$5,915,0400.
Brandon Ingram$5,757,1200.
Anthony Tolliver$5,750,0000.
De'Aaron Fox$5,470,9200.
Ben McLemore$5,460,0000.
Alex Abrines$5,455,2360.
Patrick Patterson$5,451,6000.
Jason Smith$5,450,0000.
Marcus Morris$5,375,0000.
Trae Young$5,356,4400.
DeMarcus Cousins$5,337,0000.
Thabo Sefolosha$5,250,0000.
Aron Baynes$5,193,6000.
Jaylen Brown$5,169,9600.
Patrick Beverley$5,027,0280.
Tony Parker$5,000,0000.
Mike Muscala$5,000,0000.
Dirk Nowitzki$5,000,0000.
Jonathan Isaac$4,969,0800.
Mohamed Bamba$4,865,0400.
Willie Cauley-Stein$4,696,8750.
Dragan Bender$4,661,2800.
Ron Baker$4,544,4000.
Lauri Markkanen$4,536,1200.
Lance Stephenson$4,449,0000.
Kyle O'Quinn$4,449,0000.
Ed Davis$4,449,0000.
Wendell Carter$4,441,2000.
Justin Holiday$4,384,6160.
Alex Len$4,350,0000.
Mike Scott$4,320,5000.
Luc Mbah a Moute$4,320,5000.
Emmanuel Mudiay$4,294,4800.
Kris Dunn$4,221,0000.
Frank Ntilikina$4,155,7200.
Glenn Robinson$4,075,0000.
Collin Sexton$4,068,6000.
Stanley Johnson$3,940,4020.
Buddy Hield$3,833,7600.
Dennis Smith$3,819,9600.
Kevin Knox$3,739,9200.
J.J. Barea$3,710,8500.
Zach Collins$3,628,9200.
Frank Kaminsky$3,627,8420.
Mikal Bridges$3,552,9600.
Michael Beasley$3,500,0000.
Jamal Murray$3,499,8000.
Rodney Hood$3,472,8870.
Jodie Meeks$3,454,5000.
Justise Winslow$3,448,9260.
Malik Monk$3,447,4800.
Myles Turner$3,410,2840.
Allonzo Trier$3,382,0000.
Brook Lopez$3,382,0000.
Shai Gilgeous-Alexander$3,375,3600.
Trey Lyles$3,364,2490.
Ekpe Udoh$3,360,0000.
Devin Booker$3,314,3650.
Luke Kennard$3,275,2800.
Cameron Payne$3,263,2950.
Troy Daniels$3,258,5390.
Kelly Oubre$3,208,6300.
Miles Bridges$3,206,6400.
Marquese Chriss$3,206,1600.
Tomas Satoransky$3,129,1870.
Bryn Forbes$3,125,0000.
Donovan Mitchell$3,111,4800.
Terry Rozier$3,050,3900.
Jerome Robinson$3,046,2000.
Yogi Ferrell$3,000,0000.
Elfrid Payton$3,000,0000.
Bam Adebayo$2,955,8400.
Justin Jackson$2,807,8800.
Thon Maker$2,799,7200.
Seth Curry$2,795,0000.
Cedi Osman$2,775,0000.
Sam Dekker$2,760,0950.
Troy Brown$2,749,0800.
Guerschon Yabusele$2,667,6000.
Justin Patton$2,667,6000.
Domantas Sabonis$2,659,8000.
Jerian Grant$2,639,3140.
Zhaire Smith$2,611,8000.
Delon Wright$2,536,8980.
D.J. Wilson$2,534,2800.
Dario Saric$2,526,8400.
Taurean Waller-Prince$2,526,8400.
Justin Anderson$2,516,0480.
Dante Cunningham$2,500,0000.
Reggie Bullock$2,500,0000.
Bobby Portis$2,494,3460.
Donte DiVincenzo$2,481,0000.
Rondae Hollis-Jefferson$2,470,3570.
Tyus Jones$2,444,0530.
Jarell Martin$2,416,2220.
T.J. Leaf$2,407,5600.
Jeff Green$2,393,8870.
Amir Johnson$2,393,8870.
Raymond Felton$2,393,8870.
Dwyane Wade$2,393,8870.
Udonis Haslem$2,393,8870.
JaVale McGee$2,393,8870.
Gerald Green$2,393,8870.
Andrew Bogut$2,393,8870.
Zaza Pachulia$2,393,8870.
Devin Harris$2,393,8870.
Channing Frye$2,393,8870.
Vince Carter$2,393,8870.
Jamal Crawford$2,393,8870.
Lonnie Walker$2,357,1600.
John Collins$2,299,0800.
Larry Nance$2,272,3910.
Kevin Huerter$2,250,9600.
Harry Giles$2,207,0400.
Darius Miller$2,205,0000.
Derrick Rose$2,176,2600.
Omri Casspi$2,176,2600.
Quincy Pondexter$2,165,4810.
Jonas Jerebko$2,165,4810.
Greg Monroe$2,165,4810.
Josh Okogie$2,160,7200.
Raul Neto$2,150,0000.
Terrance Ferguson$2,118,8400.
Grayson Allen$2,074,3200.
Jarrett Allen$2,034,1200.
Isaiah Thomas$2,029,4630.
Shelvin Mack$2,029,4630.
Torrey Craig$2,000,0000.
Corey Brewer$2,000,0000.
Chandler Hutchison$1,991,5200.
OG Anunoby$1,952,7600.
Ante Zizic$1,952,7600.
Shabazz Napier$1,942,4220.
Aaron Holiday$1,911,9600.
Tyler Lydon$1,874,6400.
Henry Ellenson$1,857,4800.
Anfernee Simons$1,835,5200.
Trey Burke$1,795,0150.
Malik Beasley$1,773,8400.
Moritz Wagner$1,762,0800.
Nerlens Noel$1,757,4290.
Ian Clark$1,757,4290.
Caleb Swanigan$1,740,0000.
Furkan Korkmaz$1,740,0000.
Landry Shamet$1,703,6400.
Caris LeVert$1,702,8000.
Jonah Bolden$1,690,0000.
Kyle Kuzma$1,689,8400.
Tony Bradley$1,679,5200.
Derrick White$1,667,1600.
Spencer Dinwiddie$1,656,0920.
MarShon Brooks$1,656,0920.
Josh Hart$1,655,1600.
Robert Williams$1,654,4400.
Jacob Evans$1,644,2400.
Pat Connaughton$1,641,0000.
DeAndre' Bembry$1,634,6400.
James Ennis$1,621,4150.
Noah Vonleh$1,621,4150.
Tim Frazier$1,621,4150.
Nik Stauskas$1,621,4150.
Omari Spellman$1,620,4800.
Luke Kornet$1,619,0000.
Rodions Kurucs$1,618,3200.
Richaun Holmes$1,600,5200.
T.J. McConnell$1,600,5200.
Malachi Richardson$1,569,3600.
Jahlil Okafor$1,567,0070.
Kevon Looney$1,567,0070.
Salah Mejri$1,567,0070.
Pascal Siakam$1,544,9510.
Damian Jones$1,544,9510.
Deyonta Davis$1,544,9510.
Jake Layman$1,544,9510.
Cheick Diallo$1,544,9510.
Malcolm Brogdon$1,544,9510.
Rodney McGruder$1,544,9510.
Ivica Zubac$1,544,9510.
Wade Baldwin$1,544,9510.
Quinn Cook$1,544,9510.
Dorian Finney-Smith$1,544,9510.
Wayne Selden$1,544,9510.
Georges Niang$1,512,6010.
Christian Wood$1,512,6010.
Derrick Jones$1,512,6010.
David Nwaba$1,512,6010.
Treveon Graham$1,512,6010.
Sviatoslav Mykhailiuk$1,487,6940.
Mitchell Robinson$1,485,4400.
Jordan Bell$1,378,2520.
Frank Jackson$1,378,2420.
Thomas Bryant$1,378,2420.
Royce O'Neale$1,378,2420.
Frank Mason$1,378,2420.
Davon Reed$1,378,2420.
Wesley Iwundu$1,378,2420.
Khem Birch$1,378,2420.
Abdel Nader$1,378,2420.
Damyean Dotson$1,378,2420.
Sterling Brown$1,378,2420.
Dillon Brooks$1,378,2420.
Tyler Dorsey$1,378,2420.
Ivan Rabb$1,378,2420.
Jawun Evans$1,378,2420.
Sindarius Thornwell$1,378,2420.
Ike Anigbogu$1,378,2420.
Maxi Kleber$1,378,2420.
Dwayne Bacon$1,378,2420.
Semi Ojeleye$1,378,2420.
Daniel Theis$1,378,2420.
Monte Morris$1,349,3830.
Antonio Blakeney$1,349,3830.
Tyrone Wallace$1,349,3830.
Alfonzo McKinnie$1,349,3830.
Ryan Arcidiacono$1,349,3830.
Daniel Hamilton$1,349,3830.
Shaquille Harrison$1,311,2650.
Elie Okobo$1,238,4640.
Jalen Brunson$1,230,0000.
Michael Carter-Williams$1,200,0000.
Melvin Frazier$1,050,0000.
Isaac Bonga$1,000,0000.
Devonte' Graham$988,4640.
De'Anthony Melton$949,0000.
Chasson Randle$869,0940.
Gary Trent$838,4640.
Hamidou Diallo$838,4640.
Keita Bates-Diop$838,4640.
Jevon Carter$838,4640.
Bruce Brown$838,4640.
Jarred Vanderbilt$838,4640.
Khyri Thomas$838,4640.
Isaiah Briscoe$838,4640.
Chimezie Metu$838,4640.
Kenrich Williams$838,4640.
Alize Johnson$838,4640.
Isaiah Hartenstein$838,4640.
Ray Spalding$838,4640.
Ryan Broekhoff$838,4640.
Brad Wanamaker$838,4640.
Lorenzo Brown$800,0000.
Patrick McCaw$786,0000.
James Nunnally$655,6320.
Zhou Qi$506,1340.
Bruno Caboclo$487,0000.
Chris Boucher$457,4180.
Malcolm Miller$457,4180.
Isaiah Canaan$456,7330.
Edmond Sumner$449,7940.
Nick Young$311,0700.
Okaro White$264,9190.
Eric Moreland$239,0350.
Jaylen Adams$236,8540.
Quincy Acy$213,9490.
Andrew Harrison$200,0000.
Jimmer Fredette$198,5800.
Terrence Jones$198,5800.
Dairis Bertans$194,2200.
Troy Williams$122,7410.
Cameron Reynolds$108,9530.
Tyler Zeller$106,9740.
John Jenkins$99,2900.
Mitch Creek$94,7420.
B.J. Johnson$94,7420.
Demetrius Jackson$92,8570.
Chris Chiozza$90,0050.
Gary Payton$85,4580.
Andre Ingram$76,2360.
Kobi Simmons$76,2360.
Emanuel Terry$47,3710.
Dusty Hannahs$47,3710.
Scott Machado$47,3710.
Tahjere McCall$47,3710.
Jordan Sibert$47,3710.

The table below shows the sum of salary difference for each team.

TeamExpected - actual salary

Interactive results

To help visualize the results, we created an R Shiny app. This lets you interactively compare players and teams. The link is:

Why do the models predict what they do?

Now that we’ve seen the performance and results of the model, we’ll look at what influences their prediction. To do this, we’ll use Shapley values. Shapley value estimates the marginal contribution of a feature through all possible combinations. This allows us to see what features are most important, and what values for these features create the most impact. For example, we know that a low value for age will result in a low predicted salary because of rookie contracts. But, is age a big predictor of salary in all cases? Or does it only matter for young players, after which other factors matter more?

Shapley values let us answer these questions. The four graphs below show the Shapley values for each feature in our four models. The y-axis sorts features by their importance (most important is on top). The x-axis shows the Shapley value, or the impact on model output. The color of each point shows the actual feature value. So, for example, older players have red “age” points, because their age is higher.

All four models have points, age, and rebounds as their top 3 features in that order. Notice that the feature importance for the RF, GBR, and XGB follow the same order. Because all three of these models are tree-based models, they fit the data in similar ways. So, the same feature sets affect the models to a similar extent.

We see that rebounds don’t have much of an effect on model output unless the player recorded a lot of rebounds. This makes sense, as most top bigs will rack up rebounds. Meanwhile, guards and wings won’t record lots of rebounds, but that shouldn’t affect them.

The plots give more proof to the effect of rookie contracts and age. In each model, low values of age have the largest negative impact on model output. But, there’s an interesting trend with high values of age. We’d expect high values of age not to affect output, given that older players often earn less because they’re expected to decline. However, high values of age positively impact model outputs. This explains some of the odd results of the models, such as why LaMarcus Aldridge has a higher predicted salary than Giannis.


When offering players contracts, GMs try to give better players more money. By giving models basic indicators of player performance, we can almost pinpoint what a player should earn.

We could expand this to predict a player’s salary in year n given their performance in year n – 1. This allows us to predict how much a player should earn before they start their contract or pick their team. However, this analysis would be much less accurate than what we did here. It’s hard to predict how players will progress between seasons or how different schemes affect their production. Over one summer, so much changes. In the course of a single season, player performance is much easier to predict. So, this works because it’s a retrospective look at value, instead of a future prediction.

Share this post

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.