Redrafts are hard. Unlike a simple Rookie of the Year ranking, redrafts make you think about a variety of information. How good was the player at the time of the draft? How good have they been this season? And, perhaps toughest of all, how does one balance those two things? A great way to analyze this would be to look at it objectively. Plug in data about the pre-draft credentials and rookie performance of players from the past, then use a technique known as multiple regression to see what factors most affected future performance. I thought it would work pretty simply. However, there were a bunch of little things that snowballed into the very strange redraft you will see later. Here’s what went wrong, and why.
_____________
The Process
To create the redraft, I had to first set up a bunch of example cases for the computer to reference. I ended up deciding to use draft slots to look at where players were entering their rookie season. Then, I used Basketball Index's LEBRON statistic to get an idea of players' rookie performance. Finally, I took these players' LEBRON again, but this time for their sophomore campaigns through their fourth season, when their rookie contract ended. The rookie contract is often a point of reference for deciding draft value in the NFL because after that point, players theoretically should be compensated at the level of their talent, and therefore the value surplus is gone. In the NBA, maximum contracts make that concept shakier, but it still works as a solid reference point.
Using publicly available data from Basketball-Reference and Basketball Index, I set up references with the three pieces of data mentioned earlier. The computer could then calculate the relationship between the two independent variables (draft pick and rookie LEBRON) and the dependent one (LEBRON years 2-4). It used this to make a formula that could estimate rookie contract* LEBRON from rookie LEBRON and draft pick. Using those values for each 2020 rookie, the computer could then attempt to make a redraft estimating the rookie contract LEBRON of each player.
*I'll be using rookie contract here to represent years two through four. That excludes the rookie season itself even though it's a part of the player's rookie contract.
This is where I started to have no idea what I was doing. While my research told me that multiple regression was the easiest and best way to let the computer parse out this relationship, I have no experience with multiple regression. Luckily, though, the Internet does. I ended up using a site called Statistics Kingdom that let me upload my raw data directly from Excel. I left everything on the default settings (which may have been a mistake, but I wouldn't be able to tell), and plugged in my data. After a couple of iterations that involved adjusting for the non-linearity of draft talent and testing different variables like age and minutes played, I had, well, this: