The NFL Draft is coming up next month, and with it come a wide variety of questions from fans:
- Will my team get the long snapper it so desperately needs?
- Do I have enough beer in the fridge to make the NFL Draft interesting to watch?
- How can I pretend that the glazed look in my friends' eyes is genuine interest when I tell them about my Fantasy Football draft?
However, for the truest of NFL fans, other more in-depth questions often arise. Over the next several weeks we are going to dive into some more complex subjects.
Today's topic deals with NFL Draft Location Bias. In other words, do teams prefer to draft players close to or far away from home?
To begin answering this question, a data set was required that could show what teams players were drafted from, as well as where the various college and pro teams were. To do this, Google Maps was utilized to generate longitudes and latitudes for all pro and 1-A college teams, and draft data from the past ten years (2007-2016) was acquired from the web. The resulting data sets looked like this:
Once the raw data was acquired, a master distance table was calculated for every NFL Team from every college team. This was important so that expected distances for each draft could be calculated. Luckily, a very smart guy named Josef de Mendoza y Rios that created a very cool table to allow for these very calculations. Even better, in the late '80s there was a woman named Jenny Excel who created a software package that now makes using Joe's table a breeze(Jenny may be made up, but the software does indeed exist!)
After compiling the distance table an expected distance table was created. This was done by summing up distances from the teams to each player drafted and dividing by the total number of players. This process was repeated for each year's draft.
After completing the expected distance table the actual draft distances for each team by year were calculated and the data set was loaded into Tableau.
To utilize the data, the graph below was created to show the average deviation from expected distance by team over the ten year period.
The data seemed to be fairly random, but one team in particular stuck out above all the rest: the Cleveland Browns, whose deviation was almost double that of any other NFL Team (~320 miles farther than expected, the next highest team was the New York Jets at ~175 miles closer than expected).
As an Ohio native and Cleveland Browns fan, this data was quite startling. After digging a little deeper, I noticed that several Browns players were drafted from Hawai'i. Could those outliers be the cause of the trend? To find out, I removed the Hawai'i players from the data set, re-calculated the expected distances, and re-created the chart.
The removal of Hawai'i helps the Browns come closer to the mean, but they are clearly still an outlier. Why? To dig deeper, I created a map of where the Browns players were drafted from (without Hawai'i).
Per the map above, the Browns drafted very few Midwest players and many from the South and West (not quite neighbors of Cleveland...)
Contrast this with the other end of the spectrum, the New York Jets.
The Jets have a large quantity of their draftees coming from the Midwest, the Mason-Dixon line region, and Mid-Atlantic region, all fairly close to New York City. Not to mention zero players from Florida, Washington, or Oregon, and only a few from California.
Now let's take a look at a team that is exceptionally close to its expected distance (~14 miles farther than expected), the Tennessee Titans.
As you can see, the Titans have a smattering of players all across the US, and do not seem to favor any specific region.
To put a bow on this discussion, let's take a look at some statistics. The graph below shows the 95% confidence interval for the median of the population.
As we can see, many teams fall outside of the confidence interval, with the Browns and Jets leading the way. But is this a statistically significant relationship between Teams and distance? To find out, I plotted each individual data point and looked at confidence intervals for the median values, which the Tableau Analytics pane makes very easy!
As you can see, the majority of the teams fall within the confidence bands of each other, meaning there is not a statistically significant difference in the averages (this is in effect an ANOVA statistical test, just done at a remedial level). However, the Cleveland Browns do fall outside of the typical range. But when we look closely, there are quite a few outliers. Might a median test work better? Let's try!
The median comparison tells a bit of a different story. The Browns have come back to the pack (although still on the high end), while other teams have moved up substantially due to the fact that most players are not located near them (four of the top five medians are West Coast teams, the fifth of which is the Browns). In fact, when we look at the five West Coast teams (the Rams were not included since a large % of their data set comes from their days in St. Louis) we see that all five median values are above the expected distance (0).
So where does this all leave us? Without using some more rigorous statistical tests not available in Tableau (Mood's Median Comparison, ANOVA) it is difficult to quantify exact relationships, but our Tableau analysis can lead us to a few definitive answers:
- 30 of 32 NFL Teams' average deviation from expected distance can't be statistically shown to be greater than or less than zero with 95% confidence (Browns and Ravens being the exceptions).
- 17 of 32 NFL Teams' median deviation from expected distance can't be statistically shown to be greater than or less than zero with 95% confidence. However, a noticeable geographic effect is evident, as all West Coast Teams were skewed to one side, and only 8 teams had a median above 0, 5 of which were West Coast-based.
To wrap it all up, there are three major conclusions to take from this data.
- There does not appear to be a correlation for most NFL teams between average expected distance and actual distance from their draft pick's schools to the NFL teams themselves.
- There may be a relationship for median distances, but the lack of players in the middle of the country creates bimodal distributions, adversely affecting West Coast teams.
- Never underestimate the ability of the Cleveland Browns to do their own thing, and to do it stupidly.
That's all for now, we'll be back with another NFL Draft question next week!
Author: Chris Bick