Saturday, February 16, 2019

What makes athletes popular? A sentiment regression analysis

By Michael Patterson, and Matt Goldman
(Standard disclaimer: The analyses contained here were done on personal time, and do not reflect the views of our employers.)


The 2018 Cleveland Cavs made the NBA finals while generating some of the hottest memes of 2018 ("We got an [expletive] squad now," "He boomed me."). Following the Cavs on Reddit, I (Mike) noticed something odd. Turkish rookie Cedi Osman was a particular fan favourite. Cedi played limited minutes with energy, and everyone joked that Cedi was the "GOAT" (Greatest Of All Time) carrying Lebron. In contrast, Tristan Thompson, a hero of the 2016 season, had an off year, and was the center of a meme for being traded ("Shump, TT, and the Nets pick"). For the Cavs at least, it seemed commenters gave the white players an easier time. And being a data scientist, I thought, "I could measure that!"
In this project, we used sentiment analysis and regression models to measure how performance, demographics, and race systematically predict sentiment towards players and coaches on the r/NBA and r/NFL subreddits. In doing so, we get a useful window into opinion formation in the online communities that increasingly dominate social and political discourse. Movements as varied as Black Lives Matter, #metoo and r/the_donald have leveraged online communities to connect disparate supporters around a common set of values and ideas. Social commentators have expressed concerns that herd mentalities and out-group biases can lead to the formation of distorted opinions in these settings, but it is hard to study such bias due to confounding factors. Studying professional athletes offers key advantages: we can measure sentiment associated with a large sample of athletes of varying race; and there are objective performance metrics.
This blog post includes a brief overview of our methodology and results. For details of the analysis, we have written a series of three Jupyter notebooks (linked below). We find that:
  • NBA Players
    • Reddit commenters like players who perform well; in the NBA, scoring 1 more PPG is worth approximately 0.02 standard deviations of sentiment
    • Commenters particularly like both young players (each year below the mean age of 26.7 is worth ~2.5 PPG), and old players (1.5 PPG for each year above 26.7)
    • The coefficient for race was overall not statistically significant (t <= 1.76)
    • In the NBA, scoring points for white players is worth ~3x as much as for black players
    • Commenters from “blue” cities (cities that supported Hillary Clinton in the 2016 general election) had higher sentiment towards NBA players, but this effect was smaller than 1 PPG.
  • NBA coaches
    • The overall sentiment towards coaches was less than that towards player (mean and median of 0.1 vs 0.13 for players)
    • One win above expectation was worth 0.06 standard deviations
    • There is a significant bias against black coaches in the NBA, worth ~10 wins
  • NFL
    • We measured performance in the NFL using Football Outsiders dVOA statistic; one point of DVOA was worth 0.02 standard deviations of sentiment
    • We did not detect any effect of race in the NFL, either for players or coaches

Sentiment Modeling



To quantify how redditors felt towards players, we used a natural language processing (NLP) technique called sentiment analysis. “Sentiment” is just a jargon-y way of saying whether someone is liked or disliked. The sentiment analysis technique we used (VADER) sums the positive and negative sentiment of the words in a sentence, and then normalizes them for an overall score. To tie these sentiment scores to players, we used a technique called named entity recognition. For details of the NLP, see this notebook.


Using these techniques, we analyzed over 2.5 million reddit comments from the 2013-2018 seasons (see here for how to scrape reddit) for the NBA and NFL. Since sentiment towards players can change over time, we calculated sentiment on a yearly basis. To get the sentiment towards a player in a year, we first calculated the average sentiment towards each player from each commenter, then averaged over all commenters (a mean-of-means). This was performed over the 2013-2018 seasons.


Using this sentiment model, scores generally range from -1 to 1, with 0 being neutral. The mean sentiment across all players was slightly positive, 0.13. We can check the results of our sentiment analysis by looking at the highest and lowest sentiment player-years:


Lowest Sentiment Seasons    Highest Sentiment Seasons
Player
Year
Avg Sentiment
Player
Year
Avg Sentiment
Mike Dunleavy
2016
-0.11
Brandon Ingram
2015
0.27
Kelly Olynyk
2016
-0.08
Karl-Athony Towns
2015
0.26
Steve Blake
2015
-0.07
Marc Gasol
2014
0.25
Zaza Pachulia
2017
-0.07
Gordon Hayward
2014
0.24


In general, these sentiment values pass the sniff test. Dirty players like Kelly Olynyk, and Zaza Pachulia each received their low score in seasons immediately following incidents where they injured high-profile players; Brandon Ingram and Karl-Anthony Towns in 2015 were young players with potential, and Marc Gasol is the franchise player of Memphis. For a full table of player sentiment, see this .tsv, where the column ‘compound_mean_mean’ represents player sentiment.

Fig 1 (left panel) plots a histogram of this mean sentiment score across white and black players. Overall, the distributions are similar (unpaired t-test, p=0.07), with a standard deviation of 0.053 (calculated on players having at least 200 commenters). However, this need not be the whole story. White and black players differ on many other characteristics that may also determine sentiment. For example, Fig. 1 (right panel) plots player age versus average sentiment score. Here we can see that both young and old players are more liked than NBA middle-age players. In order to make useful statements about the role of race in determining sentiment, we need to consider how other player characteristics can confound and mediate such a relationship.
Fig. 1: Graphs exploring sentiment distributions. Left panel: Histogram of sentiment towards white and black players. Overall the distributions are similar. Right panel: Average sentiment towards players for each age. Young and older players are more popular than average.

Regression Analysis: NBA Players



The above graphs are informative, but not conclusive, since many factors can be correlated with each other, and we can’t make causal inferences. For example, young players might be popular because they are full of potential, or they might be popular because they are underpaid. To disentangle these effects, we can use multi-variate regression analysis, where we consider all of these factors simultaneously. Rather than analyze data at the player-year level, we can analyze it at the player-user-year level to gain more samples.


In our regression analysis, we set our target variable to be the average player sentiment from a commenter in a year. We start our analysis with simple models, and gradually add more and more covariates (features). Starting with a simple regression using PPG as a covariate, we find that the PPG coefficient is significant, and 1 PPG is worth about 0.01 standard deviations of sentiment (0.0007 compared to the standard deviation of 0.053). In this regression, I also included the covariate of minutes played, which was not significant.



Specification
Coefficient
(t-statistic)
(1)
(2)
(3)
(4)
(5)
Intercept
0.07
0.067
0.054
0.0853
0.08
PPG
0.0007
(1.985)
0.0014
(3.5)
0.0012
(2.3)
0.0007 (1.91)
0.001
(2.2)
Rookie

0.020
(5.4)
0.021
(4.7)

0.022
(4.8)
Youth

0.0026
(3.5)
0.0028
(3.6)

0.0026
(3.2)
Oldness

0.0014
(2.04)
0.0014
(1.9)

0.0012
(1.7)
White Player (race)


0.0004
(0.08)
0.0095 (1.76)
0.008
(1.6)
White Player X PPG



0.0022 (2.6)
0.0016
(2.6)
Commenter City “Blueness”




0.0096
(3.2)
Blue Commenter X White Player




-0.005 (-0.67)
All stats were downloaded from basketball-reference.com. Youth (oldness) defined as years below (above) the average NBA age (26.7 years). Regression was done at commenter-player-year level, weighted with square root of comment count, and with clustered errors at player level. For details, see future notebook.


After this initial regression, we start to increase complexity. For the full list of specifications that we used, please see this Google spreadsheet. The next bit of complexity we added was more performance variables, and simple demographics. Here we found, surprisingly, that no other performance variable was significant for sentiment. For age, we found that commenters preferred both young and old players, and rookies the most. Being a rookie was worth ~14 PPG, one year of youth was worth ~2-2.5 PPG (0.0026 vs 0.0014 in this specification), and one year of oldness being worth 1 PPG (0.0014 for each). This might be explained by the potential of youth, and survivors bias of players who get older.


In spec (3), we add race as a covariate, which coefficient was not significant. However, the confidence interval on this effect is fairly wide, and we can only conclude that race is less important than 3 PPG. However in specs (4+5) , we see that white players received 2-3x the benefit of scoring as black players did. In fact, in specification (4) we see that the coefficient for scoring for PPG dips below statistical significance for black players alone.


All the previous coefficients were measured at the player level, but there may also be bias at the commenter level. To measure this, we took each user’s flair, and assigned it to a city (on Reddit, users can express affiliation with a team, e.g. “[CLE] Cedi Osman”). We found that when a user has flair for a city that Clinton won disproportionately more, they had a higher sentiment towards players. To check whether we could detect changes in politicization over time, we interacted year with Clinton vote share, but did not find significant coefficients). Overall, these results are in line with research that shows Democrats have higher favorability towards the NBA and NFL than Republicans do. Interpreting this coefficient, however, is difficult, as it is correlated with many other factors of a city.

Regression analysis: NBA coaches



Using the same techniques, we can quantify sentiment towards coaches. Here are the most liked and disliked coach-seasons since 2013. This list looks reasonable, as popular coaches like Brad Stevens and Steve Kerr are at the top. Overall, the mean and median sentiment towards coaches was 0.1. For the full table, see this .tsv.


Highest Sentiment Seasons  Lowest  Sentiment Seasons
Coach
Year
Avg sentiment
Coach
Year
Avg Sentiment
Brad Stevens
(BOS)
2014-2015
0.3
George Karl
(SAC)
2015-2016
-0.15
Erik Spoelstra
(MIA)
2014-2015
0.26
Earl Watson
(PHX)
2017-2018
-0.1
Brad Stevens
(BOS)
2017-2018
0.24
Kurt Rambis
(NYK)
2015-2016
-0.1
Steve Kerr
(GSW)
2014-2015
0.23
Fred Hoiberg
(CHI)
2015-2016
-0.06


As before, we can use regression analysis to understand what factors influence coach sentiment. However, the coach analysis has less power than the player analysis for a few reasons: 1.) There is only one coach per team, limiting the sample size, and increasing the influence of outlier coaches. 2.) People talk less about coaches, making estimates of coach sentiment less reliable. 3.) Coaches have fewer covariates compared to players, increasing the chance of omitted variable bias; for example, coaches may be well liked for interviews, which we don’t quantify here.


We can start with the simplest regression, predicting sentiment using variables based on wins: the raw win percentage in a season; career win percentage for the coach; and winning percentage compared to the over-under for wins in a season as a proxy for over- or under-achievement. Surprisingly, the coefficient for wins alone is negative, but the coefficient for wins above expectation was highly positive (specification (1)). There is also a positive coefficient for career win percentage, albeit half the size of the in-season coefficient. The next covariates we can add are time-related, like tenure with team, or age (specification (2)). Both of these coefficients are significant: one year of age is worth approximately 0.5 wins; the effect of tenure is twice as big as age. The opposing signs of these coefficients would allow young coaches to retain their popularity as they stay with the same team. We also tested a covariate for former players, which was not significant.



Specification
Coefficient (t-statistic)
(1)
(2)
(3)
Intercept
-0.01
0.14
0.14
Season Win %
-0.14 (-2.2)
-0.13 (-1.94)
-0.08 (-1.5)
Win% - pre-season over/under
0.51 (6.8)
0.48 (6.6)
0.39 (4.5)
Career Win %
0.23 (3)
0.25 (3.3)
0.17 (2.3)
Age (years)

-0.0034 (-2.4)
-0.0033 (-2.7)
Tenure with team (years)

0.005 (2.1)
0.0048 (2.2)
Race (White)


0.047 (2.8)

Finally, we can add a covariate for race. Here, we find the coefficient is significant and large, worth approximately 10 wins (0.047, vs the coefficient for one win of 0.0047). This coefficient was surprisingly large, and different enough from our player analysis, that we wanted to double check it. First, we plotted the residual of our predicted sentiment (predicted sentiment - measured sentiment) using a model that ignored race, splitting the data for white and black coaches (Fig. 2, left panel). Here we can see that a segment of black coaches have negative residuals, meaning that their measured sentiment was less than predicted, a sign that the coefficient for race is negative.


Another reason for concern is that there are relatively few coaches, which means some outlier coaches could be influencing our results. To verify this was not the case, we can perform a bootstrapped regression where we re-sample our data at the coach level (namely, we take a sample where half the coaches are missing, and fit a regression; Fig. 2, right panel). If we do this, we find that the distribution of coefficients for race are different from zero.


Fig. 2: Left: Histogram of residuals of sentiment using a model that did not include race. Right panel: distribution of coefficients for race, using bootstrapped samples.

Conclusion



We found that factors like age, performance, and race were related to sentiment towards players. The overall sentiment on reddit was positive, probably due to moderation policies that remove abusive language. This moderation policy limits our ability to measure overt racism, as those posts are removed.

We only found one piece of evidence of racial bias for NBA players, namely that high scoring white players are well liked. In general, this fits with our personal observations, as players like Luka Doncic and Gordon Hayward (both legitimately great players), receive lots of attention and favoritism on reddit. We do not know the cause of this bias: it could be due to the relative scarcity of white players yielding a novelty factor; or it could reflect unconscious bias by a subset of reddit users. On the flip side, some of the least popular players on reddit are low-skill white players known for a bruising style and dangerous play. 

We also found evidence that commenters from cities that supported Clinton had slightly higher sentiment towards the NBA. This could align with recent research that sports are becoming politicized. For example, conservative sentiment towards the NFL has dropped since the national anthem protests. We did not, however, find any change in time for this effect.

We found a significant, large bias against black NBA coaches. Anecdotally, we can think of two successful, young black coaches, Tyronn Lue and Dwane Casey, that have gotten consistent criticism. In contrast, probably the three most popular coaches, Steve Kerr, Brad Stevens, and Gregg Popovich, are all white. It is possible that black coaches systematically differ from white coaches in ways we have not quantified, although we did not detect a significant coefficient for ex-players.

This analysis could be improved by using more refined sentiment analysis, named entity recognition, incorporating other social media sources, and by taking a finer-grained approached to time. For sentiment analysis and NER, we used quick methods like VADER and filtering comments to those about a single player. These analyses could both be improved simultaneously by using a combined sentiment-entity extraction; this would require training a model on a significant amount of labeled data. For other social media sources, we could also try to use Twitter, or perhaps go down to the team subreddit level to gather more data; this would allow us to test the robustness of these results. For time, it would be interesting to analyze sentiment on a game-by-game basis; for example, white players might receive more praise for a high scoring game than black players.

So far, we have only presented results from the NBA, but we also performed a similar analysis for the NFL. We’ll be putting those out shortly.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.