Spurious relationships are a serious problem in social scientific research. But… they’re also fun! For example, did you know that ice cream causes crime? Also, global temperature is inversely related to the number of pirates.
Seth Masket provides an excellent example of a spurious relationship asking: Are Democrats perverts? As he shows, it turns out that there is a strong relationship between porn use and Obama’s vote share in 2012! But as Seth aptly explains:
For one thing, there’s a big potential ecological inference problem here. We’re making assumptions about individual level behavior by examining data aggregated at the state level… Second, chances are that even if there is an individual relationship here, it’s not a direct one. Porn usage may correlate with something else that also correlates with partisan voting patterns.
Seth notes that porn pageviews explain 16% of the variation in Obama’s vote share. In an attempt to outdo him in the “spurious-relationship-a-thon,” I looked at the predictive power of Google search traffic at the state-level for: (1) the rock band “Nickleback” and (2) “cure for herpes.” Here are the results.*
According to the first scatterplot, “Cure for herpes” has a negative relationship with Obama’s vote share in 2012, supposedly indicating that herpes caused people to vote for Mitt Romney in 2012. However, the R2 is just 0.08, which indicates that herpes has little predictive power (in fact, the relationship is not significant (p=.12; n=31)). Nickelback also has a negative effect, supposedly indicating that fans of the Canadian rock band were more likely (!) to vote for Romney in 2012. Notably, in the second scatterplot, the R2 is 25%, indicating the fully a quarter of the variation in the 2012 election outcome is explained by the state-level variation in Nickelback fans.
While it’s difficult to take the above “findings” seriously, social scientists dedicate their lives to distinguishing between correlation and causation. In practice, it’s much harder than most people appreciate. What the world gives us is correlation, so it’s easy to draw faulty conclusions from observational data. In an undergraduate research methods class of mine, we cover countless examples of correlation and causation. For example, congressional candidates who spend large amounts of their personal wealth on their campaign actually receive fewer votes. Also, humanitarian aid is associated with various negative consequences such as higher rates of infant mortality. Are these causal relationships? Of course not. Always remember: correlation is not causation.
* Don’t tell my department chair I spent 30 minutes working on this…