Hot Hands, a Paradox, and one reason why it’s bad to combine within-subject data

The ‘Hot Hand’ phenomenon is a popular belief (applicable to many domains from sports to gambling) that players who were successful in their most recent attempt(s) have increased odds of being successful in their next attempt — they are on a so-called ‘hot streak’ or have a ‘hot hand’. The statistical validity of this belief can be investigated using actual data. Indeed, it has been. For example, Tversky and Gilovich (1989) investigated the hot hand belief in basketball.

Here however, we are not scrutinizing the hot hand belief, but rather using this framework and dataset to reveal the presence of a Simpsonian Paradox. The definition of this paradox will precipitate from the following example…

In the 1996/97 NBA jam seasons Michael Jordan shot a pair of free throws on 338 occasions. MJ made both 251 times, missed both 5 times, made only the first 34 times, and made only the second 48 times. These data are presented in the table above, as are the same data for Dennis Rodman, and also their combined numbers.

Let Phit and Pmiss denote the proportion of first shot hits followed by a hit, and the proportion of first shot misses followed by a hit, respectively. These proportions for Jordan and Rodman, along with their combined numbers are:

Phit = 251 / 285 = .881
Pmiss = 48 / 53 = .906

Phit = 54 / 91 = .593
Pmiss = 49 / 80 = .612

Phit = 305 / 376 = .811
Pmiss = 97 / 133 = .729

Notice that, contrary to the hot hand phenomenon, MJ actually shot worse after he made the first of two freethrow shots. The same goes for Rodman.

So both players are actually worse on their second freethrow, on attempts when they’ve made their first shot.

Combining MJ and Rodman’s freethrow data together, the opposite is true. This is the Simpsonian paradox.

The Donald

A vain man childishly fishing for the adoration of strangers. Being rich isn’t enough; he wants to be universally loved for his imperfections, too. Slurping at the popularity spigot, The Donald’s need for ego gratification is double the size of any ordinary adult. Never in history has such a supercilious dope been more adored. It is beyond reason. Perhaps aware of this, and sensing defection, Trump whips out his phone… 140 characters is all it takes. Tweet.

He glances out the window of his private jet at the destruction beset a once beautiful US island; it disgusts him. He retreats to his phone to check emails, but years of operant conditioning and muscle memory take over. He taps on the bird icon. “30k likes! Only.“. Trump works best knowing his base is in a frothy outrage. “Better do one more…

That’s the one! Mmmmm. That one had everything.

Favorite News Sources Across the Political Aisle

What are the most liberal-leaning and conservative-leaning news outlets?

Sometime last presidential election season I had this very thought. All kinds of dirt was being thrown around about both candidates; however, lots of it was coming from news sources I had never heard of. I probably still wouldn’t have if Twitter and Reddit didn’t exist, providing these outlandish stories a platform for mass exposure (and mass outrage).

So I could never really tell if what I was reading was from a legit source, completely spun, or flat-out fake. For example I would see a headline like:

FBI Arrests Hillary on Corruption Charges

Linking to a news outlet calling themselves “The Discovery Examiner Guardian“, or something. I thought, well… if The DEG is like the NYT, Hil-dawg is probably in deep shit. If the DEG is like Breitbart then I’m 99% sure the opposite is true.

So who the F are these guys? Like, in general.

So I googled: What are the most liberal-leaning and conservative-leaning news outlets? To my dismay I found nothing satisfying. No ranking lists curated by experts, no data driven politio-meter, nothing really. Just a bunch of anecdotes from internet people complaining that so-and-so news is like totally bias.

I guess it makes sense given that any corporation attempting to appear as “the news” is trying to woo as many people as possible into believing they are thee most credible straight-shootin’, just-the-facts-you-decide, fair-and-balanced, no-underlying-agenda organization around. So, as nice as it would be, of course Fox News isn’t going to post on their homepage something like, “We are a 8/10 on the left/right political spectrum“.

So I had an idea… Reddit created this mess; let’s see if they can help fix it.

Using Reddit API (PRAW), I wrote python script to identify the favorite news sources of two subreddit communities on opposite ends of the political aisle.

This bot scraped the url from the top daily submissions to the main pro-Trump and anti-Trump subreddit communities, essentially determining these subreddits favorite news outlets. Nota bene: the validity of these data as a litmus for liberal-leaning and conservative-leaning news rests on the assumption that generally people prefer to post and upvote stories that align-with and support their personal world view.

Without further adieu…

UPPER PANEL: pro-Trump subreddit The_Donald
LOWER PANEL: anti-Trump subreddit EnoughTrumpSpam

I cross posted this project on Reddit’s Data is Beautiful, where a Googler, Filipe Hoffa saw my post and took it to the next level. Using data studio he expanded my original idea to all of reddit, and made it interactive. It’s something really worth playing around with for a few minutes. So go check it out!.

You can grab the code I used from this gist (you really don’t want it though, it’s awful)