In my last post on the topic, I answered the timeless, by which I mean hyper-timely, question of “where are Winter Olympians from?” with this map:

But that didn’t answer the question of *why*, specifically, these countries produce all those Olympians. And while “why” is always tricky, thanks to the magic of multivariate regression, we can at least look at correlations and make some inferences. Spoiler alert – if you want to win the Winter Olympics, you should be cold, populous, and rich.

So first I just regressed the number of Winter Olympians on the log of GDP-per-capita, the log of population, and the absolute latitude of the population centroid (ie, angle from the equator of where a country’s people live). I got this:

For those of you unversed in the black magic of Stata, this shows very statistically significant correlations between all those variables and the number of Olympians a country produces. When a country’s GDP-per-capita doubles, it can be expected to send an additional 10 Winter Olympians; when its population doubles, it can be expected to send an extra five Winter Olympians; and it can be expected to send an extra Olympian for every additional two degrees of latitudinal distance from the equator. If you are satisfied with that, you can conclude the post here; if you like watching statistics receive substantial abuse, keep reading.

First, I thought, “job well done.” Then I thought, why not try a deeply arbitrary blend of p-hacking and r-squared-fishing?” So I did!

My next step was to toss in a dummy for “former Soviet Union,” seeing a how that seemed to have some explanatory power.

That didn’t improve the model, though, which made me sad. So I added an interaction term for former Soviet Union * log population:

There we go! That makes more sense, and has at least some intuitive grounding.

And then I thought, “Interaction terms? We can have fun with those! Let’s generate all kinds of interaction terms, run tons of models, and put the prettiest-looking one on the blog!” So here you have it:

I may not have run four million regressions, but I still think I did a pretty good job of hunting out the one that had nothing but beautiful zeroes in the significance tests and about as high an r-squared as I could muster without resorting to polynomials. And there is some logic to this model – we should expect that the interaction of climate and wealth could lead to increasing ability and inclination to produce Winter Olympians beyond what we would expect from independent linear trends, and the same for the interaction between wealth and population (though, you know, Monaco).

Also interesting, though, are the residuals – no model predicts that the top few countries would produce quite so many athletes, and also predict some countries should produce many more than they do – a good example is Turkey, which is predicted to send 44 athletes to the Winter Olympics despite only sending six! There is obviously some cultural wintersport heritage that no econometric model can account for.

Anyway, here’s my data. Offers of honorary PhDs and/or gold medals should be sent to squarelyrootedblog@gmail.com.

Also, t-tests and r-squared are both lousy. FWIW.

## 1 comment

Comments feed for this article

02/24/2014 at 06:02

Winter Olympics Wrap-Up | squarely rooted[…] should take a look at my last three posts on this topic […]