Noah Smith had a great post yesterday about becoming a Bayesian Superhero. Because I am an inveterate nitpicker and a routine abandoner of my commitment to Spreadsheets Anonymous, I want to dig into the math behind his example. In this case, it actually matters quite a bit, because the math in this case mutes the power of the example somewhat:
But nevertheless, every moment contains some probability of death for a non-superman. So every moment that passes, evidence piles up in support of the proposition that you are a Bayesian superman. The pile will probably never amount to very much, but it will always grow, until you die.
The thing is that ‘the pile will probably never amount to very much.’ Here are the Social Security’s life tables
. I am a 27-year-old-male, so my probably of dying (without adding in any other life expectancy modifying factors) is just 0.001362; as odds, that’s 1 in 734. That means Bayes’ Rule is not going to make very much of my not dying as evidence. Just to put it as starkly as possible, if I believed right now that there was a 48% prior probability of my being an invincible superhero, living to 40 (ceteris paribus) would be still be insufficient to push the posterior over 50%.
What does that mean? To ever believe you are a superhero through Bayesian inference, at least one of two things have to be the case:
1) You have to have a very large prior – essentially, superderp.
2) You have to do survive things that drastically increase your odds of dying.
The first thing, I think, is what Noah was getting at with teenagers; the latter thing is basically the plot of Unbreakable. If you want to generate real evidence for the proposition that you are a superhero, you need to survive some deadly encounters. And even then, you could still just be Boris.
Actually, though, the real meat of Noah’s post is in this aside:
But this gets into a philosophical thing that I’ve never quite understood about statistical inference, Bayesian or otherwise, which is the question of how to choose the set of hypotheses, when the set of possible hypotheses seems infinite.
To which I actually have a good answer! When selecting hypothesis from the infinite, simply go with the existing consensus and try to generate evidence that supports or undermines it at the Bayesian margin. This should actually be the right strategy whether you’re operating under a Popperian or a Kuhnian framework.