You are currently browsing the tag archive for the ‘statistics’ tag.

Tyler Cowen posted this the other day:

A typical cow in the European Union [in 2002] receives a government subsidy of $2.20 a day.

And if your next thought isn’t “of course he’s going to reconstruct this figure” then you haven’t read this blog before, have you?

Turns out, though, that doing so was surprisingly challenging, in part because it meant recreating all the assumptions behind that figure and in part because European agricultural policy is an epic rabbit hole (cow hole?). But, after some slogging and some tweaking, I was able to recreate the methodology that produced the original $2.20/day (and subsequent $2.60/day) figure, but it came with a wallop of a surprise:

the cow jumped over the moon but didn't exactly stick the landing

I didn’t adjust the currency because that’s it’s own field of cow holes, but yes, those 2002 and 2003 numbers basically check out. But look what’s happened since then!

Before you get too excited about the EU attempting a radical break with subsidizing agriculture, it turns out that this is a consequence of the ‘decoupling’ – the shift in EU ag policy from subsidizing specific commodities to subsidizing practices and benchmarks that cut across commodities, which has lead the OECD to basically throw up its hands in terms of trying to derive consistent commodity series. Don’t worry, though, I used a rough previous average of milk as a share of total ag subsidies to impute the recent numbers:

ma he's imputing cows!

And given the wealth of data and anecdotal evidence that dairy farmers in Europe are still getting plenty of government (don’t say cheese don’t say cheese)…tofu?…anyway, there’s no reason to think European cows are getting too shafted, even though it does seem like 2002-03 was perhaps peak season for cow bucks.

But this doesn’t get into the larger issue with this number – that the vast majority of the total subsidy to dairy as computed by the OECD doesn’t come in direct government-to-producer payments but in implicit support, for example through tariffs, whose impact is much harder to quantify – they impute it through trying to measure negative consumer surplus. As Jacques Berthelot wrote at the time, their methods are far from bulletproof. Either way, the figure gives the at least somewhat-misleading impression that the $2.20/cow/day is received entirely in the form of cash transfers, which I suppose is misleading to the extent that you draw a distinction between the two.

Anyway, in conclusion – we no longer know how much subsidy European cows get, and maybe we didn’t really know at the time.

A final thought – a dairy cow in Great Britain will put you back, roughly on average, $2,500 (or just under 1500 GBP, or just over 1,850 EUR). So you could always try this yourself.

Wonkblog’s Matt O’Brien had a great reminder last week that Eurozone policymakers’ obsession with low inflation is fueling a monetary policy that is extremely damaging to the broader European economy and the lives of millions of Europeans. A recent paper, though, suggests the problem may be even worse then we thought.

Jessie Handburt, Tsutomu Watanabe, and David E. Weinstein recently published a paper that purports to have assembled “the largest price and quantity dataset ever employed in economics” to assess how well the official Japanese CPI is measuring inflation. The answer is ‘not so good’ – but the reason for that answer is scary. To wit:

We show that when the Japanese CPI measures inflation as low (below 2.4 percent in our baseline estimates) there is little relation between measured inflation and actual inflation. Outside of this range, measured inflation understates actual inflation changes. In other words, one can infer inflation changes from CPI changes when the CPI is high, but not when the CPI close to zero.

What does that mean? They draw two clear conclusions. Firstly, that national CPIs routinely overstate inflation – here is their (better) measure stacked against the official measure:

the decline and fall of the nippon yenmpire

Since 1993, the official Japanese statistics show a net decline in prices of just a few percent, whereas the authors’ numbers show a decline close to 15%.

The other conclusion is that, even though over the long term the CPI overstates inflation, when inflation is low, the CPI is basically no better than a random guess as regards any particular measurement.

find the pattern [hint you can't]

This means that while, on average, the CPI inflation rate is biased upwards by 0.6 percentage points per year, one can only say with 95 percent confidence that this bias lies between -1.5 and 2.8 percentage points. In other words, if the official inflation rate is one percent per year and aggregate CPI errors are the same as those for grocery items, one can only infer that the true is inflation rate is between -1.5 and 2.8 percentage points. Thus, a one percent measured inflation rate would not be sufficient information for a central bank to know if the economy is in inflation or deflation.

you say toe-mae-tos (are more expensive) i say toe-mah-tos (aren't) let's call the whole thing off

In case it wasn’t clear enough, Europe is definitely in the ‘flying blind’ zone:

#winning

As is more and more of the developed world in general:

if all the other countries were blowing up their economies to satisfy a bizarre price stability fetish would you do it too?

This is, errr, kind of terrifying. Because what it all adds up to is the conclusion that monetary policy makers are throttling growth because they’re relying on data that is both inaccurate and imprecise. The inflation fears that have crippled Western recoveries for half-a-decade and running are based purely on phantoms.

A weekend thought: my father is the kind of guy who likes to come up with big monocausal theories to explain every little thing; he missed his calling as a columnist for a major newspaper. Anyway, last week we were chatting and he expounded on one of these theories, in this case a coherent and compelling narrative for the dramatic increase in dog ownership in recent years. The theory is unimportant (it had to do with a decline in aggregate nachas) but afterwards I decided for the heck of it to fact-check his theory. And what do you know? According to the AVMA’s pet census, dog ownership rates have declined, very slightly, from 2007 to 2012.

Now, I know why my dad thought otherwise – over the past few years, dogs have become fantastically more visible in the environments he inhabits, mainly, urban and near-suburban NYC. I am certain that, compared to 5-10 years, ago, many more dogs can be seen in public, more dog parks have emerged, and there are many more stores offering pet-related goods-and-services. But these are intertwined with substantial cultural and demographic changes, and authoritatively  not driven by a change in the absolute number of dogs or dog-ownership rate.

It’s hard to prove things with data, even if you have a lot of really good data. There will always be multiple valid interpretations of the data, and even advanced statistical methods can be problematic and disputable, and hard to use to truly, conclusive prove a single interpretation. As Russ Roberts is fond of pointing out, it’s hard to name a single empirical econometric work that has conclusively resolved a dispute in the field of economics.

But what data can do is it can disprove things, often quite easily. While Scott Winship will argue to death that Piketty’s market-income data is not the best kind of data to understand changes in income inequality, but what you can’t do is proclaim or expound a theory explaining a decrease in market income inequality. This goes for a whole host of things – now that data is plentiful, accessible, available, and manipulable to a degree exponentially vaster than any before in human history, it’s become that much more harder to promote ideas contrary to data. This is the big hidden benefit to bigger, freer, better data – it may not conclusively prove things, but it can most certainly disprove them, and thereby help better hone and focus our understanding of the world.

Of course, I’m well over halfway into writing my Big Important Thinkpiece about Capital in the 21st Century and the FT decides to throw a grenade. Smarter and more knowledgeable people than I have gone back and forth on the specific issues, and my sense seems to align with the general consensus with there being specific issues with some of the data, but that the FT criticisms were at least somewhat overblown and that there is not nearly enough to overturn some of the central empirical conclusions of Piketty’s work.

What strikes me about this episode most is just how unbelievably hard true data and methodological transparency is. The spreadsheet vs. statistical programming platform debate seems to me to be a red herring – at least as the paradigm stands, each has their uses and limitations, as well as common pitfalls, and for the kind of work Piketty was doing, which didn’t rely on more complex statistical methods but mostly careful data aggregation and cleaning, etc, a spreadsheet is probably as fine a tool as any.

The bigger issue is that current standards for data transparency, while certainly well-advanced by the power of the internet to make raw data freely available, are still sorely lacking. The real problem is that published data and code, while useful, is still the tip of a much larger methodological iceberg whose base, like a pyramid (because I mix metaphors like The Avalanches mix phat beats), extends much deeper and wider than the final work. If a published paper is the apex, the final dataset is still just a relatively thin layer, when what we care about is the base.

To operationalize this a little, let me pick an example that’s both a very good one and also one I happen to be quite familiar with, as I had to replicate and extend the paper for my Econometrics course. In 2008, Daron Acemoglu, Simon Johnson, James A. Robinson, and Pierre Yared wrote a paper entitled “Income and Democracy” for American Economic Review in which they claimed to have demonstrated empirically that there is no detectable causal relationship between levels of national income and democratic political development.

The paper is linked; the data, which is available at AER’s website, are also attached to this post. I encourage you to download it and take a look for yourself, even if you’re far from an expert or even afraid of numbers altogether. You’ll notice, first and foremost, that it’s a spreadsheet. An Excel spreadsheet. It’s full of numbers. Additionally, the sheets have some text boxes. Those textboxes have Stata code. If you copy and paste all the numbers into Stata, then copy and paste the corresponding code into Stata, then run the code, it will produce a bunch of results. Those results match the results published in the corresponding table in the paper. Congratulations! You, like me, have replicated a published work of complex empirical macroeconomics!

Except, of course, you haven’t done very much at all. You just replicated a series of purely algorithmic functions – you’re a Chinese room of sorts (as much as I loathe that metaphor). Most importantly, you didn’t replicate the process that led to the production of this spreadsheet full of numbers. In this instance, there are 16 different variables, each of which is drawn from a different source. To truly “replicate” the work done by AJR&Y you would have to go to each of those sources and cross-check each of the datapoints – of which there are many, because the unit of analysis is the country year; their central panel alone, the 5-Year Panel, has 36,603 datapoints over 2321 different country-years. Many of these datapoints come from other papers – do you replicate those? And many of them required some kind of transformation between their source and their final form in the paper – that also has to be replicated. Additionally, two of those variables are wholly novel – the trade weighted GDP index, as well as its secondary sibling the trade-weighted democracy index. To produce those datapoints requires not merely transcription but computation. If, in the end, you were to superhumanly do this, what would you do if you found some discrepancies? Is it author error? Author manipulation? Or your error? How would you know?

And none of these speaks to differences of methodological opinion – in assembling even seemingly-simple data judgment calls in how they will be computed and represented must be made. There are also higher level judgment calls – what is a country? Which should be included and excluded? In my own extension of their work, I added a new variable to their dataset, and much the same questions apply – were I to simply hand you my augmented data, you would have no way of knowing precisely how or why I computed that variable. And we haven’t even reached the most meaningful questions – most centrally, are these data or these statistical methods the right tools to answer the questions the authors raise? In this particular case, while there is much to admire about their work, I have my doubts – but to even move on to addressing those doubts, in this case, involves some throwing up of hands in the face of the enormity of their dataset. We are essentially forced to say “assume data methodology correct.”

Piketty’s data, in their own way, go well beyond simply a spreadsheet full of numbers – there were nested workbooks, with the final data actually being formulae that referred to preceding sources of raw-er data that were transformed into the variables of Piketty’s interest. Piketty also included other raw data sources in his repository even if they were not linked via computer programming to the spreadsheets. This is extremely transparent, but still leaves key questions unanswered – some “what” and “how” questions, but also “why” questions – why did you do this this way vs. that way? Why did you use this expression to transform this data into that variable? Why did you make this exception to that rule? Why did you prioritize different data points in different years? A dataset as large and complex as Piketty’s is going to have hundreds, even thousands of individual instances where these questions can be raised with no automatic system of providing answers other than having the author manually address them as they are raised.

This is, of course, woefully inefficient, as well as to some degree providing perverse incentives. If Piketty had provided no transparency at all, well, that would have been what every author of every book did going back centuries until very, very recently. In today’s context it may have seemed odd, but it is what it is. If he had been less transparent, say by releasing simpler spreadsheets with inert results rather than transparent formulae calling on a broader set of data, it would have made it harder, not easier, for the FT to interrogate his methods and choices – that “why did he add 2 to that variable” thing, for example, would have been invisible. The FT had the privilege of being able to do at least some deconstruction of Piketty’s data, as opposed to reconstruction, the latter of which can leave the reasons for discrepancies substantially more ambiguous than the former. As it currently stands, high levels of attention on your research has the nasty side-effect of drawing attention to transparent data but opaque methods, methods that, while in all likelihood at least as defensible as any other choice, are extremely hard under the status quo to present and defend systematically against aggressive inquisition.

The kicker, of course, is that Piketty’s data is coming under exceptional, extraordinary, above-and-beyond scrutiny – how many works that are merely “important” but not “seminal” never undergo even the most basic attempts at replication? How many papers are published in which nobody even plugs in the data and the code and cross-checks the tables – forget about checking the methodology undergirding the underlying data! And these are problems that relate, at least somewhat, to publically available and verifiable datasets, like national accounts and demographics. What about data on more obscure subjects with only a single, difficult-to-verify source? Or data produced directly by the researchers?

On Twitter in discussing this, I advocated for the creation of a unified data platform which not only allowed users to merge the functions and/or toggle between spreadsheet and statistical programming GUIs and capabilities, but also to create a running annotatable log of a user’s choices, not merely static input and output. Such a platform could produce a user-friendly log that could either be read in a common format (html, pdf, doc, epub, mobi) or uploaded by a user in a packaged file with the data and code to actually replicate, from the very beginning, how a researcher took raw input and created a dataset, as well as how they analyzed that dataset to draw conclusions. I’m afraid that without such a system, or some other way of making not only data, but start-to-finish methodologies, transparent, accessible, and replicable, increased transparency  may end up paradoxically eroding trust in social science (not to mention the hard sciences) rather than buttressing it.

Income and Democracy Data AER adjustment_DATA_SET_AER.v98.n3June2008.p808 (1) AER Readme File

Now that it’s been a few months, we can all calm down and stop arguing over power calculations in the Oregon Medicaid Study and acknowledge that the most important finding in the study was this one:

p had them confidence intervals/values with the stastically significant to which they refer/the whole relevant academic community was looking at her/p hit the floor/next thing you know/p values got low, low, low, low, low, low, low, low

For all the talk about the state of American health, and whether Medicaid provides quality healthcare, people really neglected to discuss that health insurance is insurance, a fundamentally financial project in which customers exchange regular payments for a promise to be protected from the consequences of low-probability but high-cost events. It’s certainly an interesting question whether car insurance or homeowner insurance effect the rates of collisions or fires, but more importantly it is completely clear that these products eliminate people who have suffered those events from also being bankrupted. Similarly, health insurance is a way that people who contract cancer from not also contracting six-figure debts. Despite what is my very strong discomfort with conventional methods of statistical significance, it is clearly obvious from the above results that even the relatively-minimal insurance afforded by Medicaid succeeds somewhere between “substantially” and “wildly” in reducing the financial risk of illness.

Now, the health care market is a funny one, because of the weird ways we think about it and conceptualize, the expense of it, and the large hand the taxpayer has in it. But while we should definitely strive to increase the efficiency of health care, by encouraging good behavior and incentivizing preventative care and reducing wasteful care and introducing more cost-reduction pressure and reducing administrative costs and eliminating infections in hospitals, all that is separate from following the commitment our society has already made to guarantee at least some forms of medical care to those in need with a commitment to put a ceiling on the financial risk that individuals can incur when they elect to not simply wander off and die when catastrophic illness strikes.

Yesterday a friend of mine tweeted an invitation via a new service called Feastly. The invitation was to come to her home and eat a delicious, home-cooked gourmet meal in exchange for money. The service, Feastly, is set up to do exactly that – while it is still in private beta (and therefore cannot be fully-explored until one is invited in) it clearly aggregates offerings of that sort, sortable by dietary restrictions, price, attire, pet-friendliness, and other criteria. It’s a great idea, and one I wish I thought of.

On a social scale, I think as we see more services like this that directly connect buyers and sellers – think eBay, Etsy, ebook self-publishing – it will throw further into question whether statistics like GDP/GNI are useful metrics, not just of broader concepts like "standard of living," but of what they purport to measure. Every meal eaten on Feastly and not at a formal restaurant is one that involves an exchange of goods and services for money, and most of them will likely not be counted by current methods of measuring GDP. This issue predates the internet, of course, but the internet’s amazing power to match small-scale producers to buyers will accelerate this trend, as will the advent of 3-D printing.

Join 3,849 other followers

Not Even Past

Tweetin’

RSS Tumblin’

  • An error has occurred; the feed is probably down. Try again later.