I wish economists made better plots

Tweet about this on Twitter37Share on Facebook47Share on Google+38Share on LinkedIn7Email this to someone

I'm seeing lots of traffic on a big-time economics article by that failed to reproduce and here are my quick thoughts. You can read a pretty good summary here by Mike Konczal.

Quick background: Carmen Reinhart and Kenneth Rogoff wrote an influential paper that was used by many to justify the need for austerity measures taken by governments to reduce debts relative to GDP. Yesterday, Thomas Herndon, Michael Ash, and Robert Pollin (HAP) released a paper where they reproduced the Reinhart-Rogoff (RR) analysis and noted a few irregularities or errors. In their abstract, HAP claim that they "find that coding errors, selective exclusion of available data, and unconventional weighting of summary statistics [in the RR analysis] lead to serious errors that inaccurately represent the relationship between public debt and GDP growth among 20 advanced economies in the post-war period.

It appears there were three points made by HAP: (1) RR excluded some important data from their final analysis; (2) RR weighted countries in a manner that was not proportional to the number of years they contributed to the dataset (RR used equal weighting of countries); and (3) there was an error in RR's Excel formula which resulted in them inadvertently leaving out five countries from their final analysis.

The bottom line is shown in HAP's Figure 1, which I reproduce below (on the basis of fair use):

HAP Analysis

From the plot you can see that the HAP's adjusted analysis (circles) more or less coincides with RR's analysis (diamonds) except for the last categories of countries with debt/GDP ratios over 90%. In that category RR's analysis shows a large drop in growth whereas HAP's analysis shows a more or less smooth decline (but still positive growth).

To me, it seems that the incorrect Excel formula is a real error, but easily fixed. It also seemed to have the least impact on the final analysis. The other two problems, which had far bigger impacts, might have some explanation that I'm not aware of. I am not an economist so I await others to weigh in. RR apparently do not comment on the exclusion of certain data points or on the weighting scheme so it's difficult to say what the thinking was, whether it was inadvertent or purposeful.

In summary, so what? Here's what I think:

  1. Is there some fishiness? Sure, but this is not the Potti-Nevins scandal a la economics. I suppose it's possible RR manipulated the analysis to get the answer austerity hawks were looking for, but we don't have the evidence yet and this just doesn't feel like that kind of thing.
  2. What's the counterfactual? Or, what would have happened if the analysis had been done the way HAP propose? Would the world have embraced pro-growth policies by taking on a greater debt burden? My guess is no. Austerity hawks would have found some other study that supported their claims (and in fact there was at least one other).
  3. RR's original analysis did not contain a plot like Figure 1 in HAP's analysis, which I personally find very illuminating. From HAP's figure, you can see that there's quite a bit of variation across countries and perhaps an overall downward trend. I'm not sure I would have dramatically changed my conclusion if I had done the HAP analysis instead of the RR analysis. My point is that plots like this, which show the variability, are very important.
  4. People see what they want to see. I would not be surprised to see some claim that HAP's analysis supports the austerity conclusion because growth under high debt loads is much lower (almost 50%!) than under low debt loads.
  5. If RR's analysis had been correct, should they have even made the conclusions they made? RR indicated that there was a "threshold" at 90% debt/GDP. My experience is that statements about thresholds, are generally very hard to make, even with good data. I wonder what other more knowledgable people think of the original conclusions.
  6. If the data had been made available sooner, this problem would have been fixed sooner. But in my opinion, that's all that would have happened.

The vibe on the Internets seems to be that if only this problem had been identified sooner, the world would be a better place. But my cynical mind says, uh, no. You can toss this incident in the very large bucket of papers with some technical errors that are easily fixed. Thankfully, someone found these errors and fixed them, and that's a good thing. Science moves on.

UPDATE: Reinhart-Rogoff respond.

UPDATE 2: Reinhart-Rogoff more detailed response.

  • http://yihui.name/ Yihui Xie

    My additional complaint is on categorizing an otherwise continuous variable (debt/GDP ratio). That is horrible crime in statistics -- murdering information.

    A very natural choice would have been a scatterplot with a LOESS curve, instead of cutting the x-axis into arbitrary categories and plot the averages. They are recommended to take Jeff's Coursera class :)

    • Roger Peng

      In the re-analysis, HAP do a plot with continuous debt/GDP. That's interesting too but I thought I'd better not copy their entire paper :)

  • Thomas Lumley

    Roger, I'm less convinced than usual by the graph. Remember that 1% difference in economic growth is a huge amount. For example, the sequester cuts added up to quite a bit less than 1% of US GDP. The graph makes 1% look small and unimpressive.

    • Roger Peng

      That's a good point. I think RR also make that point in their response, in that even in the corrected analysis, there's still a relatively big difference in GDP in the 90% category.

  • Ken

    One more good reason for not using Excel. Changing models can be extremely difficult, as it may require modifying a large number of cells, but only after checking their contents.

    Two major problems with their analysis,are that the ignore private debt and for both public and private debt it is the rate of change of debt that seems most important. Current economic orthodoxy is that private debt doesn't matter, which is why every government for the last 30 years has been trying to expand it, rather successfully. Of course it all ended badly. Minsky http://en.wikipedia.org/wiki/Hyman_Minsky claimed that debt mattered, and it seems like he was correct.

  • http://twitter.com/Malarky67 Stephen Henderson

    I think you are being very generous to them... almost everything is wrong with it. It's really difficult to pin down all the wrong-headedness as its just so dumb. The most interesting question (to me) is why bins of 30% (leaving aside the question why bins?). Presumably they could have made it 20% bins poggled the data selection and made the magic threshold 80% debt....

    Plus its in ... Excel... hah ha ha hah...do they do their presentations in Comic Sans too?

  • http://twitter.com/PolSciReplicate PolSci Replication

    Policy makers might not change their view of austerity measures after this, but they will have to cite another paper as justification and answer criticism more carefully.

    Among researchers and journals the value and possible impact of replication will hopefully be recognized more than before. Let’s not forget, a recent study found that out of 120 political science journals only 18 had a data sharing/replication policy… (http://wp.me/p315fp-aC).

    Without replication, economics, political science and policy makers might base their decisions and work on wrong results – even if it’s just because of an excel error.

  • mladefer

    As for point 4, from their detailed response: "Put differently, growth at high debt levels is a little more than half of the growth rate at the lowest levels of debt."


    So, let's sum it up. Excel errors, missing data, dubious averaging. Arbitrary 90% thresholds. Then we have conflating correlation and causation in interpretations and policy proposals.

    To add to all of this, let's also not forget conflating completely different monetary regimes in the same bucket. From prof. Wray's response:

    "More importantly, they have no idea what sovereign debt is. They add together government debts issued by states on gold standards, fixed exchange rates and floating rates. They aggregated across governments that issue debt in their own currency and states that issue debt denominated in foreign currency."


    These are completely different beasts and should never be put together in the same aggregate. For further reading to understand why, I suggest prof. Fullwiler's analysis on debt and interest rates as a possible starting point:


  • Marian B Westley

    Thanks for posting this, Dr Peng! I read Paul Krugman this morning and came straight here.

  • buggyfunbunny

    You have to remember the Prime Directive in economics: "Thou shalt defend your political allies". They made no error, just following orders.

  • Chris Stehlik

    As you mentioned, what stands out is not the Excel error (though that's bad) but the amount of variability. A good number of countries at the 90% level had economic growth over even the corrected 2.2% amount (much less the -0.9%). The 90% debt is not a death knell it was being made out to be.

    (Got to your blog from Prof Leek's earlier Coursera course.)