Wednesday, May 28, 2014

Excel is EVIL

In the news, more reasons why no one, especially not academic economists, should use Excel formulas as a way to calculate anything even marginally important.

Last year we had the Reinhart and Rogoff case of spreadsheet errors and now we have the Piketty case.

Chris Giles of FT writes

In the wealth inequality tables for the top 1 per cent and top 10 per cent in Capital in the 21st Century, there were 142 data points and I found problems with 114 of them. As a problem proportion, it comes in at a rather high 80 per cent.

At the end of the same blog posts he unfortunately has to add a mea culpa about his previous post:

Finally, I am grateful to @kenivers1 for spotting an error in the FT’s spreadsheet I posted online. The constructed European average for the top 10% wealth share in 1810 contained a formula error. The spreadsheet has now been amended. Our mistaken result was 83.4%, while the corrected one is 81.3%.

Making mistakes is easy. My Stata code almost never works the first time I try to run it.  But once it does run, there is code with (some) comments and hopefully it possible to figure out why I did what I did and whether it makes sense or not.   On the other hand excel buries the formulas. Even when you look at them they are written in terms of A1 and D5 (wasn't that supposed to be D1?).  In other words when do data work in Excel you are more or less ensuring that you will make mistakes that will be very hard to find unless someone is looking hard. And that's a bad thing.

Here's Justin Wolfers on how to avoid/identify mistakes.