Saturday, October 29, 2016

Reading Other People's Code is the Worst...But Also the Best

I would say that one of the best recent developments in the world of research is that more and more journals are making it official policy to post data and code associated with published papers. All of the AEA journals have this policy and so does the Review of Economics and Statistics (ReStat). I think making it easier for researchers to replicate papers makes for better published papers to start with and also results in more and better research going forward.

My official advice for graduate students starting new projects: Check the AEA journals and ReStat for papers using your data and download the authors' code. Look through it carefully. Learn from it. I bet you'll see lots of clever tricks for coding things that appear difficult. You'll also see directly how people describe what they do in the published papers. Yes, I know that reading other people's code is extremely painful, but trust me, it's often worth the trouble. Here is ReStat's data archive. You can download data and code for the AEA journals from the AEA website.

And now I'll end with a plea for you to write good code. From our friends at www.xkcd.com:

Code Quality 2

Monday, October 17, 2016

This Happens.



The question is what to do when it does. What do I suggest? Step one: Check and recheck your code for mistakes. Step two: Take some time away from the project. During that time, think about alternative specifications, why the data are not cooperating, etc. Maybe you were just thinking about the problem incorrectly. Maybe you were not looking at the correct sample, etc. Step three: When all else fails, move on to a different project.

Notice that spending months and months (or years?!?) torturing the data is never the way to go.

Sunday, October 9, 2016

Formulas for Writing Introductions and Conclusions

It's that time of year again! That special time when job market candidates should be perfecting (ie, writing and rewriting and rewriting) the introductions of their job market papers. Writing is hard. Even harder is thinking carefully about what exactly people learn from your analysis and why it's important. Good papers are important for many reasons, and it's tricky to guess which to emphasize. Sometimes you have to write multiple versions of introductions before you can determine what works best.

But what about that very first draft? How do you even start? I just discovered a nifty little formula for writing introductions. Does it work all the time for every paper? Who knows...but it does work often. Definitely a great place to start. And here's a brand new formula for writing conclusions! Use these formulas, but also come up with your own. Pay attention to the format of introductions and conclusions when reading your favorite papers. Then use what works best for your particular papers.

Monday, October 3, 2016

Trouble with the Curve

Oh yes, another post about RD design. I've actually been meaning to blog about this for a while, but I just haven't gotten to it. So, we all know the basic idea behind regression discontinuity. We want to control for smooth function of the forcing variable, and then check if there is a discontinuity at the cutoff. The tricky thing: how to control for the forcing variable given that we don't really know the true relationship. There are basically two potential techniques. It turns out, however, that one is way better than the other. For intuition on the two techniques along with an explanation of why one is way better than the other, click here. For the most recent draft of the paper, click here. For the NBER version, here.

And how about a really cool example of a paper using spatial regression discontinuity? Click here.