Sunday, July 30, 2017

"Relax. Nothing is Under Control."--Adi Da

That quote seems particularly fitting for my life right now, but since this is a blog about doing applied micro, let me write again about control variables (hehe, I know corny). Marc Bellman has a new blog post about the sensitivity of results to the use of particular controls. On the one hand, we should expect results to be sensitive. That's why we include those controls! On the other hand, if researchers are playing with different specifications including many different combinations of controls and only reporting those generating significant results, ...well, you know. 

Marc's post discusses a recent working paper by Lenz and San showing that in about 40% of the observational studies analyzed in the journal, American Journal of Political Science, researchers obtain statistical significance of their estimate of interest by tinkering with the covariates included. 

Yikes! Would you expect similar numbers in an economics journal? 

In past blog entries, I've written about how economics papers have gotten longer and longer over the years and how referees often help write the paper instead of just 'refereeing'. But now that I see that 40% figure, I think maybe these are not such horrible developments. If you have only one specification to tinker with, it's not so hard to get that statistical significance, but if you have many suggested by referees, it's not impossible but certainly a lot harder. 

Economists have been worried about the issue of control variables recently. I really like Marc's description of two recent papers:

(1) y = a + bX + cD + e
"The issue of what goes on the RHS of equation (1) is getting a lot of attention in the applied literature. Two prominent examples are Emily Oster’s forthcoming JBES article “Unobserved Selection and Coefficient Stability: Theory and Evidence” and Pei, Pischke, and Schwandt’s (2017) NBER working paper titled “Poorly Measured Confounders are More Useful on the Left than on the Right.”

Oster provides a method to assess just how much coefficient (as in coefficient c in equation 1) stability tells us about selection on unobservables. Pei et al. develop a test of identifying assumptions that treats putative additional controls as dependent variables in equation (1).
I expect both methods to become part of the applied econometrician’s toolkit over the next five to 10 years. At the very least, I expect a bare-bone regression of y on D alone to become something that has to be included in a paper, along with a discussion of why the controls that were included on the RHS of equation (1) were retained for analysis."



Sunday, July 23, 2017

How to Choose a Title

Confession: I often don't really think about titles until it's time to submit the paper to a conference. I have been known to quickly come up with, say, four potential titles and then ask my friends to vote, but that's about it. This blog entry makes the very excellent point that paper titles are really, really important. Not only that, but Patrick Dunleavy goes through both how to write a good title and how to write a terrible one. Both lists are so helpful! I really like the suggestion to do a google search of the potential title to see what else comes up. I'm also a fan of the full narrative title idea, but I can imagine that might be really tricky to do well. 

My addition to all of this: Pay attention to titles when you read papers. Which titles do you like? Which do you hate? Which are completely uninformative? I think this process alone will help you come up with better titles for your own papers. 

Inspiration for this blog post: this David Evans blog entry

Wednesday, July 12, 2017

Cool Data Alert: Health Surveys

We all know about the National Health Interview Survey (NHIS). We also (hopefully) know that the harmonized version is easy to download from our friends at IPUMS. What you may not know is that sample sizes are large enough to do good research on the U.S. immigrant population. Hooray! But I have more good news. This past weekend at the iHEA World Congress in Boston, I learned about the California Health Interview Survey (CHIS). OK, it is only for California, but it's a large data set and there are lots of questions specifically pertaining to immigrants--including parents country of birth and visa status! UConn grad students, if you're interested in looking into this for your dissertation, come see me. 

For even more data sets related to health, see here