Friday, July 31, 2015

Nothing screams “GRAD STUDENT!!!” louder than...

Here is a bit more on the LPM vs. nonlinear models issue.

My favorite quote: "Indeed, nothing screams “GRAD STUDENT!!!” louder than an obsession with fancy estimators — usually of the maximum likelihood variety, so probit, logit, tobit, etc., sometimes of the Bayesian variety — instead of with whether one has reasonably identified one’s parameter of interest (via a research design that relies on a plausibly exogenous source of variation), or with whether one’s findings have some reasonable claim at being externally valid (via the use of a representative sample)."

Also liked this: 



There is an unspoken ontological order of importance to things in applied work, which unfortunately goes unspoken in most econometrics classes. That order is roughly as follows:
  1. Internal validity: Is your parameter of interest credibly identified? In other words, are you estimating a causal relationship, or are you merely dealing with a correlation? If the latter, how close can you get to estimating a causal relationship with the best available data and methods?
  2. External validity: Are your findings applicable to observations outside of your sample? Why or why not?
  3. Precision: Are your standard errors right? Have you accounted for things like heteroskedasticity? Did you cluster your standard errors at the right level?
  4. Data-generating process: Did you properly model the DGP? For example, does your estimation procedure account for the fact that, say, your dependent variable is a positive integer, which would require a Poisson or negative binomial regression?




Saturday, July 25, 2015

Should We Just Stop Teaching Probit/Logit Models?

Instead of teaching these models and then teaching students to just use linear models (see Mostly Harmless), could we just skip teaching them? Read this.

Usually, the argument is that it doesn't make a difference. Fine. But sometimes it does. Then what? And how different do estimates need to be in order to even think about this....

Thursday, July 16, 2015

RCT, RD, IV, DiD..Whatever! :)

"O Data, Data! Wherefore Art Thou Missing?"

Here is a nice summary of all of the ways to address the missing data problem. All is great if it turns out that all of these techniques suggest that the missing data is missing at random. But which specification should you use as your baseline: The missing dummy trick or just drop missing observations? I usually just drop missing observations because it's simpler, Regardless of what you choose, it is important to discuss your results from the techniques discussed in the article (even if they suggest that the missings are not missing at random).

Tuesday, July 7, 2015

Stata Command for Using RD

I don't have a single RD paper, but hopefully in the future, I will. And when that time comes, I want to be prepared with the appropriate funky Stata commands. Here it is.