Monday, June 29, 2020

How to Properly Cite Data

I am a big, big fan of writing papers so that readers can exactly replicate results. One important part of this is telling readers exactly how to get your original data sources. Properly citing data sources certainly helps in this regard, but I think it's also a nice way to reward authors and other organizations for making the data available to researchers. 

One issue in particular I have come across lately is how to cite data that authors have very generously made available on their personal websites or journal websites (I'm thinking of you, David Dorn!). Worry no longer because you can find detailed guidance on how to cite data right here. Note specifically how to cite replication data. Super useful! 

Is the Paper Ready to be Submitted to a Journal?

You may have noticed I haven't posted in a while. The reason: I've been trying to put those last finishing touches on a paper before submitting. It's taking a lot longer than I would have liked. For those of you out there who may be in similar situations, see this twitter thread by Brian Knight for some insights.

Thursday, June 11, 2020

Stata Tip: Multiple Commands Available for Multiple Hypothesis Testing

Yesterday, I attended a (zoom) seminar where multiple hypothesis testing was taken very seriously---you can watch it here if you're interested. I decided I needed to think more about this and, luckily for me, David Mackenzie had already thought very carefully about and blogged about it recently. He evaluates many different Stata commands including the one I heard about in the talk yesterday. 

But the most eye-opening thing was this simple calculation that I admit (shamefully) I had never thought about before. Let's say you examine the impact of four different treatments, and you look at four different outcomes. Not crazy numbers, right? David writes, 

Suppose that none of the treatments have any effect on any outcome (all null hypotheses are true), and that the outcomes are independent. Then if we just test the hypotheses one by one, then the probability that of one or more false rejections when using a critical value of 0.05 is 1-0.95^20 = 64%. 

64%! That's quite a high probability, right?! Wouldn't it be crazy to then write an entire paper on that one result? The multiple hypothesis testing adjustments provide methods to adjust for the fact that we are testing multiple hypotheses. 

Thank you to all of the authors of the Stata commands. You have made it easier for all of us to make the right adjustments, and of course, thank you to David for helping us think about these issues. 



Tuesday, June 2, 2020

Cool Data Alert: COVID-19 Data

I have blogged before about whether you should drop everything and write that COVID-19 paper. If you do decide that, for you, the answer is yes, then check out Stata's resources on how to download and import COVID-19 data to your computer. What an amazing time to be a researcher! 


Updated (6/3/20): Check out how existing longitudinal data sets are incorporating people's experiences with the pandemic. You can sign up for a webinar here