Saturday, March 23, 2019

And More on Those Pesky P Values

I never really thought about statistics/econometrics being such a political subject, but I guess it is. A recent article in Nature comes with a list of over 800 signatures rising up against statistical significance!

There is much to like about the article. Many things to think about. I do often see students in my office really excited about a placebo regression with statistically insignificant estimates--despite the fact that the point estimates are just as large (or larger!) as those in a baseline regression. That's not exactly what I want to see in placebo regressions. I've also seen people really excited when they get stars even though the point estimate are just too big to be believable!

I think ignoring estimate magnitudes can be a mistake when trying to write good papers. Paying too much attention on those little stars also makes for bad science. One of my favorite quotes from the article:

"Statistically significant estimates are biased upwards in magnitude and potentially to a large degree, whereas statistically non-significant estimates are biased downwards in magnitude. Consequently, any discussion that focuses on estimates chosen for their significance will be biased. On top of this, the rigid focus on statistical significance encourages researchers to choose data and methods that yield statistical significance for some desired (or simply publishable) result, or that yield statistical non-significance for an undesired result, such as potential side effects of drugs — thereby invalidating conclusions."

So what does the article recommend?

"...we recommend that authors describe the practical implications of all values inside the interval, especially the observed effect (or point estimate) and the limits."

I definitely think that's a great idea. Think about what the estimates in the specific context of your paper. For some questions, a wide interval of potential estimates is still interesting. For other questions, maybe not.

Am I ready to abandon discussion of statistical significance all together? Maybe not yet. Those stars are a nice and easy way to determine how confident we should be that there is enough data/variation in the data to be able to learn something about the world. Sure, thresholds may not be ideal for many reasons, but they do provide a quick way to make comparisons across studies.

So, how about this? Let's keep the stars but maybe report p values instead of standard errors? Would that be so crazy? And I'm all for pictures of estimates with confidence intervals around them.

The authors of the article hope that abandoning statistical significance will get people to spend less time with statistical software and more time thinking. I'm up for that!

P.S.
I have had this song in my head the entire time writing this post: https://www.youtube.com/watch?v=79ZLtr-QYNA. Enjoy!

2 comments:

  1. See twitter comments on the article: https://twitter.com/NatureNews/status/1108711120691494913

    ReplyDelete
  2. More: https://twitter.com/DavidAJaeger/status/1109074900260737024

    ReplyDelete