Hacker News new | past | comments | ask | show | jobs | submit login

"Don’t base your conclusions solely on whether an association or effect was found to be “statistically significant” (i.e., the p-value passed some arbitrary threshold such as p < 0.05).

Don’t believe that an association or effect exists just because it was statistically significant.

Don’t believe that an association or effect is absent just because it was not statistically significant.

Don’t believe that your p-value gives the probability that chance alone produced the observed association or effect or the probability that your test hypothesis is true.

Don’t conclude anything about scientific or practical importance based on statistical significance (or lack thereof)."

Hopefully this can help address the replication crisis[0] in (social) science.

[0]: https://en.wikipedia.org/wiki/Replication_crisis

Edit: Formatting (sorry formatting is hopeless).




I think NHST is kind of overstated as a cause of the replication crisis.

People do routinely misuse and misinterpret p-values — the worst of it I've seen is actually in the biomedical and biological sciences, but I'm not sure that matters. Attending to the appropriate use of them, as well as alternatives, is warranted.

However, even if everyone started focusing on, say, Bayesian credibility intervals I don't think it would change much. There would still be some criterion people would adopt in terms of what decision threshold to use about how to interpret a result, and it would end up looking like p-values. People would abuse that in the same ways.

Although this paper is well-intended and goes into actionable reasonable advice, it suffers some of the same problems I think is typical of this area. It tends to assume your data is fixed, and the question is how to interpret your modeling and results. But in the broader scientific context, that data isn't a fixed quantity ideally: it's collected by someone, and there's a broader question of "why this N, why this design", and so forth. So yes, ps are arbitrary, but they're not necessarily arbitrary relative to your study design, in the sense that if p < 0.05 is the standard the field has adopted, and you have a p = 0.053, the onus is on you to increase your N or choose a more powerful or more convincing design to demonstrate something at whatever threshold the field has settled on.

I'm not trying to argue for p-values per se necessarily, science is much more than p-values or even statistics, and think the broader problem lies with vocational incentives and things like that. But I do think at some level people will often, if not usually, want some categorical decision criterion to decide "this is a real effect not equal to null" and that decision criterion will always produce questionable behavior around it.

It's uncommon in science in general to be in a situation where the question of interest is to genuinely want to estimate a parameter with precision per se. There are cases of this, like in physics for example, but I think usually in other fields that's not the case. Many (most?) fields just don't have the precision of prediction of the physical sciences, to the point where differences of a parameter value from some nonzero theoretical one make a difference. Usually the hypothesis of a nonzero effect, or of some difference from an alternative; moreover, even when there is some interest in estimating a parameter value, there's often (like in physics) some implicit desire to test whether or not the value deviates significantly from a theoretical one, so you're back to a categorical decision threshold.


> Hopefully this can help address the replication crisis[0] in (social) science.

I think it isn't just p-hacking.

I've participated in a bunch of psychology studies (questionaires) for university and I've frequently had situations where my answer to some question didn't fit into the possible answer choices at all. So I'd sometimes just choose whatever seems the least wrong answer out of frustration.

It often felt like the study author's own beliefs and biases strongly influence how studies are designed and that might be the bigger issue. It made me feel pretty disillusioned with that field, I frankly find it weird they call it a science. Although that is of course just based on the few studies I've seen.


> the study author's own beliefs and biases strongly influence how studies are designed

While studies should try to be as "objective" as possible, it isn't clear how this can be avoided. How can the design of a study not depend on the author's beliefs? After all, the study is usually designed to test some hypothesis (that the author has based on their prior knowledge) or measure some effect (that the author thinks exists).


There is a difference between a belief and an idea. I might have an idea about what causes some bug in my code, but it isn't a belief. I'm not trying to defend it, but to research it. Though I have met people who do hold beliefs about why code is broken. They refuse to consider the larger body of evidence and will cherry pick what we know about an incident to back their own view conclusions.

Can we recognize the beliefs we have that bias our work and then take action to eliminate those biases? I think that is possible when we aren't studying humans, but beliefs we have about humans are on a much deeper level and psychology largely doesn't have the rigor to account for them.


If you get an answer outside of what you expected, reevaluate your approach, fix your study and redo it all, probably with a new set of participants.

If you can't do science, don't call it science.


Which is a great idea if we ignore all other issues in academia, e.g. pressure to publish etc. Taking such a hard-line stance I fear will just yield much less science being done.


> much less science being done

This isn't obviously a bad thing, in the context of a belief that most results are misleading or wrong.


Let's do a less science then, but rigorous and throrough. Or find more funding.

But surely let's have a "hard-line stance" on not drowning in BS?


And where will the money come from for this second study? What about a third? Fourth?

We live in a money-dependent world. We cannot go without it.


Psychology is IMO in the state alchemy was before chemistry. And there's no guarantee it will evolve beyond that. Not unless we can fully simulate the mind.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: