Even in the most advanced discussions of modern times’ patient management, I have not once met a colleague who understood p-values. Not that I do.
In last Friday’s session we did the classical experiment (described e.g. in this article) of offering various choices of an explanation of how to interpret p-values to the audience, with actually none of them correct. Which is funny. Then we discussed the core definition. Which is simple. And worked through the examples on the Wikipedia entry (by the way: Wikipedia provides excellent articles on statistics!), such as computing p for simple random variables (such as the number of heads in n tosses of a coin). Which is hard.
Just for completeness: the p-value is the probability of obtaining the data’s test statistics or more extreme values, assuming the null hypothesis to be true.
Unfortunately we did not get far enough to discuss the Bayesian alternatives to p values and their decision theoretic applications.
Here are my take home messages:
- p-values are a statistical property of the data.
- You require a statistical model and have to assume the null hypothesis to be true to be able to compute them.
- They say nothing about the truth of the null hypothesis or (even less so!) any alternative hypothesis.
- They may increase with effect size yet small effect sizes may have the tiniest ps and vice versa.
- They cannot be compared among studies.