Even in the most advanced discussions of modern times’ patient management, I have not once met a colleague who understood p-values. Not that I do.

In last Friday’s session we did the classical experiment (described e.g. in this article) of offering various choices of an explanation of how to interpret p-values to the audience, with actually none of them correct. Which is funny. Then we discussed the core definition. Which is simple. And worked through the examples on the Wikipedia entry (by the way: Wikipedia provides excellent articles on statistics!), such as computing p for simple random variables (such as the number of heads in n tosses of a coin). Which is hard.

Just for completeness: the p-value is the probability of obtaining the data’s test statistics or more extreme values, assuming the null hypothesis to be true.

Unfortunately we did not get far enough to discuss the Bayesian alternatives to p values and their decision theoretic applications.

Here are my take home messages:

  • p-values are a statistical property of the data.
  • You require a statistical model and have to assume the null hypothesis to be true to be able to compute them.
  • They say nothing about the truth of the null hypothesis or (even less so!) any alternative hypothesis.
  • They may increase with effect size yet small effect sizes may have the tiniest ps and vice versa.
  • They cannot be compared among studies.


How to read the publication of a randomized controlled trial

Traditional evidence-based medicine a la Sackett has it that you should consider a couple of aspects about a randomized controlled trial, before you believe it. Using the chapter on RCTs in the User’s Guide and the SPARCL trial in the NEJM we discover quite a lot of flaws in this not quite so new trial, that spawned a lot of controversy about the bleeding complications of high dose statins – a topic on it’s own.

On my hidden agenda I wanted to also sharpen the minds about the disadvantages of large-scale RCTs. While I would not go as far as James Penston in his book or one of his articles to totally condemn them (hey, they are still the best – after n=1 trials – we’ve got in terms of research), it is worthwhile to understand why phase 3 RCTs are more an economic than a medical undertaking. So go on and read one of his articles, say this.

Quite curiously, we did not go into this topic as far as I would have wished and instead focused on p-values, confidence intervals and frequentist statistics – a topic we have to deal with again in the future (after I have read up on Fisher vs. Neyman-Pearson).