The myth of objective statistics - von Ravenzwaaij - 2017 - Article


What is the difference between Bayesian hypothesis testing and traditional statistics?

Bayesian hypothesis testing is different from traditional null-hypothesis testing. The four problems with this traditional method are:

  1. Evidence in favour of the null-hypothesis cannot be qualified.

  2. When the null-hypothesis is true, it is often over-rejected in cases where both the null -and alternative hypothesis are unlikely.

  3. Interpreting p-values is very difficult.

  4. Sequential testing is not allowed with the use of p-values.

In Bayesian hypothesis testing, none of these problems exist because prior knowledge is compared to likelihood of the observed data. This creates an a-posterior view about the data. An a-posterior view is what we believe about the world after we have seen the data.

  • Prior: there is a 45% change of rain.

  • Likelihood: the change I walk out with an umbrella given that it rains is about 80% (sometimes I forget) and the chance I walk out with an umbrella given that it is dry is 10%.

  • Posterior: now the probability of me walking out with an umbrella and it rains can be calculated (.45*.8=.36) and the probability of me walking out with an umbrella and it is dry (.55*.1=.055). Our a-posterior belief about the hypothesis then is:

    • H0 (it is dry): .055/(.055+.36) = .13

    • H1 (it is raining): .36/(.055+.36) = .87

What are the advantages of Bayesian hypothesis testing?

Based on the umbrella and weather forecast you can conclude there is an 87% chance it is raining without a p-value involved. By combining our prior belief to what is in the data we get a more refined belief about the world.

  • Evidence in favour of the null-hypothesis is quantified given the 13% chance it is dry.

  • There is no bias against the null-hypothesis in the face of unlikely data.

  • The posteriors are easy to interpret.

  • New data can be collected, and the new posteriors can be calculated along the way.

What are the implications for the research?

In determining the efficacy of new medicine, they stick to the use of p-values. This can cause errors when medications are endorsed based on the combination of p-value results. The strict adoption of the FDA’s policy to only endorse new medications with two statistically significant trials causes many ineffective medicines being sold. They believe that statistics can be applied without thinking, however, based on the previous examples in the article this is clearly not the case.

Statistics is not done, on the contrary, it is a vibrant area of science that includes some cutting-edge subjects with which some scientist strongly disagree. The most important lesson of the author here is not to throw away statistics, but thing critically about the assumptions that are based on p-values: “we need to continue to think about smart ways to generalize conclusions drawn from a small population to the entire population and get the information we need”.

Join World Supporter
Join World Supporter
Log in or create your free account

Why create an account?

  • Your WorldSupporter account gives you access to all functionalities of the platform
  • Once you are logged in, you can:
    • Save pages to your favorites
    • Give feedback or share contributions
    • participate in discussions
    • share your own contributions through the 7 WorldSupporter tools
Follow the author: Vintage Supporter
Comments, Compliments & Kudos

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.