Key takeaways
- A p-value measures how surprising your data would be if the null hypothesis were true—not the probability that your theory is correct.
- Statistical significance at p < .05 is a convention, not a universal truth standard.
- Always pair p-values with effect sizes, confidence intervals, and study context.
Statistical significance appears in nearly every quantitative dissertation, yet p-values remain one of the most misunderstood concepts in academic research. Students treat p = .047 as proof their hypothesis is correct and p = .051 as failure—neither interpretation is statistically valid. Understanding what a p-value actually measures, what statistical significance really means, and how to communicate findings responsibly is essential for thesis writing, journal submission, and viva voce defence.
What is a p-value in plain language?
The p-value is a probability between 0 and 1. It answers a specific question: if the null hypothesis (no effect, no difference, no relationship) were true in the population, how likely is it that you would observe a sample result as extreme as—or more extreme than—what you obtained? A small p-value means such a result would be rare under the null; a large p-value means it would be common.
What statistical significance means
Statistical significance is a decision rule, not a discovery. You set an alpha level—typically α = 0.05—before analysis. If p < α, you reject the null hypothesis and call the result statistically significant. That means you have evidence against the null at your chosen error tolerance. It does not mean the effect is large, important, or practically meaningful.
The hypothesis testing framework behind p-values
- 1Formulate null (H0) and alternative (H1) hypotheses from your research question.
- 2Choose a test appropriate to your design and variable types.
- 3Set alpha and collect data according to your approved protocol.
- 4Calculate a test statistic and derive the p-value.
- 5Compare p to alpha and make a cautious inferential statement.
- 6Report exact p-values, effect sizes, and confidence intervals.
Common significance thresholds and how to read them
- p < .001: very strong evidence against H0 at conventional alpha levels.
- p < .01: strong evidence against H0.
- p < .05: conventional threshold for statistical significance in most social sciences.
- p between .05 and .10: not significant at α = .05; may warrant discussion but not rejection of H0.
- p > .10: weak evidence against H0; fail to reject.
What p-values do not tell you
- They do not prove your alternative hypothesis is true.
- They do not measure effect size or practical importance.
- They do not indicate whether your model fits well overall.
- They do not guarantee replicability in future studies.
- They are not the probability that H0 is true—that requires Bayesian inference.
Statistical significance vs practical significance
A result can be statistically significant but trivial in real-world terms. Large samples detect tiny differences. Conversely, a non-significant result in a small sample may mask a meaningful effect due to low statistical power. Report Cohen's d, η², r, or R² alongside p-values so examiners assess importance, not just significance.
One-tailed and two-tailed tests
Two-tailed tests evaluate effects in both directions and are the default in most dissertation work. One-tailed tests allocate all alpha to a predicted direction and require pre-specified justification. Never switch between one- and two-tailed testing after seeing results—that inflates Type I error.
Multiple comparisons and p-hacking risks
Running many tests increases the chance of at least one false positive. If you conduct multiple ANOVAs, post-hoc tests, or subgroup analyses, apply corrections (Bonferroni, FDR) or pre-register a primary analysis. Selectively reporting only significant p-values is a serious integrity concern examiners and reviewers recognise.
Reporting p-values in dissertations and papers
- Report exact values: p = .023, not p < .05 alone.
- Use p < .001 for very small values; never write p = .000.
- Include test statistic and degrees of freedom: t(98) = 2.45, p = .016.
- State alpha level in the methodology chapter.
- Discuss non-significant results honestly—absence of evidence is not evidence of absence.
How examiners evaluate your significance claims
Examiners check whether your significance language matches your design, whether assumptions were tested, and whether you overclaim from p-values alone. In viva voce, be ready to explain what your p-value would look like under the null and why your effect size matters for your field.
Moving beyond p-values in modern research
Leading journals increasingly require confidence intervals, effect sizes, and open data alongside p-values. Some fields advocate abandoning binary significance thresholds entirely. For your dissertation, follow your department's guidelines—but understand that significance is one piece of evidence, not the whole argument.
Professional data analysis support
If test selection, SPSS output interpretation, or results chapter writing is blocking your dissertation timeline, ReportLift data analysis support helps you run valid tests, interpret findings correctly, and report results to examiner and journal standards.