p-Values
Definition
The p-value tells us “what is the probability that I’m completely wrong and this variable actually does nothing.”
A p-value is the probability of observing a test statistic as extreme or more extreme than the one obtained, assuming the null hypothesis is true.
It helps us measure the strength of evidence against the null hypothesis ().
Purpose
To decide whether the observed relationship or effect in a regression (or any statistical test) is statistically significant, rather than due to random chance.
Interpretation
- A small p-value (typically ) → strong evidence against .
- A large p-value (typically ) → weak evidence against (fail to reject).
In regression:
- The null hypothesis (or the hypothesis that this variable does not affect our results) for a coefficient is:
- A low p-value suggests that predictor significantly contributes to explaining variation in .
Common Thresholds
| p-value | Interpretation | Decision |
|---|---|---|
| Strong evidence against | Reject | |
| Weak evidence against | Fail to reject |
Example
Suppose a regression output gives:
| Variable | Coefficient () | p-value |
|---|---|---|
| (Hours Studied) | 3.2 | 0.01 |
| (Tutoring) | 1.1 | 0.45 |
Interpretation:
- is statistically significant → more study hours increase grades.
- is not significant → tutoring shows no measurable effect after controlling for other factors.
Key Idea
The p-value does not measure the size or importance of an effect — only how likely it is that the observed result occurred by random chance under .