How do non-parametric tests such as the sign test and Wilcoxon tests work when we cannot assume a normal distribution?
Apply non-parametric tests including the sign test and the Wilcoxon signed-rank test, and know when they are appropriate
A focused answer to the H2 Further Mathematics outcome on non-parametric tests. When to use distribution-free methods, the sign test for a median, the Wilcoxon signed-rank test, the test statistics and how to reach a conclusion.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
SEAB wants you to apply non-parametric (distribution-free) tests, the sign test and the Wilcoxon signed-rank test, to know when they are appropriate (when the normality assumption of a - or -test cannot be made), to set up the hypotheses about a median or a difference, to compute the test statistic, and to reach a conclusion.
The answer
When non-parametric tests are used
Parametric tests (the -test and -test) assume the data come from a normal distribution (or that the sample is large enough for the Central Limit Theorem). When this cannot be assumed, because the distribution is clearly non-normal, the sample is small, or the data are only ordinal (ranks), a non-parametric test makes far weaker assumptions and is preferred. These tests are about the median rather than the mean.
The sign test
The sign test tests a hypothesis about the median (or that paired differences have median zero). For paired data, record the sign of each difference (positive or negative), discarding any zero differences. Under a positive and a negative sign are equally likely, so the number of one sign follows
where is the number of non-zero differences. The test is then a binomial tail probability, exactly as for a proportion of .
The Wilcoxon signed-rank test
The Wilcoxon signed-rank test also uses the differences but keeps more information: it ranks the absolute differences, then sums the ranks of the positive (or negative) differences to form the test statistic . Because it uses the magnitudes as well as the signs, it is more powerful than the sign test when the symmetry assumption it requires holds. The statistic is compared with critical values from Wilcoxon tables (or a normal approximation for large ).
Reaching a conclusion
As with any test: state and , compute the statistic, compare with the critical value (or find the -value), and conclude in context. For the sign test the comparison is a binomial tail; for Wilcoxon it is against the tabulated critical value, where a small gives significance.
Examples in context
Example 1. Before-and-after studies. A small trial measuring each subject before and after an intervention, with no reason to assume normal differences, is the classic setting for the sign test or Wilcoxon test, which is why these appear throughout psychology and medical pilot studies.
Example 2. Ordinal survey data. When respondents rank preferences on a scale that is not truly numerical, only a non-parametric test is valid, because the differences between ranks are not meaningful as measured quantities, a common situation in market research.
Try this
Q1. When is a non-parametric test preferred over a -test? [2 marks]
- Cue. When normality cannot be assumed: non-normal data, a small sample, or ordinal (rank) data.
Q2. Under , what distribution does the sign-test count follow? [1 mark]
- Cue. , where is the number of non-zero differences.
Q3. What is the main disadvantage of the sign test compared with the Wilcoxon signed-rank test? [1 mark]
- Cue. It ignores the magnitudes of the differences, so it has lower power.
Exam-style practice questions
Practice questions written in the style of SEAB exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Original6 marksTen people rate a product before and after a redesign. Eight rate it higher afterwards and two rate it lower. Use a sign test at the level to test whether the redesign improved ratings.Show worked answer →
Let be the probability that an individual rates the product higher after the redesign. Under there is no systematic change, so ; the alternative (improvement) is .
Under , the number of "higher" responses . We observed higher out of . The one-tailed -value is
Using : , , , total .
Since , we do not reject at the level: there is insufficient evidence of an improvement.
Markers reward the hypotheses on the median/sign with , the binomial tail , comparison with , and the conclusion not to reject .
Original6 marksExplain when a non-parametric test should be preferred over a -test or -test, and state one advantage and one disadvantage of the sign test.Show worked answer →
A non-parametric (distribution-free) test should be preferred when the assumptions of the parametric test are not met: in particular when the population cannot be assumed normal, the sample is small so the Central Limit Theorem does not rescue normality, or the data are ordinal (ranks) rather than measured on an interval scale.
The sign test only uses the sign of each difference (whether each value is above or below the hypothesised median), so it makes very weak assumptions: an advantage is robustness, since it works for any continuous distribution and resists outliers. A disadvantage is that, by discarding the magnitudes of the differences, it uses little of the information in the data and so has lower power than tests that use more (such as the Wilcoxon signed-rank test or a -test when valid).
Markers reward the condition (non-normal, small sample or ordinal data), the robustness advantage, and the low-power disadvantage from ignoring magnitudes.
Related dot points
- Carry out hypothesis tests and analyse Type I and Type II errors and the power of a test
A focused answer to the H2 Further Mathematics outcome on errors in hypothesis testing. Type I and Type II errors, the significance level as the Type I error probability, computing the probability of a Type II error, and the power of a test.
- Compute unbiased estimates of a population mean and variance and construct and interpret confidence intervals for a population mean
A focused answer to the H2 Further Mathematics outcome on estimation. Unbiased estimators of the population mean and variance, the sample variance with its n minus 1 divisor, and constructing and correctly interpreting confidence intervals for a mean.
- Work with discrete random variables, their probability distributions, expectation, variance, and the expectation and variance of linear functions
A focused answer to the H2 Further Mathematics outcome on discrete random variables. Probability distributions, expectation and variance, the computational formula for variance, and the rules for the expectation and variance of a linear function aX + b.
- Recognise and apply the geometric and negative binomial distributions, including their probabilities, expectations and variances
A focused answer to the H2 Further Mathematics outcome on the geometric and negative binomial distributions. Their probability formulae, when each applies, the expectation and variance of each, and the link to the binomial distribution.