What is wrong tail of the Wilcoxon statistic?

For the Wilcoxon signed-rank test a small T indicates significance; comparing the wrong direction reverses the conclusion.

SingaporeFurther MathsSyllabus dot point

How do non-parametric tests such as the sign test and Wilcoxon tests work when we cannot assume a normal distribution?

Apply non-parametric tests including the sign test and the Wilcoxon signed-rank test, and know when they are appropriate

A focused answer to the H2 Further Mathematics outcome on non-parametric tests. When to use distribution-free methods, the sign test for a median, the Wilcoxon signed-rank test, the test statistics and how to reach a conclusion.

Generated by Claude Opus 4.811 min answerUpdated 2026-06-06

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

SEAB wants you to apply non-parametric (distribution-free) tests, the sign test and the Wilcoxon signed-rank test, to know when they are appropriate (when the normality assumption of a $t$ - or $z$ -test cannot be made), to set up the hypotheses about a median or a difference, to compute the test statistic, and to reach a conclusion.

The answer

When non-parametric tests are used

Parametric tests (the $z$ -test and $t$ -test) assume the data come from a normal distribution (or that the sample is large enough for the Central Limit Theorem). When this cannot be assumed, because the distribution is clearly non-normal, the sample is small, or the data are only ordinal (ranks), a non-parametric test makes far weaker assumptions and is preferred. These tests are about the median rather than the mean.

The sign test

The sign test tests a hypothesis about the median (or that paired differences have median zero). For paired data, record the sign of each difference (positive or negative), discarding any zero differences. Under $H_0$ a positive and a negative sign are equally likely, so the number of one sign follows

X \sim \mathrm{B}(n, 0.5),

where $n$ is the number of non-zero differences. The test is then a binomial tail probability, exactly as for a proportion of $0.5$ .

The Wilcoxon signed-rank test

The Wilcoxon signed-rank test also uses the differences but keeps more information: it ranks the absolute differences, then sums the ranks of the positive (or negative) differences to form the test statistic $T$ . Because it uses the magnitudes as well as the signs, it is more powerful than the sign test when the symmetry assumption it requires holds. The statistic is compared with critical values from Wilcoxon tables (or a normal approximation for large $n$ ).

Reaching a conclusion

As with any test: state $H_0$ and $H_1$ , compute the statistic, compare with the critical value (or find the $p$ -value), and conclude in context. For the sign test the comparison is a binomial tail; for Wilcoxon it is against the tabulated $T$ critical value, where a small $T$ gives significance.

Worked example

Twelve students sit a test before and after extra tuition. Nine score higher afterwards, three lower (no ties). Use a sign test at the $5\%$ level to test whether tuition improves scores.

Step 1: State the hypotheses

Let $p$ be the probability a student scores higher after tuition. $H_0: p = 0.5$ (no systematic improvement) against $H_1: p > 0.5$ (improvement), a one-tailed test.

Step 2: Identify the distribution under H0

There are $n = 12$ non-zero differences, so under $H_0$ the number scoring higher is $X \sim \mathrm{B}(12, 0.5)$ . We observed $X = 9$ .

Step 3: Compute the one-tailed p-value

\mathrm{P}(X \geq 9 \mid p = 0.5) = \mathrm{P}(9) + \mathrm{P}(10) + \mathrm{P}(11) + \mathrm{P}(12).

From $\mathrm{B}(12, 0.5)$ these are $\dfrac{220 + 66 + 12 + 1}{4096} = \dfrac{299}{4096} \approx 0.073$ .

Step 4: Compare with the significance level

$0.073 > 0.05$ , so the result is not significant at the $5\%$ level.

Step 5: Conclude in context

There is insufficient evidence at the $5\%$ level to conclude that the tuition improves scores; the sign test does not detect a significant effect from $9$ improvements out of $12$ .

Examples in context

Example 1. Before-and-after studies. A small trial measuring each subject before and after an intervention, with no reason to assume normal differences, is the classic setting for the sign test or Wilcoxon test, which is why these appear throughout psychology and medical pilot studies.

Example 2. Ordinal survey data. When respondents rank preferences on a scale that is not truly numerical, only a non-parametric test is valid, because the differences between ranks are not meaningful as measured quantities, a common situation in market research.

Try this

Q1. When is a non-parametric test preferred over a $t$ -test? [2 marks]

Cue. When normality cannot be assumed: non-normal data, a small sample, or ordinal (rank) data.

Q2. Under $H_0$ , what distribution does the sign-test count follow? [1 mark]

Cue. $\mathrm{B}(n, 0.5)$ , where $n$ is the number of non-zero differences.

Q3. What is the main disadvantage of the sign test compared with the Wilcoxon signed-rank test? [1 mark]

Cue. It ignores the magnitudes of the differences, so it has lower power.

Exam-style practice questions

Practice questions written in the style of SEAB exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Original6 marksTen people rate a product before and after a redesign. Eight rate it higher afterwards and two rate it lower. Use a sign test at the

5\%

level to test whether the redesign improved ratings.

Show worked answer →

Let $p$ be the probability that an individual rates the product higher after the redesign. Under $H_0$ there is no systematic change, so $H_0: p = 0.5$ ; the alternative (improvement) is $H_1: p > 0.5$ .

Under $H_0$ , the number of "higher" responses $X \sim \mathrm{B}(10, 0.5)$ . We observed $X = 8$ higher out of $10$ . The one-tailed $p$ -value is

\mathrm{P}(X \geq 8 \mid p = 0.5) = \mathrm{P}(8) + \mathrm{P}(9) + \mathrm{P}(10).

Using

\mathrm{B}(10, 0.5)

\mathrm{P}(8) = 45/1024

\mathrm{P}(9) = 10/1024

\mathrm{P}(10) = 1/1024

, total

56/1024 \approx 0.0547

Since $0.0547 > 0.05$ , we do not reject $H_0$ at the $5\%$ level: there is insufficient evidence of an improvement.

Markers reward the hypotheses on the median/sign with $p = 0.5$ , the binomial tail $\mathrm{P}(X \geq 8) \approx 0.0547$ , comparison with $0.05$ , and the conclusion not to reject $H_0$ .

Original6 marksExplain when a non-parametric test should be preferred over a

t

-test or

z

-test, and state one advantage and one disadvantage of the sign test.

Show worked answer →

A non-parametric (distribution-free) test should be preferred when the assumptions of the parametric test are not met: in particular when the population cannot be assumed normal, the sample is small so the Central Limit Theorem does not rescue normality, or the data are ordinal (ranks) rather than measured on an interval scale.

The sign test only uses the sign of each difference (whether each value is above or below the hypothesised median), so it makes very weak assumptions: an advantage is robustness, since it works for any continuous distribution and resists outliers. A disadvantage is that, by discarding the magnitudes of the differences, it uses little of the information in the data and so has lower power than tests that use more (such as the Wilcoxon signed-rank test or a $t$ -test when valid).

Markers reward the condition (non-normal, small sample or ordinal data), the robustness advantage, and the low-power disadvantage from ignoring magnitudes.

What this dot point is asking

The answer

When non-parametric tests are used

The sign test

The Wilcoxon signed-rank test

Reaching a conclusion

Examples in context

Try this

Exam-style practice questions

Related dot points