SingaporeGeographySyllabus dot point

How do you summarise a set of field measurements, and how do you describe how spread out they are?

Calculate and interpret measures of central tendency (mean, median, mode) and dispersion (range, interquartile range, standard deviation) for geographical data

A focused answer to the H2 Geography skill of summarising data. The mean, median and mode and when each is appropriate, the range, interquartile range and standard deviation as measures of spread, the effect of anomalies and skew, and how to interpret dispersion geographically.

Generated by Claude Opus 4.810 min answerUpdated 2026-06-06

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The answer
Examples in context
Try this

What this dot point is asking

SEAB wants you to calculate and interpret the common measures of central tendency (mean, median, mode) and dispersion (range, interquartile range, standard deviation) for geographical data. The central insight is that a single average can hide as much as it reveals; two places can share the same mean yet behave completely differently, so describing data properly means reporting both a representative average and a measure of how spread out the values are.

The answer

Measures of central tendency

These summarise the "typical" value of a data set:

Mean: the sum of all values divided by the number of values, $\bar{x} = \dfrac{\sum x}{n}$ . It uses every value, but is pulled toward anomalies (outliers).
Median: the middle value when the data are ordered. It is resistant to outliers, so it is the better average for skewed data.
Mode: the most frequently occurring value. It is the only average that works for categorical data and shows the most common case.

Choosing between them depends on the data: use the mean for roughly symmetrical data with no big outliers, the median when there are anomalies or skew, and the mode for categories or to report the most common value.

The effect of anomalies and skew

An outlier, a value far from the rest, drags the mean toward it but barely moves the median. So if a beach pebble sample reads 12, 15, 18, 18, 21, 24, 90 mm, the mean (about 28 mm) sits above almost every reading, while the median (18 mm) genuinely represents the typical pebble. Recognising this is a common exam discriminator: when data are skewed, the median is the honest average.

Measures of dispersion

Dispersion describes how spread out the values are around the centre:

Range: the largest value minus the smallest. Quick to find, but determined entirely by the two extremes, so it is sensitive to outliers.
Interquartile range (IQR): the range of the middle 50 percent of values, $\text{IQR} = Q_3 - Q_1$ (upper quartile minus lower quartile). It ignores the extreme quarters, so it is robust to outliers and well suited to skewed data.
Standard deviation: the average distance of values from the mean. A small standard deviation means values cluster tightly (consistent); a large one means they are widely spread (variable). It uses every value, which makes it powerful but also sensitive to outliers.

Why dispersion matters geographically

Two data sets can share a mean yet differ in spread, and the difference is often the geographically interesting part. Two weather stations might both average a similar monthly rainfall, but if one has a low standard deviation (even, reliable rainfall all year) and the other a high standard deviation (intensely seasonal, monsoonal), they have very different climates and human implications. Reporting the spread, not just the average, is what captures that.

Worked example

Question: a geographer records daily pedestrian counts at a site over seven days: 220, 240, 250, 250, 260, 280, 900. Summarise the data and choose appropriate measures. [8 marks]

Step 1: Calculate the mean

Mean $= \dfrac{220+240+250+250+260+280+900}{7} = \dfrac{2400}{7} \approx 343$ . Note this exceeds six of the seven counts, a sign of an outlier.

Step 2: Find the median and mode

Ordering the values, the middle (fourth) value is $250$ , so the median is $250$ ; the mode is also $250$ . Both sit among the bulk of the data, unlike the mean.

Step 3: Identify and justify the better average

Argue that the value $900$ (perhaps an event day) is an outlier that inflates the mean, so the median ( $250$ ) is the more representative typical count for an ordinary day.

Step 4: Describe the spread

Note the range ( $900 - 220 = 680$ ) is dominated by the outlier, so the interquartile range better describes normal variability; mention that you would report the median with the IQR. This calculation plus a justified choice and a robust spread earns the marks.

Examples in context

Example 1. Rainfall reliability in Singapore versus a monsoon climate. Equatorial Singapore receives rain throughout the year, so its monthly rainfall has a relatively low standard deviation, meaning reliable, evenly distributed rainfall. A station with a pronounced monsoon may share a similar annual mean but has a high standard deviation, with wet and dry seasons. Comparing the spreads, not the means, reveals the contrasting reliability and seasonality that matter for agriculture and water supply.

Example 2. House-price distribution in a city district. Average house prices reported as a mean can be badly skewed by a few very expensive properties, so analysts often quote the median price instead, which better reflects what a typical household pays. Reporting the interquartile range alongside it shows how unequal the market is. This is a clear case of the median and IQR giving an honest summary where the mean and range mislead.

Try this

Q1. For the data set 4, 6, 7, 7, 51, state the mean and the median and say which better represents the data. [3 marks]

Cue. Mean $= (4+6+7+7+51)/5 = 75/5 = 15$ ; median (middle of the ordered values) $= 7$ . The median better represents the data because the outlier $51$ inflates the mean well above four of the five values.

Q2. Explain why the interquartile range is often preferred to the range as a measure of spread. [2 marks]

Cue. The range depends only on the two most extreme values, so a single outlier distorts it; the interquartile range covers the middle 50 percent of the data and ignores the extreme quarters, so it is robust to outliers and better describes typical variability.

Q3. Two weather stations have the same mean monthly rainfall but very different standard deviations. Explain what this tells you. [3 marks]

Cue. The station with the low standard deviation has rainfall clustered close to the mean each month, so it is reliable and evenly spread; the station with the high standard deviation has widely varying monthly totals, indicating a strongly seasonal regime, even though their averages are identical.

Exam-style practice questions

Practice questions written in the style of SEAB exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

Original8 marksA geographer measures pebble sizes (mm) at a beach site and records: 12, 15, 18, 18, 21, 24, 90. Explain which measure of central tendency best summarises these data and what the measures of dispersion add.

Show worked answer →

Argument: because the data contain a large outlier, the median is the most representative average, while a measure of spread reveals how variable the pebbles are.

Identify the issue: the value 90 is an anomaly far above the others, which inflates the mean. The mean is $\dfrac{12+15+18+18+21+24+90}{7} = \dfrac{198}{7} \approx 28.3\text{ mm}$ , higher than all but one reading, so it misrepresents the typical pebble.

Choose the median: ordering the seven values, the middle (fourth) value is $18\text{ mm}$ , which sits among the bulk of the data and is unaffected by the outlier, so it is the better average here; the mode is also $18\text{ mm}$ .

Add dispersion: the range ( $90 - 12 = 78\text{ mm}$ ) is large but driven by the outlier; the interquartile range, covering the middle 50 percent, is more robust and shows the typical spread; standard deviation would also be inflated by the outlier.

Markers reward calculating the mean, recognising the outlier's effect, justifying the median, and explaining that dispersion (especially the robust IQR) describes variability.

Original6 marksExplain what standard deviation measures and why comparing the standard deviation of two data sets can be geographically useful.

Show worked answer →

Argument: standard deviation measures how much values typically deviate from the mean, so comparing it between two places reveals which is more variable.

Explain the measure: standard deviation is the average distance of values from the mean; a small value means data cluster tightly around the mean (consistent), a large value means they are widely spread (variable). It uses every value, unlike the range.

Explain the geographical use: comparing standard deviations lets a geographer say, for instance, that monthly rainfall at an equatorial station has a low standard deviation (reliable, even rainfall) while a monsoon station has a high standard deviation (highly seasonal, variable), even if their means are similar. This distinguishes reliability and seasonality that a mean alone hides.

Markers reward defining standard deviation as spread about the mean, contrasting small and large values, and a geographical comparison where two means are similar but spreads differ.