How do you choose what to sample and how to collect it, so the data is representative and the conclusions are sound?
Explain random, systematic and stratified sampling and how to select appropriate primary and secondary data-collection methods
A focused answer to the H2 Geography skill of sampling and data collection. Why we sample, random, systematic and stratified strategies (point, line and area), sample size and bias, and choosing primary versus secondary and quantitative versus qualitative methods.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
SEAB wants you to explain the main sampling strategies, random, systematic and stratified, and to choose appropriate methods for collecting primary and secondary data. The central insight is that we almost never measure a whole population, so we sample a part of it; whether the conclusions are trustworthy depends entirely on whether that sample is representative, which is decided by the sampling strategy, the sample size, and the control of bias.
The answer
Why we sample
It is rarely possible to measure every pebble in a river or survey every resident of a town, so geographers take a sample: a manageable subset chosen to represent the whole population. A good sample lets us draw valid conclusions about the population; a poor one misleads. The art is making the sample representative and unbiased.
The three sampling strategies
- Random sampling: every member of the population has an equal chance of selection, usually via random-number coordinates. It removes selection bias, but by chance it can cluster points or miss a gradient.
- Systematic sampling: observations are taken at regular intervals (every 10 metres along a transect, every fifth house). It is simple, even in coverage, and ideal for capturing change along a gradient, but it can miss or align with a periodic pattern.
- Stratified sampling: the population is divided into sub-groups (strata) and each is sampled in proportion to its size. For example, if a town is 60 percent older housing and 40 percent new estates, the sample mirrors that split. It guarantees each group is represented.
These can be combined (for example, a stratified-systematic design) to get the strengths of more than one.
Point, line and area sampling
The same strategies apply across different spatial frames:
- Point sampling: data taken at specific points (a weather reading at a site).
- Line sampling: data along a line or transect (vegetation along a dune profile).
- Area sampling: data within areas, often using quadrats (percentage cover in a square).
Line sampling along a transect is the natural choice when a variable changes along a clear gradient.
Sample size and bias
Two things make a sample trustworthy:
- Sample size: larger samples reduce the influence of anomalies and random variation, narrowing uncertainty. Three readings per site are unreliable; many more give a stable estimate. The trade-off is time and cost.
- Avoiding bias: bias is systematic error that skews the sample, such as surveying only weekday shoppers, or placing quadrats where vegetation looks lush. Random or systematic placement and varied times reduce it.
Together, adequate size and low bias give reliability (consistent results) and validity (the data really measures what was intended).
Choosing data-collection methods
Match the method to the data needed:
- Primary data is collected first-hand in the field (measurements, counts, questionnaires, field sketches). It is current and tailored, but time-consuming.
- Secondary data is collected by others (census data, maps, satellite imagery, official statistics). It gives breadth, history and context, but may not fit the question exactly.
- Quantitative data is numerical (temperatures, counts) and supports statistical testing; qualitative data is descriptive (perceptions, photographs) and adds depth and meaning.
The best investigations combine primary and secondary, quantitative and qualitative, to triangulate findings.
Examples in context
Example 1. A vegetation transect on a Singapore mangrove boardwalk. To study how species change from the seaward edge inland at a site such as Sungei Buloh, a geographer uses systematic line sampling, placing quadrats at fixed intervals along a transect, capturing the zonation from pioneer mangroves to landward species in order. Random quadrat placement within each interval reduces bias, illustrating a combined design matched to a clear environmental gradient.
Example 2. A stratified household survey of service use. Investigating how use of a town's services varies by neighbourhood, a geographer divides the town into strata (older terraces, suburban estates, high-rise blocks) and samples households in proportion to each stratum's size. This guarantees every neighbourhood type is represented, avoiding the bias of surveying only the most accessible area, and shows stratified sampling capturing social structure.
Try this
Q1. Explain the difference between systematic and random sampling. [2 marks]
- Cue. Systematic sampling takes observations at regular, fixed intervals (every nth unit or every set distance), giving even coverage; random sampling selects units so each has an equal chance, removing selection bias but risking uneven or clustered coverage.
Q2. Give one advantage and one disadvantage of using secondary data. [2 marks]
- Cue. Advantage: it provides breadth, historical depth or context (for example census or satellite data) quickly and cheaply. Disadvantage: it was collected for another purpose, so it may not fit the question exactly or be at the right scale or date.
Q3. Explain how a geographer can reduce bias when collecting questionnaire data in a town centre. [3 marks]
- Cue. Survey at several locations and at varied times and days to capture different groups (workers, shoppers, residents), select respondents systematically or randomly rather than choosing approachable people, and use a consistent set of questions, so the sample better represents the whole population.
Exam-style practice questions
Practice questions written in the style of SEAB exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Original8 marksA geographer is studying how plant species change across a sand dune system from the beach inland. Explain which sampling strategy would be most appropriate and justify your choice.Show worked answer →
Argument: because the variable changes systematically with distance across a clear environmental gradient, systematic line sampling along a transect is most appropriate.
Explain the options: random sampling avoids bias but may miss the gradient and clump points unevenly; systematic sampling places points at regular intervals, capturing change along a transect; stratified sampling divides the area into sub-groups (zones) sampled in proportion.
Justify systematic line sampling: a transect from the strand line inland with quadrats at fixed intervals (for example every five metres) captures the succession from pioneer to climax vegetation in order, which is exactly the pattern of interest; it is also practical and repeatable.
Add nuance: a stratified element could ensure each dune zone (embryo, fore, yellow, grey dune) is represented if zones differ in width, and quadrat placement within each point can be random to reduce bias.
Markers reward defining the strategies, matching systematic transect sampling to a gradient, and a justification linked to the aim plus a sensible refinement.
Original6 marksExplain why sample size and the avoidance of bias are important in fieldwork, using an example.Show worked answer →
Argument: a large, unbiased sample is more likely to represent the population, so it makes conclusions reliable and valid; a small or biased sample can mislead.
Explain sample size: more observations reduce the influence of anomalies and random variation, narrowing uncertainty; too few readings (say three pebbles per site) make a result unreliable, while a larger number (say fifty) gives a more stable estimate.
Explain bias: bias is systematic error that skews the sample away from the population, such as only surveying shoppers on a weekday afternoon (missing workers) or placing quadrats where vegetation looks lush. Random or systematic placement and varied survey times reduce it.
Use an example: in a pedestrian-count study, counting at one location at one time over-represents that moment; sampling several sites at several times gives a representative picture. Markers reward the link from size and bias to reliability and validity, and a concrete example.
Related dot points
- Explain the stages of a geographical investigation and how to formulate a focused geographical question, aim and testable hypothesis
A focused answer to the H2 Geography skill of designing an investigation. The route to enquiry, framing a sharp geographical question and aim, writing a testable hypothesis and null hypothesis, choosing variables, and the importance of location, scale and feasibility.
- Select and justify appropriate techniques for presenting geographical data, including graphs, located proportional symbols, choropleth maps and specialised diagrams
A focused answer to the H2 Geography skill of data presentation. Matching the technique to the data type, line and bar graphs, scatter graphs, choropleth and isoline maps, located proportional symbols, kite and triangular graphs, and how to describe a presented pattern in a data-response answer.
- Calculate and interpret measures of central tendency (mean, median, mode) and dispersion (range, interquartile range, standard deviation) for geographical data
A focused answer to the H2 Geography skill of summarising data. The mean, median and mode and when each is appropriate, the range, interquartile range and standard deviation as measures of spread, the effect of anomalies and skew, and how to interpret dispersion geographically.
- Calculate and interpret Spearman's rank correlation coefficient to test for a relationship between two variables, and assess its statistical significance
A focused answer to the H2 Geography skill of correlation testing. Ranking paired data, calculating Spearman's rank correlation coefficient, interpreting its sign and strength, testing significance against critical values, and avoiding the correlation-causation trap.
- Apply the chi-square test to compare observed and expected frequencies, use degrees of freedom and critical values, and interpret statistical significance
A focused answer to the H2 Geography skill of significance testing with chi-square. Observed versus expected frequencies, the chi-square formula, degrees of freedom, comparing the statistic with critical values, the role of the significance level, and the test's conditions and limits.