How do we organise tables to remove redundancy and avoid update anomalies?
Normalise a relational design to first, second and third normal form, explaining the anomalies each form removes
A focused answer to the H2 Computing outcome on normalisation. First, second and third normal form, the insertion, update and deletion anomalies they remove, and how to decompose a table step by step.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
SEAB wants you to normalise a relational design to first, second and third normal form (1NF, 2NF, 3NF), and to explain the insertion, update and deletion anomalies each step removes. The core idea is that redundancy - storing the same fact in many rows - is the root of inconsistency, and normalisation systematically eliminates it by splitting tables so each fact lives in exactly one place.
The answer
Why normalise: anomalies
When data is duplicated across rows, three problems arise:
- Update anomaly - changing a repeated fact means editing every copy; miss one and the data is inconsistent.
- Insertion anomaly - you cannot record a fact because other, unrelated data is missing (you cannot add a course with no enrolled student if course details only live in the enrolment row).
- Deletion anomaly - removing a row accidentally erases an unrelated fact (deleting the last student deletes the course).
First normal form (1NF)
A table is in 1NF if every cell holds a single atomic value (no repeating groups or lists in one column) and each row is unique. Split a column like PhoneNumbers = "1234, 5678" into separate rows or a related table so each value stands alone.
Second normal form (2NF)
A table is in 2NF if it is in 1NF and every non-key attribute depends on the whole primary key, not just part of it. This only bites when the key is composite. If (OrderID, ProductID) is the key but ProductName depends on ProductID alone, that is a partial dependency - move ProductName to a Product table.
Third normal form (3NF)
A table is in 3NF if it is in 2NF and has no transitive dependency - no non-key attribute depends on another non-key attribute. If StudentID -> ClassID -> ClassName, then ClassName depends on the key only through ClassID; move the class attributes into a Class table.
Before (not 3NF):
Student(StudentID PK, Name, ClassID, ClassName, TeacherName)
After (3NF):
Student(StudentID PK, Name, ClassID FK)
Class(ClassID PK, ClassName, TeacherName)
The informal summary of 3NF: every non-key attribute depends on the key, the whole key, and nothing but the key.
Examples in context
Example 1. A supplier catalogue. A flat product table that repeats each supplier's address on every product they supply suffers update anomalies. Normalising to a separate Supplier table, referenced by SupplierID, stores each address once, so a supplier's move is a single-row edit.
Example 2. Course enrolment. Storing course title and lecturer alongside each enrolment risks losing the course entirely when the last student withdraws (a deletion anomaly). Splitting into Course, Student and an Enrolment link table keeps course details independent of who is enrolled.
Try this
Q1. State the condition a table must meet to be in first normal form. [1 mark]
- Cue. Every cell holds a single atomic value (no repeating groups or lists) and each row is unique.
Q2. A table with key (StudentID, CourseID) stores CourseName, which depends only on CourseID. Which normal form does this violate and why? [2 marks]
- Cue. 2NF -
CourseNameis partially dependent on part of the composite key (CourseID), not the whole key.
Q3. Give one anomaly that third normal form removes by eliminating transitive dependencies. [1 mark]
- Cue. An update anomaly - a transitively dependent fact (like a class teacher) repeated for every student no longer has to be updated in many rows.
Exam-style practice questions
Practice questions written in the style of SEAB exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
Original6 marksA single table stores: StudentID, StudentName, ClassID, ClassName, and (TeacherName) where ClassName and TeacherName depend only on ClassID. (a) Explain why this table is not in third normal form. (b) Decompose it into 3NF tables, stating the keys.Show worked answer →
(a) The table has a transitive dependency: StudentID determines ClassID, and ClassID determines ClassName and TeacherName. So non-key attributes (ClassName, TeacherName) depend on the key only through another non-key attribute (ClassID), not directly. This breaks third normal form and causes redundancy: the class name and teacher repeat for every student in the class, risking update anomalies.
(b) Decompose by moving the class details into their own table:
Student(StudentID PK, StudentName, ClassID FK -> Class)
Class(ClassID PK, ClassName, TeacherName)
Now every non-key attribute depends only on its table's primary key. The class name and teacher are stored once per class.
Markers reward identifying the transitive dependency, the resulting redundancy/anomaly, and a correct decomposition with ClassID as a foreign key linking the two tables.
Original4 marksDefine an update anomaly and a deletion anomaly, and give an example of each that normalisation removes.Show worked answer →
An update anomaly occurs when a fact is stored in many rows, so changing it requires updating every copy; missing one leaves the data inconsistent. Example: if a supplier's phone number is repeated on every product they supply, changing the number means editing many rows, and any missed row now disagrees.
A deletion anomaly occurs when deleting a row unintentionally removes other facts. Example: if a course's details are stored only alongside the students enrolled, deleting the last student deletes the only record of the course.
Normalisation removes both by storing each fact once in the table where it belongs (supplier details in a Supplier table; course details in a Course table), linked by keys.
Markers reward correct definitions and a concrete example of each, with the point that storing each fact once removes them.
Related dot points
- Describe the relational model in terms of tables, rows, attributes, primary keys and foreign keys, and explain referential integrity
A focused answer to the H2 Computing outcome on the relational model. Tables, rows and attributes, primary and foreign keys, relationships between tables, and how referential integrity keeps data consistent.
- Write SQL queries using SELECT with WHERE, ORDER BY, JOIN, GROUP BY and aggregate functions, and modify data with INSERT, UPDATE and DELETE
A focused answer to the H2 Computing outcome on SQL queries. SELECT with WHERE and ORDER BY, joining tables, grouping with aggregate functions, and modifying rows with INSERT, UPDATE and DELETE.
- Define database schemas with CREATE TABLE, choosing data types and enforcing PRIMARY KEY, FOREIGN KEY, NOT NULL and UNIQUE constraints
A focused answer to the H2 Computing outcome on defining schemas. CREATE TABLE, choosing appropriate data types, and enforcing primary key, foreign key, NOT NULL and UNIQUE constraints to protect data integrity.
- Model a problem domain with an entity-relationship diagram, identifying entities, attributes, relationships and cardinality, and map it to tables
A focused answer to the H2 Computing outcome on ER modelling. Entities, attributes and relationships, one-to-one, one-to-many and many-to-many cardinality, resolving many-to-many with a link table, and mapping an ER model to relational tables.
- Convert whole numbers between binary, denary and hexadecimal, and perform binary addition, explaining the role of place value and overflow
A focused answer to the H2 Computing outcome on number bases. Place value in binary and hexadecimal, conversion methods between binary, denary and hexadecimal, binary addition, and the meaning of overflow.