Created by: Brendan Coulson

Normalisation

Normalisation is the process of ensuring that the database is designed in the best possible way, so that:

·  there is a minimum of data duplicated in different tables

·  inconsistencies between data items are eliminated (for example, a person’s surname is not spelt Humphreys in one place and Humphries somewhere else in the database)

·  it is as easy as possible to extract information from the database

There are several stages in normalisation, the first three of which are enough for nearly all databases, including any that you will come across or design on this course. These stages are known as first, second and third normal form.

In this unit you are required to normalise only to first normal form, but in Unit 6 you have to be able to normalise to third normal form so all three stages are covered here.

Normalisation is sometimes perceived as baffling and difficult, but is really a question of common sense – avoid repeating attributes, put all the attributes in the table where they obviously belong and you will end up with correctly normalised tables.

First Normal form

Definition: A database in First Normal Form must not contain any repeating groups.

Looking at the sample data in figure one write down which attributes are repeating groups below:


Now to resolve this problem we must separate the attributes that can hold more than one value. In figure one’s example Orders may contain many Sandwiches; therefore the attributes that are related to Sandwiches (Sand. Code, Sand Desc., Sand. Price and Sand. Qty.) can occur more than one once within an order.

Order Table

·  Order Number (Primary Key)

·  Date of Order

·  Company ID

·  Company Name

·  Company Address

·  Company Postcode

·  Company Employee Contact

·  Company Contact Number

·  Total Price

·  Order Paid

·  Store ID

·  Store Location

Sandwich Table

·  Sandwich Code (Primary key)

·  Order Number (Foreign Key)

·  Sandwich Description

·  Sandwich Price

·  Sandwich Quantity

At this stage we need to have a primary key for each table to uniquely identify each record. We also need to relate Sandwiches to Orders therefore Order Number is needed as a foreign key in the Sandwich Table.

Sand. Code / Order Num. / Sand. Desc. / Sand. Price / Sand. Qty
111 / 1 / Egg Salad / £2 / 1
112 / 1 / Cheese & Ham / £3 / 1
115 / 1 / Ham / £4 / 1
112 / 43 / Cheese & Ham / £4 / 2

By looking at the table above what problem can you see with the Sand. Code primary key values?


Due to the attributes within the Sandwich Table, Sand. Code is not unique i.e. Sand. Code 112 is repeated twice. To resolve this we use Sand. Code and Order Num. together as the primary key. Using more than one key to make a record unique is called a Composite key.

Order Table

·  Order Number (Primary Key)

·  Date of Order

·  Company ID

·  Company Name

·  Company Address

·  Company Postcode

·  Company Employee Contact

·  Company Contact Number

·  Total Price

·  Order Paid

·  Store ID

·  Store Location

Sandwich Table

·  Sandwich Code

·  Order Number

·  Sandwich Description

·  Sandwich Price

·  Sandwich Quantity

These tables are now in First Normal Form.

Second Normal Form

Definition: A table is in Second Normal Form if it is in First Normal Form and no column that is not part of the primary key is dependent on only a portion of the primary key.

2NF is only concerned with attributes containing more than 1 primary key – therefore the Order attributes are already in 2NF.

The Sandwich table is not in second normal form because Sandwich Description is only dependant on Sandwich Code and not on the Order Number. Therefore is it only dependent on a portion of the key (Sandwich Code).


To put the tables in Second Normal Form we need to introduce a third table (SandwichOrder) that acts as a link between the entities Order and Sandwich.

This is illustrated in the ERD diagram below:

Before the extra table:

Many Sandwiches can be included in many Orders and many Orders will include many Sandwiches.

After the extra table:

Order Table

·  Order Number (Primary Key)

·  Date of Order

·  Company ID

·  Company Name

·  Company Address

·  Company Postcode

·  Company Employee Contact

·  Company Contact Number

·  Total Price

·  Order Paid

·  Store ID

·  Store Location

Sandwich Table

·  Sandwich Code (Primary Key)

·  Sandwich Description

·  Sandwich Price

SandwichOrder

·  Sandwich Code

·  Order Number

·  Sandwich Qty

The tables are now in Second Normal Form.

All non key attributes require all parts of the key to uniquely identify it.

Third Normal Form

Definition: A table in Third Normal Form contains no non-key dependencies.

All non key attributes must depend only on the key and not on each other.

Order Table

·  Order Number (Primary Key)

·  Date of Order

·  Company ID

·  Company Name

·  Company Address

·  Company Postcode

·  Company Employee Contact

·  Company Contact Number

·  Total Price

·  Order Paid

·  Store ID

·  Store Location

Looking at the Order table above the Company Name is dependent on the Company ID and the Store Location is dependant on the Store ID. These attributes are not dependant on the Order Number. Therefore they need to be separated off into their own tables.

However a relationship needs to be created between Order and Company. Therefore the Company ID will be needed as a foreign key in the Order Table. This is also the case between Order and Store, the Store ID is also needed as foreign key in the Order Table.

Order Table

·  Order Number (Primary Key)

·  Company ID (Foreign key)

·  Store ID (Foreign key)

·  Date of Order

·  Total Price

·  Order Paid

Company Table

·  Company ID (Primary Key)

·  Company Name

·  Company Address

·  Company Postcode

·  Company Employee Contact

·  Company Contact Number

Store Table

·  Store ID (Primary Key)

·  Store Location

Sandwich Table

·  Sandwich Code (Primary Key)

·  Sandwich Description

·  Sandwich Price

SandwichOrder

·  Sandwich Code (Composite Key)

·  Order Number (Composite Key)

·  Sandwich Qty

The tables are now in Third Normal Form.

All non key attributes depend only on the key and not on each other

Normalisation Page 2 of 7

Created by: Brendan Coulson

Figure 1 - Sample Data

Order Num. / Date of Order / Total Price / Order Paid / Company ID / Company Name / Company Address / Company Postcode / Company Contact Number / Sand. Code / Sand. Desc. / Sand. Price / Sand.
Qty. / Store ID / Store Lo.
1 / 29/02/09 / £9 / Y / 101 / Smiths / 12 Eastwood Ave / NG12 8YT / 0115 2727272 / 111 / Egg Salad / £2 / 1 / 1 / East-wood
1 / 29/02/09 / £9 / Y / 101 / Smiths / 12 Eastwood Ave / NG12 8YT / 0115 2727272 / 112 / Cheese & Ham / £3 / 1 / 1 / East-wood
1 / 29/02/09 / £9 / Y / 101 / Smiths / 12 Eastwood Ave / NG12 8YT / 0115 2727272 / 115 / Ham / £4 / 1 / 1 / East-wood
43 / 24/02/09 / £10 / N / 1144 / Kinch / 15 Eve Rd / LE12 7RY / 01162 2323223 / 112 / Cheese & Ham / £4 / 2 / 2 / Leics.
43 / 24/02/09 / £10 / N / 1144 / Kinch / 15 Eve Rd / LE12 7RY / 01162 2323223 / 111 / Egg Salad / £2 / 1 / 2 / Leics.

Normalisation Page 2 of 7