The relational database

A singleflat-filetable is useful for recording a limited amount of data. But a large flat-file database can be inefficient as it takes up more space andmemorythan a relational database. It also requires new data to be added every time you enter a new record, whereas a relational database does not. Finally,data redundancy– where data is partially duplicated across records – can occur in flat-file tables, and can more easily be avoided in relational databases.

Therefore, if you have a large set of data about many differententities, it is more efficient to create separate tables and connect them with relationships.

Relationaldatabasesallow data to be stored in a clear, organised manner across multiple tables. Links, known asrelationships, are formed to allow the data to be shared across the tables.

For instance, a retail company might have different tables for the following information:

  • customer details
  • customer orders
  • product details
  • stock levels
  • stock locations
  • staff details

Product detailscould be complicated, eg if the company sold books there may be several categories within books, including author name, title, genre, physical size and many other details. Storing all this information in one flat-file table would create a very large table.

Connecting entities

The main characteristics of arelational databaseare:

  • it is built from a set of unique tables (also called relations)
  • a table contains data about just oneentity
  • tables must have aprimary key
  • tables are linked by primary andforeign keys

When working with relational databases, users need to try to keepinformation about different entities in separate tables. Each entity has a primary key to provide a unique reference to an entity, which means that an entity can be referenced in another table without having to call up all the details about that entity.

Entity relationship diagrams

Entities can relate to each other in three different ways:one to one, one to many and many to many.

You can represent these relationships using an entity relationship diagram (ERD).

One to one

For example,oneperson hasoneaddress.

One to many

For example,onecinema hasmanycustomers.

Many to many

For example,manysubjects can be taken bymanystudents.

Relationship example

The following example shows how tables can be connected usingprimaryandforeign keys.

The tables are in adatabasefor an online shop. There are three tables:

  • customer
  • product
  • orders

Each table has a primary key field and each record has a primary key with a unique number.

Customer table

The customer table gives customers a unique Customer ID (the primary key for this table) and shows customer details, ie name, address and phone number.

Customer ID / First name / Surname / Address / Phone number
02942 / Rebecca / Johnson / 49 Drew Road / 029 381834
02943 / Mushtaq / Aqbar / 28 Lyttleton Lane / 028 282738

Product table

The product table gives details about the products. The Product ID is the primary key for this table.

Product ID / Product type / Colour / Size / Cost
284758 / Jeans / Blue / 28 / £14.99
384957 / Shoes / Brown / 6 / £12.99
483927 / Jumper / Red / M / £29.99
489320 / Shirt / Blue / M / £33.99
839258 / Socks / White / 6 / £10.00

Orders table

In the orders table, each order has a unique Order number (the primary key for this table). The table also includes customer ID (the primary key of the customer table) and product ID (the primary key of the product table) asforeign keys, but does not need to include all details about customers and products as these are stored in the Customer and Product tables.

Order number / Customer ID / Product ID / Quantity / Total cost
59876 / 02942 / 284758 / 2 / £29.98
59877 / 02942 / 384957 / 3 / £38.97
59878 / 02942 / 483927 / 1 / £29.99
59879 / 02943 / 489320 / 3 / £99.97
59880 / 02943 / 839258 / 2 / £20.00

What are absolute and relative references and why should I care?

An absolute reference means that the reference to the column or row it’s in front of will never change no matter where you place the formula.

For example, if I have a formula reference set to =$A$2 then no matter where I copy the formula to it will always be equaled to the value in A2. Now if I had =$A2 and I copied the formula to cell D6 then the formula would change to =$A6. Why does this happen? It’s because the absolute reference is for the column only and not the row. So it will always reference column A, but the row will change to the row of wherever the formula is being copied.

Four types of cell references:

=$A$2 – Absolute reference, nothing will change when the formula is moved, it will always be equaled to the value in cell A2

=A$2 – Absolute row reference, only the column will change when the formula is moved

=$A2 – Absolute column reference, only the row will change when the formula is moved

=A2 – Relative reference, both the column and row will change when the formula is moved

So, why should I care? Suppose you use a standard vlookup formula, something like =VLOOKUP(A2,B2:B10,2). This is stating that you’re looking up the value in A2 in the range B2:B10 and returning the value if it’s an exact match. Now let’s consider you want to lookup the values in A2:A10 and see the matching values in B2:B10. If you drag the formula down then you’ll have some unexpected results, because the reference is relative and not absolute (no dollar signs). Once you drag the formula down it becomes =VLOOKUP(A3,B3:B11,2), then if you continue dragging down it will become =VLOOKUP(A4,B4:B12,2), etc. You can quickly see that the range you’re trying to look in (B2:B10) is no longer the range you actually are looking in.This can become a big problem.

Suppose you are looking up financial information on which customers accounts are paid in full. If you don’t setup the formula properly with the correct reference types, you could be sending overdue notices to customers that have already paid.

Vlookup

Vlookup (short for 'vertical' lookup) is a built-in Excel function that is designed to work with data that is organized into columns. For a specified value, the function finds (or 'looks up') the value in one column of data, and returns the corresponding value from another column.

Vlookup Syntax

The syntax of the Excel Vlookup function is:

VLOOKUP(lookup_value,table_array,col_index_num)

where the function arguments are:

lookup_value / - / The value that you want to search for.
table_array / - / The array of data that is to be searched for thelookup_value.
Note: The Vlookup function searches in the left-most column of this array.
col_index_num / - / An integer, specifying the column number of the suppliedtable_array, that you want to return a value from.

To refresh your memory on how VLOOKUP works please watch this tutorial: