CSC931 Computing Science I: Autumn 2008

Databases – Using Related Tables in MS AccessWeek Beginning 24th November 2008

CSC931: Databases

Using Related Tables in MS Access

Learning Outcomes

By the end of this practical students should be able to:

Understand the idea of multi-entity database.

Understand relationships between tables.

Execute queries using data from several tables.

Remember:

Attendance: Go to the groups (V: drive) folder for CSC931 and click on the register icon, then register your attendance.

Checkpoints: Checkpoints contribute 25% to your final grade. Today is your last chance to confirm your checkpoints.

Module Evaluation Questionnaire: tell us what you think of the module. The MEQ is available on WebCT.

»Go to Groups on Wide and copy the folder Databases into your CSC931 folder.

»Open your Databases folder and double click on movies. This starts up Microsoft Access.

The Movies database has already been set up with data for you to use. Let’s tour the facilities.

Movies has four tables: Actors, Directors, Movies and Performances.

»Double click on Actors.

This brings up the data view of the table. You can see that the table contains several rows, with information about various actors. The headings at the top of the columns are the attribute names: ActorID, Name, Born, Died, Birthplace. At the bottom of the window it tells you how many records there are in the table (23).

»Switch to Design View by clicking the View button .

This brings up the design view of the table. This allows us to inspect and alter the schema of the table, i.e. the attribute names and their types. Setting the type means we can use Access to control the sort of data people put in the database. For example, the Born and Died fields should be numbers.

The ActorID field has type AutoNumber. This means that Access makes up a unique number for the field for every entry in the table. Also notice that this field has a little key to the left of it. This means it is the primary key of the table.

»Close this window and open up Directors. Switch to Design View to look at the schema (notice that DirectorID is the primary key here).

»Switch back to data view using the View button (notice that it’s changed icon, depending on what view you’re currently in) and add the following row to your Directors table. You don’t need to type the DirectorID field; Access makes that up for you (it should be number 8).

Name / Born / Died / Birthplace
Raoul Walsh / 1887 / 1980 / USA

»Save the Directors table.

»Open the Movies table.

Notice that each row has information about Movies, including Directors. We could have added director’s names here, but instead, we hold that information in the Directors table and in Movies simply refer to the DirectorID (the unique primary key of the Directors table). Why do you think that is?

»Add the records below to the Movies table (they should be number 11 and 12).

Title / DirectorID / Writer / Year / Runtime
White Heat / 8 / Virginia Kellogg / 1949 / 114
The Magnificent Ambersons / 6 / Orson Welles / 1942 / 88

»Save the Movies table.

Linking two tables together requires more than just using an appropriate number for DirectorID. We also need to tell Access that there is a relationship between the tables. Let’s look at that now.

Relationships

»Click the Relationships button in the main Access window toolbar.

This shows the four tables in our database, and the relationships between them (the lines joining the table schemas). Between Directors and Movies there is a one to many relationship (you can tell because one side of the line has a 1 and the other has an infinity symbol ∞). This means that each director may have directed many movies, but each movie has only one director.

Between Actors and Movies there is a many to many relationship, which we model by adding an additional table, Performances. There is a one to many relationship from Actors to Performances (each actor appears in many performances, but each performance has only one actor). There is a one to many relationship from Movies to Performances (each movie contains many performances, but each performance is in only one movie).

The primary key of a table is always used to make the relationship (because it’s a unique identifer for rows in the table).

Performances has two fields for a primary key. You could check that out by looking at its schema too. This is because the only way to uniquely identify a performance is by movie and by actor. Why can’t we use just MovieID on its own, or ActorID on its own to uniquely identify performances?

Adding some more data

To allow for some more interesting queries later, we’ll add a few more records to the database. It was a well-known feature of Alfred Hitchcock’s movies that he always appeared, usually very briefly, on screen so we’ll credit him as an actor in each of the movies he directed.

»Add a new record to the Actors table (it should be number 24).

Name / Born / Died / Birthplace
Alfred Hitchcock / 1899 / 1980 / UK

Hitchcock’s DirectorID in the Directors table is 1 and the MovieIDs of the movies that he directed are 6,7 and 10.

»Add four new records to the Performances table. The first three record Hitchcock’s appearances and the fourth is for Joseph Cotten’s role in The Magnificent Ambersons

MovieID / ActorID / Character
6 / 24 / Not credited
7 / 24 / Not credited
10 / 24 / Not credited
12 / 5 / Eugene

It’s possible that your numbers (MovieID and ActorID) may be different from mine if you’ve done the steps in a different order, or redone steps. Just make sure the numbers you enter above are the correct ones for your database.

»Close all table windows.

Single-table queries

It’s all very well adding data to the database, but we have to get it back out. Queries allow you to make almost any imaginable selections from your database. In addition, queries can be saved for future use. Each time you run a query it will operate on the most up-to-date data in the database.

In our Movies database, suppose you want to list just the Title and Runtime of all movies made between 1946 and 1951 inclusive. We’ll create a query!

»In the Database window, click on Queries under the Objects heading.

»Double-click on Create Query in Design View from the available options.

»In the Show Table dialog choose the table you want to use in your query (Movies).

»Press Add and then Close to remove the dialog box.

What you see is the QBE (Query By Example) window. The top part of the screen displays the schema of the table you’re using in the query. The bottom half is where we will choose what to display and which rows to select out of our table.

»First, select the attributes in which you’re interested (Title, Runtime and Year) by clicking in the Field box and choosing from the drop-down list that appears.

»To view the results of a query (even one that is not complete), press the View button in the main Access window to switch from Design View to Datasheet View.

At the moment, the query should just contain the three columns from all of the rows in the table.

»Press the View button again to switch back to Design View.

»We only want to see films made between 1946 and 1951. To achieve this, click in the Criteria box under the Year column in the grid. Type in:

>= 1946 And <= 1951

This expression tells Access only to display rows in which the Year column value is bothgreater-than-or-equal-to 1946 and less-than-or-equal-to 1951. You must enter the expression exactly as above otherwise it will not work. View the query results again. Has Access ignored the years out of the range of our query?

»Notice that we can arrange for the results of the query to be sorted if we like. Back in Design View, click in the Sort box under the Year heading and choose Ascending. Check that the sort has been performed by viewing the results again.

»Finally, for this query, we said that we only wanted to see the Title and Runtime fields displayed, so uncheck the Show box under the Year heading. View the results. Success?

»Save the query. Press the Save button. You’ll be given the chance to choose a name for the query. Access suggests the name Query1. Call it Title+Runtime+Year46to51.

At any time in the future, you can run this query again simply by going to the Queries section in the Database window and double-clicking the Title+Runtime+Year46to51 item. You can open a query again in Datasheet View, to see the results, or in Design View, to edit or modify it.

There are a number of useful criteria operators that you can use in queries. For example:

And, Or, Not / The Boolean operators you’ve seen before
Like / Containing the specified item (see later)
Between … And … / Alternative to the example above (>= And <=)
=,>,>,<,>=,<= / Equals, not equals, greater than, less than, greater than or equal, less than or equal.
*,/,+,- / Multiply, divide, add, subtract

We’ll use some more of these below.

Multi-table queries

Most interesting queries will need to use more of the tables in the database, and to make use of the relationships between them. For example:

  1. Who directed Strangers on a Train?
  2. Which movies were directed by a director from the UK?
  3. Which movies featured Humphrey Bogart and what character did he play?
  4. Which movies features Joseph Cotten and who directed him?
Two-table query: Who directed Strangers on a Train?

You could find out the answer to this query yourself by looking at the tables (because they’re small). The Movie table tells you that “Strangers on a Train” was directed by number 1, and the Director table tells you that director number 1 is “Alfred Hitchcock”. You need both tables to tell you the answer. But what would be the point of adding all this information to the database if you weren’t going to make use of all of the things it can do? Also, with more data, doing queries manually gets tedious and error prone. Let’s do this with a query.

»In the Database window, click on Queries under the Objects heading.

»Double-click on Create Query in Design View from the available options.

»In the Show Table dialog box, add the tables you want to use in your query (Movies and Directors).

»Press Close to remove the dialog box.

You should see the Query By Example window again, which should contain the icons for the chosen tables and the relationship link between them in the top half of the window and the filter/sort grid in the bottom half of the window.

»Choose Movies.Title from the Field box in the leftmost column.

(This means the Title field in the Movies table. Sometimes we use the same field name in more than one table, for example Birthplace is in both Directorsand Actors. We can tell the difference between them because they are in different tables. Using the full name like this tells Access precisely which one we mean.)

»In the next column, choose Directors.Name.

In the Movies.Title column, uncheck the Show box and for Criteria, type in Strangers on a Train.

»Switch to Datasheet View or press the Runbutton to see the results of the query. Hopefully, there should be no nasty surprises.

»Save the query as DirectorOfStrangersOnATrain.

Access has used the relationship between the tables to find only the combinations of movie titles and director names for records in which the DirectorID Field is the same in both tables. If the relationship did not exist, Access would have combined every movie with every director and the result would have been nonsensical. This is an example of a JOIN (but Access hides the details of this from you).

Two-table query: Which movies were directed by a director from the UK?

»Create a new query and add the tablesMovies and Directors.

»Set up the query fields to be Movies.Title and Directors.Birthplace.

»In the Criteria box of the Directors.Birthplace field, typeUK. Uncheck the show box for this field.

»In the Sort box of the Movies.Title field, choose Ascending.

»Run the query.

»Save the query as UKDirectorsMovies.

Three-table query: Which movies featured Humphrey Bogart and what character did he play?

»Create a new query and add the tablesMovies, Performances and Actors.

»Set up the query fields to be Movies.Title, Actors.Name, and Performances.Character.

»In the Criteria box of the Actors.Name field, type Humphrey Bogart. Uncheck the show box for this field.

»Run the query (You should get Sam Spade and Philip Marlowe).

»Save the query as CharactersByHumph.

Rewind

»Open the design view of the query CharactersByHumph.

Sometimes we’re too lazy to type a whole name, and sometimes we just don’t remember it all. Access can help!

»In the Criteria box of the Actors.Name field, type

Like *bog*

The asterisks mean that we are looking for the string somewhere within the field so there may be characters before or after the search string. For example, if the search string was *bog then a match would be found only if the field ended with the string bog. Similarly bog* would produce a match only if the field started with bog.

»View the query results. This time we will elect to view the Actors.Name field just in case there are any other actors who might match the query string.

»Run the query (You should still get Sam Spade and Philip Marlowe).

»Save the query as CharactersByBogart.

Four-table query:Which movies features Joseph Cotten and who directed him?

Find the names of directors of movies featuring Joseph Cotten and show the movie titles and the names of the characters he played.

»Start a new query.

»Add all four tables.

»Add the fields Directors.Name, Movies.Title, Performances.Character and Actors.Name to the query.

»Set the Criteria box in the Actors.Name field to be Joseph Cotten (note the spelling).

»Run the Query. Go back to Design View and uncheck the Show box in the Actors.Name field. We don’t need to see it.

»Save the query as Directors, Movies and Characters for Joseph Cotten.

There’s more over the page, so don’t stop yet.

Try Some More Yourself

Use the Movies database to answer (at least one of) the following four queries.

  1. Who directed really short movies (with runtime of less than 100 minutes)?
  2. Which movies featured Ingrid Bergman?
  3. Who directed movies featuring Orson Welles?
  4. What were the movies, directors and character names for the actors born before 1900?

Something to think about: There are movies in which one actor plays more than one character. Indeed there are movies in which the same character is played by more than one actor. We won’t concern ourselves with these cases but, if our table did contain such entries, it would mean that the combination of MovieID and ActorID would be insufficient to guarantee a unique entry in the table. How do you think we could make our database deal with these possibilities?

Checkpoint

Demonstrate your queries to the tutor. (Movies from 1946-1951, roles of Humphrey Bogart, UK directors, Movies with Joseph Cotton, and at least one of the queries above, from the “Try it yourself” list.)

Reports

The output for our queries above was useful, but not that readable. Access lets us format the results of queries in a number of ways.

»Click the Reports button in the Database window and press the New button.

»In the New Report Dialog, select the Movies table in the drop-down list and choose AutoReport:Tabular from the available options.

»Click OK. You should get an automatically generated report based on the table contents. Nice!

We could have selected any of the tables or queries to base our report on, if we wanted.

»Close the present report. Don’t bother saving it. Now create a report based on the query Title+Runtime+Year46to51.

»I’m sure you get the idea! This is all very well but we can do much better using the Report Wizard. Try this out in your own time.

Department of Computing Science & MathematicsPage 1 of 9