Databases exercises[Type text]Paolo Coletti
Databases exercises
Dr. Paolo Coletti – Free University of Bolzano Bozen
04 June 2013
Databases
Draw the schema of this database, using at least (more if necessary): 3 tables, a junction table, 15 fields in the entire database. Try to make the database as complete as possible, keeping it simple and not contorted. You must indicate very clearly field names, MySQL field types, primary keys and other options, at least two constraints, relations with their “1” and “many” sides and the fields involved in the relations. For all the fields whose name is not obvious, you must also include a small comment that lets everybody understand what the field should contain. You must also justify non-standard choices.
Moreover, suggest two new queries (in English not in SQL): one which involves at least two tables and another one which needs a summary query to be implemented.
Travel Agency
A travel agency needs a database to keep track of all the available hotel/resorts and of the customers. For each customer the database must include the personal information, the destinations (which hotel) and the dates of departure and arrivals of his trips. The travel agency is also interested in keeping record of which employee was involved in reserving a specific trip for a given customer. The database must be able to answer the question: “What is the average number of stars of mountain resorts visited by customer John Smith?”
Research Institute
This temporal database is designed to keep record of all the research projects conducted by a research institute with the involved researchers. Each researcher belongs to a research unit, which has a specific goal and a location in the buildings, and is involved in several project for a different amount of months. Each project is characterized by a research area, duration, funding. The database should be able to answer the question “How many researchers of the Theoretical Physics unit have been involved in projects funded by the European Union?”
Airline company
This static database handles information about flights and airplanes of an airline company. The goal is to keep track of the specific airplane used in a specific flight (a flight has a day of the week, times, destination, arrival and it is repeated in the same way with the same airplane every week). Each airplane is identified by a code and has a purchase date, a model and manufacturer. The company owns several airplanes of different models and for each model the technical details (number of seat, maximum flight length, maximum speed) has to be stored. The database should be able to answer the question "What is the average number of seats of airplanes which depart from Rome?".
Medical clinic
This temporal database handles information about customers of a private medical clinic. Each client is identified by an ID, name, surname, birth date, address and phone number. For each client the clinic managers want to keep track of the specialist visits with time, date and personal details of the involved doctor (we assume that for each visit there is only one doctor). Each visit is identified by a code and has a description, a cost and is offered by a specific unit of the clinic (i.e. cardiology, surgery, etc.). The database should be able for each client to retrieve doctor, time, date and type of each visit..
Recruitment company
This database handles the data of a recruitment company. For each worker, all his working periods should be recorded and each one with all the acquired skills he/she has learnt in that period and their level of expertise. The database must be able to create a report of the career of each worker and to retrieve all the workers whose profile satisfy a given list of skills and expertise.
Politicians
this temporal database handles information on politicians, parties and governments they have been involved in. Each politician has belonged to different parties during his career and may have had government roles for given periods of time. The database must be able to produce a report with the political career of a given politician.
Recipes
this static database handles data on recipes and ingredients. Each recipe has a short and long description, time, difficulty, cost and many ingredients in different quantities. Ingredient unfortunately use several units of measure (litre, Kg, gram, ...). The database must be able to produce a report with the recipe’s description and the list of ingredients with their quantities.
Bookshop’s inventory
this static database handles data on books, editions with their ISBN, editors and authors. Each book is printed in several editions by different editors with different features and the bookshop has some copies in its storehouse. The database should be able to answer to the question “how many copies of Romeo and Juliet’s editions of less than 100 pages do we have?”.
Zoo
this static database handles data on each single animal, keeper and cage. There is more than one animal per cage and keepers make 8-hours turns for groups of cages. The database should be able to answer to the question “who are the responsible ones for the horses during the night?”.
Animal shop
this temporal database handles data on each single animal, its daily food and the buyer. Each animal eats more than one type of food and can have only one buyer. The database should be able to answer to the question “how much corn per day do animals currently present in shop eat?”.
Students association
this temporal database handles data on each member, its degree, the association’s roles and all the fee payments. Each member may have only one role but in time s/he may have had different roles, each member is enrolled or was enrolled to only one degree course. The yearly fee is the same for all students, but varies from year to year and we want to know exactly who has paid what and when. The database should be able to answer to the question “how many president we had enrolled to EM between 2004 and 2009?”.
Olympic games
this static database handles data on each team and participant, each competition and the results. Clearly each athlete belongs only to one team but can participate to different games. Some games have a team participation instead of single athletes participation. The database should be able to answer to the question “how many silver medals has Finland won?”.
Proposed solutions
Warning: these are databases as I would build them. Any other implementation which is correct and satisfies the requites is fully accepted. There might be cases where requirements are so open that a completely different schema and idea is correct.
Warning: reading these solutions before doing the database is a wonderful way to spoil yourself of the possibility to do exercises. These are meant to be exercises and not examples: I remind that you learn much more by trying to do them and then checking rather than by looking at solution.
Travel Agency
Resorts:
· ID: integer, auto_increment, primary key
· Phone, email: varchar(30)
· ContactPerson: varchar(50)
· ResortName, Town, Country: varchar(50), not null, index
· Address: varchar(100)
· ZIP: varchar(10)
· Type: ENUM(“mountain”, “seaside”, “country”, “lakeside”, “island”), index
· Stars: integer, not null, index
· Constraint: Stars between 1 and 5
Bookings:
· ID: integer, auto_increment, primary key
· Customer, Resort, Madeby: foreign keys, same type as corresponding primary key, index
· From: Date, index, not null
· To: Date, index
· DateMade: datetime, not null, index, default {CURDATE()}
· NumberPeople: integer, not null, default {2}
· FoodIncluded, TravelIncluded: ENUN(“yes”,”no”), not null, default{“no”}
· Constraint: NumberPeople between 1 and 999;
· Constraints: DateMade <= CURDATE()
Customers:
· ID: integer, auto_increment, primary key
· LastName: varchar(30), not null, index
· PlaceBirth, FirstName, Phone: varchar(30)
· DateBirth: Date
· Phone: varchar(30)
· Address: varchar(100)
· Constraints: DateBirth < CURDATE()
Employees:
· TaxCode: char(16), primary key
· LastName: varchar(30), not null, index
· PlaceBirth, FirstName: varchar(30), not null
· DateBirth: Date
· Nationality: varchar(30), not null, index
· Sex: ENUM(“M”,”F”), not null, index
· Address: varchar(100)
ZIP is text because some countries have letters and because it does not have any mathematical meaning. Phone are text because they may contain symbols and they do not have any mathematical meaning. Star is number, therefore it must be a number between 1 and 5 (no 0 stars or unrated resorts are possible).
ZIP can be null because some very small countries may not use it.
Query using more than one table: in which resort will John Smith be on the 27th May 2011?
Summary query: how many people has our travel agency in hotel Villa Bianca on the 27th May 2011?
Research Institute
Researchers:
· TaxCode: char(16), primary key
· LastName: varchar(30), not null, index
· PlaceBirth, FirstName: varchar(30), not null
· DateBirth: Date
· Nationality: varchar(30), not null, index
· Sex: ENUM(“M”,”F”), not null, index
· Unit: foreign key, same type as corresponding primary key, index
Units:
· ID: integer, auto_increment, primary key
· UnitName: varchar(60), not null, index
· Floor: varchar(10)
Projects:
· ID: integer, auto_increment, primary key
· ProjectName, ResearchArea: varchar(100), not null, index
· StartDate: Date, not null, index
· EndDate: Date, index
· Totalfunding: decimal(11,2)
· FundingOrganiz: varchar(100), index
· Constraint: Totalfunding IS NULL or TotalFunding>=0
· Constraint: EndDate IS NULL or EndDate>StartDate
Participations:
· ID: integer, auto_increment, primary key
· Researcher, Projects: foreign keys, same type as corresponding primary key, index
· Start: Date, not null, index
· End: Date, index
· ParticipationPercentage: decimal(4,3), not null
· Role: varchar(50), index
· Constraint: ParticipationPercentage between 0 and 1
· Constraint: End IS NULL or End>Start
End fields are not required since projects and participation may be without end. Total Funding is not required since it may be “to be determined”. Role is not required since the participant may be without role. ParticipationPercentage is indicated in decimal form (0,750 for 75%) and not with percentage symbol.
According to this database only a single funding organizations is possible for each project. Moreover, several Units may exist with the same name but different locations (since UnitName does not have “unique” option).
Query involving more than one table: what are the projects in which John Smith is participating?
Summary query: what is the total funding for projects to which John Smith is currently participating?
Airline company
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
Query: which airplane models do Rome-Milan on Monday?
Summary query: what is the total number of seats (theoretical number of people which can arrive) of all flights landing on Monday in Milan?
Medical clinic
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
Major specialization can be null since the medic may have no specialization. Visit’s times can be null since the visit may last the entire day. Visit’s description can be null since sometimes the name may be enough.
Query: who has done a visit on the 9th May 2011?
Summary query: show, medic by medic, the average number of visits per day.
Recruitment company
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
In this database every acquired skill must be acquired during a certain work period.
Skills Enddate is not null because this recruitment agency is interested only in past works
Some companies data can be null because the worker may not know them, however a phone number is always not null to ask to confirmation.
Vatcode is text because, even though is numeric, it does not have any mathematical meaning.
Expertiselevel is a number from 1 to 10.
Query1: show all the workers who can use a computer at level >= 7
Query2: How many people have worked for company XYZ?
Politicians
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
End dates in the two junction tables can be null since they are blank in case the person still belongs to that party or still is in that role.
Suspension is a yes/no field which indicated whether that person has been suspended by its party.
MaxPeople in table Roles is the maximum number of people who can be in that role at the same time (for example, Italian deputies are 640 or first minister is 1).
Mandatory in table Roles indicates whether that role must absolutely be filled in or whether it is optional like some Minister titles.
Resigning indicates whether that person has resigned (but resigning still has to be accepted).
A possible query is: Show all the politicians belonging to party ABC now
A possible summery query is: Show all the parties with the number of deputies they have now
Recipes
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
EasilyAVailable is an indication whether the product is easily available in local stores.
UnitOfMeasure is the ingredient’s unit of measure, which can be for example Kg, grams, litres, ml, units, cm.
Quantity can be null since for some small ingredients, such as salt and pepper, the quantity is “as needed”
DifficultyRating and CostRating are ratings from 1 to 5.
Recipe field in recipes table is the full instructions on how to prepare ingredients and on how to cook.
ShortDescription is a short description of how the dish looks like at the end
DifficultyRating, TimeMinutes, CostRating and Author are fields which may be unknown or still to be decided or tested.
A possible query which involves at least two tables is: what are the ingredients of recipe Spaghetti Alla Carbonara?
A possible summary query is: how many ingredients are needed on average for Difficulty 5 recipes?
Bookshop’s inventory
Warning: field names are for Access (but you can easily determine the corresponding MySQL type), “R” means NOT NULL, “autonumber” means “integer, AUTO_INCREMENT”, constraints, indices and options UNIQUE, DEFAULT are to be added where appropriate.
TaxCode, VATnumber, ISBN are text to give the possibility to start with 0.
All author’s data can be null since they may be unknown, for example for very old books or for authors using a pseudonym.