CS610/710 Review exercise 2 example solution (answers are in italic)
Decomposition exercises:
1.Consider a database of ship voyages with the following attributes: S (Ship name), T (type of ship), V (voyage identifier), C (cargo carried by one ship on one voyage), P (port), and D (day). We assume that a voyage consists of a sequence of events where one ship picks up a single cargo, and delivers it to a sequence of ports. A ship can visit only one port in a single day. Thus, the following functional dependencies may be assumed: S T, V SC, and SD PV.
- Find a decomposition into BCNF.
- Find a dependency-preserving decomposition into 3NF.
- Explain why there is no lossless-join, dependency-preserving BCNF decomposition for this database.
Answer:
- One possible BCNF decomposition is (VSC, VT, VPD)
- A 3NF decomposition is (ST, VSC, SDPV)
- Due to V -> S and SD -> V, BCNF cannot preserve the FD SD -> V . This is very similar to the situation described in example 3.42 pp. 151 (3.32 pp. 114) in the text.
Formally, it is not simple to determine whether any relation schema with a set of FDs would have no lossless-join, dependency-preserving BCNF. If the set of attributes in the relation schema is large and the set of FDs is large, The formal computation of closures for FDs for the decomposed schemas could be exponential (although the text has discussed some short cut to avoid computing all the possible FDs' for a decomposed schema by recognizing FDs that are trivial and not related to the new schema). Once that is done, then the check for dependency preservation can be done in polynomial time because we just need to check that each FD in the original set is entailed by the union of FDs local to the new schemas. For schemas that constitute a lot of attributes and FDs, one would not do this by hand
However, for problems that are encountered in exercises (and test), they are not that complicated and usually there are only a few possible decompositions and each could be checked by hand easily. So, to list the decomposition and to analyze them is not too difficult.
Some insight of the BCNF definition will help to spot FDs in exercises that could not be preserved by BCNF. Recall the difference of BCNF and 3NF. BCNF basically strengthens 3NF so that prime attributes cannot be dependent on non-keys. So, if a relation schema has 2 (or more) composite keys (i.e. keys that has more than 1 attributes) that share at least one attribute, and the prime attributes that are not common among the composite keys are dependent on each other, then you are going to have the situation that there is no lossless-join, dependency-preserving BCNF. The example in the text and the review exercise question is exactly that kind of problem. Take the exercise question, for example, the possible keys to the relation STVCPD are SD and VD. However, there are the FDs V->S and SD->V. To turn the relation into a BCNF decomposition, V->S must be used for decomposition and formed VS. But, then the remaining attributes do not contain S and hence the other new relations will not have SDV together in any form (as evidence in the example solution), henceSD->V is not presesved (i.e cannot be checked locally in any of the new relation schemas). The only way to check for that FD when insert new data is to join the new relations together to determine if the insertion violates the FD.
- Suppose we have a database for a bank, consisting of the following attributes: K (banker), N (branch name), L (city branch is located in), C (customer name), A (account number), and B (balance), with the following functional dependencies: N L, A BCN, K N, and AN K.
- Find the key(s) for the relation schema R = KNLCAB. Justify your answer.
Answer: The only key is A. AABCNABCKLN
- Find a decomposition of R into collections of relations that are in 3NF.
Answer: start with NL, decompose into NL and ABCKN, then use KN to decompose ABCKN into KN and ABCK. The relations NL, KN, and ABCK are in 3NF.
- Find a decomposition of R into collections of relations that are in BCNF.
Answer: the decomposition in c is already in BCNF.
SQL exercises:
- Based on the suppliers-parts-projects database of the review exercise, create the relations for the database in Oracle using SQLPLUS. The data for the relations are provided in a text file, assignment2_data.txt, which is posted on the course web page.
-- relation supplier:
CREATE TABLE supplier (
snum CHAR(3),
sname VARCHAR(15),
status INT,
city VARCHAR(15)
);
insert into supplier values ('S1', 'Smith', 20, 'London');
insert into supplier values ('S2', 'Jones', 10, 'Paris');
insert into supplier values ('S3', 'Blake', 30, 'Paris');
insert into supplier values ('S4', 'Clark', 20, 'London');
insert into supplier values ('S5', 'Adams', 30, 'Athens');
-- relation spj (for supplier, part, job relationship):
CREATE TABLE spj (
snum CHAR(3),
pnum CHAR(3),
jnum CHAR(3),
qty INT
);
insert into spj values ('S1', 'P1', 'J1', 200);
insert into spj values ('S1', 'P1', 'J4', 700);
insert into spj values ('S2', 'P3', 'J1', 400);
insert into spj values ('S2', 'P3', 'J2', 200);
insert into spj values ('S2', 'P3', 'J3', 200);
insert into spj values ('S2', 'P3', 'J4', 500);
insert into spj values ('S2', 'P3', 'J5', 600);
insert into spj values ('S2', 'P3', 'J6', 400);
insert into spj values ('S2', 'P3', 'J7', 800);
insert into spj values ('S2', 'P5', 'J2', 100);
insert into spj values ('S3', 'P3', 'J1', 200);
insert into spj values ('S3', 'P4', 'J2', 500);
insert into spj values ('S4', 'P6', 'J3', 300);
insert into spj values ('S4', 'P6', 'J7', 300);
insert into spj values ('S5', 'P2', 'J2', 200);
insert into spj values ('S5', 'P2', 'J4', 100);
insert into spj values ('S5', 'P5', 'J5', 500);
insert into spj values ('S5', 'P5', 'J7', 100);
insert into spj values ('S5', 'P6', 'J2', 200);
insert into spj values ('S5', 'P1', 'J4', 100);
insert into spj values ('S5', 'P3', 'J4', 200);
insert into spj values ('S5', 'P4', 'J4', 800);
insert into spj values ('S5', 'P5', 'J4', 400);
insert into spj values ('S5', 'P6', 'J4', 500);
-- relation part:
CREATE TABLE part (
pnum CHAR(3),
pname VARCHAR(15),
color CHAR(10),
weight INT,
city VARCHAR(15)
);
insert into part values ('P1', 'Nut', 'Red', 12, 'London');
insert into part values ('P2', 'Bolt', 'Green', 17, 'Paris');
insert into part values ('P3', 'Screw', 'Blue', 17, 'Rome');
insert into part values ('P4', 'Screw', 'Red', 14, 'London');
insert into part values ('P5', 'Cam', 'Blue', 12, 'Paris');
insert into part values ('P6', 'Cog', 'Red', 19, 'London');
-- relation job:
CREATE TABLE job (
jnum CHAR(3),
jname VARCHAR(15),
city VARCHAR(15)
);
insert into job values ('J1', 'Sorter', 'Paris');
insert into job values ('J2', 'Display', 'Rome');
insert into job values ('J3', 'OCR', 'Athens');
insert into job values ('J4', 'Console', 'Athens');
insert into job values ('J5', 'RAID', 'London');
insert into job values ('J6', 'EDS', 'Oslo');
insert into job values ('J7', 'Tape', 'London');
-- The relations S = SUPPLIER, P = PART, J = JOB, and SPJ is SPJ
-- Q2: a.Print the projects that are supplied by at least one supplier not in the same city.
SELECT DISTINCT SPJ.jnum
FROM SPJ, (SELECT snum, jnum
FROM SUPPLIER, JOB
WHERE NOT (SUPPLIER.CITY = JOB.CITY)) R
WHERE spj.jnum = r.jnum;
Result:
JNUM
---
J1
J2
J3
J4
J5
J6
J7
--Q2b: b.Print the projects that are not supplied with any red part by any London supplier.
SELECT DISTINCT SPJ.jnum
FROM SPJ
WHERE SPJ.jnum NOT IN
(SELECT DISTINCT jnum
FROM SPJ,
(SELECT * FROM SUPPLIER WHERE (SUPPLIER.city = 'London')) R1,
(SELECT * FROM PART WHERE (PART.color = 'Red')) R2
WHERE
R2.city = R1.city and SPJ.snum = R1.snum and SPJ.pnum = R2.pnum) ;
or
SELECT DISTINCT SPJ.jnum
FROM SPJ
MINUS
SELECT DISTINCT jnum
FROM SPJ,
(SELECT * FROM SUPPLIER WHERE (SUPPLIER.city = 'London')) R1,
(SELECT * FROM PART WHERE (PART.color = 'Red')) R2
WHERE
R2.city = R1.city and SPJ.snum = R1.snum and SPJ.pnum = R2.pnum ;
Result:
JNUM
---
J2
J5
J6
-- Q2c Print the parts that are supplied to all projects in London
SELECT pnum-- set 1, all shipments
FROM SPJ
MINUS
SELECT DISTINCT T1.pnum -- set 2, parts not supplied to London
FROM
(SELECT DISTINCT SPJ.pnum, JOB.jnum -- set 3, parts used by London projects
FROM SPJ, JOB
WHERE JOB.city = 'London'
MINUS
SELECT DISTINCT pnum, jnum FROM SPJ) T1;-- set 4, all parts and projects -- that get shipment
Result:
PNUM
---
P3
P5
-- Q2d. d.For each part being supplied to a project, print the part number, the project number, and the corresponding total quantity.
SELECT pnum, jnum, SUM(qty)
FROM SPJ
GROUP BY pnum, jnum;
Result:
PNUMJNUM SUM(QTY)
------
P1 J1 200
P1 J4 800
P2 J2 200
P2 J4 100
P3 J1 600
P3 J2 200
P3 J3 200
P3 J4 700
P3 J5 600
P3 J6 400
P3 J7 800
PNU JNU SUM(QTY)
------
P4 J2 500
P4 J4 800
P5 J2 100
P5 J4 400
P5 J5 500
P5 J7 100
P6 J2 200
P6 J3 300
P6 J4 500
P6 J7 300
21 rows selected.
-- Q2e. For each supplier, print the total number of parts being supplied to all projects.
SELECT snum, SUM(qty) FROM SPJ GROUP BY snum;
Result:
SNU SUM(QTY)
------
S1 900
S2 3200
S3 700
S4 600
S5 3100
-- Q2f. Print the supplier numbers for suppliers supplying some project with part P1 in a quantity greater than the average shipment quantity of part P1 for that project.
SELECT SPJ.snum
FROM SPJ, (SELECT jnum, avg(qty) projectAvg
FROM SPJ
WHERE pnum = 'P1'
GROUP BY jnum) R1
WHERE SPJ.pnum = 'P1' and SPJ.qty > R1.projectAvg AND SPJ.jnum = R1.jnum;
-- or
SELECT R1.snum
FROM(SELECT * from SPJ WHERE pnum = 'P1') R1 ,
(SELECT jnum, avg(qty) projectAvg
FROM SPJ
WHERE pnum = 'P1'
GROUP BY jnum) R2
WHERE R1.qty > R2.projectAvg AND R1.jnum = R2.jnum;
Results
SNU
---
S1
-- Q3a. Change the city of all blue parts to Athens.
UPDATE PART SET city = 'Athens' WHERE color = 'Blue';
-- Q3b b.Delete all parts for which the total number of shipments is less than 1000.
DELETE FROM PART
WHERE PART.pnum IN
(SELECT pnum
FROM SPJ
GROUP BY pnum
HAVING sum(qty) < 1000);
-- or
DELETE FROM PART
WHERE PART.pnum = ANY
SELECT pnum
FROM SPJ
GROUP BY pnum
HAVING sum(qty) < 1000);
-- or (less efficient since a join is involved) (Join happy people would like this one!):
DELETE FROM PART
WHERE pnum IN
(SELECT PART.pnum
FROM PART, SPJ
WHERE PART.pnum = SPJ.pnum
GROUP BY SPJ.pnum, PART.pnum
HAVING sum(SPJ.qty) < 1000);
-- Q3c. Insert a new project (J8) into relation J. The name and city are Multimedia and San Francisco, respectively.
INSERT INTO JOB values ('J8', 'Multimedia', 'San Franncisco');
-- if your orginial name and/or city attribute size is not large enough for this new data, then you will need to modify the column width using the ALTER TABLE tableName MODIFY (columnName newDataDeclaration); For example, to change city's size to 20:
ALTER TABLE JOB MODIFY (city varchar(20));