CS610/710 Review exercise 2 example solution (answers are in italic)

Decomposition exercises:

1.Consider a database of ship voyages with the following attributes: S (Ship name), T (type of ship), V (voyage identifier), C (cargo carried by one ship on one voyage), P (port), and D (day). We assume that a voyage consists of a sequence of events where one ship picks up a single cargo, and delivers it to a sequence of ports. A ship can visit only one port in a single day. Thus, the following functional dependencies may be assumed: S  T, V  SC, and SD  PV.

  1. Find a decomposition into BCNF.
  2. Find a dependency-preserving decomposition into 3NF.
  3. Explain why there is no lossless-join, dependency-preserving BCNF decomposition for this database.

Answer:

  1. One possible BCNF decomposition is (VSC, VT, VPD)
  2. A 3NF decomposition is (ST, VSC, SDPV)
  3. Due to V -> S and SD -> V, BCNF cannot preserve the FD SD -> V . This is very similar to the situation described in example 3.42 pp. 151 (3.32 pp. 114) in the text.

Formally, it is not simple to determine whether any relation schema with a set of FDs would have no lossless-join, dependency-preserving BCNF. If the set of attributes in the relation schema is large and the set of FDs is large, The formal computation of closures for FDs for the decomposed schemas could be exponential (although the text has discussed some short cut to avoid computing all the possible FDs' for a decomposed schema by recognizing FDs that are trivial and not related to the new schema). Once that is done, then the check for dependency preservation can be done in polynomial time because we just need to check that each FD in the original set is entailed by the union of FDs local to the new schemas. For schemas that constitute a lot of attributes and FDs, one would not do this by hand

However, for problems that are encountered in exercises (and test), they are not that complicated and usually there are only a few possible decompositions and each could be checked by hand easily. So, to list the decomposition and to analyze them is not too difficult.

Some insight of the BCNF definition will help to spot FDs in exercises that could not be preserved by BCNF. Recall the difference of BCNF and 3NF. BCNF basically strengthens 3NF so that prime attributes cannot be dependent on non-keys. So, if a relation schema has 2 (or more) composite keys (i.e. keys that has more than 1 attributes) that share at least one attribute, and the prime attributes that are not common among the composite keys are dependent on each other, then you are going to have the situation that there is no lossless-join, dependency-preserving BCNF. The example in the text and the review exercise question is exactly that kind of problem. Take the exercise question, for example, the possible keys to the relation STVCPD are SD and VD. However, there are the FDs V->S and SD->V. To turn the relation into a BCNF decomposition, V->S must be used for decomposition and formed VS. But, then the remaining attributes do not contain S and hence the other new relations will not have SDV together in any form (as evidence in the example solution), henceSD->V is not presesved (i.e cannot be checked locally in any of the new relation schemas). The only way to check for that FD when insert new data is to join the new relations together to determine if the insertion violates the FD.

  1. Suppose we have a database for a bank, consisting of the following attributes: K (banker), N (branch name), L (city branch is located in), C (customer name), A (account number), and B (balance), with the following functional dependencies: N  L, A  BCN, K  N, and AN  K.
  1. Find the key(s) for the relation schema R = KNLCAB. Justify your answer.

Answer: The only key is A. AABCNABCKLN

  1. Find a decomposition of R into collections of relations that are in 3NF.

Answer: start with NL, decompose into NL and ABCKN, then use KN to decompose ABCKN into KN and ABCK. The relations NL, KN, and ABCK are in 3NF.

  1. Find a decomposition of R into collections of relations that are in BCNF.

Answer: the decomposition in c is already in BCNF.

SQL exercises:

  1. Based on the suppliers-parts-projects database of the review exercise, create the relations for the database in Oracle using SQLPLUS. The data for the relations are provided in a text file, assignment2_data.txt, which is posted on the course web page.

-- relation supplier:

CREATE TABLE supplier (

snum CHAR(3),

sname VARCHAR(15),

status INT,

city VARCHAR(15)

);

insert into supplier values ('S1', 'Smith', 20, 'London');

insert into supplier values ('S2', 'Jones', 10, 'Paris');

insert into supplier values ('S3', 'Blake', 30, 'Paris');

insert into supplier values ('S4', 'Clark', 20, 'London');

insert into supplier values ('S5', 'Adams', 30, 'Athens');

-- relation spj (for supplier, part, job relationship):

CREATE TABLE spj (

snum CHAR(3),

pnum CHAR(3),

jnum CHAR(3),

qty INT

);

insert into spj values ('S1', 'P1', 'J1', 200);

insert into spj values ('S1', 'P1', 'J4', 700);

insert into spj values ('S2', 'P3', 'J1', 400);

insert into spj values ('S2', 'P3', 'J2', 200);

insert into spj values ('S2', 'P3', 'J3', 200);

insert into spj values ('S2', 'P3', 'J4', 500);

insert into spj values ('S2', 'P3', 'J5', 600);

insert into spj values ('S2', 'P3', 'J6', 400);

insert into spj values ('S2', 'P3', 'J7', 800);

insert into spj values ('S2', 'P5', 'J2', 100);

insert into spj values ('S3', 'P3', 'J1', 200);

insert into spj values ('S3', 'P4', 'J2', 500);

insert into spj values ('S4', 'P6', 'J3', 300);

insert into spj values ('S4', 'P6', 'J7', 300);

insert into spj values ('S5', 'P2', 'J2', 200);

insert into spj values ('S5', 'P2', 'J4', 100);

insert into spj values ('S5', 'P5', 'J5', 500);

insert into spj values ('S5', 'P5', 'J7', 100);

insert into spj values ('S5', 'P6', 'J2', 200);

insert into spj values ('S5', 'P1', 'J4', 100);

insert into spj values ('S5', 'P3', 'J4', 200);

insert into spj values ('S5', 'P4', 'J4', 800);

insert into spj values ('S5', 'P5', 'J4', 400);

insert into spj values ('S5', 'P6', 'J4', 500);

-- relation part:

CREATE TABLE part (

pnum CHAR(3),

pname VARCHAR(15),

color CHAR(10),

weight INT,

city VARCHAR(15)

);

insert into part values ('P1', 'Nut', 'Red', 12, 'London');

insert into part values ('P2', 'Bolt', 'Green', 17, 'Paris');

insert into part values ('P3', 'Screw', 'Blue', 17, 'Rome');

insert into part values ('P4', 'Screw', 'Red', 14, 'London');

insert into part values ('P5', 'Cam', 'Blue', 12, 'Paris');

insert into part values ('P6', 'Cog', 'Red', 19, 'London');

-- relation job:

CREATE TABLE job (

jnum CHAR(3),

jname VARCHAR(15),

city VARCHAR(15)

);

insert into job values ('J1', 'Sorter', 'Paris');

insert into job values ('J2', 'Display', 'Rome');

insert into job values ('J3', 'OCR', 'Athens');

insert into job values ('J4', 'Console', 'Athens');

insert into job values ('J5', 'RAID', 'London');

insert into job values ('J6', 'EDS', 'Oslo');

insert into job values ('J7', 'Tape', 'London');

-- The relations S = SUPPLIER, P = PART, J = JOB, and SPJ is SPJ

-- Q2: a.Print the projects that are supplied by at least one supplier not in the same city.

SELECT DISTINCT SPJ.jnum

FROM SPJ, (SELECT snum, jnum

FROM SUPPLIER, JOB

WHERE NOT (SUPPLIER.CITY = JOB.CITY)) R

WHERE spj.jnum = r.jnum;

Result:

JNUM

---

J1

J2

J3

J4

J5

J6

J7

--Q2b: b.Print the projects that are not supplied with any red part by any London supplier.

SELECT DISTINCT SPJ.jnum

FROM SPJ

WHERE SPJ.jnum NOT IN

(SELECT DISTINCT jnum

FROM SPJ,

(SELECT * FROM SUPPLIER WHERE (SUPPLIER.city = 'London')) R1,

(SELECT * FROM PART WHERE (PART.color = 'Red')) R2

WHERE

R2.city = R1.city and SPJ.snum = R1.snum and SPJ.pnum = R2.pnum) ;

or

SELECT DISTINCT SPJ.jnum

FROM SPJ

MINUS

SELECT DISTINCT jnum

FROM SPJ,

(SELECT * FROM SUPPLIER WHERE (SUPPLIER.city = 'London')) R1,

(SELECT * FROM PART WHERE (PART.color = 'Red')) R2

WHERE

R2.city = R1.city and SPJ.snum = R1.snum and SPJ.pnum = R2.pnum ;

Result:

JNUM

---

J2

J5

J6

-- Q2c Print the parts that are supplied to all projects in London

SELECT pnum-- set 1, all shipments

FROM SPJ

MINUS

SELECT DISTINCT T1.pnum -- set 2, parts not supplied to London

FROM

(SELECT DISTINCT SPJ.pnum, JOB.jnum -- set 3, parts used by London projects

FROM SPJ, JOB

WHERE JOB.city = 'London'

MINUS

SELECT DISTINCT pnum, jnum FROM SPJ) T1;-- set 4, all parts and projects -- that get shipment

Result:

PNUM

---

P3

P5

-- Q2d. d.For each part being supplied to a project, print the part number, the project number, and the corresponding total quantity.

SELECT pnum, jnum, SUM(qty)

FROM SPJ

GROUP BY pnum, jnum;

Result:

PNUMJNUM SUM(QTY)

------

P1 J1 200

P1 J4 800

P2 J2 200

P2 J4 100

P3 J1 600

P3 J2 200

P3 J3 200

P3 J4 700

P3 J5 600

P3 J6 400

P3 J7 800

PNU JNU SUM(QTY)

------

P4 J2 500

P4 J4 800

P5 J2 100

P5 J4 400

P5 J5 500

P5 J7 100

P6 J2 200

P6 J3 300

P6 J4 500

P6 J7 300

21 rows selected.

-- Q2e. For each supplier, print the total number of parts being supplied to all projects.

SELECT snum, SUM(qty) FROM SPJ GROUP BY snum;

Result:

SNU SUM(QTY)

------

S1 900

S2 3200

S3 700

S4 600

S5 3100

-- Q2f. Print the supplier numbers for suppliers supplying some project with part P1 in a quantity greater than the average shipment quantity of part P1 for that project.

SELECT SPJ.snum

FROM SPJ, (SELECT jnum, avg(qty) projectAvg

FROM SPJ

WHERE pnum = 'P1'

GROUP BY jnum) R1

WHERE SPJ.pnum = 'P1' and SPJ.qty > R1.projectAvg AND SPJ.jnum = R1.jnum;

-- or

SELECT R1.snum

FROM(SELECT * from SPJ WHERE pnum = 'P1') R1 ,

(SELECT jnum, avg(qty) projectAvg

FROM SPJ

WHERE pnum = 'P1'

GROUP BY jnum) R2

WHERE R1.qty > R2.projectAvg AND R1.jnum = R2.jnum;

Results

SNU

---

S1

-- Q3a. Change the city of all blue parts to Athens.

UPDATE PART SET city = 'Athens' WHERE color = 'Blue';

-- Q3b b.Delete all parts for which the total number of shipments is less than 1000.

DELETE FROM PART

WHERE PART.pnum IN

(SELECT pnum

FROM SPJ

GROUP BY pnum

HAVING sum(qty) < 1000);

-- or

DELETE FROM PART

WHERE PART.pnum = ANY

SELECT pnum

FROM SPJ

GROUP BY pnum

HAVING sum(qty) < 1000);

-- or (less efficient since a join is involved) (Join happy people would like this one!):

DELETE FROM PART

WHERE pnum IN

(SELECT PART.pnum

FROM PART, SPJ

WHERE PART.pnum = SPJ.pnum

GROUP BY SPJ.pnum, PART.pnum

HAVING sum(SPJ.qty) < 1000);

-- Q3c. Insert a new project (J8) into relation J. The name and city are Multimedia and San Francisco, respectively.

INSERT INTO JOB values ('J8', 'Multimedia', 'San Franncisco');

-- if your orginial name and/or city attribute size is not large enough for this new data, then you will need to modify the column width using the ALTER TABLE tableName MODIFY (columnName newDataDeclaration); For example, to change city's size to 20:

ALTER TABLE JOB MODIFY (city varchar(20));