Name:

Final Exam -- SAMPLE

Name:

Major:

SSN (last 4 digits):

Answer the following questions. Be brief and precise, please.

You have 2 hours to finish the exam.

1. 15 points

Consider a university database for the scheduling of classrooms for final exams. This database could be modeled as the single entity set EXAM, with attributes course, section, room, and time. Alternatively, one or more additional entity sets could be defined, along with relationship sets to replace some of the attributes of the EXAM entity set as follows:

  • COURSE with name, department, c-number
  • SECTION with s-number, enrollment, as a dependent weak entity set on COURSE.
  • ROOM with building, r-number, and capacity.

For these two design alternatives:

  • (10)Show ER diagram illustrating the use of all three additional entity sets.

Very similar design to the University ER for the homework assignments, except the room is not an attribute of the Section entity set but a separate entity set with attributes building, r-number, and capacity. There is a relation hip between Section and Room, e.g., location, with attribute term.

  • (5) Explain what application characteristics would influence a decision to include or not to include one or more of the additional entity sets.
  • Type of usage, e.g., frequency and type of queries
  • Performance requirements
  • Size of the database
  • Importance of preserving constraints/eliminating redundancies
  • Etc.

2. 10 points

  • (5) Let R=(A,B,C,D) and functional dependencies (1) AC, (2) ABD. What is the closureof {A,B}?

{AB}+ = {ABCD}

  • (5) Explain the connection between closure of attributes and keys of a relation.

If the closure of an attribute (set) includes all the attributes of the relation, than the attribute (set) is a key for the relation.

3. 20 points

Consider the following schemas.

R1=(A, B,C)

R2=(A,E,F)

R3-(E,D)

The underlined attributes are the keys. Write both relational algebra and SQL expressions for the following queries:

Query 1:

List the Sum of D’s for each A, where the value of B is 25.

(5) Relational algebra:

Not applicable

(5) SQL:

SELECT R1.A, SUM(D)

FROM R1, R2, R3

WHERE R1.A = R2.A and R2.E = R3.E and B=25

GROUP BY A;

Query 2:

List A and C where D is larger than B.

(5) Relational algebra:

Not applicable

(5) SQL:

SELECT R1.A, C

FROM R1, R2, R3

WHERE R1.A = R2.A and R2.E = R3.E and D > B;

4.15 points

Consider the following two transactions:

T1: r1[B] (if B >=100 then B:=B-100) w1[B]

T2: r2[B] (if B >=100 then B:=B-100) w2[B]

The consistency requirement is that (B>=0). Let B = 100 be the initial value.

  • (5) Show that every serial execution involving T1 and T2 preserves consistency of the database.

T1 completes first, followed by T2 (or T2 completes first followed by T1).

IF T1 completes first, then

Initially B=100

After T1 completesB=0

T2 cannot change the value of B because “if B >= 100” will be false.

  • (5) Show a concurrent execution of T1 and T2 that produces a nonserializable schedule.

T1T2

Initially B=100

T1 reads BB=100

T2 reads BB=100

T1’s if statement it trueB=B-100=0

T2’s if statement is trueB=B-100=0

Final B valueB=0

Both transactions followed integrity but the final database state is incorrect, Since 100+100 = 200 was withdrawn from the initial 100 value of B. B should be -100.

  • (5) Is the final result of your nonserializable schedule satisfies the consistency requirement? Is it correct? Why or why not?

Incorrect result as explained above.

5. 15 points

Given relation schema R(A,B,C,D,E,F,G) and functional dependencies (1) EACB, (2) ACD, and (3) CDF (4) D G

You need to decompose the relation such that the decomposition

  • Reduces redundancies
  • Lossless join and
  • Dependency preserving

Show your decomposition and establish its correctness based on the concepts of BCNF, 3NF, lossless join, and dependency preserving. You may not be able to find a decomposition that satisfies all requirements. Justify your preferences.

3NF guarantees dependency preserving and lossless decomp. I chose that.

FD1 does not violate 3NF. Others do.

Minimal basis for the FDs:

E  A

E  C

E  B

E D

E  F

E  G

A  C

A  D

A  F

A  G

CD  F

D  G

Relations: R1(E,A), R2(E,B), R3(A,C), R4(A,D), R5(C,D,F), R6(D, G)

6. 15points

  • (5) Why are indexesimportant for databases? What are the difficulties of managing indexes during data modification?

Enhances the efficiency of retrieving data items. Depending on the type of index used, database modifications require the modification of index structures. (see chapter 8.3 for additional details.)

  • (5) What are the disadvantages of index-sequential file organization at the presence of insertions and deletions?

The sequential file must be updated according to the index value. This may require extensive reorganization in the data storage.

  • Explain how B+-treesovercome the disadvantage of sequential files?

Easier update of the index structure due to the structure of the B+-tree. Some updates will not require any structural changes.

  • (5) Consider the following B+-tree with n=3. Show the insertion of search-key 12.

7. 10 points

What are the three security objectives. Give an example of each objective in the context of databases.

Integrity: prevent/deter/detect unauthorized modification of resources. E.g., student grades can only be modified by the corresponding professor.

Confidentiality: prevent/deter/detect unauthorized access to resources. E.g., a student’s grade should be accessible by that student and the professor but not the other students.

Availability: prevent/deter/detect denial of access to the resources for the authorized users. E.g., students should be able to view their own grades and transcripts.

Bonus Question

5 points

You are asked to design a database system for a health club. The database would contain data about customers, their training, contact numbers, etc. Show the main steps you would perform for designing and implementing the database. (Don’t show the actual schemas!)

1