Final Exam

Advanced Database System

Semester 1, Academic Year 2007

Due date: October5, 2007

Part 1: Provide brief explanation, description, or discussion. State any assumptions you might have to avoid lengthy answers.(20 points, each worth 2.5)

1.1) Explain the difference between conceptual, logical, and physical database design. Why might these tasks be carried out by different people?

1.2) The consistency and reliability aspects of transactions are due to the ‘ACIDity’ properties of transactions. Discuss each of these properties and how they relate to the concurrency control and recovery mechanisms. Give examples to illustrate your answer

1.3) Some problems such as lost update, uncommitted dependency, and inconsistent analysis may arise when concurrent access to the database is allowed in a multi-user environment. Please discuss a mechanism for concurrency control that can be used to ensure such problems cannot occur. Also show how the mechanism can prevent the problems from occurring.

1.4) What is a timestamp? How do timestamp-based protocols for concurrency control differ from locking based protocols?

1.5) Discuss the difference between pessimistic and optimistic concurrency control?

1.6) Briefly discuss how the log file (or journal) is a fundamental feature in any recovery mechanism. Explain what is meant by forward and backward recovery and describe how the log file is used in forward and backward recovery. What is the significance of the write-ahead log protocol? How do checkpoints affect the recovery protocol?

1.7) Compare and contrast a DDBMS with distributed processing. Under what circumstances would you choose a DDBMS over distributed processing?

1.8) What are the strategic objectives for the definition and allocation of fragments?

Part 2).Provide graph, tree, etc. representation.(20 points)

2.1) Produce a wait-for-graph (WFG) for the following transaction scenario in a centralized database system and determine whether deadlock exists. (3 points)

Transaction / Data items locked by transaction / Data items that transaction is waiting for
T1 / x2 / x1 , x3
T2 / x3 , x10 / x7 , x8
T3 / x8 / x4 , x5
T4 / x7 / x1
T5 / x1 , x5 / x3
T6 / x4 , x9 / x6
T7 / x6 / x5

2.2) Consider five transactions T1, T2, T3, T4, and T5 with:

T1 initiated at site S1 and spawning an agent at site S2,

T2 initiated at site S3 and spawning an agent at site S1,

T3 initiated at site S1 and spawning an agent at site S3,

T4 initiated at site S2 and spawning an agent at site S3,

T5 initiated at site S3.

The locking information for these transactions is shown in following table. (5 points)

(a) Produce the local wait-for-graphs (LWFGs) for each of the sites (similar to what is shown in example 23.1). What can you conclude from the local WFGs?

(b) Obermarck’s method for distributed deadlock detection is said to be more robust than the above method and it works as follows:

- adding an external node Text to the LWFGs to indicates remote agent,

- if an LWFG contains a cycle that does not involve Text, then the site and DDBMS are in deadlock,

- global deadlock may exist if LWFG contains a cycle involving Text and to determine if there is deadlock, the graphs have to be merged.

From the above transactions, demonstrate how Obermarck’s method works. What can we conclude from the global WFG?

2.3) (A simplified version of the exercise on page 732)

A multinational engineering company has decided to distribute its project management information at the regional level in mainland Britain. The current centralized relational schema is as follows:

Employee(NIN, fName, lName, address, DOB, sex, salary, taxCode, deptNo)

Department(deptNo, deptName, managerNIN, businessAreaNo, regionNo)

Project(projNo, projName, contractPrice, projectManagerNIN, deptNo)

WorksOn(NIN, projNo, hoursWorked)

Business(businessAreaNo, businessAreaName)

Region(regionNo, regionName)

Suppose the sub-regions have been laid out according to the following predicates:

{

regionNo = 1 (‘Scotland’) and businessAreaNo = 1 (‘SE’),

regionNo = 1 (‘Scotland’) and businessAreaNo = 2 (‘ME’),

regionNo = 2 (‘Wales’) and businessAreaNo = 2 (‘ME’),

regionNo = 3 (‘England’) and businessAreaNo = 1 (‘SE’),

regionNo = 3 (‘England’) and businessAreaNo = 2 (‘ME’),

regionNo = 3 (‘England’) and businessAreaNo = 3 (‘EL’)

}(6 points)

(a) Use primary horizontal fragmentation for relationDepartment, write relational algebra for each fragment and show the relational algebra of how the original relation to be reconstructed.

(b) Use vertical fragmentation for relationEmployee, write relational algebra for each fragment and show the relational algebra of how the original relation to be reconstructed.

(c) Use derived fragmentation for relationProjects based onsub-regions given in (a), write relational algebra for each fragment and show the relational algebra of how the original relation to be reconstructed.

2.4) From the DreamHome relational schema, two of its relations are shown below:

Staff(staffNo, fName, lName, position, sex, DOB, salary, branchNo)

Branch(branchNo, address, street, city, zipcode)

Assuming the relations are fragmented as followed:

B1: branchNo=‘B003’ (Branch)

B2: branchNo!=‘B003’ (Branch)

S1:staffNo, position, sex, DOB, salary(Staff)

S2:staffNo, fName, lName, branchNo(Staff)

S21:branchNo=‘B003’ (S2)

S22:branchNo!=‘B003’ (S2)

Suppose a manager at the headquarter would like to list the staff names who work at Branch B003 along with the city name. The SQL is written as:

SELECT fName, lName, city

FROM Branch B, Staff S

WHERE B.branchNo = S.branchNo AND B.branchNo = ‘B003’;

Show the generic relational algebra tree for this query and give the reduced tree. (6 points)

(Hint: see examples 23.2, 23.3, and 23.4)