Principles of Databases

INF3703/202/01/2011

PRINCIPLES OF DATABASES

Tutorial letter 202/

Studiebrief 202

CONTENTS / INHOUD:

Solution to assignment 2 / Oplossing vir werkopdrag 2

SCHOOL OF COMPUTING

SKOOL VIR REKENAARKUNDE

INF3703/202/01/2011

1.Answers to Assignment 1

Question 1

1.1What is normalization?

Answer to 1.1:

Normalization is the process for assigning attributes to entities. Properly executed, the normalization process eliminates uncontrolled data redundancies, thus eliminating the data anomalies and the data integrity problems that are produced by such redundancies.

Normalization does not eliminate data redundancy; instead, it produces the carefully controlled redundancy that lets us properly link database tables.

1.2When is a table in 1NF?

Answer to 1.2:

A table is in 1NF when all the key attributes are defined (no repeating groups in the table) and when all remaining attributes are dependent on the primary key. However, a table in 1NF still may contain partial dependencies, i.e., dependencies based on only part of the primary key and/or transitive dependencies that are based on a non-key attribute.

1.3When is a table in 2NF?

Answer to 1.3:

A table is in 2NF when it is in 1NF and it includes no partial dependencies. However, a table in 2NF may still have transitive dependencies, i.e., dependencies based on attributes that are not part of the primary key.

1.4When is a table in 3NF?

Answer to 1.4:

A table is in 3NF when it is in 2NF and it contains no transitive dependencies.

1.5When is a table in BCNF?

Answer to 1.5:

A table is in Boyce-Codd Normal Form (BCNF) when it is in 3NF and every determinant in the table is a candidate key. For example, if the table is in 3NF and it contains a nonprime attribute that determines a prime attribute, the BCNF requirements are not met.This description clearly yields the following conclusions:

If a table is in 3NF and it contains only one candidate key, 3NF and BCNF are equivalent.
BCNF can be violated only if the table contains more than one candidate key. Putting it another way, there is no way that the BCNF requirement can be violated if there is only one candidate key.

Question 2

2.1Using the INVOICE table structure shown in table 1, write the relational schema, draw its dependency diagram and identify all dependencies (including all partial and transitive dependencies). You can assume that the table does not contain repeating groups and that any invoice number may reference more than one product. (Hint: This table uses a composite primary key.)

Table 1

Attribute Name / Sample Value / Sample Value / Sample Value / Sample Value / Sample Value
INV_NUM / 211347 / 211347 / 211347 / 211348 / 211349
PROD_NUM / AA-E3422QW / QD-300932X / RU-995748G / AA-E3422QW / GH-778345P
SALE_DATE / 15-Jan-2006 / 15-Jan-2006 / 15-Jan-2006 / 15-Jan-2006 / 16-Jan-2006
PROD_LABEL / Rotary sander / 0.25-in. drill bit / Band saw / Rotary sander / Power drill
VEND_CODE / 211 / 211 / 309 / 211 / 157
VEND_NAME / NeverFail, Inc. / NeverFail, Inc. / BeGood, Inc. / NeverFail, Inc. / ToughGo, Inc.
QUANT_SOLD / 1 / 8 / 1 / 2 / 1
PROD_PRICE / $49.95 / $3.45 / $39.99 / $49.95 / $87.75

Answer to 2.1:

Note:QUANT_SOLD = NUM_SOLD

2.2Using the initial dependency diagram drawn in question 2.1, remove all partial dependencies, draw the new dependency diagrams, and identify the normal forms for each table structure you created.

Answer to 2.2:

2.3Using the table structures you created in question 2.2, remove all transitive dependencies and draw the new dependency diagrams. Also identify the normal forms for each table structure you created

Answer to 2.3:

Question 3

Some Tiny University staff employees are information technology (IT) personnel. Some IT personnel provide technology support for academic programs. Some IT personnel provide technology infrastructure support. Some IT personnel provide technology support for academic programs and technology infrastructure support. IT personnel are not professors. IT personnel are required to take periodic training to retain their technical expertise. Tiny University tracks all IT personnel training by date, type, and results (completed vs. not completed). Given that information, create the complete ERD containing all primary keys, foreign keys, and main attributes.

Answer:

Question 4

4.1Explain the following statement: a transaction is a logical unit of work.

Answer to 4.1:
A transaction is a logical unit of work that must be entirely completed of aborted; no intermediate states are accepted. In other words, a transaction, composed of several database requests, is treated by the DBMS as a unit of work in which all transaction steps must be fully completed if the transaction is to be accepted by the DBMS.

Acceptance of an incomplete transaction will yield an inconsistent database state. To avoid such a state, the DBMS ensures that all of a transaction's database operations are completed before they are committed to the database. For example, a credit sale requires a minimum of three database operations:

4.2What is concurrency control?

Answer to 4.2:

Concurrency control is the activity of coordinating the simultaneous execution of transactions in a multiprocessing or multiuser database management system. The objective of concurrency control is to ensure the serializability of transactions in a multiuser database management system. (The DBMS's scheduler is in charge of maintaining concurrency control.) Because it helps to guarantee data integrity and consistency in a database system, concurrency control is one of the most critical activities performed by a DBMS. If concurrency control is not maintained, three serious problems may be caused by concurrent transaction execution: lost updates, uncommitted data, and inconsistent retrievals.

4.3 What is the advantages of DDBMS?

Answer to 4.3:

Data are located near the greatest demand site
Faster access
Faster data access
Growth facilitation

4.4What is the disadvantages of DDBMS?

Answer to 4.4:

Complexity of management and control
Security
Lack of standards
Increased storage requirements

Question 5

5.1Explain the difference between a distributed database and distributed processing

Answer to 5.1:

Distributed processing, a database’s logical processing is shared among two or more physically independent sites that are connected through a network. For example the data input/output(I/O), data selection and data validation might be performed on one computer, and a report based on that data might be on another computer. Distributed database, on the other hand, stores a logically related database over two or more physically independent sites. The sites are connected via computer network.

5.2Describe the different types of database requests and transactions.

Answer to 5.2:

A database transaction is formed by one or more database requests. Each database request is the equivalent of a single SQL statement. The basic difference between a local transaction and a distributed transaction is that the latter can update or request data from several remote sites on a network. In a DDBMS, a database request and a database transaction can be of two types: remote or distributed.

A remote request accesses data located at a single remote database processor (or DP site). In other words, an SQL statement (or request) can reference data at only one remote DP site.

A remote transaction, composed of several requests, accesses data at only a single remote DP site.

Adistributed transaction allows a transaction to reference several different local or remote DP sites. Although each single request can reference only one local or remote DP site, the complete transaction can reference multiple DP sites because each request can reference a different site. Use Figure 12.12 to illustrate the distributed transaction.

A distributed request lets us reference data from several different DP sites. Since each request can access data from more than one DP site, a transaction can access several DP sites. The ability to execute a distributed request requires fully distributed database processing because we must be able to:

Partition a database table into several fragments.
Reference one or more of those fragments with only one request. In other words, we must have fragmentation transparency.

The distributed request feature also allows a single request to reference a physically partitioned table.

5.3 What is the objective of query optimisation function?

Answer to 5.3:

The objective of query optimization functions is to minimize the total costs associated with the execution of a database request. The costs associated with a request are a function of:the access time (I/O) cost involved in accessing the physical data stored on disk

the communication cost associated with the transmission of data among nodes in distributed database systems
the CPU time cost.

It is difficult to separate communication and processing costs. Queryoptimization algorithms use different parameters, and the algorithms assign different weight to each parameter. For example, some algorithms minimize total time, others minimize the communication time, and still others do not factor in the CPU time, considering it insignificant relative to the other costs. Query optimization must provide distribution and replica transparency in distributed database systems.

5.4 What is XML and why is it important?

Answer to 5.4:

Extensible Markup Language (XML) is a metalanguage used to represent and manipulate data elements. XML is designed to facilitate the exchange of structured documents, such as orders and invoices, over the internet.

2.1Extras exercises to complete

Question 1:Problem 7 on page 200 of textbook

Question 2: Problem 7 on page 234 of textbook

Question 3: Problem 11 and 12 on page 284 of textbook

Answer to Question 1:

Solution to Problem 7 on page 200.

Answer to Question 2:

Solution to Problem 7on page 238

Answer to Question 3:

Solution to Problem 11 & 12 on page 284

The relational schemas are written as:

EMPLOYEE(EMP_CODE, EMP_LNAME, DEPT_CODE, JOB_CLASS, EMP_DOB,EMP_HIREDATE)

DEPENDENT(EMP_CODE, DEP_NUM, DEP_FNAME, DEP_TYPE)

DEPARTMENT(DEPT_CODE, DEPT_NAME, EMP_CODE)

JOB(JOB_CLASS, JOB_TITLE, JOB_BASE_SALARY)

EDUCATION(EDUC_CODE, EDUC_DESCRIPTION)

QUALIFICATION(EMP_CODE, EDUC_CODE, QUAL_DATE_EARNED)

UNISA 2011