University of Cincinnati s5

University of Cincinnati
College of Evening and Continuing Education
Database Management Systems I
(30-IT-451-002)

Database Design Methodology Summary

This page gives a summary of the Database Design Methodology for Relational Databases and key success factors.

Critical Success Factors in Database Design

The following guidelines may prove to be critical to the success of database design:

· Work interactively with the users as much as possible.

· Follow a structured methodology throughout the data modelling process.

· Employ a data-driven approach.

· Incorporate structural and integrity considerations into the data models.

· Combine conceptualization, normalization, and transaction validation techniques into the data modelling methodology.

· Use diagrams to represent as much of the data models as possible.

· Use a Database Design Language (DBDL) to represent additional data semantics.

· Build a data dictionary to supplement the data model diagrams.

· Be willing to repeat steps.

Database Design and Implementation Phases

· conceptual,

· logical, and

· physical database design.

The steps involved in the main phases of the database design methodology are outlined below.

Step 1 Build Local Conceptual Data Model for Each User View

Build a local conceptual data model of an enterprise for each specific user view.

Step 1.1 Identify entity types

Identify the main entity types in the users’ view of the enterprise. Document entity types.

Step 1.2 Identify relationship types

Identify the important relationships that exist between the entity types that we have identified. Determine the cardinality and participation constraints of relationship types. Document relationship types. Use Entity-Relationship (ER) modelling, when necessary.

Step 1.3 Identify and associate attributes with entity or relationship types

Associate attributes with the appropriate entity or relationship types. Identify simple/composite attributes, single valued, multi-valued attributes, and derived attributes. Document attributes capturing information such as:

· Attribute name and description.

· Any aliases, or synonyms that the attribute is known by.

· Data type and length.

· Default values for the attribute (if specified).

· Whether the attribute must always be specified (in other words, whether the attribute allows or disallows nulls).

· Whether the attribute is composite and if so, what are the simple attributes that make up the composite attribute.

· Whether the attribute is derived and if so, how it should be computed.

· Whether the attribute is multi-valued.

Step 1.4 Determine attribute domains

Determine domains for the attributes in the local conceptual model. Document attribute domains.

Step 1.5 Determine candidate and primary key attributes

Identify the candidate key(s) for each entity and, if there is more than one candidate key, choose one to be the primary key. Document primary and alternate keys for each strong entity.

A candidate key is an attribute or minimal set of attributes of an entity that uniquely identifies each occurrence of that entity. We may identify more than one candidate key. However, in this case, we must choose one to be the primary key; the remaining candidate keys are called alternate keys. When choosing a primary key from among the candidate keys, use the following guidelines to help make the selection:

· The candidate key with the minimal set of attributes.

· The candidate key that is less likely to have its values changed.

· The candidate key that is less likely to lose uniqueness in the future.

· The candidate key with fewest characters (for those with textual attribute(s)).

· The candidate key that is easiest to use from the users' point of view.

Step 1.6 Consider use of enhanced modeling concepts

Identify superclass and subclass entity types, where appropriate.

Step 1.7 Check model for redundancy

Re-examine one-to-one relationships and remove redundant relationships.

Step 1.8 Validate local conceptual model against user transactions.

Step 1.9 Review local conceptual data model with user

Review the local conceptual data model with the user to ensure that the model is a 'true' representation of the user's view of the enterprise.

Before completing Step 1, we should review the local conceptual data model with the user. The conceptual data model includes the ER diagram and the supporting documentation that describes the data model. If any anomalies are present in the data model, we must make the appropriate changes, which may require repeating the previous step(s).

We repeat this process until the user is prepared to 'sign off' the model as being a 'true' representation of the part of the enterprise that we are attempting to model.

Step 2 Build and Validate Local Logical Data Model for each User View

Build a logical data model based on the conceptual data model for each user view of the enterprise, and then validate the model using the technique of normalization and against the required transactions.

· A Logical database schema is a model of the structures in a DBMS.

· Logical design is the process of defining a system's data requirements and grouping elements into logical units.

Step 2.1 Map local conceptual data model to local logical data model

Refine the local conceptual data model to remove undesirable features and to map this model to a local logical data model. Remove the following by decomposing the attribute, entity, or relationship into an intermediate entity:

· M:N relationships.

· Complex relationships.

· Recursive relationships.

· multi-valued attributes,

· relationships with attributes

Re-examine relationships to see if there are redundancies.

Step 2.2 Derive relations from local logical data model

Derive relations from the local logical data model to represent the entity and relationships described in the user's view of the enterprise. Document relations and foreign key attributes. Also, document any new primary or candidate keys that have been formed as a result of the process of deriving relations from the logical data model. [Turn the relationships into entities]

Step 2.3 Validate model using normalization

Validate a local logical data model using the technique of normalization. The objective of this step is to ensure that each relation derived from the logical data model is in at least Boyce-Codd Normal Form (BCNF). [Most just validate to 3rd normal form, 1 normal - remove repeating groups, 2nd normal - remove partial dependencies, 3rd normal - remove transitive dependencies]

Step 2.4 Validate model against user transactions

Ensure that the logical data model supports the transactions that are required by the user view. The transactions that are required by each user view can be determined from the user's requirements specification. Using the ER diagram, the data dictionary and the primary key/foreign entity links shown in the relations, attempt to perform the operations manually. [Use CASE tool to prototype db tables, use sample data to validate]

Step 2.5 Define integrity constraints

Identify the integrity constraints given in the user's view of the enterprise. These include specifying the required data, attribute domain constraints, entity integrity, referential integrity, and enterprise constraints. Document all integrity constraints. [Result is a high level view of all constraints]

Step 2.6 Review local logical data model with user

Ensure that the local logical data model is a true representation of the user view.

Step 3 Build and Validate Global Logical Data Model

Combine the individual local logical data models into a single global logical data model that can be used to represent the part of the enterprise that we are interested in modeling.

Step 3.1 Merge local logical data models into global model

Merge the individual local logical data models into a single global logical data model of the enterprise. Some typical tasks in this approach are as follows:

· Review names of entities and their primary keys.

· Review the names of relationships.

· Merge entities from the local views.

· Include (without merging) entities unique to each local view.

· Merge relationships from the local views.

· Include (without merging) relationships unique to each local view.

· Check for missing entities and relationships.

· Check foreign keys.

· Draw the global logical data model.

· Update the documentation.

Step 3.2 Validate global logical data model

Validate the global logical data model using normalization and against the required transactions, if necessary. This step is equivalent to Steps 2.3 and 2.4, where validated each local logical data model.

Step 3.3 Check for future growth

Determine whether there are any significant changes likely in the foreseeable future, and assess whether the global logical data model can accommodate these changes.

Step 3.4 Review global logical data model with users

Ensure that the global logical data model is a true representation of the enterprise.

Step 4 Translate Global Logical Data Model for Target DBMS

Produce a basic working relational database schema from the global logical data model.
In this step you will find out:

· Whether the system supports the definition of primary keys, foreign keys, and alternate keys.

· Whether the system supports the definition of required data (that is, whether the system allows attributes to be defined as NOT NULL).

· Whether the system supports the definition of enterprise constraints.

· How to create base relations.

Step 4.1 Design base relations for target DBMS

Decide how to represent the base relations we have identified in the global logical data model in the target DBMS. Document design of relations. Collate and assimilate the information about relations produced during logical data modeling. For each relation identified
in the global logical data model, we have a definition consisting of:

· The name of the relation.

· A list of simple attributes in brackets.

· The primary key and, where appropriate, alternate keys (AK) and foreign keys (FK).

· Integrity constraints for any foreign keys identified.

· From the data dictionary, we also have for each attribute:

· Its domain, consisting of a data type, length, and any constraints on the domain.

· An optional default value for the attribute.

· Whether the attribute can hold nulls.

· Whether the attribute is derived and, if so, how it should be computed.

Step 4.2 Design representation of derived data

Step 4.3 Design enterprise constraints for target DBMS

Design the enterprise constraint rules for the target DBMS. Document design of enterprise constraint. Particular ways to create relations and integrity constraints are:

· The 1992 ISO SQL standard (SQL2).

· Triggers.

· INGRES 6.4.

· Unique indexes.

· Application code (stored procedures, application code, gui, ...)

Step 5 Design Physical Representation

Determine the file organizations and access methods that will be used to store the base relations: that is, the way in which relations and tuples will be held on secondary storage. A physical database schema is a plan of how to store data on a particular system.

Step 5.1 Analyze transactions

Understand the functionality of the transactions that will run on the database and analyze the important transactions.
For each transaction, we should determine:

· The expected frequency at which the transaction will run.

· The relations and attributes accessed by the transaction and the type of access; that is, query, insert, update, or delete.

· The attributes used in any predicates (in SQL, the predicates are the conditions specified in the WHERE clause). Check whether the predicates involve pattern matching, range searches, or exact match key retrieval.

· For a query, the attributes that are involved in the join of two or more relations.

· The time constraints imposed on the transaction

Step 5.2 Choose file organizations

Determine an efficient file organization for each base relation.
Selections a file organization include the following types:

· Heap.

· Hash.

· Indexed Sequential Access Method (ISAM).

· B+-Tree.

Step 5.3 Choose secondary indexes

Determine whether adding secondary indexes will improve the performance of the system. Secondary indexes provide a mechanism for specifying an additional key for a base relation that can be used to retrieve data more efficiently.

Step 5.4 Estimate disk space requirements

Estimate the amount of disk space that will be required by the database.

Step 6 Design User Views

Step 7 Design Security Mechanisms

Design the security measures for the database implementation as specified by the users.

Step 8 Consider the introduction of controlled redundancy

Determine whether introducing redundancy in a controlled manner by relaxing the normalization rules will improve the performance of the system. Consider introducing derived data and duplicating attributes or joining relations together.

Step 9 Monitor and Tune the Operational System

Monitor the operational system and improve the performance of the system to correct inappropriate design decisions or reflect changing requirements.

University of Cincinnati
College of Evening and Continuing Education
Database Management Systems I
(30-IT-451-002)

Database Terminology

This list of terms was taken from the class book. The terms are listed in the order in which we will study them.

File-Based System

A collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data.

Database

A shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization.

DBMS

A software system that enables users to define, create, and maintain the database and provides controlled access to this database.

Conceptual level

The community view of the database. This level describes what data is stored in the database and the relationships among the data.

Internal level

The physical representation of the database on the computer. This level describes how the data is stored in the database.

Logical data independence

Logical data independence refers to the immunity of external schemas to changes in the conceptual schema.

Physical data independence

Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema.

DDL

A descriptive language that allows the DBA or user to describe and name the entities required for the application and the relationships that may exist between the different entities.

DML

A language that provides a set of operations to support the basic data manipulation operations on the data held in the database.