Unit -3
DBMS (Database Management System)
Data Management
Without data and the ability to process it, an organization could not successfully complete most business activities. If data to be transformed into meaningful and useful information it must first be organized in a meaningful way.
The hierarchy of data
Data Hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves fields, records, files and so on.
A data field holds a single fact. Consider a date field, e.g. "September 19, 2004". This can be treated as a single date field (e.g birthdate), or 3 fields, namely, month, day of month and year.
A record is a collection of related fields. An Employee record may contain a name field(s), address fields, birthdate field and so on.
A file is a collection of related records. If there are 100 employees, then each employee would have a record (e.g. called Employee Personal Details record) and the collection of 100 such records would constitute a file (in this case, called Employee Personal Details file).
Files are integrated into a database. This is done using a Database Management System. If there are other facets of employee data that we wish to capture, then other files such as Employee Training History file and Employee Work History file could be created as well.
The above is a view of data seen by a computer user. The above structure can be seen in the hierarchical model, which is one way to organize data in a database.
In terms of data storage, data fields are made of bytes and these in turn are made up of bits.
The smallest element is a bit. (0 & 1)
8 bits= 1 byte
1 byte= a character
Entity, Attributes and keys
An entity is a generalized class of people, place or things (objects).
An attribute is a characteristic of an entity.
The specific value of an attribute called a data item
Any subset of a relation is called key.
Attributes
It is possible to define a set of entities and the relationships among them in a number of different ways. The main difference is in how we deal with attributes.
Example
Consider the entity set employee with attributes employee-name and phone number. We could argue that the phone be treated as an entity itself, with attributes phone-number and location. Then we have two entity sets, and the relationship set Emp Phn defining the association between employees and their phones.
Keys in DBMS (RDBMS)
A key is a single or combination of multiple fields in a table. It is used to fetch or retrieve records/data-rows from data table according to the condition/requirement. Keys are also used to create relationship among different database tables or views.
Types of SQL Keys
We have following types of keys in SQL which are used to fetch records from tables and to make relationship among tables or views.
1. Super Key
Super key is a set of one or more than one keys that can be used to identify a record uniquely in a table.Example : Primary key, Unique key, Alternate key are subset of Super Keys.
2. Candidate Key
A Candidate Key is a set of one or more fields/columns that can identify a record uniquely in a table. There can be multiple Candidate Keys in one table. Each Candidate Key can work as Primary Key.
Example: In below diagram ID, RollNo and EnrollNo are Candidate Keys since all these three fields can be work as Primary Key.
3. Primary Key
Primary key is a set of one or more fields/columns of a table that uniquely identify a record in database table. It cannot accept null, duplicate values. Only one Candidate Key can be Primary Key.
4. Alternate key
A Alternate key is a key that can be work as a primary key. Basically it is a candidate key that currently is not primary key.
Example: In below diagram RollNo and EnrollNo becomes Alternate Keys when we define ID as Primary Key.
5. Composite/Compound Key
Composite Key is a combination of more than one fields/columns of a table. It can be a Candidate key, Primary key.
E.g. of Composite Key, if we have used “Name, Address” as a Primary Key then it will be our Composite Key.
6. Unique Key
Unique key is a set of one or more fields/columns of a table that uniquely identify a record in database table. It is like Primary key but it can accept only one null value and it cannot have duplicate values.
7. Foreign Key
Foreign Key is a field in database table that is Primary key in another table. It can accept multiple null, duplicate values.
8. Secondary Key
The attributes that are not even the Super Key but can be still used for identification of records (not unique) are known as Secondary Key.
Data Models, Schemas and Instances
Data Model is a collection of concepts that can be used to describe the structure of database.
Structure of database means data types, relationships and constraints. In addition, most data model includes a set of basic operations for specifying retrievals and modifications on the database. Data Model provides a means to achieve Data Abstraction.
Data Abstraction is refers to the hiding of certain details of how the data are stored and maintained. With several levels of abstraction, the user’s view of the database is simplified and this leads to the improved understanding of data.
There are three levels of abstractions:
View level:
The highest level of abstraction describes only part of the entire database. Many users will not be concerned with the large database. Instead, they need to access only a part of it so that view level abstraction is defined. There are many views for the same database.
Logical level:
This level describes what data are stored in the whole database.
Physical level:
The lowest level of abstraction describes how the data are actually stored
Database Schema
Definition:
Overall design of data base. Schema contains 'No of records +
Type of data + No of attributes'
Database Instance
Definition:
The term instance is typically used to describe a complete database environment, including the RDBMS software, table structure, stored procedures and other functionality. It is most commonly used when administrators describe multiple instances of the same database. (At a particular point of time)
Also Known As: environment
Introduction (DBMS)
File Based Approach (Traditional approach)
One way to keep the information on a computer is to store it in the permanent files. The system has a number of application programs; each of them is defined to manipulate the data files. These application programs have been written on request of the users in the organization. New application will be added to the system as the need arises. The system just described is called the file-based system. Consider a traditional banking system which using the file-based system in managing the organization’s data in the picture below. As we can see, there are different departments in the Bank, each of them have their own applications which manage and manipulate different data files. For Banking system, the programs can be the one to debit or credit an account, find the balance of an account, add a new mortgage loan or generate monthly statements etc.
File-based approach for banking system
Keeping organizational information in this approach has a number of disadvantages, including:
Disadvantages
a.Data Redundancy:
Since files and applications are created by different programmer of various departments over long period of time, it might lead to several problems:
1. Inconsistency in data format
2. The same information may be kept in several different places (files).
3. Data inconsistency which means various copies of the same data are conflicting; waste storage space and duplication of effort
b. Data Isolation
It is difficult for new application to retrieve the appropriate data which might be stored in various files.
c. Integrity problems
1. Data values must satisfy certain consistency constraints which are specified in the application programs.
2. It is difficult to add change the programs to enforce new constraint
d. Security problems
1. There are constraint regarding accessing privileges
2. Application is added to the system in the ad-hoc manner so it is difficult to enforce those constraints Data may be accessed by many applications that have not been coordinated previously so it is not easy to provide a strategy to support multiple users to update data simultaneously These difficulties have prompted the development of a new approach in managing large amount of organizational information – database approach.
The Database Approach
Database and database technology play an important role in most of social areas where computer are used, including business, education, medicine etc.
Fundamental Concepts
Database is a shared collection of related data which will be used to support the activities of particular organization. Database can be viewed as a repository of data that is defined once and then is accessed by various users. A database has the following properties: It is a representation of some aspect of the real world; or perhaps, a collection of data elements (facts) representing real world information. Database is logical coherent and internally consistent. Database is designed, built, and populated with data for a specific purpose.
DBMS
Database Management System (DBMS) is a collection of programs that enable users to create, maintain database and control all the access to the database. The primary goal of the DBMS is to provide an environment that is both convenient and efficient for user to retrieve and store information. Application program accesses the data stored in the database by sending requests to the DBMS.
The components of a database system
Database approach for banking system
Characteristics of Database approach
There are a number of characteristics that distinguish the database approach with the file-based approach. In this section, we describe in detail some of those important characteristics.
1. Self-Describing Nature of a Database System:
Database System contains not only the database itself but also the descriptions of data structure and constraints (meta-data). This information is used by the DBMS software or database users if needed. This separation makes database system totally different from traditional file-based system in which data definition is a part of application programs
2. Insulation between Program and Data:
In the file base system, the structure of the data files is defined in the application programs so if user wants to change the structure of a file, all the programs access to that files might need to be changed. On the other hand, in database approach, data structure is stored in the system catalog not in the programs so such changes might not occur.
3. Support multiple views of data:
A view is a subset of the database which is defined and dedicated for particular users of the system. Multiple users in the system might have different views of the system. Each view might contain only the interested data of an user or a group of user. Sharing of data and Multi user system: A multi user database system must allow multiple users access the database at the same time. As the result, the multi user DBMS must have concurrency control strategies to ensure that several user try to access the same data item at a time do so in the manner so that the data always be correct.
Benefits of Database Approach (Advantages of DBMS)
1. To control Data Redundancy
In the Database approach, ideally each data item is stored in only one place in the database. However, in some case redundancy is still exists to improving system performance, but such redundancy is controlled and kept to minimum
2. Data Sharing
The integration of the whole data in an organization leads to the ability to produce more information from a given amount of data
3. Enforcing Integrity Constraints
DBMSs should provide capabilities to define and enforce certain constraints such as data type, data uniqueness.
4. Restricting Unauthorized Access
Not all users of the system have the same accessing privileges. DBMSs should provide a security subsystem to create and control the user accounts.
5. Data Independence
The system data descriptions are separated from the application programs. Changes to the data structure is handled by the DBMS and not embedded in the program.
6. Transaction Processing
The DBMS must include concurrency control subsystem to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct.
7. Providing multiple views of data
A view may be a subset of the database. Various users may have different views of the database itself. Users may not need to be aware of how and where the data they refer to is stored
8. Providing backup and recovery facilities
If the computer system fails in the middle of a complex update program, the recovery subsystem is responsible for making sure that the database is restored to the stage it was in before the program started executing.
9. Easier modification and updating
10. A framework for program development
11. Improved data integrity
The Main disadvantages of DBMS
· Cost of Hardware and Software
A processor with high speed of data processing and memory of large size is required to run the DBMS software. It means that you have to upgrade the hardware used for file-based system. Similarly, DBMS software is also very costly,.
· Cost of Data Conversion
When a computer file-based system is replaced with database system, the data stored into data file must be converted to database file. It is very difficult and costly method to convert data of data file into database. You have to hire database system designers along with application programmers. Alternatively, you have to take the services of some software house. So a lot of money has to be paid for developing software.
· Cost of Staff Training
Most database management system are often complex systems so the training for users to use the DBMS is required. Training is required at all levels, including programming, application development, and database administration. The organization has to be paid a lot of amount for the training of staff to run the DBMS.