Entities and Relationships

Focus

Identifying entities and relationships

Purpose. This assignment introduces a single table (or ‘flat file’) representation of a marketing database. The goal is two-fold: 1) learn to identify entities embedded in data and 2) observe the data redundancy which results when entities and relationships are modeled in one table.

Description. The marketing department in an investment firm is developing a database to improve client reporting. This database will extract information from the existing computer systems, integrate the information and generate the reports. We will begin this development process with two entities Client and Fund, and one relationship HasFund. An entity is a classification of objects in the real-world application domain which share descriptive properties. An entity can be viewed as an abstraction that is characterized by one or more properties. The values of the properties are the data in the database. The Client entity represents information about the investors such as Client name and address. The Fund entity represents information about the products in which a client invests. The Contact Person is the individual handling the client’s requests. Consider the sample table below.

FundCode / tblFund.Name / Type / ClientCode / tblClient.Name / ContactPerson
104 / Domestic Fixed Income Fund / DFI / GEEL / General Electric Company / Tom Smith
104 / Domestic Fixed Income Fund / DFI / NYCM / New York City Municipal Employees Retirement / Joan Sikorski
104 / Domestic Fixed Income Fund / DFI / STNJ / State of New Jersey Public Employees Retirement Agency / Thomas Thomas
110 / International Large Capitalization Equity Fund / IE / LIPA / Long Island Power Authority / Annette Francis
150 / Domestic Small Capitalization Equity Fund / DE / LIPA / Long Island Power Authority / Annette Francis
220 / Domestic Large Capitalization Equity Fund / DE / IBME / IBM Employees Retirement Group / Annette Francis

1. There are six columns (or properties) in the table. Classify these columns as belonging either the Client or Fund entity.

2. In the table, Domestic Fixed Income Fund is repeated three times. Why ?

3. Long Island Power Authority is repeated twice. Why ?

4. Looking at the fund properties identified in (1), which property can serve as a unique identifier of funds? Note, for any two distinct funds, the respective unique identifiers are not the same.

5. Looking at the client properties identified in (1), which property can serve as a unique identifier of clients?

6. Construct tables corresponding to the Fund and Client entities. The properties become columns in the tables. Copy the values from the above table but do not repeat rows. The resultingtables will be called tblFund and tblClient respectively. To save time, abbreviate the values for fund and client names. How many rows are in each table?

7. When we partitioned the table on page 1 into tblClient and tblFund in question 6, we eliminated redundant rows (a good thing), but lost information regarding the relationship between a client and the funds invested in. In order to correct this, wewill introduce a table to represent this relationship.

a) The unique identifier of an entity becomes the primary key of the table representing the entity. What is the primary key of tblClient?

b) What is the primary key of tblFund?

8. Write each primary key on a line above a column in the table below. Fill in the tableby copying the respective columns from the table on page 1. Maintain the row correspondence. In other words, values in the same row on page 1 should be in the same row in the table below.

______

We have now instantiated the relationship table, which we will call tblHasFund. An instance of a table is the data it contains at a particular point in time. The primary keys above identify the related rows in their respective tables, tblClient and tblFund. Redundancy is reduced to simply repeating primary key values for each relationship in which the respective entity instance participates. In other words, if a client is invested in three funds, then its primary key value is repeated three times in tblHasFund, once for every fund invested in. We are not repeating the other properties of the client. Each related fund is represented in tblHasFund by its primary key value. A fund in which two clients are invested, will have its primary key value repeated twice in tblHasFund. However, the other properties of the fund are not repeated.

9. a) Can either ClientCode or FundCode by itself serve as the primary key of tblHasFund ? Why?

b) If not, what is the primary key of tblHasFund? Note, a primary key must always exist in a table in a relational database. A composite key is a primary key in which two or more columns are required to uniquely identify rows in a table. A table is all key if the values in every column participate in the primary key.

10. We now introduce a new property called InceptionDate. This represents the date a client started investing in the fund. Is InceptionDate a property of Client, Fund, or the relationship HasFund? Briefly explain your reasoning.