Code: T-19Subject: DATA WAREHOUSE AND DATA MINING

Time: 3 HoursMax. Marks: 100

NOTE: There are 11 Questions in all.

Question 1 is compulsory and carries 16 marks. Answer to Q. 1. must be written in the space provided for it in the answer book supplied and nowhere else.

Answer any THREE Questions each from Part I and Part II. Each of these questions carries 14 marks.

Any required data not explicitly given, may be suitably assumed and stated.

Q.1Choose the correct or best alternative in the following:(2x8)

a.The abundance of data in the present day society is a

(A)data poor and information poor situation.

(B)data rich but information poor situation.

(C)data rich and information rich situation.

(D)data poor and information rich situation.

b.Data mining is a synonym for

(A)DBMS.(B)RDBMS.

(C)KDD.(D)Statistical Analysis.

c.Data mining is a step in the

(A)Data cleaning.(B)Data transformation.

(C)Data selection.(D)knowledge discovery process.

d.A data mart is a

(A)data warehouse(B)database

(C)subset of data warehouse(D)meta data

e.Operational databases do not typically include

(A)raw data.(B)transaction data.

(C)historical data.(D)indexing and hashing.

f.Principal component analysis is used in

(A)data reduction(B)data compression

(C)data cleaning(D)data pre-processing

g.Combining data from multiple sources into coherent data is termed as

(A)data cleaning.(B)data integration.

(C)data transformation.(D)data clustering.

h.In the data warehouse environment.

(A)data is time variant.

(B)there is a firm set of requirements.

(C)transaction response time is a major issue.

(D)development is done one application at a time.

PART I

Answer any THREE Questions. Each question carries 14 marks.

Q.2a.Describe the evolution of Decision support system.(7)

b.Explain with reasons the crisis of data credibility in naturally evolving architecture.(7)

Q.3a.Explain how data warehousing helps the executives to make strategic decisions.(5)

b.Suppose that a data warehouse consists of the three dimensionstime, doctor and patientand two measurescount and charge, wherechargeis the fee that a doctor charges a patient for a visit.(9)

i) Draw a schema diagram for the above warehouse, using one of three classes of schema.

ii)Enumerate based on the kind of aggregate functions that could be used in a datacube.

Q.4a.Explain how data cubes model n-dimensional data. Give an example of a 3-D view of sales data of an electronics company according to dimensions: time, item and location.(7)

b.Distinguish between snowflake schema and star schema model.(5)

c.What is Meta data?(2)

Q.5a.Describe the three tier architecture of data warehouse.(7)

b.What are the different data warehouse models from the architecture point of view?(7)

Q.6a.In a data warehouse technology, a multiple dimensional view can be implemented by ROLAP or by MOLAP. For each of the technique mentioned above, explain how each of the following functions are implemented.(7)

(i)Roll Up

(ii)Drill Down

b.Explain how data warehouse enables the EIS analyst to deal with various management needs.(7)

PART II

Answer any THREE Questions. Each question carries 14 marks.

Q.7a.Describe various problems relating to the use and storage of external and unstructured data in the data warehouse.(7)

b.Describe how meta data is vital and important component of data warehouse. Also list components of a typical meta data.(7)

Q.8a.What are the various forms of data pre-processing? Describe the purpose of data cleaning routines. How do you detect outliers?(7)

b.What are NAIVE Bayesian Classification Networks? Explain.(7)

Q.9a.Define z-score normalization. Find the z-score normalization valueRs. 73, 600 for income when the mean and the standard deviations for the attribute income are Rs. 54000 and Rs.16, 000 respectively.(7)

b.Describe data compression and discretization techniques.(5)

c.Data quality can be assessed in terms of accuracy, completeness andconsistency. Propose two other dimensions of data quality.(2)

Q.10a.What is corporate data model? How this can be changed to task the data model into data warehouse design.(7)

b.Describe Market basket analysis as a form of association rule mining.(7)

Q.11a.Describe the Apriori Algorithm for mining sequence of item sets for Boolean Association rules.(7)

b.What is a decision tree? Write basic algorithm for inducing a decision tree from

training sampler.(7)