ISAM 5332DATA WAREHOUSING AND DATA MINING SPRING 2017

PREREQUISITE:ISAM 5331 – Database Design and SQL or equivalent knowledge.

CLASSROOM & TIME:SSB 3310/3.201.02 (MIS Database Lab); Mondays, 7:00 PM – 9:50 PM

INSTRUCTOR:Mohammad A. Rob, Ph.D. Office:SSB Suite 3-202-9 Voice: (281) 283-3191

E-mail: Course Website:

Office Hours:Mondays, 5-7 PM & Wednesdays:2-4 PM;and walk-ins & appointment.

Teaching Assistant: Check course website.

COURSE REQUIREMENTS:

  • Required Text: Data Warehousing Fundamentals: A Comprehensive Guide for IT Professionals, Author: Paulraj Ponniah, Second Edition, John Wiley & Sons; ISBN: 978-0-470-46207-2.
  • Required Material: Instructor’s Class Notes - available in the Course Web site.
  • Required Material: Create a Pivot table in Excel 2013. Online Version from Microsoft. Available through Instructor’s website.
  • Required Materials: Microsoft SQL Server 2012 Analysis Services Multidimensional Modeling Step by Step, Online Version from Microsoft. Available through Instructor’s website.
  • Recommended Material: Microsoft SQL Server 2012 Analysis Services Data Mining Step by Step, Online Version from Microsoft. Available through Instructor’s website.
  • Required Tool:Microsoft SQL Server 2012 Analysis Services (available in the database lab).
  • Recommended Text:Data Mining Concepts and Techniques, Authors: Jiawei Han and Micheline Kamber, Morgan Kaufmann Publishers/Academic Press.

COURSE FORMAT:FACE TO FACE

COURSE DESCRIPTION, GOALS, AND LEARNING OUTCOMES:

This course will acquaint students with the fundamental knowledge of data warehouse modeling and design.

  • It will also provide students with the knowledge of tools and techniques of data warehouse development in a corporate business environment.
  • It will further familiarize students on the tools and techniques of data interrogation of warehouse data using Pivot tables and OLAP (Online Analytical Processing) methodology.
  • It will prepare students for future careers in data warehouse planning, analysis, design, and implementation as well as making business forecasting using decision-making tools.
  • It will helpstudents to pass data warehousing certification.
  • If time permits, there will be discussions on data mining concepts and techniques.

COURSE ACTIVITIES:The course will contain lectures, homework assignments, a group project, group presentations, a test, a research paperand two reports. At least 50% of the activities will be hands-on practice by students using Microsoft tools such as Excel, SQL Server Data Warehouse, Analysis Services and Visual Studio.

RESEARCH PAPER:Students will write a paper understanding the current job prospect of a data warehouse analyst. Refer to the course web site for the details of the paper requirements.

GROUP PROJECT:Students will work in groups to model, design, and develop a data warehouse. Each group will choose a business industry such as airline, education, retail, financial, insurance, hospitality, investment, and healthcare. They will then develop/collect day-to-day business data of various formats which can be found in files, databases, spreadsheets, or text documents. They will then develop a strategy to transfer these data in a common format and in summarized form to a data warehouse developed in Microsoft Access. They will then transfer that data into a SQL Server data warehouse. Finally, students will apply OLAP tools to extract meaningful business intelligence reportson customers, products, purchases, and so on.Refer to course website for details. Each group needs to submit a final report as outlined in the website.

GROUP PRESENTATION:Each student group will make 2-3 presentations on the project mentioned above. The presentations will follow a schedule and they will cover topics of (i) project definition and planning, (ii) data warehouse modeling and expected business intelligence reports, and finally (iii) demonstrate OLAP implementation and reports. Each student must participate in the presentation.

ATTENDANCE:Students are expected to be physically present in the class and participate in the discussion on presentations by others. Class rolls may be called anytime.

EVALUATION/GRADING POLICY:

Midterm60%

Homework10%

Group Project - Presentation5%

Group Project - Final Report10%

Individual Report – Analysis Services/Excel Tabular Modeling5%

Discussion/Attendance5%

Research Paper5%

______

Total100%

GRADE DISTRIBUTION:

A–=91 – 95,A=96 – 100,

B–=81 – 83,B=84 – 87,B+=88 – 90,

C–=71 – 73,C=74 – 77, C+=78 – 80,

D–=61 – 63,D=64 – 67,D+=68 – 70,

F=60 and below

OTHER INFORMATION:

A.Class attendance: Regular class attendance is required and rolls may be called anytime.

B.UHCL Software:

C.Missing Tests and Assignments: Missing tests and laboratories will be counted as zero. Make-up of missing tests and any late submission of assignments will be acceptable only under extreme emergencies.

D.Academic Honesty: The Academic Honesty Policy at UHCL (found on the Dean of Students’ website, the Faculty Handbook, the Student Handbook, the Senior Vice President and Provost’s website, the Graduate Catalog, and the Undergraduate Catalog) states: Academic honesty is the cornerstone of the academic integrity of the university. It is the foundation upon which the student builds personal integrity and establishes a standard of personal behavior. Because honesty and integrity are such important factors in the professional community, you should be aware that failure to perform within the bounds of these ethical standards is sufficient grounds to receive a grade of "F" in this course and be recommended for suspension from UHCL. The Honesty Code of UHCL states "I will be honest in all my academic activities and will not tolerate dishonesty."

COPYING FROM ONLINE BOOKS & FROM EACH OTHER WILL NOT BE TOLERATED.ABSOLUTELY NO CELL PHONES WILL BE ALLOWED DURING THE TEST.

E.Special Academic Accommodations: If you believe you have a disability requiring an accommodation, contact Disability Services at 281‐283‐2648 or as soon as possible and complete their registration process. The University of Houston System complies with Section 504 of the Rehabilitation Act of 1973 and the Americans with Disabilities Act of 1990, pertaining to the provision of reasonable academic adjustments/auxiliary aids for students with a disability. In accordance with Section 504 and ADA guidelines, each University within the System strives to provide reasonable academic adjustments/auxiliary aids to students who request and require them.

F.Incomplete Grade: A grade of “I” (Incomplete) will be administered only under extreme, verifiable emergency” situation where the student in unable to complete some minor portion of the course work due to circumstance beyond his/her control provided the student is passing the course.

G.LAST DAY TO DROP/WITHDRAW FROM THE SEMESTER: APRIL 11, 2017.

COURSE SCHEDULE
(Subject to change if deemed necessary)
Date-2017 / Lecture & Skill/Activity / Due Dates
January 23 / Syllabus Review & Introduction
January 30 / Background and definitions of data warehouse, data marts, and data mining (Chapter 1 of Ponniah) / Formation of Groups (3 students)
February 6 / The data warehouse architecture (Chapters 2, 6, and 7 of Ponniah)
February 13 / The principles of dimensional modeling (Chapter 5 and 10 of Ponniah) / Submit Review Questions - Chapter 1 & 2
February 20 / Advanced topics in dimensional modeling (Chapter 11 of Ponniah) / Start of First Group Presentations – Problem Definition and Planning
February 27 / Online Analytical processing (Chapter 15 ofPonniah) / Submit DW Analyst Paper
March 6 / Introduction to SQL Server Database and Create a new Database / Submit Review Questions - Chapters 5 & 10
March 13 / Spring Holiday
March 20 / Create a Pivot Table in Excel using Contoso Database
March 27 / Test-I: Essay/short answers - Chapters 1, 2, 5, 10, 11 and Class Notes / Submit Review Questions - Chapters 11 & 15
April 3 / SQL Server 2012 Analysis Services: Lesson 1 / Start of Second Group Presentation –Multi-dimensional Data Modeling
April 10 / SQL Server 2012 Analysis Services: Lesson 2 / Submit Completed Contoso Pivot Table in the Web Folder
April 17 / SQL Server 2012 Analysis Services: Lesson 3 / Submit individual Report on Analysis Services (see below)
April 24 / Implementing the Group Project / Start of Third Group Presentation – Demonstration and OLAP Reports
May 1 / Continue working on the Group Project
May 8 / Finishing the Group Project / Submit Final Project Report (see below)

Individual Report on OLAP Test Project

After learning to develop Cubes through SQL Server Analysis Services, each student will be required to document through texts and screen shots, the steps of test Cube development and results of at least three OLAP queries from the Excel Pivot Table. In this case, students will be using Visual Studio/SQL Server 2012 along with a sample SQL Server AdventureWorks data warehouse to create a business intelligence project and go through the steps following the note: SQL Server 2012 analysis Services Multidimensional Modeling Step by Step. Refer to course website for a sample report. Again, your report should be about 10-15 pages long and it should contain 3-5 OLAP reports along with your analysis and conclusion.

Group Project Final Report

You are to write a report for the group project you have done during the semester. The paper should clearly describe the concepts and purpose of data warehousing, the business scenario of your project and proposed outcome, the approach of dimensional modeling, the design of data warehouse, the research methodology including data gathering, data cleansing, data transfer, as well as development of data warehouse and OLAP cube using a specific tool such as SQL Server Analysis Services 2012. The implementation part should describe dimensions and facts as well as the reasons to consider them. The results should provide appropriate strategic information for your business.

Use appropriate texts and graphics to describe and display necessary activities and data. The following is a typical outline of the paper; however, you are free to change the titles of the sections. But the sections must follow typical steps of your project development - beginning from the business scenario and ending with the explanation OLAP results along with figures and graphs:

Title

Name of the Students and affiliation

Abstract

Business Scenario: Problem statement, data origins, and expected value

Why Data Warehousing?

Methodology

Defining raw data format

Data cleansing

Data Transfer (Text file, Excel, Access/SQL Server)

Dimensional Modeling: Defining dimensions, facts, attributes, and hierarchies, database schema

DW implementation (in Access/SQL Server): Screen capture of relationships, dimensions, facts

Cube Implementation and OLAP in SQL Server

Results: Screen capture of drill-down, roll-up, slicing and dicing, graphs, along with explanations

Conclusion and Discussion

1