/ Course Description and Objectives
Textbook
Software
Methods of Instruction
Evaluation
Student Responsibilities
Attendance Policy
Academic Dishonesty
ADAAccommodation Notice

Instructor:Dr. Vladimir Zanev
Office Location/Phone Number: CCT 442/ (706) 507-8182
Office Hours: Mon, Wed, Fri: 10:00 a.m.-12:00 noon.; Tue,Thu: 2:00-4:00 p.m.
Website:

This course is offered as an online class in the Spring term of 2013. Class meets 100% online at
( )

Section / Days / Time / Location
CRN21362 / TR / Online, 75 min. sessions / Online

Online Interface:
CougarVIEW (Desire2Learn) will be the primary method of online interaction in this course. Course materials (course outline, schedule, assignments, calendar, Midterm and Final exams, course notes, resources, email, and grading will be available throughCougarVIEW. You can access CougarVIEW at:

At this page, login with your username and password and open My Home page. Your CougarVIEW (D2L) username and password are the same you are using to login to the CSU computers. If you are a newly accepted student at CSU, your username and passwords are:

Username: lastname_firstname
Password: MMDDYY
where MMDDYY is the student birth date. (Example - Birthday of Oct. 25, 1978 is 102578)

For password resets, call the CSU Helpdesk at 706-507-8199.

On My Home page find the link to our course and click on it to open the Course Home page. This Course Home page with the left-hand Course Content menu will give you access to all course tools and materials.

Top ...

Course Description and Objectives

Course Description:
Prerequisite - CPSC 5115. Algorithm Analysis and Design, CPSC 5138 Advanced DBMS.
These prerequisites are not in the Catalog and will not be enforced. Consider them as a suggested background, which you should have to pass this course in a breeze. It is not required that you must have taken the courses above. However, completing the following courses and/or having a working knowledge in the respective areas will greatly help you to succeed in this class.

This course is an introduction to data mining. Recent advances in database technology along with the phenomenal growth of the Internet have resulted in an explosion of data collected, stored, and disseminated by various organizations. Because of its massive size, it is difficult for analysts to sift through the data even though it may contain useful information. Data mining holds great promise to address this problem by providing efficient techniques to uncover useful information hidden in the large data repositories. Data mining is a modern area of computer science concerned with automated or convenient extraction of patterns that represents previously unknown knowledge implicitly stored in large databases, data warehouses, and other massive information repositories. In this course we will approach the data mining problem from the position of data mining algorithms, database design and programming. We will discuss suitable data models, data preparation, and finally - different methods and algorithms one can implement to discover new knowledge from raw data. We consider an introduction to the data warehouse and OLAP technology, data cube computation and data generalization. The key objectives of this course are two-fold: (1) to teach the fundamental concepts of data mining and (2) to provide extensive hands-on experience in applying the concepts to real-world applications. The core topics to be covered in this course include:

  • data and exploring/preprocessing data
  • data warehouse and OLAP, data cubes and data generalization
  • classification data mining algorithms and methods
  • association analysis data mining algorithms and methods
  • cluster data mining algorithms and methods
  • WEKA data mining environment
  • Data mining using data mining Add-Ins and Excel
  • Database data mining with SQL Server 2012

Expected Outcomes
At the completion of this course, students will have an understanding and knowledge of:

  • What is data mining?
  • Data and exploring data: sampling, data cleaning, feature selection, and dimensionality reduction
  • Data warehouse, OLAP technology, data cubes and data cube computation
  • Classification: basic concepts, decision trees, model evaluation
  • Classification: naive Bayes, time series, neural networks
  • Association analysis: basic concepts and algorithms, Apriori algorithm
  • Cluster analysis: basic concepts and algorithms, hierarchical clustering methods
  • Data warehouse, OLAP technology, data cubes and data cube computation
  • SQL Server 2012 environment, tools, and algorithms
  • How to use SQL Server 2012 for database data mining

Top ...

Textbook
Textbooks - required
/ Title: Data Mining. Practical Machine Learning Tools and Techniques
Authors: Ian H. Witten, Eibe Frank, Mark Hall
Edition: 3rd, 2011
Publisher: Morgan Kaufmann Publishers
ISBN: 978-0-12-374856-0
/ Title: Data Mining with SQL Server 2008
Authors: Jamie MacLennan, ZhaoHui Tang, Bogdan Crivat
Edition: 2009
Publisher: Wiley Publishing Inc.
ISBN: 978-0-470-27774-4
Additional Resources
(available online at the class Resources page) / Chapter 3. Data Warehouse and OLAP Technology
Chapter 4. Data Cube Computation and Data Generalization
Chapter 5. Mining Frequent Patterns, Associations, and Correlations
from the textbook Data Mining. Concepts and Techniques by J. Han and M. Kamber
Data Cube: A Relational Aggregation Operators Generalizing Group-By, Cross-Tab, and Sub-Totals by Jim Gray et all (research paper)
SS08 Analysis Services and Data Cube Tutorials (developed from the SQL Server Books Online and SQL Server Developer Center)

Top ...

Software

Software
To complete all lessons, the data mining project, assignments, discussions, and exams, you will need a computer with:

  • Windows XP/Vista/7, Internet Explorer, Adobe Acrobat Reader, and Word
  • Access to CSU CougarVIEW Web site
  • SQL Server 2012 (see Resources Web page for details how to obtain SQL Server 2012)
  • WEKA data mining environment (see Resources Web page)
  • SQL Server 2012 Add-Ins and Excel 2010

Top ...

Methods of Instruction

Methods of Instruction:

  • Online Study
  • Forums
  • Assignments
  • Data Mining Projects
  • Midterm Exam
  • Final Exam

Online Study
Each student is expected to complete all readings from the textbooks and the additional resources following the course schedule. Make your own notes. You can use your own notes during the exams.

Assignments

Several assignments will be given that build upon the concepts covered in the textbooks and have to be completed on your own time. Assignments will be problem-solving about data mining algorithms. Assignment deadlines are not flexible for any reason. Late assignments are not accepted for credit. Assignment submissions are usually via D2L dropboxes.

Data Mining Projects
The purpose of the projects is to give you experience with Data Mining project development, implementation, analysis, result interpretations, and conclusions. The data mining projects are an opportunity to apply the data mining concepts, techniques, and tools studied in class on real data sets. All projects are data mining projects developed individually. The objective is to study, implement and run data mining algorithms analyzing real data sets. You have to use SQL Server 2012, WEKA, and the Data Mining Add-Ins as implementation tools. Late projects are not accepted for credit. Project submissions are usually via D2L dropboxes.

Forums
Three special forums will be opened on the course D2L Web site. The first one is Software Installation forum, the second one is Data Mining Projects and the third one - Data Mining Assignments. The forums are studying tools and your participation in these forums is not for grading purpose. You can post in these forums any questions, answers, remarks, or essays. You cannot ask for a help on an entire project or assignment in these forums. For example, you can ask for help on some error messages with projects, to give some hints or directions about parts of an assignment or a project. However you cannot ask for solutions of an entire project and/or assignment or for essential parts of a project or an assignment.

Exams
Your performance in this class will be measured by two online exams - Midterm and Final Exam. No make-up exams will be given unless an exam was missed due to a documented emergency. The exams will problem solving, timed exams. The problems on the exams will be about data mining algorithms.

Top ...

Evaluation

Evaluation
The final grade will be obtained from the following:

Assignments / 30%
Projects / 30%
Midterm Exam / 20%
Final Exam / 20%

The letter grade will be assigned as follows:

Grade / Points
A / 90-100
B / 80-89
C / 70-79
D / 60-69
F / 0 -59

Top ...

Student Responsibilities

Student Responsibilities

  • Each student is responsible to manage his/her time and maintain the discipline required to meet the course requirements.
  • Each student is responsible to read from the textbooks and the additional resources all topics covered in the class
  • Each student is responsible to read the forum messages and to participate in the forums
  • Each student is responsible to execute the data mining projects
  • Each student is responsible to complete all assignments
  • Each student is responsible to adhere to all course deadlines
  • Each student is responsible to take the exams as they are scheduled in the course schedule.

"I didn't know" is no an acceptable excuse for failing to meet the course requirements. Students who fail to meet their responsibilities do so at their own risk.
Top ...

Attendance Policy

Attendance Policy
Attendance at all classes and other activities (lecture periods, quizzes, examinations, or other schedule meetings) is required for every student at Columbus State University. The attendance record begins with the first meeting of the class, and one who registers late is responsible for class work missed.Class attendance is the responsibility of the student, and it is the student's responsibility to independently cover any materials missed. Class attendance and participation may also be used in determining grades. Student should note that the Computer Science Faculty does not initiate "class drops". A student wishing to drop should complete the official procedure before the deadline.Those who violate the attendance policy after that deadline may receive an "F" at the discretion of the instructor. Refer to the CSU Catalog ( ) for more information on class attendance and withdrawal.

Top ...

Academic Dishonesty

Academic Dishonesty:Academicdishonesty includes, but is not limited to, activities such as cheating andplagiarism ( It is a basis for disciplinary action. Anywork turned in for individual credit must be entirely the work of the studentsubmitting the work.All work must be your own.You may share ideas butsubmitting identical assignments (for example) will be considered cheating.You may discuss the material in the course and help one another with debugging;however, any work you hand in for a grade must be your own.A simple way toavoid inadvertent plagiarism is to talk about the assignments, but don't readeach other's work or write solutions together unless otherwise directed. Foryour own protection, keep scratch paper and old versions of assignments toestablish ownership, until after the assignment has been graded and returned toyou.If you have any questions about this, please see me immediately.Forassignments, access to notes, the course textbooks, books and other publicationsis allowed. All work that is not your own, MUST be properly cited. This includesany material found on the Internet. Stealing or giving or receiving any code,diagrams, drawings, text or designs from another person (CSU or non-CSU,including the Internet) is not allowed. Having access to another person’s workon the computer system or giving access to your work to another person is notallowed. It is your responsibility to keep your work confidential.
No cheating in any form will be tolerated. Penalties for academic dishonesty mayinclude:

  • a zerograde on the assignment or exam/quiz
  • a failing grade for the course
  • suspension from the Computer Science program
  • dismissal from the Computer Science program.

All instances of cheating will be documented in writing with acopy placed in the Department's files. Students will be expected to discuss theacademic misconduct with the faculty members and the chairperson.

Top ...

ADA Accommodation Notice

ADA Accommodation Notice
If you have a documented disability as described by the Rehabilitation Actof 1973 (P.L. 933-112 Section 504) and Americans with Disabilities Act (ADA) andwould like to request academic and/or physical accommodations please the Officeof Disability Servicesin the Shuster Student Center (room 221), 706-507-8755 as soon as possible.Course requirementswill not be waived but reasonable accommodations may be provided as appropriate.

Top ...