Data Mining (Revision 3)
View previous syllabus
Delivery Mode: Individualized study online
Area of Study: IS Core
Prerequisite: COMP 602 or equivalent. Students registering in this course will need to have some background in database systems and statistics. Students who are concerned about not meeting the prerequisites for this course are encouraged to contact the course coordinator before registering.
This course is not available for challenge credit.
Faculty: Faculty of Science and Technology
Instructor: Dr. Larbi Esmahi
Our ability to generate and collect data has been increasing rapidly. The widespread use of information technology in our lives has flooded us with a tremendous amount of data. This explosive growth of stored and transient data has generated an urgent need for new techniques and automated tools that can assist in transforming this data into useful information and knowledge. Data mining has emerged as a multidisciplinary field that addresses this need.
This course discusses techniques for preprocessing data before mining and presents the concepts related to data warehousing, online analytical processing (OLAP), and data generalization. It presents methods for mining frequent patterns, associations, and correlations. It also presents methods for data classification and prediction, data-clustering approaches, and outlier analysis.
Students who successfully complete this course should be able to
- interpret the contribution of data warehousing and data mining to the decision-support level of organizations
- evaluate different models used for OLAP and data preprocessing
- categorize and carefully differentiate between situations for applying different data-mining techniques: frequent pattern mining, association, correlation, classification, prediction, and cluster and outlier analysis
- design and implement systems for data mining
- evaluate the performance of different data-mining algorithms
- propose data-mining solutions for different applications
Unit 1: Overview of Data Mining
- This unit provides some background on data objects and statistical concepts. It also discusses the types of data to be mined and presents a general classification of data-mining tasks.
Unit 2: Data Preprocessing
- This unit introduces techniques for preprocessing data before mining. Concepts such as the cleaning, integration, reduction, transformation, and discretization of data are discussed.
Unit 3: Overview of Data Warehousing and OLAP
- This unit provides a solid introduction to data warehousing, OLAP, and data generalization.
Unit 4: Data Cube Computation and Multidimensional Data Analysis
- This unit presents a detailed study of methods for data cube computation, advanced query processing, and multidimensional data analysis.
Unit 5: Mining Frequent Patterns, Associations, and Correlations
- This unit presents methods for mining frequent patterns, associations, and correlations.
Unit 6: Classification
- This unit discusses ways of classifying data: decision tree induction, Bayesian classification, rule-based classification, neural networks, support vector machines, associative classification, k-nearest-neighbor classifier, case-based reasoning, genetic algorithms, rough sets, and fuzzy set approaches.
Unit 7: Cluster Analysis
- This unit describes the partitioning, hierarchical, density-based, grid-based, and model-based methods data clustering.
Unit 8: Outlier Detection
- This unit describes several major approaches to the detection of anomalies, such as the statistical, proximity-based, clustering-based, and classification-based methods.
To pass this course, students must achieve an average grade of at least 60% on the assignments and project, and a grade of at least 60% on the final examination.
To receive credit toward the Master of Science in IS for Core Courses, students must achieve a course composite grade of at least B- (70%).
To receive credits towards the Master of Science in IS, for Electives/Career Track, students must achieve a course composite grade of at least C+ (67%).
The weighting of the composite grade is as follows:
|Assignment 1||10%||after Unit 3|
|Assignment 2||15%||after Unit 5|
|Assignment 3||15%||after Unit 7|
|Project||30%||after Unit 8|
|Final Invigilated Examination||30%||after Unit 8|
Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann, 2012. eText ISBN: 9780123814807.
Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Morgan Kaufmann, 2011. ISBN 978-0-12-374856-0. (Available as an e-book through the Athabasca University Library.)
The remainder of the course material is distributed through the online course site.
Special Course Features
COMP 682 is offered entirely online and can be completed at the student’s workplace or home. Students will need to order the final examination four weeks before the course end date.
Athabasca University reserves the right to amend course outlines occasionally and without notice. Courses offered by other delivery methods may vary from their individualized-study counterparts.
Opened in Revision 3, August 11, 2017.
View previous syllabus
Updated August 11 2017 by Student & Academic Services