Data Mining

Course Information

 

To: graduate students

 

Textbook:

J. Han and M. Kamber. Data Mining: Concepts and Techniques, 2nd edition, Elsevier Inc.  2006.

 

Main reference books:

[1] P.-N. Tan, M. Steinbach, V. Kumar. Introduction to Data Mining, Addison-Wesley, 2006.

[2] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, 2nd edition, Elsevier Inc.  2005.

[3] D. Hand, H. Mannila, and P. Smyth. Principles of Data Mining, MIT Press, 2001.

 

 

Slides

(by Jiawei Han and Micheline Kamber, and can be obtained from here)

 

Chapter 1. Introduction

Chapter 2. Data Preprocessing

Chapter 5. Mining Frequent Patterns, Associations and Correlations

Chapter 6. Classification and Prediction

Chapter 7. Cluster Analysis

Chapter 8. Mining Stream, Time-Series and Sequence Data

Chapter 9. Graph Mining, Social Network Analysis and Multi-Relational Data Mining

Chapter 10. Mining Object, Spatial, Multimedia, Text and Web Data

 

 

 

Homework 1Data mining reading notes—Mining frequent patterns

 

First read chapter 5 of the text book, and finish the following tasks:

(1)   Find the original papers on three major frequent pattern mining methods: Apriori (Agrawal & Srikant, VLDB’94), FPgrowth (Han, Pei & Yin, SIGMOD’00) and Charm (Zaki & Hsiao, SDM’02), and write a brief note.

(2)   If possible, compare running performances of the three algorithms through experiments on benchmark datasets.

(3)   If possible, find 1-2 RECENT papers on the three algorithms respectively, and write a brief note to indicate the developments.

 

Attn:

 

Deadline for homework 1: May 31, 2008