MATH-656 Data Mining
Fall for 2016-2017
This course presents an introduction to computational and statistical methods for exploring large data sets and discovering patterns in them. Visualization and other exploratory methods will be used throughout the course. The course surveys methods in predictive modeling (classification) including decision trees, Naïve Bayes and nearest neighbor methods. In the process, we will study discretization, data normalization and attribute selection as well as sampling methods like cross-validation, bagging and boosting. Other topics will include cluster analysis, association analysis, anomaly detection and text mining. For all topics studied, students will work with various real and constructed data sets to see the impact of different distributions on the performance of the algorithms. A variety of performance metrics will be studied.

The software Weka, R and Excel will be used in the course, although only basic knowledge of R and Excel will be assumed.

Fall semester.

Text: Intro to Data Mining
Author: Tan
ISBN: 9780321321367
Copyright Year: 2006
Publisher: Addison Wesley

Must be enrolled in one of the following Levels:
MN or MC Graduate
Must be enrolled in one of the following Majors:
Mathematics and Statistics
Credits: 3
Prerequisites: Math 503
More information
Look for this course in the schedule of classes.

The academic department web site for this program may provide other details about this course.