MATH-656 Data Mining
Fall for 2016-2017
This course presents an introduction to computational and statistical methods for exploring large data sets and discovering patterns in them. Visualization and other exploratory methods will be used throughout the course. The course surveys methods in predictive modeling (classification) including decision trees, Naïve Bayes and nearest neighbor methods. In the process, we will study discretization, data normalization and attribute selection as well as sampling methods like cross-validation, bagging and boosting. Other topics will include cluster analysis, association analysis, anomaly detection and text mining. For all topics studied, students will work with various real and constructed data sets to see the impact of different distributions on the performance of the algorithms. A variety of performance metrics will be studied.
The software Weka, R and Excel will be used in the course, although only basic knowledge of R and Excel will be assumed.
Text: Intro to Data Mining
Copyright Year: 2006
Publisher: Addison Wesley
Must be enrolled in one of the following Levels:
MN or MC Graduate
Must be enrolled in one of the following Majors:
Mathematics and Statistics
Prerequisites: Math 503
Other academic years
There is information about this course number in other academic years: