Math 7740: Statistical Learning Theory:  Classification, Pattern Recognition, Machine Learning (Fall 2009)

 

4 credits. Prerequisites: Basic mathematical statistics (MATH 6740 or equivalent) and measure theoretic  probability (MATH 671 or equivalent) or permission of instructor.

 

Lecturer: Michael Nussbaum,  <mn66>, Malott 441, 5-3403,

Office hours: MF 2:30–3:30

Lecture:            TR 2:55-4:10, Malott 230                                       

                       

Course Website     http://mysite.verizon.net/vzey4zz5/math7740/

 

Required textbook:

Course packet:   Nussbaum, M., Topics in Statistical Learning Theory, Cornell University, 2009.

Optional textbook: The Elements of Statistical Learning (Data Mining, Inference and Prediction) by T. Hastie, R. Tibshirani, J. H. Friedman, Second Edition, Springer, 2009.

The course aims to present the developing interface between machine learning theory and statistics. Topics include an introduction to classification and pattern recognition; the connection to nonparametric regression is emphasized throughout. Some classical statistical methodology is reviewed, like discriminant analysis and logistic regression, as well as the notion of perceptron which played a key role in the development of machine learning theory. The empirical risk minimization principle is introduced, as well as its justification by Vapnik-Chervonenkis bounds. Basic principles of constructing classifiers are treated in detail, such as support vector machines, kernelization, neural networks and tree methods. The course will conclude with an outline of boosting and aggregation as the most active research areas in learning theory today.

 

Primary reference books:     

 

Ø      Devroye, L, Gyorfi, L., Lugosi, G.,  A Probabilistic Theory of Pattern Recognition, Springer 1997. Mathematically rigorous and proof oriented, written in a clear  and accessible style, a useful  complement to  the course textbook by Hastie et al. which focuses on applications

Ø      Vapnik, V. , Statistical Learning Theory, Wiley, 1998. A large treatise which has become a classic, mathematically rigorous also,  focusing on one particular method (support vector machines) developed by the author

Ø      Vapnik, V. The Nature of Statistical Learning Theory, 2nd ed, Springer 1999. A very readable abstract of the treatise above without proofs and more  background discussion instead

Ø      Wasserman, L., All of Statistics, Springer 2004.  A first overview of learning theory can be obtained from Chapter 22, which  contains a concise introduction to classification

 

 

 

 

 

Further possible sources:

 

Ø      An Introduction to Support Vector Machines and Other Kernel-based Learning Methods by Nello Cristianini and John Shawe-Taylor

Ø      Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (Adaptive Computation and Machine Learning) by Bernhard Schölkopf and Alexander J. Smola

Ø      Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher M. Bishop

Ø      Learning Kernel Classifiers: Theory and Algorithms (Adaptive Computation and Machine Learning) by Ralf Herbrich

Ø      Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) by Carl Edward Rasmussen and Christopher K. I. Williams

Ø      Feedforward Neural Network Methodology (Springer Series in Statistics) by Terrence L. Fine

Ø      Information Theory, Inference & Learning Algorithms by David J. C. MacKay