Statistics 860 - Estimation of functions from data.
[a. k. a. Statistical Model Building and Learning]

T Th 4:00-5:15 Fall 2007, Room 5295 MED SC CTR (1300 University Ave)
Stat 709 NOT required.

Grace Wahba, Instructor

This course is about various aspects of statistical model building, supervised machine learning, and multivariate function estimation given scattered, noisy, direct or indirect data, and/or heterogenous information of various kinds including dissimilarity data, with focus on what might be called the class of regularization methods. These methods obtain a model by solving an optimization problem which trades off the goodness of fit of the model to the data and the complexity of the model. In the machine learning literature, when the complexity measure involves a reproducing kernel Hilbert space (rkhs) norm, the methods are known as "kernel methods", which are presently a "hot topic" for good reason. These methods include many flavors of splines in one and several variables, as well as the classification tool known as support vector machines. Also included will be recent results in LASSO (l_1 penalty) methods.

1. Background, introduction to the theory of reproducing kernel Hilbert spaces. Many flavors of univariate and multivariate spline models. Representer theory. Connections between smoothing splines, Bayes estimates, variational problems in rkhs and regularization.

2. Degrees of freedom for signal and the bias-variance tradeoff. Generalized cross validation, generalized approximate cross validation, unbiased risk, maximum likelihood and other model tuning methods.

3. Model selection, variable selection, pattern extraction and model building methods suitable for spline and related regularization-based models, including methods for very large attribute vectors. Penalized likelihood models for risk factor modeling with Bernoulli data, data from other exponential families. Two and Multicategory Support Vector Machines for classification. "Hard" and "Soft" Classification.

4. Regularized and robust kernel estimation for noisy dissimilarity data. Semi-supervised learning.

5. Numerical methods for medium sized to very large data sets. Randomized trace estimation for the degrees of freedom for signal. Early termination of iterative methods as a form of regularization. Global basis pursuit/LASSO basis function selection methods.

6. Applications in biostatistics and bioinformatics (risk factor modeling, classification), statistical learning theory (supervised machine learning, support vector machines), meteorology (ill-posed inverse problems and remote sensing), medical imaging and other areas, will be discussed according to the interests of the class.

Prerequisites: - Statistics Majors, mathematical maturity to the level of a year of graduate work, and either multivariate analysis, or, some exposure to Hilbert spaces, or cons. instr. Those unfamiliar with Hilbert spaces will be asked to read the first 33 pages of Akhiezer and Glazman, Theory of Linear Operators in Hilbert Spaces, vol. I here at the beginning of the course. Graduate students in biostatistics, CS, AOS and other physical sciences, engineering, economics, math and business may find some of the techniques studied here useful and are welcome to sit in, or, take the course for credit if they have exposure to linear algebra, sufficient math background to read Akhiezer and Glazman, and are familiar with the basic properties of the multivariate normal distribution, as found, e. g. in Anderson, Multivariate Analysis, or Wilks, Mathematical Statistics. Otherwise, the development will be self-contained. If in doubt, please contact the instructor by e-mail (wahba@stat.wisc.edu) or come to the first class. This will be a seminar-type course. There will be no sit-down exams. Students taking the course for credit will be expected to do several small computer projects studying the behavior of some of the methods discussed on simulated or experimental data, and one or two projects in an area of application of their choice with a possible project being the presentation of a lecture in class on a recent paper or recent research. Text: Wahba: Spline Models for Observational Data, SIAM (1990). Material from selected recent papers, books and conferences will be discussed, tba. [If you are a member of SIAM it will be cheaper to order the book from them, but allow several weeks for delivery. See the BOOK link on my home page]