Course: STAT 37601=CMSC 25025
Title: Machine Learning and Large-Scale Data Analysis
Instructor(s): Yali Amit
Teaching Assistant(s): Zhisheng Xiao
Class Schedule: Section 1: TR 2:00 PM–3:20 PM (subject to change, TBD)
Office Hours:
Description: This course is an introduction to machine learning and the analysis of large data sets using distributed computation and storage infrastructure. Basic machine learning methodology and relevant statistical theory will be presented in lectures. Homework exercises will give students hands-on experience with the methods on different types of data. Methods include algorithms for clustering, binary classification, and hierarchical Bayesian modeling. Data types include images, archives of scientific articles, online ad clickthrough logs, and public records of the City of Chicago. Programming will be based on Python and R, but previous exposure to these languages is not assumed.