MA 267: Introduction to Statistical Learning with Applications

Credits: 3:0

Exploratory Data Analysis and Descriptive Statistics, with basic introductory programming in R using tidyverse for data visualisation.

Sampling Distribution and Limit Theorems: Order Statistics, Chi^2, F, Studentâ€™s t. Sampling statistics from Normal Population, Law of Large numbers, Central Limit Theorem, Variance Stabilising transformation. Proofs via simulation in R.

Estimation: Method of Moments, Maximum Likelihood Estimate and Confidence intervals.

Hypothesis Testing: Binomial Test for proportion, Normal Test for mean when variance is known/unknown, two sample t-test for equality of means when variance is known.

Linear Models, Normal Equations, Gauss Markov Theorem, Testing of linear hypotheses. One-way and two-way classification models: ANOVA, Random effects. Emphasis on Numerical evaluation.
Regularisation and Subset Selection methods.

Basics of Decision trees: Regression Tress, Classification trees and comparison with Linear Models.

Computational Optimal transport.

Applications from Epidemiology, Networks and Optimal transport.

Suggested books and references:

Siva Athreya, Deepayan Sarkar and Steve Tanner, Probability and Statistics with Examples Using R, Institute of Mathematical Statistics, Hayward, CA.

Sanford Weisberg, Applied Linear Regression, John Wiley and Sons, New York.

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning, Springer-Verlag, New York.