Mathematics of Machine Learning M2 MAPI3 (Paul Sabatier, 2024-2025)

Lecturer: Clément Lalanne

Overview

This class aims to introduce the main theoretical foundations of modern machine learning, with a focus on supervised learning. Basic knowledge of linear algebra and probability theory (including measure theory) is required. For the last lectures, basic knowledge of functional analysis is recommended.

Evaluation

IMPORTANT PLEASE READ I have been informed that the evaluation methods are determined by the university and cannot be altered. I am deeply sorry about the missleading early information that I gave you. Below are the updated evaluation modalities:
For all students, the final grade will consist of an 80% weight from a 2-hour final exam and 20% from project work. For part-time working students (étudiants en alternance), the project grade will be based on the TP2, which I will evaluate (replacing the previous homework assignment). For other students, the project grade will be split equally: 50% from TP2 and 50% from the actual projects.

The TP2 (which is due for the evaluation) is out.

Final Exam : 21/02/2025 (13h30 -> 15h30 (+ 40 minutes for people with extra time)). Here are exercices on which you can train on.

Final Exam 2024 2025 (solution).

Survival Kit for the Exam

For the exam, you should be familiar with the following concepts and techniques:

Decomposing risk as an estimation error (considering the uniform worst-case error over the predictor set), approximation error, and potential other error sources (e.g., optimization errors). You should be able to recognize the different bias and variance terms in this decomposition.
Basic linear algebra (vectors, linear maps, rank, matrices) and bilinear algebra (eigenvectors, eigenvalues, spectral theorem, basic matrix decompositions).
Multivariable calculus (Gradients, Hessians, Taylor expansions, higher-order derivative tensors), Convexity, First and Second Order optimality conditions, KKT conditions.
Basic probability theory in general probability spaces, conditional laws, conditional expectations, and independence.
Common inequalities: Triangle inequality, convexity, AM-GM inequality, Cauchy-Schwarz, Jensen, Hölder, Minkowski.
Concentration inequalities: Markov, Bienaymé-Tchebychev and McDarmid's inequality.
Rademacher complexity: Definition, upper bounding the sup deviations (each side by two times the Rademacher complexity), Lipschitz contraction principle, and typical use cases.
Kernel methods: Aronszajn's theorem (equivalence between positive kernels and Hilbert space dot products), representer theorem, kernel trick, operations on kernels, and Bochner's theorem.
Optimization: general optimization priciples and usual regularity assumptions.
More to be continued...

Lectures

Lecture 1 (02/09/2024 - 13:30-15:30): Introduction to Supervised Learning.
Lecture 2 (04/09/2024 - 13:30-15:30): Linear Regression and Ridge Regression.
Lecture 3 (09/09/2024 - 13:30-15:30): Empirical Risk Minimization 1.
Lecture 4 (11/09/2024 - 13:30-15:30): Empirical Risk Minimization 2.
Lecture 5 (16/09/2024 - 13:30-15:30): Kernel Methods 1 (Jean-Philippe Vert's proof of Aronszajn's theorem).
Lecture 6 (18/09/2024 - 13:30-15:30): Kernel Methods 2.
Lecture 7 (06/01/2025 - 13:30-15:30): Optimization for Machine Learning.
Lecture 8 (08/01/2025 - 15:45-17:45): Neural Networks.
Lecture 9 (13/01/2025 - 13:30-15:30): Questions / Answers before the exam : To maximize efficiency, prepare your questions before the class !

TDs / TPs

TD/TP 1 Group 1 (07/10/2024 - 13:30-15:30) - Group 2 (07/10/2024 - 15:45-17:45): Linear Regression, Ridge Regression, (Cross-)Validation, and Dimensionality Reduction (solution).
TD/TP 2 Group 1 (14/10/2024 - 13:30-15:30) - Group 2 (14/10/2024 - 15:45-17:45): Kernel methods (The correction will be sent by email after the deadline) Due : 13-12-2024 23h59 by email @ clement.lalanne@univ-tlse3.fr with object [TP2 MAPI3 Firstname LASTNAME]. Complementary info : I expect one submission (i.e. one email) per student, and the theoretical question may be aswered on scanned handwriting (you do not have to write the answers in Latex). One email may contain a Python notebook and a pdf..
TD/TP 3 Group 1 (04/11/2024 - 13:30-15:30) - Group 2 (04/11/2024 - 15:45-17:45): Neural Networks and Optimization (solution) original material by Elisa Riccietti.

References and External Resources

Machine Learning and Learning Theory

Learning Theory from First Principles by Francis Bach, 2024: Main reference for the course.
Alice's Adventures in a Differentiable Wonderland -- Volume I, A Tour of the Land by Simone Scardapane, 2024: A very accessible hands on presentation of modern machine learning methods.
Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar, 2018.
Aurélien Garivier's ML M2 class at ENS Lyon: Complementary topics to those presented in this course.

Optimization for Machine Learning

Convex Optimization by Stephen Boyd and Lieven Vandenberghe, Cambridge University Press, 2012.
First-Order Optimization Methods by Amir Beck, 2017.
Acceleration Methods by Alexandre d’Aspremont, Damien Scieur and Adrien Taylor, 2024.

Measure and Probability Theory

Intégration, Probabilités et Processus Aléatoires by Jean-François Le Gall, 2006: An excellent introduction to measure theory with elements of stochastic processes and conditional expectation.
Probabilités 2 by Jean-Yves Ouvrard, 2009: A detailed treatment of advanced topics in probability theory.
Concentration Inequalities: A Nonasymptotic Theory of Independence by Stéphane Boucheron, Gábor Lugosi, and Pascal Massart, 2013.

Analysis

Cours d'analyse by Jean-Michel Bony, 2001.