## Master 2

## Mathematics of randomness

### Overview

This master is a top-level training programme covering various fields related to Probability, Statistics and Machine Learning. It is mostly devoted to the most fundamental aspects of these fields without eluding the constraints due to applied considerations. It typically leads to academic or industrial Phd positions but some students can also start a career in the industry directly after the Master. The courses of this master are split into two specializations: from one hand, Probability and Statistics with a focus on mathematical research and from another hand Statistics and Machine Learning, which is more oriented toward research or openings in companies. During the second semester, a master thesis should be delivered, based on the analysis of several research papers, under the supervision of a teacher from the Master programme. This thesis may also be replaced by an internship in a firm or in a research laboratory.

**Language of instruction**: French and English

**ECTS**: 60

**Oriented**: Research.

**Duration**: 1 year

**Courses Location**: Essentially in Paris-Saclay. Occasionally in IP Paris.

### Educational objectives

At the end of this master, the student should be able to

• understand, master and use many modern mathematical tools for modeling randomness, analysing and treating data.

• perform predictions and take decisions.

### Program structure

* (C: course, T: tutorial class)

**Specialization: Probability and Statistics. First semester. 30 ECTS among the following courses.**

Ergodic Theory (7.5 ECTS, 25C/12T). Ergodicity, Kac theorem, Birkhoff theorem, invariant measure entropy.

Lecturers: Hans Rugh, Damien Thomine.

Brownian motion and stochastic calculus (7.5 ECTS, 28C/20T). Ito calculus, Girsanov Theorem, Dubins-Schwarz, Stochastic Differential Equations.

Lecturers: Jean Francois Le Gall, Maxime Février.

Random walks and graphs (7.5 ECTS, 25C/12T). Potential Theory, recurrence, transience, random trees, random walks, random graphs.

Lecturer: Nicolas Curien.

Data Mining Project (7.5 ECTS, 36 H) (open to students from both specializations). Non linear regression, random forest, gradient boosting, aggregating experts.

• Lecturer: Yannig Goude.

Semi-parametric statistics (5 ECTS, 20H) (open to students from both specializations). Le Cam Theory, efficiency, likelihood methods, Bayesian methods, spectral methods.

Lecturer: Elisabeth Gassiat.

High dimensional statistics (6 ECTS, 30 H) (open to students from both specializations). High dimension, complexity control.

Lecturers: Christophe Giraud, Tristan Mary-Huart.

Convergence of measures and Poisson process. (5 ECTS, 20C, 10TD). Weak convergence of processes, Donsker theorem, random Poisson measure, Levy process.

Lecturer: Pierre Loic-Méliot.

Advanced Markov Chains (5 ECTS, 32 H) (open to students from both specializations). General state space Markov Chains: uniform and non uniform ergodicity, convergence control in total variation norm, V-norms or Wasserstein distance.

Lecturer: Eric Moulines, Randal Douc.

Concentration and model selection (5+5 ECTS 20C/20C) (course divided into two independent parts) (open to students from both specializations). Non-asymptotic theory, concentration inequalities, empirical process, penalisation, entropy method.

Lecturer: Pascal Massart.

Statistical learning and resampling (5 ECTS, 20H) (open to students from both specializations). Supervised learning, classification, regression, local averaging, empirical risk. Universal consistency, bootstrap, subsampling, cross-validation.

Lecturer: Sylvain Arlot.

Introduction to percolation theory (2.5 ECTS, 10H). phase transition, percolation.

Lecturer: Hugo Duminil-Copin.

**Specialization: Probability and Statistics. Second semester. 16 ECTS among the following courses. 14 ECTS for the Master Thesis.**

Random Matrices (4 ECTS, 20H) (open to students from both specializations).

Lecturer: Bertrand Eynard.

Optimisation and statistics (4 ECTS, 20H) (open to students from both specializations). M-estimation and convex optimisation, non-asymptotic analysis of the stochastic approximation.

Lecturer: Francis Bach.

Large scale graph inference (4 ECTS, 20H). Community detection, graph alignment, tree reconstruction, spectral methods.

Lecturer: Laurent Massoulié.

Random population model (4 ECTS, 20H). One-dimensional stochastic differential equations with jumps; limiting processes; martingale properties, poisson measure representation, stochastic calculus.

Lecturers: Vincent Bansaye, Sylvie Méléard.

Soluble models in probability (4 ECTS, 20H). Random walks in random environment, longest non-decreasing subsequence of a random permutation, hardy-ramanujan formula.

Lecturer: Nathanaël Enriquez.

Branching random walks (4 ECTS, 20H). Large deviations, additive martingale convergence, spine decomposition.

Lecturer: Pascal Maillard.

Robust learning (4 ECTS, 20H) (open to students from both specialization).

Lecturer: Matthieu Lerasle.

Malliavin Calculus (4 ECTS, 20H). Generalised Girsanov Theorem, absolute continuity criterion for SDE solutions, explicit Itô-Clark formula, anticipative calculus, insider trading, calculation of the greeks, optimal transport.

Lecturer: Laurent Decreusefond.

Interacting particle systems (4 ECTS, 20H). hydrodynamic limit, PDE.

Lecturer: Thierry Bodineau.

Local times and excursion theory (4 ECTS, 20H). Continuous semi-martingales, Brownian local times process and branching process, excursion theory.

Lecturer: Jean-Francois Le Gall

The self-avoiding walk model (2 ECTS, 8H). Self-avoiding walk model on lattices, discrete holomorphicity.

Lecturer: Hugo Duminil-Copin.

Non parametric Bayesian estimation (4 ECTS, 20H) (open to students from both specializations). Gaussian and Dirichlet processes, asymptotics, density estimation, high dimension.

Lecturer: Vincent Rivoirard.

Topological data analysis (4 ECTS, 20H) (open to students from both specializations).. Estimation of submanifold homology, persistent homology, geometric inference.

Lecturers: Frédéric Chazal, Quentin Merigot.

System reliability (4 ECTS, 20H). Statistical process control, reliability of repairable systems, state space markov system.

Lecturer: Patrick Pamphile.

Sequential learning and optimisation (4 ECTS, 20H) (open to students from both specializations).. Multi-armed bandit problems, Aggregation of expert advice.

Lecturer: Gilles Stoltz.

Extremes (4 ECTS, 20H)( (open to students from both specializations). Weak convergence of maxima and threshold exceedances, limiting Poisson process. Univariate and multivariate settings.

Lecturer: Anne Sabourin.

Note that some courses are open to students from both specializations. In that case, they are underlined and described in the “Probability and Statistics” section.

*(C: course, T: tutorial class)

**Specialization: Statistics and Machine Learning. First semester. 30 ECTS among the following courses. **

Seminar StatML (2.5 ECTS, 20H). Lectures are either given by researchers or by students based on the reading of some research papers.

Organizer: Christophe Giraud (PSud).

Reinforcement learning (2.5 ECTS, 20H). Reinforcement learning and Markov decision process. Bandits. Dynamic programmation, Monte Carlo methods. Planifications and learning by tabular analysis. Approximated methods in prédiction, planifications and learning.

Lecturer: Erwann Le Pennec.

Advanced optimization (5 ECTS, 40H). Mathematical tools for the construction of convex optimisation algorithms, distributed optimisation under Hadoop and Spark.

Lecturers: Pascal Bianchi, Olivier Fercoq.

Statistical Learning Theory (2.5 ECTS, 20H) Bayes predictor. Empirical risk minimization. Density Estimation. Bias-variance tradeoff. Minimax risk over the Holder classes. Adaptive estimation bandwidth. Selection by minimizing an unbiased risk estimator. Lepski’s method. Thresholding in nonparametric regression.

Lecturer: Arnak Dalalyan.

Optimization for Data Science (5 ECTS, 40H) This course includes the necessary theoretical results of convex optimization as well as the computational aspects. This course also contains a fair amount of programming as all algorithms presented will be implemented and tested on real data. At the end of the course, students shall be able to decide what algorithm is the most adapted to the machine learning problem given the size the data (number of samples, sparsity, dimension of each observation).

Lecturers: Alexandre Gramfort, Robert Gower.

Nonparametric estimation (2.5 ECTS, 15C/9T) Kernel and projection estimators. Speed of convergence, oracle inequalities and adaptation. Nonparametric estimation of a regression function. Local polynomials, splines. BIC, Lasso. Statistical estimation in high dimensions.

Lecturer: Cristina Butucea.

Introduction to Bayesian Learning (2.5 ECTS, 14C/7T) Introduction to Bayesian methods and analysis. Regularization methods and specification of a prior distribution. Approximation methods: variational Bayes, Markov Chain Monte Carlo sampling and sequential sampling schemes.

Lecturer: Anne Sabourin.

Introduction to Probabilistic Graphical Models (2.5 ECTS, 14C/7T ) Undirected graphical models, learning and inference, Hidden Markov Models (HMM’s), Expectation-Maximization algorithm.

Lecturer: Umut Şimşekli.

Hidden Markov models and sequential Monte Carlo methods (2.5 ECTS, 18H). Hidden Markov models, filtering, smoothing, prédiction and estimation. Extensions to statistical estimation in Bayesian statistics.

Lecturer: Nicolas Chopin

**Specialization: Statistics and Machine Learning. Second semester. 16 ECTS among the following courses. 14 ECTS for the Master Thesis.**

Geometric methods in machine learning (4 ECTS, 15C/6T) Visualization of metric data. Learning metrics. Metrics and kernels for exotic data-types.

Lecturer: Marco Cuturi.

Introduction to compressed sensing (4 ECTS, 15C/9T) Compressed sensing: exact or approximated reconstruction of a high dimensional signal. Matrix completion and recommendation system. Community detection in graphs.

Lecturer: Guillaume Lecué

Advanced Learning techniques. (4 ECTS, 18H)

Lecturer: Stéphan Clémençon

### IP Paris labs involved

• LTCI: Information Processing and Communications Laboratory (Télécom Paris),

• SAMOVAR (Télécom SudParis),

• CMAP: Applied Mathematics Center (Ecole Polytechnique),

• CREST: Center for Research in Economics and Statistics (ENSAE Paris)

### Career prospects

The wide range of the proposed courses naturally allows the student to work in various fields of application where randomness has to be analysed, processed and summarized in order to make predictions or take decisions. This includes, among others, insurance, banking, pharmaceutical laboratories, energy, climate, transport, aeronautics, communication, signaling…

### Institutional partners

• Université Paris-Sud,

• Télécom SudParis,

• Télécom ParisTech,

• Ecole Polytechnique,

• ENSAE,

• AgroParisTech,

• École Centrale,

• IHES,

• INRIA,

• Université Paris-Saclay.

### Industrial partners

• EDF,

• Thales,

• Groupe Lagardère,

• ID Services,

• Quinten,

• Air Liquide,

• Laboratoire National de métrologie et d’essais,

• Service Transports ville de Paris,

• Institut Pasteur,

• Institut Curie,

• INRA,

• INRIA,

• INSERM.

### Admissions

Application guidelines for a master’s program at IP Paris

**Academic Prerequisites**:

• Bachelor degree and first year of master (M1) in mathematics or equivalent training from a ‘grande école’.

**Language prerequisites**:

• French, English

**Application timeline**

Deadlines for the Master application sessions are as follows:

– First session: February 28, 2020

– Second session: April 30, 2020

– Third Session (optional): June 30, 2020 (only if there are availabilities remaining after the 2 first sessions)

Applications not finalized for a session will automatically be carried over to the next session.

You shall receive an answer 2 months after the application deadline of the session.

### Tuition fees

**National Master**: Official tuition fees of the Ministry of Higher Education, Research and innovation (2019-2020, EU students: 243 euros / Non-EU students: 3770 euros)

### Contact

• IP Paris: **Matthieu Lerasle** (Email)

• Website

• Secretariat (Email)

• Coordinator « Probability and Statistics »: Nicolas Curien (Email)

• Coordinator « Statistics and Machine-Learning »: Christophe Giraud (Email)