AppStat

Statistical learning

Description: The objective of supervised learning is to propose methods that, based on a training set of examples, make a decision on a parameter based on observations, the decision being the best possible on average. For example, classify images according to their content, i.e. decide if an image represents a cat, a dog, or something else. We will formally present the problem and study the guarantees of generalization of supervised learning algorithms, i.e. th3e quality of prediction of the output associated with an entry not present in the training set. To achieve this objective, we will introduce the concepts of hypothesis space with PAC (probably approximately correct) learning capacity , Vapnik-Chervonenkis dimension of a hypothesis space. We will state and prove two fundamental theorems of supervised learning theory giving a lower bound and an upper bound of the real risk to the binary classification problem.

Content: Formalization of supervised learning problems PAC learning capacity and uniform convergence The bias-complexity trade-off The VC (Vapnik-Chervonenkis) dimension of a hypothesis space Two fundamental theorems of PAC learning

Learning outcomes: At the end of this course, students will be able: -to understand elements of the theory of supervised learning; -to understand the bias-complexity trade-off of an hypothesis class; -to understand and use PAC bayesian bounds of supervised learning (in particular those of binary classification problem).

Teaching methods: 10,5h of courses + 10,5h of tutorials + written exam of 2h

Means: The tutorials (TDs), consisting of exercises, will allow the concepts seen in class to be used.

Evaluation methods: Written exam of 2h with documents

Course supervisor: Paul Fraux

Geode ID: 3MD4140


CM:

  1. Modèle formel de l’apprentissage statistique supervisé (1.5 h)
  2. Capacité d’apprentissage PAC (1.5 h)
  3. Dilemme biais-complexité (1.5 h)
  4. No free lunch theorem (1.5 h)
  5. Dimension VC (1.5 h)
  6. Théorèmes fondamentaux de l’apprentissage PAC (1.5 h)
  7. Théorèmes fondamentaux de l’apprentissage PAC (1.5 h)

TD:

  1. Rappels et compléments mathématiques (1.5 h)
  2. Rappels et compléments mathématiques (1.5 h)
  3. Prédicteurs linéaires (1.5 h)
  4. Prédicteurs linéaires (1.5 h)
  5. Théorèmes fondamentaux de l’apprentissage PAC (1.5 h)
  6. Théorèmes fondamentaux de l’apprentissage PAC (1.5 h)
  7. Théorèmes fondamentaux de l’apprentissage PAC (1.5 h)
  8. groupe lecture 1/3 (1.5 h)
  9. groupe lecture 2/3 (1.5 h)
  10. groupe lecture 2/3 (1.5 h)