Renforcement

Reinforcement learning

Description: The course first covers the theoretical concepts of reinforcement learning: MDP, dynamic programming and value functions (Policy iteration, Value Iteration), model-free methods (Temporal Differences, SARSA, Q-Learning). These concepts are then extended to continuous systems where the value function must be approximated (LSTDQ, DeepRL, …). These fundamentals will enable a better understanding of current successes in Artificial Intelligence, such as AlphaZero and, to a lesser extent, Chat-GPT.

Learning outcomes: Understanding of the theoretical aspects of reinforcement learning and their implementation with deep learning techniques.

Evaluation methods: 2h written test, can be retaken.

Evaluated skills:

Modelling
Research and Development

Course supervisor: Hervé Frezza-Buet

Geode ID: 3MD4120