Renforcement
Reinforcement learning
Description: The course first covers the theoretical concepts of reinforcement learning: MDP, dynamic programming and value functions (Policy iteration, Value Iteration), model-free methods (Temporal Differences, SARSA, Q-Learning). These concepts are then extended to continuous systems where the value function must be approximated (LSTDQ, DeepRL, …). These fundamentals will enable a better understanding of current successes in Artificial Intelligence, such as AlphaZero and, to a lesser extent, Chat-GPT.
Learning outcomes: Understanding of the theoretical aspects of reinforcement learning and their implementation with deep learning techniques.
Evaluation methods: 2h written test, can be retaken.
Evaluated skills:
- Modelling
- Research and Development
Course supervisor: Hervé Frezza-Buet
Geode ID: 3MD4120