CS 5180/4180 • Reinforcement Learning

Interactive learning tools for reinforcement learning.

Explore hands-on simulations, visual walkthroughs, and guided experiments. Each chapter opens an interactive learning tool.

Tap any chapter to begin

Bandits

Chapter 01

Multi-Armed Bandits Problem

Markov Decision Processes

Chapter 02

MDP

Dynamic Programming

Chapter 03

Dynamic Programming (DP)

Monte Carlo Methods

Chapter 04

Monte Carlo Methods

Monte Carlo ES (Blackjack)

Temporal-Difference Learning

Chapter 05

Temporal-Difference Learning

Random Walk (TD vs MC)

Windy Gridworld (SARSA/Q-learning)

n-step Bootstrapping

Chapter 07

n-step TD Bootstrapping (Random Walk)

Planning and Learning with Tabular Methods

Chapter 08

Dyna-Q Planning vs Q-learning

Eligibility Traces

Chapter 12

TD(0) vs TD(λ) on Random Walk