Implementation of RL Algorithms
Q-learning for solving block world
Abstract
Working on the autonomous stair-climbing robot highlighted the limitations of behavior cloning. This realization motivated me to study reinforcement learning. Following are some of the results of my implementation of basic reinforcement learning algorithms form scratch using PyTorch, NumPy, and OpenCV.
Implementated:
- DQN
- Vanilla Policy Gradient
- PPO
- DDPG
Results
Value and Policy Iteration

| Method | Deterministic Frozen Lake | Stochastic Frozen Lake |
|---|---|---|
| Value Iteration | 7 | 8 |
| Policy Iteration | 7 | 3 |
Q-Learning

Deep Q-Learning
