Analysis of RL Algorithms for a Simulated Hill Climb Racing Agent

This project consists on custom implementation of the classic Hill Climb Racing game using Python, Pygame, and the Box2D physics engine. The primary goal is to train and compare different reinforcement learning agents to master the challenge of navigating an infinitely generated, rugged terrain.

Features:

Custom Environment: A fully custom Hill Climb environment built from scratch using gymnasium and the Box2D physics engine.
Multiple RL Algorithms: Implementation and comparison of three distinct reinforcement learning algorithms:
- PPO (Proximal Policy Optimization): An advanced actor-critic method known for its stability and sample efficiency. It uses a clipped objective function to constrain policy updates.
- DQN (Deep Q-Network): A classic value-based algorithm that utilizes an experience replay buffer and a target network to stabilize learning a Q-value function.
- Expected SARSA: An on-policy temporal-difference algorithm that improves upon SARSA by calculating the expected Q-value over all possible next actions, reducing variance.
Function Approximation: Support for different models, including Neural Networks (nn) and Polynomial (poly) function approximators.
Train & Visualize: A command-line interface to easily train new agents and visualize the performance of saved models.

Analysis of RL Algorithms for a Simulated Hill Climb Racing Agent

Features:

Collaborators