Deep Reinforcement Learning

Alien Cat and Mouse

Testing agent adaptability to environment variations and obstacles using Deep Reinforcement Learning PPO (Proximal Policy Optimization) algorithm.

Self Driving Agent

AWS DeepRacer League Challenge - Virtual Circuit World Tour

Best Average Time: 14.529 seconds (362/1983)

Super Mario Bros

Trained a Mario AI Agent for 5 million time-steps with Deep Reinforcement Learning PPO (Proximal Policy Optimization)


Trained an AI agent for 600k time-steps with Deep Reinforcement Learning PPO (Proximal Policy Optimization) to play Doom (Deadly Corridor)


After training for 5 million time-steps, AI Bad Mr. Frosty learned he could win by just throwing kicks nonstop


Training an agent in multiple environments using the A2C (Advantage Actor-Critic) algorithm to play Breakout

Lunar Lander

Trained an agent using PPO (Proximal Policy Optimization) algorithm to land on the moon.


Fatal Fury 2

I trained a reinforcement learning agent to battle in the Fatal Fury 2 tournament and become the ML King of Fighters. After training for 20 million time steps, the agent was able to reach level 7.

I resumed training for another 10 million time steps and the agent got to level 10, although its fighting style changed to an exploitative method consisting of crouching and kicking similar to that of AI Bad Mr. Frosty from my previous ClayFighter training experiments.

It seems the nonstop kicking method is a very popular Artificial Intelligence martial arts fighting technique. If robots ever decide to take over the world using Deep Reinforcement Learning, watch out for those kicks!


Full Video:

Mortal Kombat II

Trained a Liu Kang AI agent using deep reinforcement learning and PPO to play Mortal Kombat 2. While victorious, more training is needed to make it an unstoppable flawless victory machine!

Ms. Pac-Man

Training a Ms. Pac-Man AI agent using Reinforcement Learning (PPO) to outsmart the ghosts and maximize its score through trial and error. 

Flappy Bird

Trained an AI agent using deep reinforcement learning DQN algorithm to play Flappy Bird and processed the animation with a diffusion model

