PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research papers.
-
Updated
Aug 14, 2025 - Python
PyTorch implementations of algorithms from "Reinforcement Learning: An Introduction by Sutton and Barto", along with various RL research papers.
End-to-end RL trading framework with PPO agent, self-attention neural network, custom Gym environment, and advanced backtesting.
A Complete Collection of Deep RL Famous Algorithms implemented in Gymnasium most Popular environments
🚦 Next-generation AI Traffic Management System with real-time computer vision, reinforcement learning optimization, emergency vehicle detection, and immersive 3D visualization
This repository is dedicated to implementing algorithms "From Scratch". It goes beyond simple API calls, diving deep into the underlying logic of everything from basic training to cutting-edge techniques like DeepSeek-R1.
This is a project for PPO S&P 500 trading
2D orbital rocket sim with PPO in PyTorch. Models thrust, drag, gravity, fuel; agent learns efficient ascent. Includes telemetry & visualization
Reinforcement learning–based controller for balancing an inverted pendulum using Proximal Policy Optimization (PPO). Supports configurable mass, length, and gravity settings (Earth, lunar, microgravity) with automated training logs, reward visualization, and performance analysis.
This repository implements a Proximal Policy Optimization (PPO) agent that learns to play Super Mario Bros using TensorFlow/Keras and OpenAI Gym. Features CNNs for vision, Actor-Critic architecture, and parallel environments. Train your own Mario master or run a pre-trained one!
🐾 Implement Proximal Policy Optimization (PPO) for quadruped locomotion, achieving 96% performance of RSL-RL with a custom solution for enhanced robot control.
A legged locomotion project
A specialized Reinforcement Learning (RL) project focused on multi-task mastery across 10 distinct gaming environments. General-Gamer-AI-Lite implements a lightweight multi-task agent designed to learn shared representations and transfer knowledge between varied game mechanics, from classic arcade challenges to strategic grid worlds.
Stable Baselines3
AI agents for Trackmania using the TMRL package. Implemented DDPG, PPO, and used two SAC algorithms (with one or two critics) to train cars to navigate custom-built tracks.
Multi-modal RL trading agent (CNN + PPO) integrating market prices, macroeconomic indicators, and news signals . MSc dissertation artefact.
This Legal Document Analyzer is a proof-of-concept NLP project demonstrating the potential of transformers for legal document summarization.
Working on new variant of PPO implemented in predefined jvrc robot simulation and SLM model integration
ClimatePredictor implemented by using Proximal policy optimization (PPO) with ray framework for the FederatedLearning approach
Среда робота-уборщика на базе Gymnasium с обучением через s-b3 PPO.
How close can LoRA get to full fine-tuning (FullFT) in terms of learning speed, performance, and compute tradeoffs? And under what conditions?
Add a description, image, and links to the ppo-algorithm topic page so that developers can more easily learn about it.
To associate your repository with the ppo-algorithm topic, visit your repo's landing page and select "manage topics."