AlgorithmsΒΆ
AgileRL already includes state-of-the-art evolvable on-policy, off-policy, offline and multi-agent reinforcement learning algorithms with distributed training. We are constantly adding more algorithms, with a view to add hierarchical algorithms soon.
Reinforcement learning algorithms implemented:
- Conservative Q-Learning (CQL)
- Deep Deterministic Policy Gradient (DDPG)
- Deep Q-Learning (DQN)
- Rainbow DQN
- Implicit Language Q-Learning (ILQL)
- Proximal Policy Optimization (PPO)
- Twin Delayed Deep Deterministic Policy Gradient (TD3)
- Multi-Agent Deep Deterministic Policy Gradient (MADDPG)
- Multi-Agent Twin-Delayed Deep Deterministic Policy Gradient (MATD3)
- Neural Contextual Bandits with UCB-based Exploration (NeuralUCB)
- Neural Contextual Bandits with Thompson Sampling (NeuralTS)