Training the Continuous Lunar Lander with Reinforcement Learning, RLLib and PPO
For an upcoming blog post, I would like to have a robotic arm to land a Lunar Lander autonomously. In [Part 1](/post/2020/5/30/coding-ai-rl-iot-xbox-robot-arm-creation) I explained how we can build such a robotic arm already, but now we need to be able to go deeper into how we are able to train an environment in a simulation environment (before deploying it on a physical device).