CC BY 4.0Gronauer, SvenSvenGronauerHaider, TomTomHaiderSchmoeller da Roza, FelippeFelippeSchmoeller da RozaDiepold, KlausKlausDiepold2024-06-112024-06-132024-06-112024https://doi.org/10.24406/h-469571https://publica.fraunhofer.de/handle/publica/46957110.5555/3635637.366292510.24406/h-469571Reinforcement learning algorithms need exploration to learn. However, unsupervised exploration prevents the deployment of such algorithms on safety-critical tasks and limits real-world deployment. In this paper, we propose a new algorithm called Ensemble Model Predictive Safety Certification that combines model-based deep reinforcement learning with tube-based model predictive control to correct the actions taken by a learning agent, keeping safety constraint violations at a minimum through planning. Our approach aims to reduce the amount of prior knowledge about the actual system by requiring only offline data generated by a safe controller. Our results show that we can achieve significantly fewer constraint violations than comparable reinforcement learning methods.enreinforcement learningRLsafe reinforcement learningsafe RLsafe explorationpredictive safety filtermodel-based learningReinforcement Learning with Ensemble Model Predictive Safety Certificationconference paper