Combining Model-Based Design and Model-Free Policy Optimization to Learn Safe, Stabilizing Controllers

التفاصيل البيبلوغرافية
العنوان: Combining Model-Based Design and Model-Free Policy Optimization to Learn Safe, Stabilizing Controllers
المؤلفون: Ayush Agrawal, Koushil Sreenath, Fernando Castañeda, S. Shankar Sastry, Tyler Westenbroek
المصدر: ADHS
بيانات النشر: Elsevier BV, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Pointwise, Operating point, Mathematical optimization, Optimization problem, Control and Systems Engineering, Computer science, Model-based design, Stability (learning theory), Reinforcement learning, Penalty method, Control-Lyapunov function
الوصف: This paper introduces a framework for learning a safe, stabilizing controller for a system with unknown dynamics using model-free policy optimization algorithms. Using a nominal dynamics model, the user specifies a candidate Control Lyapunov Function (CLF) around the desired operating point, and specifies the desired safe-set using a Control Barrier Function (CBF). Using penalty methods from the optimization literature, we then develop a family of policy optimization problems which attempt to minimize control effort while satisfying the pointwise constraints used to specify the CLF and CBF. We demonstrate that when the penalty terms are scaled correctly, the optimization prioritizes the maintenance of safety over stability, and stability over optimality. We discuss how standard reinforcement learning algorithms can be applied to the problem, and validate the approach through simulation. We then illustrate how the approach can be applied to a class of hybrid models commonly used in the dynamic walking literature, and use it to learn safe, stable walking behavior over a randomly spaced sequence of stepping stones.
تدمد: 2405-8963
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_________::fed9a461be73867f7ba65c64aaf61e5cTest
https://doi.org/10.1016/j.ifacol.2021.08.468Test
حقوق: OPEN
رقم الانضمام: edsair.doi...........fed9a461be73867f7ba65c64aaf61e5c
قاعدة البيانات: OpenAIRE