دورية أكاديمية

Stochastic Double Deep Q-Network

التفاصيل البيبلوغرافية
العنوان: Stochastic Double Deep Q-Network
المؤلفون: Pingli Lv, Xuesong Wang, Yuhu Cheng, Ziming Duan
المصدر: IEEE Access, Vol 7, Pp 79446-79454 (2019)
بيانات النشر: IEEE, 2019.
سنة النشر: 2019
المجموعة: LCC:Electrical engineering. Electronics. Nuclear engineering
مصطلحات موضوعية: Estimation bias, deep reinforcement learning, maximum operation, double estimator operation, stochastic combination, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
الوصف: Estimation bias seriously affects the performance of reinforcement learning algorithms. The maximum operation may result in overestimation, while the double estimator operation often leads to underestimation. To eliminate the estimation bias, these two operations are combined together in our proposed algorithm named stochastic double deep Q-learning network (SDDQN), which is based on the idea of random selection. A tabular version of SDDQN is also given, named stochastic double Q-learning (SDQ). Both the SDDQN and SDQ are based on the double estimator framework. At each step, we choose to use either the maximum operation or the double estimator operation with a certain probability, which is determined by a random selection parameter. The theoretical analysis shows that there indeed exists a proper random selection parameter that makes SDDQN and SDQ unbiased. The experiments on Grid World and Atari 2600 games illustrate that our proposed algorithms can balance the estimation bias effectively and improve performance.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2169-3536
العلاقة: https://ieeexplore.ieee.org/document/8736298Test/; https://doaj.org/toc/2169-3536Test
DOI: 10.1109/ACCESS.2019.2922706
الوصول الحر: https://doaj.org/article/913e8e57aeaf4ff4b13066c47cd43dc2Test
رقم الانضمام: edsdoj.913e8e57aeaf4ff4b13066c47cd43dc2
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21693536
DOI:10.1109/ACCESS.2019.2922706