دورية أكاديمية

Universal Reinforcement Learning.

التفاصيل البيبلوغرافية
العنوان: Universal Reinforcement Learning.
المؤلفون: Farias, Vivek F.1 vivekf@mit.edu, Moallemi, Ciamac C.2,3 ciamac@gsb.columbia.edu, Van Roy, Benjamin4,5,6 bvr@stanford.edu, Weissman, Tsachy4,6 tsachy@stanford.edu
المصدر: IEEE Transactions on Information Theory. May2010, Vol. 56 Issue 5, p2441-2454. 14p.
مصطلحات موضوعية: *CODING theory, *DATA transmission systems, *DATA compression (Telecommunication), *ALGORITHMS, *REINFORCEMENT learning
مستخلص: Abstract-We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence future observations and costs. The goal is to minimize the long-term average cost. We propose a novel algorithm, known as the active LZ algorithm, for optimal control based on ideas from the Lempel-Ziv scheme for universal data compression and prediction. We establish that, under the active LZ algorithm, if there exists an integer K such that the future is conditionally independent of the past given a window of K consecutive actions and observations, then the average cost converges to the optimum. Experimental results involving the game of Rock-Paper-Scissors illustrate merits of the algorithm. [ABSTRACT FROM AUTHOR]
قاعدة البيانات: Academic Search Index
الوصف
تدمد:00189448
DOI:10.1109/TIT.2010.2043762