Stochastic control strategies and adaptive critic methods

Randa Herzallah*, David Lowe

*Corresponding author for this work

Research output: Chapter in Book/Published conference outputConference publication

Abstract

Adaptive critic methods have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Since they approximate the dynamic programming solutions, they are potentially suitable for learning in noisy, nonlinear and nonstationary environments. In this study, a novel probabilistic dual heuristic programming (DHP) based adaptive critic controller is proposed. Distinct to current approaches, the proposed probabilistic (DHP) adaptive critic method takes uncertainties of forward model and inverse controller into consideration. Therefore, it is suitable for deterministic and stochastic control problems characterized by functional uncertainty. Theoretical development of the proposed method is validated by analytically evaluating the correct value of the cost function which satisfies the Bellman equation in a linear quadratic control problem. The target value of the critic network is then calculated and shown to be equal to the analytically derived correct value.

Original languageEnglish
Title of host publicationProceedings of the fifth international conference on informatics in control, automation and robotics
Subtitle of host publicationFunchal, Madeira, May 11 - 15, 2008. Robotics and automation
EditorsFilipe Joaquim
Place of Publication(PT)
Pages281-288
Number of pages8
Publication statusPublished - 2008
Event5th International Conference on Informatics in Control, Automation and Robotics - Madeira, Portugal
Duration: 11 May 200815 May 2008

Conference

Conference5th International Conference on Informatics in Control, Automation and Robotics
Abbreviated titleICINCO 2008
Country/TerritoryPortugal
CityMadeira
Period11/05/0815/05/08

Keywords

  • adaptive critic methods
  • functional uncertainty
  • stochastic control

Fingerprint

Dive into the research topics of 'Stochastic control strategies and adaptive critic methods'. Together they form a unique fingerprint.

Cite this