Frank Lewis, University of Texas at Arlington
Reinforcement learning structures for real-time optimal control and differential games

Jan 28, 2019, 2:00pm; EEB 132


This talk will discuss some new adaptive control structures for learning online the solutions to optimal control problems and multi-player differential games. Techniques from reinforcement learning are used to design a new family of adaptive controllers based on actor-critic mechanisms that converge in real time to optimal control and game theoretic solutions. Continuous-time systems are considered. Application of reinforcement learning to continuous-time (CT) systems has been hampered because the system Hamiltonian contains the full system dynamics. Using our technique known as Integral Reinforcement Learning (IRL), we will develop reinforcement learning methods that do not require knowledge of the system drift dynamics. In the linear quadratic (LQ) case, the new RL adaptive control algorithms learn the solution to the Riccati equation by adaptation along the system motion trajectories. In the case of nonlinear systems with general performance measures, the algorithms learn the (approximate smooth local) solutions of HJ or HJI equations. New algorithms will be presented for solving online the non zero-sum and zero-sum multi-player games. Each player maintains two adaptive learning structures, a critic network and an actor network. The result is an adaptive control system that learns based on the interplay of agents in a game, to deliver true online gaming behavior. A new Experience Replay technique is given that uses past data for present learning and significantly speeds up convergence. New methods of Off-policy Learning allow learning of optimal solutions without knowing any dynamic information. New RL methods in Optimal Tracking allow solution of the Output Regulator Equations for heterogeneous multi-agent systems.


Member, National Academy of Inventors. Fellow IEEE, Fellow IFAC, Fellow AAAS, Fellow U.K. Institute of Measurement & Control, PE Texas, U.K. Chartered Engineer. UTA Distinguished Scholar Professor, UTA Distinguished Teaching Professor, and Moncrief-O'Donnell Chair at The University of Texas at Arlington Research Institute. Qian Ren Thousand Talents Consulting Professor, Northeastern University, Shenyang, China. Ranked at position 84 worldwide, 64 in the USA, and 3 in Texas of all scientists in Computer Science and Electronics, by Guide2Research. Bachelor's Degree in Physics/EE and MSEE at Rice University, MS in Aeronautical Engineering at the University of Western Florida, Ph.D. at Georgia Tech. He works in feedback control, reinforcement learning, intelligent systems, and distributed control systems. Author of 7 U.S. patents, 410 journal papers, 426 conference papers, 20 books, 48 chapters, and 12 journal special issues. He received the Fulbright Research Award, NSF Research Initiation Grant, ASEE Terman Award, Int. Neural Network Soc. Gabor Award 2009, U.K. Inst. Measurement & Control Honeywell Field Engineering Medal 2009. Received AACC Ragazzini Education Award 2018, IEEE Computational Intelligence Society Neural Networks Pioneer Award 2012 and AIAA Intelligent Systems Award 2016. IEEE Control Systems Society Distinguished Lecturer. Project 111 Professor at Northeastern University, China. Distinguished Foreign Scholar at Chongqing Univ. China. Received Outstanding Service Award from Dallas IEEE Section, selected as Engineer of the Year by Ft. Worth IEEE Section. Listed in Ft. Worth Business Press Top 200 Leaders in Manufacturing. Received the 2010 IEEE Region 5 Outstanding Engineering Educator Award and the 2010 UTA Graduate Dean's Excellence in Doctoral Mentoring Award. Elected to UTA Academy of Distinguished Teachers 2012. Texas Regents Outstanding Teaching Award 2013. He served on the NAE Committee on Space Station in 1995.