Sean Meyn, University of Florida
Reinforcement learning: hidden theory, and new super-fast algorithms

Feb 21, 2018, 2:00pm; EEB 132 (note day change)
Cross-listed with CCI-MHI Joint Seminar Series on Cyber-Physical Systems

Abstract

Stochastic Approximation algorithms are used to approximate solutions to fixed point equations that involve expectations of functions with respect to possibly unknown distributions. The most famous examples today are TD- and Q-learning algorithms. The first half of this lecture will provide an overview of stochastic approximation, with a focus on optimizing the rate of convergence. A new approach to optimize the rate of convergence leads to the new Zap Q-learning algorithm. Analysis suggests that its transient behavior is a close match to a deterministic Newton-Raphson implementation, and numerical experiments confirm super fast convergence.

Biosketch

Sean Meyn received the BA degree in mathematics from the University of California, Los Angeles, in 1982 and the PhD degree in electrical engineering from McGill University, Canada, in 1987 (with Prof. P. Caines). He is now Professor and Robert C. Pittman Eminent Scholar Chair in the Department of Electrical and Computer Engineering at the University of Florida, the director of the Laboratory for Cognition and Control, and director of the Florida Institute for Sustainable Energy. His academic research interests include theory and applications of decision and control, stochastic processes, and optimization. He has received many awards for his research on these topics, and is a fellow of the IEEE. He has held visiting positions at universities all over the world, including the Indian Institute of Science, Bangalore during 1997-1998 where he was a Fulbright Research Scholar. During his latest sabbatical during the 2006-2007 academic year he was a visiting professor at MIT and United Technologies Research Center (UTRC). His award-winning 1993 monograph with Richard Tweedie, Markov Chains and Stochastic Stability, has been cited thousands of times in journals from a range of fields. The latest version is published in the Cambridge Mathematical Library. For the past ten years his applied research has focused on engineering, markets, and policy in energy systems. He regularly engages in industry, government, and academic panels on these topics, and hosts an annual workshop at the University of Florida.