Finite-Time Analysis for Double Q-learning accepted for spotlight presentation at NeurIPS 2020.

  • Dec , 2020

Our paper entitled “Finite-Time Analysis for Double Q-learning” has been accepted for spotlight presentation at NeurIPS 2020! Only 280 spotlight presentations were accepted to the conference among 1900 accepted papers and a record-breaking 9454 submissions! In this paper, we provide the first non-asymptotic (i.e., finite-time) analysis for double Q-learning. We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an epsilon-accurate neighborhood of the global optimum with finite iterations. Our analysis develops novel techniques to derive finite time bounds on the difference between two inter-connected stochastic processes, which is new to the literature of stochastic approximation. Paper available at

Close Menu