This article addresses the problem of adaptive control of nonlinear chemical processes with time-varying dynamics. The Q-learning and policy-iteration algorithms have been considered in the reinforcement learning (RL) framework. The performance of the two algorithms has been tested on highly nonlinear simulated continuous stirred tank reactor (CSTR). Comparison with conventional methods shows that the RL techniques are able to achieve better performance, and robustness against uncertainties. The policy-iteration algorithm is relatively faster in convergence, and better in robustness. Copyright © 2010 Curtin University of Technology and John Wiley & Sons, Ltd.