This paper presents an intelligent wind speed sensor less maximum power point tracking (MPPT) method for a variable speed wind energy conversion system (VS-WECS) based on a Q-Learning algorithm. The Q-Learning algorithm consists of Q-values for each state action pair which is updated using reward and learning rate. Inputs to define these states are electrical power received by grid and rotational speed of the generator. In this paper, Q-Learning is equipped with peak detection technique, which drives the system towards peak power even if learning is incomplete which makes the real time tracking faster. To make the learning uniform, each state has its separate learning parameter instead of common learning parameter for all states as is the case in conventional Q-Learning. Therefore, if half learned system is running at peak point, it does not affect the learning of unvisited states. Also, wind speed change detection is combined with proposed algorithm which makes it eligible to work for varying wind speed conditions. In addition, the information of wind turbine characteristics and wind speed measurement is not needed. The algorithm is verified through simulations and experimentation and also compared with perturbation and observation (P&O) algorithm. © 1986-2012 IEEE.