Estimating scene flow in RGB-D videos is attracting much interest of the computer vision researchers, due to its potential applications in robotics. The state-of-the-art techniques for scene flow estimation typically rely on the knowledge of scene structure of the frame and the correspondence between frames. However, with the availability of large RGB-D data captured from depth sensors, learning representations for estimation of scene flow has become possible. This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture. The proposed network SceneEDNet involves estimation of three dimensional motion vectors of all the scene points from sequence of stereo images. The training for direct estimation of scene flow is done using consecutive pairs of stereo images and corresponding scene flow ground truth. © 2018 IEEE.