In spite of the recent advancements of deep learning based techniques, automatic photo aesthetic assessment still remains a challenging computer vision task. Existing approaches used to focus on providing a single aesthetic score or category ("good" or "bad") of photograph, rather than quantifying "goodness" or "badness". The existing algorithms often ignore the importance of different attributes contributing to the artistic quality of the photograph. To obtain the human-interpretability of aesthetic score of photo, we advocate learning the aesthetic attributes alongwith the prediction of the general aesthetic score. We propose a multi-task deep CNN, that collectively learns aesthetic attributes alongwith a general aesthetic score for the photograph. To understand the mathematical representation of the attributes in the proposed model, a visualization technique is proposed using back propagation of gradients. These visualization of attributes correspond to the location of objects in the images in order to find out which part of an image "triggers" the classification outcome, thus providing the insights about the model's understanding of these attributes. This paper proposes an aesthetic feature vector based on the relative foreground position of the object in the image. The proposed aesthetic features outperform the state-of-art methods especially for Rule of Thirds attribute. © 2020 The Institution of Engineering and Technology.