Monitoring tree cover in an area plays an important role in a wide range of applications and advances in UAV technology has made it feasible to capture high resolution imagery which can be used for this purpose. In this study, we adopt a state of the art object detector Mask Region-based CNN (Mask R-CNN1), through transfer learning, for the task of tree segmentation and counting. One bottleneck for the proposed task is the huge amount of data required if the model is required to be scalable to various different geographical regions. Towards this end, we explore the use of a sampling technique based on Gist descriptors and Gabor filtering in order to minimize the amount of training data required for obtaining excellent model performance across images with varied geographical features. This study was conducted across four regions in India, each having a different geographical landscape. We captured a total of 2357 images across all four regions. The final training dataset comprised of 48 images (sampled using the aforementioned method), representative of the entire dataset. Our method demonstrates high quality and scalable tree detection results. © COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only.