In this paper we present a novel methodology for recognizing human activity in Egocentric video based on the Bag of Visual Features. The proposed technique is based on the assumption that, only a portion of the whole video can be sufficient to identify an activity. Rather, we argue that, for activity recognition in egocentric videos, the proposed approach performs better than any deep learning based method. Because, in egocentric videos, often the person wiring the sensor, becomes static for long time, or moves his head frequently. In both the cases, it becomes difficult to learn the spatiotemporal pattern of the video during action. The proposed approach divides the video into smaller video segments called Video Units. Spatio-temporal features extracted from the units, are clustered to construct the dictionary of Action Units (AU). The AUs are ranked based upon their score of likeliness. The scores are obtained by constructing a weighted graph with the AUs as vertices and edge weights calculated based on the frequencies of occurrences of the AUs during the activity. The less significant AUs are pruned out from the dictionary, and the revised dictionary of key AUs are used for activity classification. We test our approach on benchmark egocentric dataset and achieve a good accuracy. © 2018 ACM.