An investigation on linear SVM and its variants for text categorization

M.A. Kumar; Madan Gopal

doi:10.1109/ICMLC.2010.64

Profiles Research Units Publications

Conferences

An investigation on linear SVM and its variants for text categorization

M.A. Kumar,

Published in

2010

DOI: 10.1109/ICMLC.2010.64

Pages: 27 - 31

Abstract

Linear Support Vector Machines (SVMs) have been used successfully to classify text documents into set of concepts. With the increasing number of linear SVM formulations and decomposition algorithms publicly available, this paper performs a study on their efficiency and efficacy for text categorization tasks. Eight publicly available implementations are investigated in terms of Break Even Point (BEP), F1 measure, ROC plots, learning speed and sensitivity to penalty parameter, based on the experimental results on two benchmark text corpuses. The results show that out of the eight implementations, SVM lin and Proximal SVM perform better in terms of consistent performance and reduced training time. However being an extremely simple algorithm with training time independent of the penalty parameter and the category for which training is being done, Proximal SVM is appealing. We further investigated fuzzy proximal SVM on both the text corpuses; it showed improved generalization over proximal SVM. © 2010 IEEE.

Topics: Support vector machine (55)% and Statistical classification (51)%

View more info for "An Investigation on Linear SVM and its Variants for Text Categorization"

About the journal

Published in