Options
2004
Conference Paper
Titel
Text classification of news articles with support vector machines
Abstract
Support Vector Machines (SVM) can classify objects described by an effectively infinite-dimensional feature vector. This gives them the ability to use counts of different words in a document, i.e. more than 100000 words, directly for classification. In this paper we describe the results of a large number of experiments of different preprocessing strategies to generate effective input features. It turns out that n-grams of syllables and phonemes are especially effective for classification.