Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Text classification of news articles with support vector machines

 
: Paaß, G.; Kindermann, J.; Leopold, E.

Sirmakessis, S.:
Text mining and its applications : Results of the NEMIS Launch Conference
Berlin: Springer, 2004 (Studies in fuzziness and soft computing 138)
ISBN: 3-540-20238-2
ISBN: 978-3-540-20238-7
pp.53-64
NEMIS Launch Conference <1, 2003, Patrai>
International Workshop on Text Mining and its Applications <1, 2003, Patrai>
English
Conference Paper
Fraunhofer AIS ( IAIS) ()
text mining; press archive; kernel classifier

Abstract
Support Vector Machines (SVM) can classify objects described by an effectively infinite-dimensional feature vector. This gives them the ability to use counts of different words in a document, i.e. more than 100000 words, directly for classification. In this paper we describe the results of a large number of experiments of different preprocessing strategies to generate effective input features. It turns out that n-grams of syllables and phonemes are especially effective for classification.

: http://publica.fraunhofer.de/documents/N-69622.html