• English
  • Deutsch
  • Log In
    or
  • Research Outputs
  • Projects
  • Researchers
  • Institutes
  • Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. HASKER: An efficient algorithm for string kernels. Application to polarity classification in various languages
 
  • Details
  • Full
Options
2017
Zeitschriftenaufsatz
Titel

HASKER: An efficient algorithm for string kernels. Application to polarity classification in various languages

Abstract
String kernels have successfully been used for various NLP tasks, ranging from text categorization by topic to native language identification. In this paper, we present a simple and efficient algorithm for computing various spectrum string kernels. When comparing two strings, we store the p-grams in the first string into a hash table, and then we apply a hash table lookup for the p-grams that occur in the second string. In terms of time, we show that our algorithm can outperform a state-of-the-art tool for computing string similarity. In terms of accuracy, we show that our approach can reach state-of-the-art performance for polarity classification in various languages. Our efficient implementation is provided online for free at http://string-kernels.herokuapp.com.
Author(s)
Popescu, Marius
University of Bucharest
Grozea, Cristian
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS
Ionescu, Radu Tudor
University of Bucharest
Zeitschrift
Procedia computer science
Konferenz
International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES) 2017
Thumbnail Image
DOI
10.1016/j.procs.2017.08.207
Externer Link
Externer Link
Language
Englisch
google-scholar
FOKUS
Tags
  • string kernels

  • blended spectrum kern...

  • intersection kernel

  • kernel methods

  • similarity-based lear...

  • polarity classificati...

  • opining mining

  • sentiment analysis

  • string kernels tool

  • open-source code

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Send Feedback
© 2022