Options
2015
Journal Article
Title
Identifying ISI-indexed articles by their lexical usage: A text analysis approach
Abstract
This research creates an architecture for investigating the existence of probable lexical divergences between articles, categorized as Institute for Scientific Information (ISI) and non-ISI, and consequently, if such a difference is discovered, to propose the best available classification method. Based on a collection of ISI- and non-ISI-indexed articles in the areas of business and computer science, three classification models are trained. A sensitivity analysis is applied to demonstrate the impact of words in different syntactical forms on the classification decision. The results demonstrate that the lexical domains of ISI and non-ISI articles are distinguishable by machine learning techniques. Our findings indicate that the support vector machine identifies ISI-indexed articles in both disciplines with higher precision than do the Naïve Bayesian and K-Nearest Neighbors techniques.
Author(s)
Moohebat, Mohammadreza
Faculty of Computer Science and Information Technology, Department of Artificial Intelligence, University of Malaya, Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
Raj, Ram Gopal
Faculty of Computer Science and Information Technology, Department of Artificial Intelligence, University of Malaya, Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia