Convex NMF on non-convex massiv data

Kersting, Kristian; Wahabzada, Mirwaes; Thurau, Christian; Bauckhage, Christian

2010

Conference Paper

Abstract

We present an extension of convex-hull nonnegative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization (CNMF) or Archetypal Analysis (AA). CH-NMF factorizes a non-negative data matrix V into two non-negative matrix factors V WH such that the columns of W are convex combinations of certain data points so that they are readily interpretable to data analysts. There is, however, no free lunch: imposing convexity constraints on W typically prevents adaptation to intrinsic, low dimensional structures in the data. Alas, in cases where the data is distributed in a nonconvex manner or consists of mixtures of lower dimensional convex distributions, the cluster representatives obtained from CH-NMF will be less meaningful. In this paper, we present a hierarchical CH-NMF that automatically adapts to internal structures of a dataset, hence it yields meaningful and interpretable clusters for non-convex datasets . This is also conformed by our extensive evaluation on DBLP publication records of 760,000 authors, 4,000,000 images harvested from the web, and 150,000,000 votes on World of Warcraft guilds.

Author(s)

Mainwork

LWA 2010 - Lernen, Wissen und Adaptivität

Conference

Workshop Lernen, Wissensentdeckung und Adaptivität (LWA) 2010

Options

Convex NMF on non-convex massiv data