Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Imposing Category Trees Onto Word-Embeddings Using a Geometric Construction

: Dong, Tiansi; Bauckhage, Christian; Jin, Hailong; Li, J.; Cremers, O.; Speicher, D.; Cremers, A.B.; Zimmermann, J.

Fulltext (PDF; )

ICLR 2019, Seventh International Conference on Learning Representations. Online resource : New Orleans, Louisiana, United States, May 6 - May 9, 2019
Online im WWW, 2019
10 pp.
International Conference on Learning Representations (ICLR) <7, 2019, New Orleans/La.>
National Natural Science Foundation of China NSFC
National Natural Science Foundation of China NSFC
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
Conference Paper, Electronic Publication
Fraunhofer IAIS ()

We present a novel method to precisely impose tree-structured category information onto word-embeddings, resulting in ball embeddings in higher dimensional spaces (N-balls for short). Inclusion relations among N-balls implicitly encode subordinate relations among categories. The similarity measurement interms of the cosine function is enriched by category information. Using a geometric construction method instead of back-propagation, we create large N-ball embeddings that satisfy two conditions: (1) category trees are precisely imposed onto word embeddings at zero energy cost; (2) pre-trained word embeddings are well preserved. A new benchmark data set is created for validating the category of unknown words. Experiments show that N-ball embeddings, carrying category information, significantly outperform word embeddings in the test of nearest neighborhoods, and demonstrate surprisingly good performance in validating categories of unknown words. Source codes and data-sets are free for public access and