Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Imposing Category Trees Onto Word-Embeddings Using a Geometric Construction

 
: Dong, Tiansi; Bauckhage, Christian; Jin, Hailong; Li, J.; Cremers, O.; Speicher, D.; Cremers, A.B.; Zimmermann, J.

:
Volltext (PDF; )

ICLR 2019, Seventh International Conference on Learning Representations. Online resource : New Orleans, Louisiana, United States, May 6 - May 9, 2019
Online im WWW, 2019
https://openreview.net/group?id=ICLR.cc/2019/Conference
10 S.
International Conference on Learning Representations (ICLR) <7, 2019, New Orleans/La.>
National Natural Science Foundation of China NSFC
61472177
National Natural Science Foundation of China NSFC
61661146007
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
01/S17064
Bundesministerium für Bildung und Forschung BMBF (Deutschland)
01/S18038C
Englisch
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()

Abstract
We present a novel method to precisely impose tree-structured category information onto word-embeddings, resulting in ball embeddings in higher dimensional spaces (N-balls for short). Inclusion relations among N-balls implicitly encode subordinate relations among categories. The similarity measurement interms of the cosine function is enriched by category information. Using a geometric construction method instead of back-propagation, we create large N-ball embeddings that satisfy two conditions: (1) category trees are precisely imposed onto word embeddings at zero energy cost; (2) pre-trained word embeddings are well preserved. A new benchmark data set is created for validating the category of unknown words. Experiments show that N-ball embeddings, carrying category information, significantly outperform word embeddings in the test of nearest neighborhoods, and demonstrate surprisingly good performance in validating categories of unknown words. Source codes and data-sets are free for public access https://github.com/GnodIsNait/nball4tree.git and https://github.com/GnodIsNait/bp94nball.git.

: http://publica.fraunhofer.de/dokumente/N-593203.html