Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Min-hashing for probabilistic frequent subtree feature spaces

 
: Welke, P.; Horváth, T.; Wrobel, S.

:

Calders, T.:
Discovery science. 19th international conference, DS 2016 : Bari, Italy, October 19-21, 2016; Proceedings
Cham: Springer International Publishing, 2016 (Lecture Notes in Artificial Intelligence 9956)
ISBN: 978-3-319-46306-3 (Print)
ISBN: 978-3-319-46307-0 (Online)
S.67-82
International Conference on Discovery Science (DS) <19, 2016, Bari>
Englisch
Konferenzbeitrag
Fraunhofer IAIS ()

Abstract
We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.

: http://publica.fraunhofer.de/dokumente/N-444830.html