Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Minhashing for probabilistic frequent subtree feature spaces
 Calders, T.: Discovery science. 19th international conference, DS 2016 : Bari, Italy, October 1921, 2016; Proceedings Cham: Springer International Publishing, 2016 (Lecture Notes in Artificial Intelligence 9956) ISBN: 9783319463063 (Print) ISBN: 9783319463070 (Online) pp.6782 
 International Conference on Discovery Science (DS) <19, 2016, Bari> 

 English 
 Conference Paper 
 Fraunhofer IAIS () 
Abstract
We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccardsimilarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccardsimilarity by minhash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the minhash sketch, without explicitly performing the feature space embedding. Experimental results on realworld graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are wellsuited for classification and retrieval tasks in large graph datasets.