Now showing 1 - 3 of 3
No Thumbnail Available
Publication

Maximal closed set and half-space separations in finite closure systems

2023-09-21 , Seiffarth, Florian , Horvath, Tamas , Wrobel, Stefan

Several concept learning problems can be regarded as special cases of half-space separation in abstract closure systems over finite ground sets. For the typical scenario that the closure system is given via a closure operator, we show that the half-space separation problem is NP-complete. As a first approach to overcome this negative result, we relax the problem to maximal closed set separation, give a simple generic greedy algorithm solving this problem with a linear number of closure operator calls, and show that this bound is sharp. For a second direction, we consider Kakutani closure systems and prove that they are algorithmically characterized by the greedy algorithm. As a first special case of the general problem setting, we consider Kakutani closure systems over graphs and give a sufficient condition for this kind of closure systems in terms of forbidden graph minors. For a second special case, we then focus on closure systems over finite lattices, give an improved adaptation of the generic greedy algorithm, and present an application concerning subsumption lattices.

No Thumbnail Available
Publication

A Fast Heuristic for Computing Geodesic Closures in Large Networks

2022-11-06 , Seiffarth, Florian , Horvath, Tamas , Wrobel, Stefan

Motivated by the increasing interest in applications of graph geodesic convexity in machine learning and data mining, we present a heuristic for approximating the geodesic convex hull of node sets in large networks. It generates a small set of (almost) maximal outerplanar spanning subgraphs for the input graph, computes the geodesic closure in each of these graphs, and regards a node as an element of the convex hull if it belongs to the closed sets for at least a user specified number of outerplanar graphs. Our heuristic algorithm runs in time linear in the number of edges of the input graph, i.e., it is faster with one order of magnitude than the standard algorithm computing the closure exactly. Its performance is evaluated empirically by approximating convexity based core-periphery decomposition of networks. Our experimental results with large real-world networks show that for most networks, the proposed heuristic was able to produce close approximations significantly faster than the standard algorithm computing the exact convex hulls. For example, while our algorithm calculated an approximate core-periphery decomposition in 5 h or less for networks with more than 20 million edges, the standard algorithm did not terminate within 50 days.

No Thumbnail Available
Publication

A generalized Weisfeiler-Lehman graph kernel

2022-04-27 , Schulz, Till Hendrik , Horvath, Tamas , Welke, Pascal , Wrobel, Stefan

After more than one decade, Weisfeiler-Lehman graph kernels are still among the most prevalent graph kernels due to their remarkable predictive performance and time complexity. They are based on a fast iterative partitioning of vertices, originally designed for deciding graph isomorphism with one-sided error. The Weisfeiler-Lehman graph kernels retain this idea and compare such labels with respect to equality. This binary valued comparison is, however, arguably too rigid for defining suitable graph kernels for certain graph classes. To overcome this limitation, we propose a generalization of Weisfeiler-Lehman graph kernels which takes into account a more natural and finer grade of similarity between Weisfeiler-Lehman labels than equality. We show that the proposed similarity can be calculated efficiently by means of the Wasserstein distance between certain vectors representing Weisfeiler-Lehman labels. This and other facts give rise to the natural choice of partitioning the vertices with the Wasserstein k-means algorithm. We empirically demonstrate on the Weisfeiler-Lehman subtree kernel, which is one of the most prominent Weisfeiler-Lehman graph kernels, that our generalization significantly outperforms this and other state-of-the-art graph kernels in terms of predictive performance on datasets which contain structurally more complex graphs beyond the typically considered molecular graphs.