Listing closed sets of strongly accessible set systems with applications to data mining
2010,
Boley, Mario,
Horvath, Tamas,
Poigné, Axel,
Wrobel, Stefan
We study the problem of listing all closed sets of a closure operator a that is a partial function on the power set of some finite ground set E, i.e., sigma : F -> F with F subset of P(E). A very simple divide-and-conquer algorithm is analyzed that correctly solves this problem if and only if the domain of the closure operator is a strongly accessible set system. Strong accessibility is a strict relaxation of greedoids as well as of independence systems. This algorithm turns out to have delay O (vertical bar E vertical bar (T-F + T-sigma + vertical bar E vertical bar)) and space O (vertical bar E vertical bar + S-F + S-sigma), where T-F, S-F, T-sigma, and S-sigma are the time and space complexities of checking membership in F and computing a, respectively. In contrast, we show that the problem becomes intractable for accessible set systems. We relate our results to the data mining problem of listing all support-closed patterns of a dataset and show that there is a corresponding closure operator for all datasets if and only if the set system satisfies a certain confluence property.