Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Support Estimation in Frequent Itemset Mining by Locality Sensitive Hashing

Paper presented at Conference "Lernen, Wissen, Daten, Analysen", LWDA 2019, 30.09.-02.10.2019, Berlin
 
: Pick, Annika; Horváth, Tamás; Wrobel, Stefan

:
Volltext urn:nbn:de:0011-n-5592469 (202 KByte PDF)
MD5 Fingerprint: f7d60f04014ac6c04019b720a3a92e31
(CC) by
Erstellt am: 26.9.2019


2019, 5 S.
Conference "Lernen, Wissen, Daten, Analysen" (LWDA) <2019, Berlin>
Englisch
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()

Abstract
The main computational effort in generating all frequent itemsets in a transactional database is in the step of deciding whether an itemset is frequent, or not. We present a method for estimating itemset supports with two-sided error. In a preprocessing step our algorithm first partitions the database into groups of similar transactions by using locality sensitive hashing and calculates a summary for each of these groups. The support of a query itemset is then estimated by means of these summaries. Our preliminary empirical results indicate that the proposed method results in a speed-up of up to a factor of 50 on large datasets. The F-measure of the output patterns varies between 0.83 and 0.99.

: http://publica.fraunhofer.de/dokumente/N-559246.html