Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Support Estimation in Frequent Itemset Mining by Locality Sensitive Hashing

: Pick, Annika; Horváth, Tamás; Wrobel, Stefan

Volltext urn:nbn:de:0011-n-5592469 (202 KByte PDF)
MD5 Fingerprint: f7d60f04014ac6c04019b720a3a92e31
(CC) by
Erstellt am: 26.9.2019

Jäschke, Robert:
Conference on "Lernen, Wissen, Daten, Analysen", LWDA 2019. Proceedings. Online resource : Berlin, Germany, September 30 - October 2, 2019
Berlin, 2019 (CEUR Workshop Proceedings 2454)
Conference "Lernen, Wissen, Daten, Analysen" (LWDA) <2019, Berlin>
Konferenzbeitrag, Elektronische Publikation
Fraunhofer IAIS ()

The main computational effort in generating all frequent itemsets in a transactional database is in the step of deciding whether an itemset is frequent, or not. We present a method for estimating itemset supports with two-sided error. In a preprocessing step our algorithm first partitions the database into groups of similar transactions by using locality sensitive hashing and calculates a summary for each of these groups. The support of a query itemset is then estimated by means of these summaries. Our preliminary empirical results indicate that the proposed method results in a speed-up of up to a factor of 50 on large datasets. The F-measure of the output patterns varies between 0.83 and 0.99.