Now showing 1 - 4 of 4
  • Publication
    One click mining-interactive local pattern discovery through implicit preference and performance learning
    ( 2013) ;
    Mampaey, M.
    ;
    Kang, B.
    ;
    Tokmakov, P.
    ;
    It is known that productive pattern discovery from data has to interactively involve the user as directly as possible. State-of-the-art toolboxes require the specification of sophisticated workows with an explicit selection of a data mining method, all its required parameters, and a corresponding algorithm. This hinders the desired rapid interaction-especially with users that are experts of the data domain rather than data mining experts. In this paper, we present a fundamentally new approach towards user involvement that relies exclusively on the implicit feedback available from the natural analysis behavior of the user, and at the same time allows the user to work with a multitude of pattern classes and discovery algorithms simultaneously without even knowing the details of each algorithm. To achieve this goal, we are relying on a recently proposed co-active learning model and a special feature representation of patterns to arrive at an adaptively tuned user interesti ngness model. At the same time, we propose an adaptive time-allocation strategy to distribute computation time among a set of underlying mining algorithms. We describe the technical details of our approach, present the user interface for gathering implicit feedback, and provide preliminary evaluation results.
  • Publication
    Efficient discovery of interesting patterns based on strong closedness
    Finding patterns that are interesting to a user in a certain application context is one of the central goals of data mining research. Regarding all patterns above a certain frequency threshold as interesting is one way of defining interestingness. In this paper, however, we argue that in many applications, a different notion of interestingness is required in order to be able to capture "long", and thus particularly informative, patterns that are correspondingly of low frequency. To identify such patterns, our proposed measure of interestingness is based on the degree or strength of closedness of the patterns. We show that (i) indeed this definition selects long interesting patterns that are difficult to identify with frequency-based approaches, and (ii) that it selects patterns that are robust against noise and/or dynamic changes. We prove that the family of interesting patterns proposed here forms a closure system and use the corresponding closure operator to design a mining algorithm listing these patterns in amortized quadratic time. In particular, for nonsparse datasets its time complexity is O(nm) per pattern, where n denotes the number of items and m the size of the database. This is equal to the best known time bound for listing ordinary closed frequent sets, which is a special case of our problem. We also report empirical results with real-world datasets.