Validation of similarity measures for industrial alarm flood analysis?
The aim of industrial alarm flood analysis is to assist plant operators who face large amounts of alarms, referred to as alarm floods, in their daily work. Many methods used to this end involve some sort of a similarity measure to detect similar alarm sequences. However, multiple similarity measures exist and it is not clear which one is best suited for alarm analysis. In this paper, we perform an analysis of the behaviour of the similarity measures and attempt to validate the results in a semi-formalised way. To do that, we employ synthetically generated floods, based on assumption that synthetic floods that are generated as 'similar' to the original floods should receive similarity scores close to the original floods. Consequently, synthetic floods generated as 'not-similar' to the original floods are expected to receive different similarity scores. Validation of similarity measures is performed by comparing the result of clustering the original and synthetic alarm floods. This comparison is performed with standard clustering validation measures and application-specific measures.