Options
2025
Conference Paper
Title
Visualizing Extracted Patterns from Dictionary-Based Compression Algorithms
Abstract
Modern software systems continuously generate massive amounts of log files in different and varying formats. These logs contain information about the application activities, which is necessary for improvements by analyzing the behavior and maintaining the security and stability of the system. To manage their size, logs are typically stored in compressed form using algorithms that exploit repetitive patterns. This work presents an approach to detecting frequent patterns in textual data that can be registered simultaneously during the file compression process. The log file is visualized with the possibility to explore the extracted patterns using metrics based on such properties as frequency and length of the acquired pattern. This allows an analyst to gain the relevant insights more efficiently reducing the need for manual labor-intensive inspection in the log data. The implemented extension of a dictionary-based compression algorithm has the advantage of recognizing patterns in log files of any format and eliminates the need to manually perform preparation for any preprocessing of log files.