Parsing and Extracting Features from OPC Unified Architecture in Industrial Environments
OPC Unified Architecture (OPC UA) is considered as a significant part of future industrial networks to provide a modular and well-structured access to machine data. In this paper, we proposed a method for parsing OPC UA binary content and a novel two-phase approach for extracting features from OPC UA. The developed method is able to parse OPC UA content sequentially and to transfer data in a structured way. For extracting features from the payload based on a new string representation, an algorithm extracts all numbers from strings and processes them separately to the letters in the first phase. The used regular expression replaces all integers and floating-point numbers with a special character. In case of not identical strings, the representation is done by an ASCII-distribution as well as an edit-distance in the second phase. On the one hand this approach documented how OPC UA data is parsed and how to use the data in subsequent steps to run analyses. On the other hand the extracted string representations are more accurate than any existing approaches and thus, are usable for machine learning algorithms.