Decentral Runtime Adaptation for Fault Tolerance in Distributed Industrial Systems
Fueled by the growing global competitive pressure, future industrial systems require a much higher degree of exibility and reliability. The increasing complexity necessitates a paradigm shift towards a service-oriented organization of manufacturing processes, commonly associated with the keyword Industry 4.0. The higher exibility results from using a large number of diverse autonomous and interoperable systems which, however, impose fundamental challenges to achieve sufficiently high levels in reliability and robustness. Consequently, fault tolerance is a key merit to sustain production throughput when subsystems exhibit failure or malfunctioning. Prior research has determined the middleware layer as most adequate to implement fault tolerance mechanisms, owing to its high level of abstraction over both computational tasks and physical operations. This thesis proposes to use the Robotic Operating System (ROS) in the middleware layer as a means to achieve a distinct level of fault tolerance. ROS is an emerging framework for robotic applications and as such, has intrinsic intersections with industrial manufacturing processes. Key features of ROS are its high degree of modularization, support for hardware abstraction, and build-in message-passing functionality for efficient peer-to-peer-communication. Yet, ROS has not been primarily designed for industrial applications and it has not been studied previously whether fault tolerance mechanisms are feasible in the context of ROS. As the first step of this work, the future requirements of industrial fault tolerance in the literature are reviewed and design goals are identified. Subsequently, a conceptual blueprint for decentral runtime adaptation based on dynamic reconfiguration of software components among subsystems is proposed and implemented on an ROS compatible demonstrator system that simulates modular and exible manufacturing. The effectivity of the suggested approach is tested by means of two fault scenarios: partial functional degradation of a single subsystem and failure of an entire subsystem. The distributed adaptation units on subsystems, implemented as ROS components, are successfully monitoring intercommunication of software components and adequately initiate reconfigurations. Consequently, productivity is sustained and a much higher level of robustness is realized for the compound system. The qualitative findings of this thesis underline the suitability of ROS to function as industrial middleware with an integrated fault tolerance mechanism. Investigations for real-time capable and optimization-based reconfiguration mechanisms are left for future work.
München, TU, Master Thesis, 2019