Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Dynamic scheduling in large-scale stochastic processing networks for demand-driven manufacturing using distributed reinforcement learning

: Qu, Shuhui; Wang, Jie; Jasperneite, Jürgen


Institute of Electrical and Electronics Engineers -IEEE-; IEEE Industrial Electronics Society -IES-:
IEEE 23rd International Conference on Emerging Technologies and Factory Automation, ETFA 2018. Proceedings : Politecnico di Torino, Torino, Italy, 04-07 September 2018
Piscataway, NJ: IEEE, 2018
ISBN: 978-1-5386-7108-5
ISBN: 978-1-5386-7107-8
ISBN: 978-1-5386-7109-2
International Conference on Emerging Technologies and Factory Automation (ETFA) <23, 2018, Torino>
Fraunhofer IOSB ()

In the area of demand-driven manufacturing systems, one critical problem currently is how to dynamically and optimally schedule, i.e., allocate, actual jobs with different customer requirements in large-scale manufacturing systems in order to meet the various objectives. Scheduling experts have proposed several scheduling methods with acceptable performance for manufacturing systems based on their understanding of the systems’ characteristics. However, the problem remains challenging due to the complicated composition of the multiple objectives of the system, the complex system dynamics, constraints, and the extremely high computational cost for largescale manufacturing systems. In this paper, we apply a stochastic processing network that can capture the stochasticity and dynamics of discrete manufacturing systems. We then propose a data-driven, distributed reinforcement learning (DRL) method so that little information about the system dynamics is required, and the learning and search costs of a scheduling policy with high production performance in a processing system can be reduced, thus this method is capable of scaling to large-scale processing systems. In particular, we first use a stochastic processing network, i.e., a queueing model, to represent the production processes in a typical discrete manufacturing system so that it can be simulated. We then decompose the reinforcement learning into local processes. Each local processs agent can make decisions locally by assigning indices to jobs based on each job’s real-time information (index policy). Because of this distributed learning characteristic and index policy, our approach is much more scalable and efficient than either centralized methods or traditional decentralized reinforcement learning methods. Based on our simulation, we find our approach can achieve higher production performance than other heuristics, past decentralized reinforcement learning methods, or centralized methods in the stochastic processing networks with different scales.