Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Dynamic scheduling in largescale stochastic processing networks for demanddriven manufacturing using distributed reinforcement learning
 Institute of Electrical and Electronics Engineers IEEE; IEEE Industrial Electronics Society IES: IEEE 23rd International Conference on Emerging Technologies and Factory Automation, ETFA 2018. Proceedings : Politecnico di Torino, Torino, Italy, 0407 September 2018 Piscataway, NJ: IEEE, 2018 ISBN: 9781538671085 ISBN: 9781538671078 ISBN: 9781538671092 S.433440 
 International Conference on Emerging Technologies and Factory Automation (ETFA) <23, 2018, Torino> 

 Englisch 
 Konferenzbeitrag 
 Fraunhofer IOSB () 
Abstract
In the area of demanddriven manufacturing systems, one critical problem currently is how to dynamically and optimally schedule, i.e., allocate, actual jobs with different customer requirements in largescale manufacturing systems in order to meet the various objectives. Scheduling experts have proposed several scheduling methods with acceptable performance for manufacturing systems based on their understanding of the systems’ characteristics. However, the problem remains challenging due to the complicated composition of the multiple objectives of the system, the complex system dynamics, constraints, and the extremely high computational cost for largescale manufacturing systems. In this paper, we apply a stochastic processing network that can capture the stochasticity and dynamics of discrete manufacturing systems. We then propose a datadriven, distributed reinforcement learning (DRL) method so that little information about the system dynamics is required, and the learning and search costs of a scheduling policy with high production performance in a processing system can be reduced, thus this method is capable of scaling to largescale processing systems. In particular, we first use a stochastic processing network, i.e., a queueing model, to represent the production processes in a typical discrete manufacturing system so that it can be simulated. We then decompose the reinforcement learning into local processes. Each local processs agent can make decisions locally by assigning indices to jobs based on each job’s realtime information (index policy). Because of this distributed learning characteristic and index policy, our approach is much more scalable and efficient than either centralized methods or traditional decentralized reinforcement learning methods. Based on our simulation, we find our approach can achieve higher production performance than other heuristics, past decentralized reinforcement learning methods, or centralized methods in the stochastic processing networks with different scales.