Integration of the A2C Algorithm for Production Scheduling in a Two-Stage Hybrid Flow Shop Environment

Gerpott, Falk T.; Lang, Sebastian; Reggelin, Tobias; Zadek, Hartmut; Chaopaisarn, Poti; Ramingwong, Sakgasem

doi:10.1016/j.procs.2022.01.256

2022

Journal Article

Abstract

The paper introduces an approach to apply reinforcement learning (RL) for production scheduling in a two-stage hybrid flow shop (THFS) production system. The Advantage-Actor Critic (A2C) method is used to train multiple agents to minimize the total tardiness and makespan of a production program. The two-stage hybrid flow shop scheduling problem is a NP-hard combinatorial optimization problem that describes a production system with two stages, each consisting of a set of parallel machines. Our concept combines a Discrete-Event Simulation with a pre-implemented RL algorithm using Stable Baselines3. Since similar research often lacks concrete implementation information, the configuration of the OpenAI Gym interface and the agent-environment interaction is presented.

Author(s)

Gerpott, Falk T.

Otto von Guericke University of Magdeburg

Lang, Sebastian

Otto von Guericke University Magdeburg

Reggelin, Tobias

Otto von Guericke University of Magdeburg

Zadek, Hartmut

Otto von Guericke University of Magdeburg

Chaopaisarn, Poti

Chiang Mai University

Ramingwong, Sakgasem

Chiang Mai University

Journal

Procedia computer science

Conference

International Conference on Industry 4.0 and Smart Manufacturing 2021

Options

Integration of the A2C Algorithm for Production Scheduling in a Two-Stage Hybrid Flow Shop Environment