RISS 검색 - 학위논문 상세보기

국문 초록 (Abstract)

내부물류는 유통 또는 주문 처리 센터 내에서 공급, 생산 및 유통의 모든 물류 물리적 흐름을 최적화, 통합, 자동화 및 관리하는 프로세스를 의미한다. 이러한 프로세스는 점점 복잡해지고 있으며 특히, 택배, 소포 및우편 분야에서 늘어나는 물량에 대해 더 빠르고 정확한 처리가 요구되고 있다. 하지만, 대다수 기업에서 해당 처리는 수작업으로 이루어지는 실정 이며 과도한 노동으로 이어지고 있다. 본 논문에서는 소포를 자동으로 분류할 수 있는 다중 입출력 그리드 분류 시스템을 설계하고 강화 학습 환경으로 사용한다. 이 다중 입출력 그리드 분류 시스템은 독립적인 모듈을 활용하여 레이아웃을 자유롭게 생성 및 변경할 수 있다. 즉, 유연성이 높아 다양한 레이아웃 구성이 가능하므로 실제 물류센터에 적용할 수 있다.
또, 본 논문에서는 Deep Q-Learning 알고리즘을 기반으로 한 다중 에이전트 강화 학습 알고리즘 Behavior Prediction Q-Learning을 제안한다.
제안하는 알고리즘은 각 소포가 에이전트가 되어 독립적으로 행동하며 에이전트에게 다른 에이전트들의 행동을 예측하여 시스템 전체의 처리량을 높이는 다양한 행동을 유도한다. 실험은 다중 입출력 그리드 분류 시스템의 다양한 레이아웃을 대상으로 진행하며 효율적으로 분류됨을 나타내는지표 Sortation Performance Index를 제시한다. 해당 지표를 통해 모든 레이아웃에서 효율적인 제어가 가능함을 입증하며 성공적인 일반화에 대한 추가적인 실험도 진행한다. 마지막으로 독일의 Karlsruhe Institute of Technology와 GEBHARDT에서 개발한 GridSorter의 알고리즘과 본 논문에서 제안하는 Behavior Prediction Q-Learning 알고리즘의 비교실험을동일한 레이아웃에서 수행한다.

번역하기

내부물류는 유통 또는 주문 처리 센터 내에서 공급, 생산 및 유통의 모든 물류 물리적 흐름을 최적화, 통합, 자동화 및 관리하는 프로세스를 의미한다. 이러한 프로세스는 점점 복잡해지고 ...

다국어 초록 (Multilingual Abstract)

Intralogistics means the process of optimizing, integrating, automating, and managing all of the logistical physical flows of supply, production and distribution within a distribution or fulfillment center.
These processes are becoming increasingly complex. In particular, faster and more accurate processing is required for the growing volume of goods in the delivery, parcel and postal sectors. However, in most companies, the process is done manually and leads to overwork.
This paper designs a multi-input/output grid sortation system that can automatically sort parcels, and use it as a reinforcement learning environment. This multi-input/output grid sortation system can freely create and change layouts using independent modules. In other words, it is possible to configure various layouts due to its high flexibility, so it can be applied to actual distribution centers. In addition, this paper proposes multi-agent reinforcement learning algorithm Behavior Prediction Q-Learning based on the Deep Q-Learning algorithm. In the proposed algorithm, each parcel becomes an agent and acts independently, and induces various actions to the agent to increase the overall system throughput by predicting the actions of other agents.
The experiment targets various layouts of the multi-input/output grid sortation system and presents the Sortation Performance Index that indicates efficient sortation. This index proves that efficient control is possible in all layouts, and additional experiments for successful generalization are also conducted. Finally, a comparison experiment between algorithm of GridSorter developed by Karlsruhe Institute of Technology and GEBHARDT in Germany and the Behavior Prediction Q-Learning algorithm proposed in this paper is performed in the same layout.

번역하기

목차 (Table of Contents)

제1장 서론 1
제1절 연구 배경 및 목적 1
제2절 논문 구성 3
제2장 관련 연구 4
제1절 분류 시스템 4

제1장 서론 1
제1절 연구 배경 및 목적 1
제2절 논문 구성 3
제2장 관련 연구 4
제1절 분류 시스템 4
1. 연구 동향 4
제2절 강화 학습 6
1. Deep Q-Network 6
2. 다중 에이전트 강화 학습 6
제3장 강화 학습 설계 8
제1절 환경 기술 및 특성 8
1. 시나리오 8
2. 문제 정의 9
3. 충돌 10
제2절 State 정의 12
제3절 Action 정의 21
제4절 Reward 정의 27
제4장 심층 강화 학습 알고리즘 29
제1절 다중 에이전트 29
제2절 모델 구조 32
제3절 알고리즘 38
제5장 실험 39
제1절 레이아웃 39
제2절 성능 평가 46
1. 평가 지표 46
2. 학습 그래프 47
3. Loss 그래프 51
4. 일반화 55
제3절 성능 비교 56
1. GridSorter 56
2. 알고리즘 57
3. 비교 결과 58
제6장 결론 59
제1절 정리 59
참고문헌 60
Abstract 65

상세검색

RISS 보유자료

상세검색

해외전자자료

다중 입출력 그리드 분류 시스템에서 다중 에이전트 강화 학습 기반 분류 및 플로우 제어 = Multi-Agent Reinforcement Learning-based Sorting and Flow Control in Multi-Input/Output Grid Sortation System

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료