深度強化學習於二維裝箱問題的規劃策略 | Publication

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	深度強化學習於二維裝箱問題的規劃策略 Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems
作者	詹凱傑 Chan, Kai-Jie
貢獻者	李蔡彥 Li, Tsai-Yen 詹凱傑 Chan, Kai-Jie
關鍵詞	強化學習深度強化學習二維裝箱問題 Deep Reinforcement Learning Deep Reinforcement Learning Two-dimensional Bin Packing Problem 2DBP
日期	2025
上傳時間	4-Aug-2025 15:09:17 (UTC+8)
摘要	二維裝箱問題是典型的NP-hard組合最佳化問題，廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性，深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法，將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型，將複雜動作空間分解為物品選擇和位置決策兩個子問題，採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示，DRL + First-Fit混合策略在測試的資料集上，基於物品放置率和空間利用率兩個重要指標上，比傳統最佳演算法皆有更好的綜合表現，且明顯改善純深度強化學習的密集空間放置不穩定問題。本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性，也為相關問題提供方法論的參考。 The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.
參考文獻	[1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.
描述	碩士國立政治大學資訊科學系碩士在職專班 106971025
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0106971025
資料類型	thesis

dc.contributor.advisor	李蔡彥	zh_TW
dc.contributor.advisor	Li, Tsai-Yen	en_US
dc.contributor.author (Authors)	詹凱傑	zh_TW
dc.contributor.author (Authors)	Chan, Kai-Jie	en_US
dc.creator (作者)	詹凱傑	zh_TW
dc.creator (作者)	Chan, Kai-Jie	en_US
dc.date (日期)	2025	en_US
dc.date.accessioned	4-Aug-2025 15:09:17 (UTC+8)	-
dc.date.available	4-Aug-2025 15:09:17 (UTC+8)	-
dc.date.issued (上傳時間)	4-Aug-2025 15:09:17 (UTC+8)	-
dc.identifier (Other Identifiers)	G0106971025	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/158704	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系碩士在職專班	zh_TW
dc.description (描述)	106971025	zh_TW
dc.description.abstract (摘要)	二維裝箱問題是典型的NP-hard組合最佳化問題，廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性，深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法，將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型，將複雜動作空間分解為物品選擇和位置決策兩個子問題，採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示，DRL + First-Fit混合策略在測試的資料集上，基於物品放置率和空間利用率兩個重要指標上，比傳統最佳演算法皆有更好的綜合表現，且明顯改善純深度強化學習的密集空間放置不穩定問題。本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性，也為相關問題提供方法論的參考。	zh_TW
dc.description.abstract (摘要)	The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.	en_US
dc.description.tableofcontents	第一章緒論 1 第一節研究動機 1 第二節研究背景 2 第三節研究目標 3 第四節研究挑戰 4 第五節論文貢獻 5 第六節論文架構 6 第二章相關研究 7 第一節技術背景 7 第二節相關研究 9 第三節本章結語 14 第三章系統架構設計 15 第一節問題定義與系統目標 15 第二節系統架構設計 20 第三節深度強化學習模型設計 25 第四節混合演算法策略設計 34 第五節本章結語 37 第四章系統實作 38 第一節實作環境與技術選擇 38 第二節核心元件實作 40 第三節關鍵技術挑戰與解決方案 48 第四節實驗系統實作 52 第五節本章結語 61 第五章實驗設計與結果分析 62 第一節實驗目標 62 第二節實驗環境與設置 62 第三節實驗結果 66 第四節實驗結果分析與討論 75 第五節本章結語 83 第六章結論與未來研究 84 第一節研究總結 84 第二節研究限制 87 第三節未來研究方向 88 第四節研究總結 89 參考文獻 90	zh_TW
dc.format.extent	5758191 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0106971025	en_US
dc.subject (關鍵詞)	強化學習	zh_TW
dc.subject (關鍵詞)	深度強化學習	zh_TW
dc.subject (關鍵詞)	二維裝箱問題	zh_TW
dc.subject (關鍵詞)	Deep Reinforcement Learning	en_US
dc.subject (關鍵詞)	Deep Reinforcement Learning	en_US
dc.subject (關鍵詞)	Two-dimensional Bin Packing Problem	en_US
dc.subject (關鍵詞)	2DBP	en_US
dc.title (題名)	深度強化學習於二維裝箱問題的規劃策略	zh_TW
dc.title (題名)	Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM