Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 深度強化學習於二維裝箱問題的規劃策略
Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems
作者 詹凱傑
Chan, Kai-Jie
貢獻者 李蔡彥
Li, Tsai-Yen
詹凱傑
Chan, Kai-Jie
關鍵詞 強化學習
深度強化學習
二維裝箱問題
Deep Reinforcement Learning
Deep Reinforcement Learning
Two-dimensional Bin Packing Problem
2DBP
日期 2025
上傳時間 4-Aug-2025 15:09:17 (UTC+8)
摘要 二維裝箱問題是典型的NP-hard組合最佳化問題,廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性,深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法,將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型,將複雜動作空間分解為物品選擇和位置決策兩個子問題,採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示,DRL + First-Fit混合策略在測試的資料集上,基於物品放置率和空間利用率兩個重要指標上,比傳統最佳演算法皆有更好的綜合表現,且明顯改善純深度強化學習的密集空間放置不穩定問題。 本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性,也為相關問題提供方法論的參考。
The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.
參考文獻 [1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.
描述 碩士
國立政治大學
資訊科學系碩士在職專班
106971025
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0106971025
資料類型 thesis
dc.contributor.advisor 李蔡彥zh_TW
dc.contributor.advisor Li, Tsai-Yenen_US
dc.contributor.author (Authors) 詹凱傑zh_TW
dc.contributor.author (Authors) Chan, Kai-Jieen_US
dc.creator (作者) 詹凱傑zh_TW
dc.creator (作者) Chan, Kai-Jieen_US
dc.date (日期) 2025en_US
dc.date.accessioned 4-Aug-2025 15:09:17 (UTC+8)-
dc.date.available 4-Aug-2025 15:09:17 (UTC+8)-
dc.date.issued (上傳時間) 4-Aug-2025 15:09:17 (UTC+8)-
dc.identifier (Other Identifiers) G0106971025en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/158704-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系碩士在職專班zh_TW
dc.description (描述) 106971025zh_TW
dc.description.abstract (摘要) 二維裝箱問題是典型的NP-hard組合最佳化問題,廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性,深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法,將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型,將複雜動作空間分解為物品選擇和位置決策兩個子問題,採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示,DRL + First-Fit混合策略在測試的資料集上,基於物品放置率和空間利用率兩個重要指標上,比傳統最佳演算法皆有更好的綜合表現,且明顯改善純深度強化學習的密集空間放置不穩定問題。 本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性,也為相關問題提供方法論的參考。zh_TW
dc.description.abstract (摘要) The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.en_US
dc.description.tableofcontents 第一章 緒論 1 第一節 研究動機 1 第二節 研究背景 2 第三節 研究目標 3 第四節 研究挑戰 4 第五節 論文貢獻 5 第六節 論文架構 6 第二章 相關研究 7 第一節 技術背景 7 第二節 相關研究 9 第三節 本章結語 14 第三章 系統架構設計 15 第一節 問題定義與系統目標 15 第二節 系統架構設計 20 第三節 深度強化學習模型設計 25 第四節 混合演算法策略設計 34 第五節 本章結語 37 第四章 系統實作 38 第一節 實作環境與技術選擇 38 第二節 核心元件實作 40 第三節 關鍵技術挑戰與解決方案 48 第四節 實驗系統實作 52 第五節 本章結語 61 第五章 實驗設計與結果分析 62 第一節 實驗目標 62 第二節 實驗環境與設置 62 第三節 實驗結果 66 第四節 實驗結果分析與討論 75 第五節 本章結語 83 第六章 結論與未來研究 84 第一節 研究總結 84 第二節 研究限制 87 第三節 未來研究方向 88 第四節 研究總結 89 參考文獻 90zh_TW
dc.format.extent 5758191 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0106971025en_US
dc.subject (關鍵詞) 強化學習zh_TW
dc.subject (關鍵詞) 深度強化學習zh_TW
dc.subject (關鍵詞) 二維裝箱問題zh_TW
dc.subject (關鍵詞) Deep Reinforcement Learningen_US
dc.subject (關鍵詞) Deep Reinforcement Learningen_US
dc.subject (關鍵詞) Two-dimensional Bin Packing Problemen_US
dc.subject (關鍵詞) 2DBPen_US
dc.title (題名) 深度強化學習於二維裝箱問題的規劃策略zh_TW
dc.title (題名) Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problemsen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.zh_TW