Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 應用多顆 GPU 以及 PyTorch 指令進行平行處理於加速學習演算法之執行速度
Applying multiple GPUs and PyTorch commands for parallel processing to accelerate the execution speed of learning algorithms
作者 方凱柔
Fang, Kai-Rou
貢獻者 蔡瑞煌<br>林怡伶
Tsaih, Rua-Huan<br>Lin, Yi-Ling
方凱柔
Fang, Kai-Rou
關鍵詞 學習演算法
自適應神經網路
PyTorch
數據平行處理
Learning algorithm
Adaptive neural networks
PyTorch
Data parallelism
日期 2024
上傳時間 5-Aug-2024 12:07:57 (UTC+8)
摘要 在平行處理和多GPU應用方面,對於兩層自適應神經網路的研究相對較少。本研究旨在利用PyTorch框架及其相關指令,結合多個GPU,探討兩層自適應神經網路的數據平行處理。此外,我們將運用Pupil Learning Mechanism演算法,實現在多GPU環境下更高效的計算。以銅價預測數據集為基礎,我們將透過一系列實驗來驗證這一方法,並分析多GPU平行處理對模型訓練速度和準確性的影響,以全面了解和評估所提方法的實際效果和應用價值。預期能提供一個簡單的平行處理模組,讓未來使用兩層自適應神經網路的研究得以快速且簡單地進行平行處理。
Research on the parallel processing and multi-GPU application of two-layer adaptive neural networks (2LANN) is relatively scarce. This study aims to explore the data parallel processing and model parallel processing of 2LANN by leveraging the PyTorch framework and its related instructions, combined with multiple GPUs. The Pupil Learning Mechanism algorithm is employed to achieve more efficient computation in a multi-GPU environment. Based on a copper price prediction dataset, a series of experiments is conducted to validate this approach and analyze the impact of multi-GPU parallel processing on model training speed and accuracy. This aims to comprehensively understand and evaluate the practical effectiveness and application value of the proposed method. It is expected to provide a simple parallel processing module, facilitating future research on 2LANN to conduct parallel processing quickly and easily.
參考文獻 Bahrampour, S., Ramakrishnan, N., Schott, L., & Shah, M. (2015). Comparative Study of Deep Learning Software Frameworks. ArXiv Preprint ArXiv: 1511.06435 DataParallel — PyTorch 2.2 documentation. (2024). Retrieved March 6, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html Distributed communication package - torch.distributed — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/distributed.html Distributed Data Parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/notes/ddp.html Distributed data parallel training using Pytorch on AWS | Telesens. (2019). Retrieved May 31, 2024, from https://www.telesens.co/2019/04/04/distributed-data-parallel-training-using-pytorch-on-aws/ DistributedDataParallel — PyTorch 2.3 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html Fan, S., Rong, Y., Meng, C., Cao, Z., Wang, S., Zheng, Z., Wu, C., Long, G., Yang, J., Xia, L., Diao, L., Liu, X., & Lin, W. (2021). DAPPLE: A pipelined data parallel approach for training large models. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, 431–445. Geng, J., Li, D., & Wang, S. (2019). ElasticPipe: An efficient and dynamic model-parallel solution to DNN training. ScienceCloud 2019 - Proceedings of the 10th Workshop on Scientific Cloud Computing, Co-Located with HPDC 2019, 5–9. Hara, K., Saito, D., & Shouno, H. (2015). Analysis of function of rectified linear unit used in deep learning. 2015 International Joint Conference on Neural Networks (IJCNN), 1–8. Harlap, A., Narayanan, D., Phanishayee, A., Seshadri, V., Devanur, N., Ganger, G., & Gibbons, P. (2018). PipeDream: Fast and Efficient Pipeline Parallel DNN Training. Preprint ArXiv: 1806.03377 Ketkar, N., & Moolayil, J. (2021). Deep learning with python: Learn Best Practices of Deep Learning Models with PyTorch. In Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch. Apress Media LLC. https://doi.org/10.1007/978-1-4842-5364-9 Khomenko, V., Shyshkov, O., Radyvonenko, O., & Bokhan, K. (2016). Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), 100–103. Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. ArXiv Preprint ArXiv: 1404.5997 Lee, S., Kang, Q., Madireddy, S., Balaprakash, P., Agrawal, A., Choudhary, A., Archibald, R., & Liao, W. (2019). Improving Scalability of Parallel CNN Training by Adjusting Mini-Batch Size at Run-Time. 2019 IEEE International Conference on Big Data (Big Data), 830–839. Nguyen, T. D. T., Park, J. H., Hossain, M. I., Hossain, M. D., Lee, S.-J., Jang, J. W., Jo, S. H., Huynh, L. N. T., Tran, T. K., & Huh, E.-N. (2019). Performance Analysis of Data Parallelism Technique in Machine Learning for Human Activity Recognition Using LSTM. 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 387–391. Optional: Data Parallelism — PyTorch Tutorials 2.2.0+cu121 documentation. (2024). Retrieved February 21, 2024, from https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU Computing. Proceedings of the IEEE, 96(5), 879–899. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., Facebook, Z. D., Research, A. I., Lin, Z., Desmaison, A., Antiga, L., Srl, O., & Lerer, A. (2017). Automatic differentiation in PyTorch. NIPS 2017 Workshop on Autodiff. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. ArXiv Preprint ArXiv: 1912.01703 Pérez-Sánchez, B., Fontenla-Romero, O., & Guijarro-Berdiñas, B. (2018). A review of adaptive online learning for artificial neural networks. Artificial Intelligence Review, 49(2), 281–299. PyTorch Distributed Overview — PyTorch Tutorials 2.3.0+cu121 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/tutorials/beginner/dist_overview.html Ren-Han, Y. (2022). An adaptive learning-based model for copper price forecasting. Master's thesis, Department of Information Management, National Chengchi University, 1–78. Sanders, J., Kandrot, E., & Jacoboni, E. (2011). CUDA par l’exemple [une introduction à la programmation parallèle de GPU]. Pearson. torch.nn — PyTorch 2.2 documentation. (2024). Retrieved March 5, 2024, from https://pytorch.org/docs/stable/nn.html torch.nn.parallel.data_parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/_modules/torch/nn/parallel/data_parallel.html torch.utils.data — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/data.html#single-and-multi-process-data-loading Tsai, Y.-H., Jheng, Y.-J., & Tsaih, R.-H. (2019). The Cramming, Softening and Integrating Learning Algorithm with Parametric ReLU Activation Function for Binary Input/Output Problems. 2019 International Joint Conference on Neural Networks (IJCNN), 1–7. Tsaih, R. R. (1998). An Explanation of Reasoning Neural Networks. In Mathematical and Computer Modelling (Vol. 28, Issue 2). Tsaih, R.-H., Chien, Y.-H., & Chien, S.-Y. (2023). Pupil Learning Mechanism. ArXiv Preprint ArXiv: 2307.16141
描述 碩士
國立政治大學
資訊管理學系
111356049
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111356049
資料類型 thesis
dc.contributor.advisor 蔡瑞煌<br>林怡伶zh_TW
dc.contributor.advisor Tsaih, Rua-Huan<br>Lin, Yi-Lingen_US
dc.contributor.author (Authors) 方凱柔zh_TW
dc.contributor.author (Authors) Fang, Kai-Rouen_US
dc.creator (作者) 方凱柔zh_TW
dc.creator (作者) Fang, Kai-Rouen_US
dc.date (日期) 2024en_US
dc.date.accessioned 5-Aug-2024 12:07:57 (UTC+8)-
dc.date.available 5-Aug-2024 12:07:57 (UTC+8)-
dc.date.issued (上傳時間) 5-Aug-2024 12:07:57 (UTC+8)-
dc.identifier (Other Identifiers) G0111356049en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/152416-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 111356049zh_TW
dc.description.abstract (摘要) 在平行處理和多GPU應用方面,對於兩層自適應神經網路的研究相對較少。本研究旨在利用PyTorch框架及其相關指令,結合多個GPU,探討兩層自適應神經網路的數據平行處理。此外,我們將運用Pupil Learning Mechanism演算法,實現在多GPU環境下更高效的計算。以銅價預測數據集為基礎,我們將透過一系列實驗來驗證這一方法,並分析多GPU平行處理對模型訓練速度和準確性的影響,以全面了解和評估所提方法的實際效果和應用價值。預期能提供一個簡單的平行處理模組,讓未來使用兩層自適應神經網路的研究得以快速且簡單地進行平行處理。zh_TW
dc.description.abstract (摘要) Research on the parallel processing and multi-GPU application of two-layer adaptive neural networks (2LANN) is relatively scarce. This study aims to explore the data parallel processing and model parallel processing of 2LANN by leveraging the PyTorch framework and its related instructions, combined with multiple GPUs. The Pupil Learning Mechanism algorithm is employed to achieve more efficient computation in a multi-GPU environment. Based on a copper price prediction dataset, a series of experiments is conducted to validate this approach and analyze the impact of multi-GPU parallel processing on model training speed and accuracy. This aims to comprehensively understand and evaluate the practical effectiveness and application value of the proposed method. It is expected to provide a simple parallel processing module, facilitating future research on 2LANN to conduct parallel processing quickly and easily.en_US
dc.description.tableofcontents CHAPTER 1. INTRODUCTION 1 CHAPTER 2. LITERATURE REVIEW 4 2.1 PUPIL LEARNING MECHANISM (TSAIH ET AL., 2023) 4 2.2 PYTORCH 7 2.3 DATA PARALLEL PROCESSING IN PYTORCH 8 2.3.1 DATAPARALLEL 8 2.3.2 DISTRIBUTEDDATAPARALLEL 10 CHAPTER 3. RESEARCH METHODOLOGY 13 3.1 ALGORITHM OF RPLM 13 3.2 PARALLEL PROCESSING 19 3.2.1 ORGANIZING MODULE WITH DATAPARALLEL 20 3.2.2 ORGANIZING MODULE WITH DISTRIBUTEDDATAPARALLEL 21 CHAPTER 4. EXPERIMENT DESIGN 24 4.1 DATASET 24 4.2 EXPERIMENT EVALUATION 26 CHAPTER 5. EXPERIMENT RESULTS 28 5.1 TRAINING TIME 28 5.1.1 TRAINING TIME OF THE UNDERSTANDING MODULE OF ORGANIZING MODULE 28 5.1.2 TRAINING TIME OF ORGANIZING MODULE 30 5.1.3 OVERALL TRAINING TIME 32 5.2 MAE RESULTS 33 CHAPTER 6. CONCLUSION AND FUTURE WORK 36 6.1 CONCLUSION 36 6.2 LIMITATION AND FUTURE WORK 37 REFERENCES 39zh_TW
dc.format.extent 3189792 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111356049en_US
dc.subject (關鍵詞) 學習演算法zh_TW
dc.subject (關鍵詞) 自適應神經網路zh_TW
dc.subject (關鍵詞) PyTorchzh_TW
dc.subject (關鍵詞) 數據平行處理zh_TW
dc.subject (關鍵詞) Learning algorithmen_US
dc.subject (關鍵詞) Adaptive neural networksen_US
dc.subject (關鍵詞) PyTorchen_US
dc.subject (關鍵詞) Data parallelismen_US
dc.title (題名) 應用多顆 GPU 以及 PyTorch 指令進行平行處理於加速學習演算法之執行速度zh_TW
dc.title (題名) Applying multiple GPUs and PyTorch commands for parallel processing to accelerate the execution speed of learning algorithmsen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Bahrampour, S., Ramakrishnan, N., Schott, L., & Shah, M. (2015). Comparative Study of Deep Learning Software Frameworks. ArXiv Preprint ArXiv: 1511.06435 DataParallel — PyTorch 2.2 documentation. (2024). Retrieved March 6, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html Distributed communication package - torch.distributed — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/distributed.html Distributed Data Parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/notes/ddp.html Distributed data parallel training using Pytorch on AWS | Telesens. (2019). Retrieved May 31, 2024, from https://www.telesens.co/2019/04/04/distributed-data-parallel-training-using-pytorch-on-aws/ DistributedDataParallel — PyTorch 2.3 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html Fan, S., Rong, Y., Meng, C., Cao, Z., Wang, S., Zheng, Z., Wu, C., Long, G., Yang, J., Xia, L., Diao, L., Liu, X., & Lin, W. (2021). DAPPLE: A pipelined data parallel approach for training large models. Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, 431–445. Geng, J., Li, D., & Wang, S. (2019). ElasticPipe: An efficient and dynamic model-parallel solution to DNN training. ScienceCloud 2019 - Proceedings of the 10th Workshop on Scientific Cloud Computing, Co-Located with HPDC 2019, 5–9. Hara, K., Saito, D., & Shouno, H. (2015). Analysis of function of rectified linear unit used in deep learning. 2015 International Joint Conference on Neural Networks (IJCNN), 1–8. Harlap, A., Narayanan, D., Phanishayee, A., Seshadri, V., Devanur, N., Ganger, G., & Gibbons, P. (2018). PipeDream: Fast and Efficient Pipeline Parallel DNN Training. Preprint ArXiv: 1806.03377 Ketkar, N., & Moolayil, J. (2021). Deep learning with python: Learn Best Practices of Deep Learning Models with PyTorch. In Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch. Apress Media LLC. https://doi.org/10.1007/978-1-4842-5364-9 Khomenko, V., Shyshkov, O., Radyvonenko, O., & Bokhan, K. (2016). Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization. 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), 100–103. Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. ArXiv Preprint ArXiv: 1404.5997 Lee, S., Kang, Q., Madireddy, S., Balaprakash, P., Agrawal, A., Choudhary, A., Archibald, R., & Liao, W. (2019). Improving Scalability of Parallel CNN Training by Adjusting Mini-Batch Size at Run-Time. 2019 IEEE International Conference on Big Data (Big Data), 830–839. Nguyen, T. D. T., Park, J. H., Hossain, M. I., Hossain, M. D., Lee, S.-J., Jang, J. W., Jo, S. H., Huynh, L. N. T., Tran, T. K., & Huh, E.-N. (2019). Performance Analysis of Data Parallelism Technique in Machine Learning for Human Activity Recognition Using LSTM. 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 387–391. Optional: Data Parallelism — PyTorch Tutorials 2.2.0+cu121 documentation. (2024). Retrieved February 21, 2024, from https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU Computing. Proceedings of the IEEE, 96(5), 879–899. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., Facebook, Z. D., Research, A. I., Lin, Z., Desmaison, A., Antiga, L., Srl, O., & Lerer, A. (2017). Automatic differentiation in PyTorch. NIPS 2017 Workshop on Autodiff. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. ArXiv Preprint ArXiv: 1912.01703 Pérez-Sánchez, B., Fontenla-Romero, O., & Guijarro-Berdiñas, B. (2018). A review of adaptive online learning for artificial neural networks. Artificial Intelligence Review, 49(2), 281–299. PyTorch Distributed Overview — PyTorch Tutorials 2.3.0+cu121 documentation. (2024). Retrieved June 1, 2024, from https://pytorch.org/tutorials/beginner/dist_overview.html Ren-Han, Y. (2022). An adaptive learning-based model for copper price forecasting. Master's thesis, Department of Information Management, National Chengchi University, 1–78. Sanders, J., Kandrot, E., & Jacoboni, E. (2011). CUDA par l’exemple [une introduction à la programmation parallèle de GPU]. Pearson. torch.nn — PyTorch 2.2 documentation. (2024). Retrieved March 5, 2024, from https://pytorch.org/docs/stable/nn.html torch.nn.parallel.data_parallel — PyTorch 2.3 documentation. (2024). Retrieved June 2, 2024, from https://pytorch.org/docs/stable/_modules/torch/nn/parallel/data_parallel.html torch.utils.data — PyTorch 2.3 documentation. (2024). Retrieved June 3, 2024, from https://pytorch.org/docs/stable/data.html#single-and-multi-process-data-loading Tsai, Y.-H., Jheng, Y.-J., & Tsaih, R.-H. (2019). The Cramming, Softening and Integrating Learning Algorithm with Parametric ReLU Activation Function for Binary Input/Output Problems. 2019 International Joint Conference on Neural Networks (IJCNN), 1–7. Tsaih, R. R. (1998). An Explanation of Reasoning Neural Networks. In Mathematical and Computer Modelling (Vol. 28, Issue 2). Tsaih, R.-H., Chien, Y.-H., & Chien, S.-Y. (2023). Pupil Learning Mechanism. ArXiv Preprint ArXiv: 2307.16141zh_TW