學術產出-Theses
Article View/Open
Publication Export
-
題名 基於Kubernetes 高可用集群的節點失效容錯研究 以HTTP Web服務為驗證案例
The Study of Nodes Fault Tolerance Based on Kubernetes High-availability Clusters:A Pratical Study on HTTP Web Service作者 羅時雨
Luo, Shih-Yu貢獻者 張宏慶
Jang, Hung-Chin
羅時雨
Luo, Shih-Yu關鍵詞 容錯
節點失效
容器化平台
高可用集群
網絡流量工作負載
Fault Tolerance
Kubernetes
Web Traffic Workload
Container
Autoscaler日期 2022 上傳時間 1-Mar-2022 18:21:32 (UTC+8) 摘要 近年微服務架構、容器化技術普及、以Docker容器為標準化單位的軟體封裝,其快速佈署、彈性調整、跨平台運作特性,能讓業界更專注於創新和業務需求、可輕鬆管理底層基礎設施。隨著物聯網、大數據機器學習盛行,得跨主機平行處理大量資料,故當服務發生不可預期中斷時,得維持系統資源可用性與穩定性。隨著容器數量增長,Docker公司推出容器的管理平台Docker Swarm管理調度跨主機的容器,依據工作負載去調整其運作規模大小,當容器不可預期停止運作時,Docker Swarm叢集會自動產生新的容器,其確保容器服務高可用性。且在同時Google亦推出Kubernetes,故同時比較以Kubernetes 為基礎的 Horizontal Pod Autoscaler,其會依據節點記憶體目標使用率,自動調整服務Pod個数,提升整體資源利用率。Kubernetes簡化應用程式的管理與佈署,但佈署後其集群內效能未被有效去評估與比較,本研究會針對集群內節點資源配置、參數設定,以Vertical-Pod-Autoscaler、Descheduler、Ingress Controller、Scheduling Framework做優化調整。並再與Docker Swarm 架構比較。驗證叢集中節點發生故障失效,優化整體叢集內Web服務Traffic Workload平均反應時間、最長反應時間、連線數成功率、成功次數、失敗次數 數據結果。
In recent years , the popularization of the Microservice architeure,and Docker containers, its rapid deployment ,flexible adjustment,and cross-platform operation characteristics,enable enterprise to focus on innovation and business needs,and easily manage infrastructure.With the coming of Internet of Things and Data Machine Learning,a large amount of data be processed across hosts.Thus,When system’service is suddenly interrupted, the availability of resources can be sustained steadily.The Horizontal Pod Autoscaler based on Kubernetes automatically adjust the number of Pods according to the target memory utilizationof the node , improving the overall resource utilization. Apparently,Kubernetes simplify the management and deployment of pod, but its performance has not been effectively evaluated. The study will focus on the node resource configuration and parameter settings in cluster.Vertical-Pod-Autoscaler, Scheduling Framework, Descheuler, and Ingress Controller makes optimization adjustments.At the same time,compared with the Docker Swarm cluster, When the node fails, the research is implemented to optimize the Web service of the average response time , the longest response time,and the success rate of the number of connection.參考文獻 [1] T.Ashwarya, E.Berryman, and M.Konrad, “RecSyncETCD: A FAULT-TOLERANT SERVICE FOR EPICS PV CONFIGURATION DATA*” , 17th Int.Conf. on Acc. and Large Exp. Physics Control Systems ,2019.[2] A.Baarzi , G.Kesidis, D.Fleck, and A.Stavrou, “Microservices made attack-resilient using unsupervised service fissioning”.Proceedings of the 13th European workshop on Systems Security, 2020.[3] L.Chen , Y.Pan , and R.O.Sinnott , “ Auto-scaling Walkability Analytics throughKubernetes and Docker SWARM on the Cloud ”Proceedings of the 10th International Conference on Cloud Computing and Services Science, 2020.[4] P.De , S.Caiano , R.Gonçalves ,and R.Morla,“FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO A Load Balancer for Bursting Hadoop-based Network Analysis Jobs,”2015.[Online].Available:https://repositorio-aberto.up.pt/bitstream/10216/79486/2/35642.pdf.[5] M. Fukushi , T.Katsuta, and Y.Kurokawa , ” A degradable NoC router for the improvement of fault-tolerant routing performance". Artificial Life and Robotics, 25(2), pp.301-307. 2019.[6] El Haj Ahmed , F.Gil‐Castiñeira, and Costa‐Montenegro, E.KubCG : A dynamic Kubernetes scheduler for heterogeneous clusters. Software : Practice and Experience, 51(2), pp.213-234.2020.[7] D.Harris , ”Practical Issues in Vertical Scaling. Linking and Aligning Scores and Scales”, pp.233-251.2007.[8] M.Imran, Kuznetsov, K.Dziedziniewicz-Wojcik, A.Pfeiffer, P.Paparrigopoulos, S.Trigazis , T.Tedeschi , and D.Ciangottini, Migration of CMSWEB cluster at CERN to Kubernetes: a comprehensive study. Cluster Computing, 24(4), pp.3085-3099.2021.[9] C.Jensen, H.Howard, and R.Mortier , “ Examining Raft’s behaviour during partialnetwork failures,”Proceedings of the 1st Workshop on High Availability and Observa-ility of Cloud Systems, Apr. 2021.[10] C.Joseph, and K.Chandrasekaran,Nature‐inspired resource management and dynamic rescheduling of Microservices in Cloud data centers. Concurrency and Computation: Practice and Experience, 33(17).2021.[11] S.Junaid, A.Saeed2 ,R.White1, “Single Point of Failure (SPOF) – a useful concept and mnemonic to reduce reporting errors in cancer imaging”, ECR 2018 EPOS, Jan. 12, 2018.[12] J.Li, S.Kulkarni, K.Ramakrishnan, D.Li , Understanding Open Source ServerlessPlatforms. Proceedings of the 5th International Workshop on Serverless Computing- WOSC `19, 2019.[13]J.Lawrence,E.Prakash,C.Hewage, “Kubernetes:Essential for Cloud Tansformation” Cardiff School of Technologies, Cardiff Metropolitan University, p4.2021.[14] P.Martin, Discovery and Load Balancing. Kubernetes, pp.101-114. 2020.[15] V.Medel,C.Tolón,U.Arronategui, R.Tolosana-Calasanz, J.Bañares,and O.RanaClient-Side Scheduling Based on Application Characterization on Kubernetes.Economics of Grids, Clouds, Systems, and Services, pp.162-176. 2017.[16] G.Muntoni, J.Soldani ,A.Brogi, Mining the Architecture of Microservice-BasedApplications from their Kubernetes Deployment. Communications in Computer and Information Science, pp.103-115.2021.[17] D.Ongaro and J.Ousterhout,“In Search of an Understandable Consensus Algorithm,” www.usenix.org, 2014.[18] A.M.Potdar, G,S.Kengond, and M.M.Mulla, “Performance Evaluation of Docker Container and Virtual Machine,” Procedia Computer Science, vol. 171,pp.1419–1428, 2020.[19] R.Peinl, F.Holzschuher, and F.Pfitzer, “Docker cluster management for the cloud - survey results and own solution” journal of grid computing, 2016.[20] M.Sadoon, S.Hamid, H.Sofian, H.Altarturi, Z.H.Azizul, and N.Nasuha, “Fault tolerance in big data storage and processing systems: A review on challenges and solutions,” Ain Shams Engineering Journal, vol. 13, no. 2, p. 101538, Mar.2021.[21] G.Sayfan, “Mastering Kubernetes Master the art of container management by using the power of Kubernetes”. Birmingham Packt Publishing,2018.[22] R.Scolati, I.Fronza, N.El Ioini, A.Samir, and C.Pahl, “A Containerized Big Data Streaming Architecture for Edge Cloud Computing on Clustered Single-board Devices,” Proceedings of the 9th International Conference on Cloud Computing and Services Science, 2019.[23] V.S.Kushwah , S.K.Goyal , “ A Measuring Throughput for Fault Tolerant Based ACO Algorithm under Cloud Computing”:A Comparison Study. International Journal of Engineering & Technology, 7(4.12), p.39., 2018.[24] L Suresh, J.Loff, F.Kalim, S.Jyothi,N.Narodytska, L.Ryzhyk,S.Gamage, B.Oki,and M.Gasch , Building Scalable and Flexible Cluster Managers Using DeclarativeProgramming.2022. [online]Usenix.org.Available at: https://www.usenix.org/conference/osdi20/presentation/suresh[25] B.Thurgood, and R.Lennon, “Cloud Computing With Kubernetes Cluster Elastic Scaling. Proceedings of the 3rd International Conference on Future Networks andDistributed Systems”. 2019.[26] Y.Tong, and M.Kolen, ”Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests. Applied Measurement in Education”,20 (2), pp.227-253.2007.[27] E.Truyen ,B.Lagaisse,W.Joosen, A.Hoebreckx, and C.Dycker, Flexible Migrationin Blue-Green Deployments within a Fixed Cost .Proceedings of the 2020 6th Interna-tional Workshop on Container Technologies and Container Clouds, 2020.[28] O.Ungureanu , C.Vlădeanu , and R.Kooij , Kubernetes cluster optimization usinghybrid shared-state scheduling framework.Proceeding of the 3rd International Confer-ence on Future Networks and Distributed Systems, 2019.[29] M.Villamizar ,” Infrastructure cost comparison of running web applications in the cloud using AWS lambda and monolithic and microservice architectures”. Institute Of Electrical And Electronics Engineers -07-18, 2016. 描述 碩士
國立政治大學
資訊科學系碩士在職專班
107971018資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107971018 資料類型 thesis dc.contributor.advisor 張宏慶 zh_TW dc.contributor.advisor Jang, Hung-Chin en_US dc.contributor.author (Authors) 羅時雨 zh_TW dc.contributor.author (Authors) Luo, Shih-Yu en_US dc.creator (作者) 羅時雨 zh_TW dc.creator (作者) Luo, Shih-Yu en_US dc.date (日期) 2022 en_US dc.date.accessioned 1-Mar-2022 18:21:32 (UTC+8) - dc.date.available 1-Mar-2022 18:21:32 (UTC+8) - dc.date.issued (上傳時間) 1-Mar-2022 18:21:32 (UTC+8) - dc.identifier (Other Identifiers) G0107971018 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/139314 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系碩士在職專班 zh_TW dc.description (描述) 107971018 zh_TW dc.description.abstract (摘要) 近年微服務架構、容器化技術普及、以Docker容器為標準化單位的軟體封裝,其快速佈署、彈性調整、跨平台運作特性,能讓業界更專注於創新和業務需求、可輕鬆管理底層基礎設施。隨著物聯網、大數據機器學習盛行,得跨主機平行處理大量資料,故當服務發生不可預期中斷時,得維持系統資源可用性與穩定性。隨著容器數量增長,Docker公司推出容器的管理平台Docker Swarm管理調度跨主機的容器,依據工作負載去調整其運作規模大小,當容器不可預期停止運作時,Docker Swarm叢集會自動產生新的容器,其確保容器服務高可用性。且在同時Google亦推出Kubernetes,故同時比較以Kubernetes 為基礎的 Horizontal Pod Autoscaler,其會依據節點記憶體目標使用率,自動調整服務Pod個数,提升整體資源利用率。Kubernetes簡化應用程式的管理與佈署,但佈署後其集群內效能未被有效去評估與比較,本研究會針對集群內節點資源配置、參數設定,以Vertical-Pod-Autoscaler、Descheduler、Ingress Controller、Scheduling Framework做優化調整。並再與Docker Swarm 架構比較。驗證叢集中節點發生故障失效,優化整體叢集內Web服務Traffic Workload平均反應時間、最長反應時間、連線數成功率、成功次數、失敗次數 數據結果。 zh_TW dc.description.abstract (摘要) In recent years , the popularization of the Microservice architeure,and Docker containers, its rapid deployment ,flexible adjustment,and cross-platform operation characteristics,enable enterprise to focus on innovation and business needs,and easily manage infrastructure.With the coming of Internet of Things and Data Machine Learning,a large amount of data be processed across hosts.Thus,When system’service is suddenly interrupted, the availability of resources can be sustained steadily.The Horizontal Pod Autoscaler based on Kubernetes automatically adjust the number of Pods according to the target memory utilizationof the node , improving the overall resource utilization. Apparently,Kubernetes simplify the management and deployment of pod, but its performance has not been effectively evaluated. The study will focus on the node resource configuration and parameter settings in cluster.Vertical-Pod-Autoscaler, Scheduling Framework, Descheuler, and Ingress Controller makes optimization adjustments.At the same time,compared with the Docker Swarm cluster, When the node fails, the research is implemented to optimize the Web service of the average response time , the longest response time,and the success rate of the number of connection. en_US dc.description.tableofcontents 第一章、 研究背景1-1研究背景 11-2研究動機 21-3實作目標與研究貢獻 2第二章、相關研究2-1 Fault tolerance in big a data storage and processing system 62-2 Docker微服務Container技術與過去的Virtual Machine 92-3 Docker swarm叢集應用 132-4 ETCD cluster容錯服務 14第三章、系統架構3-1系統架構各元件功能與作用 183-2系統架構元件運作細節3-2-1 Kubernetes的 Master節點 233-2-1-1 Master節點的API Server 233-2-1-2 Master節點的Controller Manager 243-2-1-3 Master節點的Kube-scheduler 243-2-2 Kubernetes架構的Worker節點 253-2-3 Kubernetes 架構的Deployment 283-2-4 Kubernetes架構的 Horizontal Pod Autoscaler 263-2-5相關研究架構下透過Metrics server監控資訊執行Horizontal Pod Autoscaler 303-3說明本架構下對於叢集中節點發生錯誤情境 323-4定義Fault type: Web HTTP服務連線異常 34第四章 實驗內容4-1實驗環境設定 354-2-1 Horizontal Pod Autoscaler目標利用率最佳化 364-2-2 VPA(Vertical-pod-autoscaler)364-2-3 Descheduler 374-2-4 Ingress Controller負載平衡 384-2-5 Scheduling Framework調度框架 394-3-1 實驗案例與測試腳本說明 404-3-2 Resource Measure 404-3-3 Resource Provison 414-3-4 Web traffic workload 驗證方式 424-4實驗結果 43第五章 結論 46參考文獻 47 zh_TW dc.format.extent 5865079 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107971018 en_US dc.subject (關鍵詞) 容錯 zh_TW dc.subject (關鍵詞) 節點失效 zh_TW dc.subject (關鍵詞) 容器化平台 zh_TW dc.subject (關鍵詞) 高可用集群 zh_TW dc.subject (關鍵詞) 網絡流量工作負載 zh_TW dc.subject (關鍵詞) Fault Tolerance en_US dc.subject (關鍵詞) Kubernetes en_US dc.subject (關鍵詞) Web Traffic Workload en_US dc.subject (關鍵詞) Container en_US dc.subject (關鍵詞) Autoscaler en_US dc.title (題名) 基於Kubernetes 高可用集群的節點失效容錯研究 以HTTP Web服務為驗證案例 zh_TW dc.title (題名) The Study of Nodes Fault Tolerance Based on Kubernetes High-availability Clusters:A Pratical Study on HTTP Web Service en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] T.Ashwarya, E.Berryman, and M.Konrad, “RecSyncETCD: A FAULT-TOLERANT SERVICE FOR EPICS PV CONFIGURATION DATA*” , 17th Int.Conf. on Acc. and Large Exp. Physics Control Systems ,2019.[2] A.Baarzi , G.Kesidis, D.Fleck, and A.Stavrou, “Microservices made attack-resilient using unsupervised service fissioning”.Proceedings of the 13th European workshop on Systems Security, 2020.[3] L.Chen , Y.Pan , and R.O.Sinnott , “ Auto-scaling Walkability Analytics throughKubernetes and Docker SWARM on the Cloud ”Proceedings of the 10th International Conference on Cloud Computing and Services Science, 2020.[4] P.De , S.Caiano , R.Gonçalves ,and R.Morla,“FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO A Load Balancer for Bursting Hadoop-based Network Analysis Jobs,”2015.[Online].Available:https://repositorio-aberto.up.pt/bitstream/10216/79486/2/35642.pdf.[5] M. Fukushi , T.Katsuta, and Y.Kurokawa , ” A degradable NoC router for the improvement of fault-tolerant routing performance". Artificial Life and Robotics, 25(2), pp.301-307. 2019.[6] El Haj Ahmed , F.Gil‐Castiñeira, and Costa‐Montenegro, E.KubCG : A dynamic Kubernetes scheduler for heterogeneous clusters. Software : Practice and Experience, 51(2), pp.213-234.2020.[7] D.Harris , ”Practical Issues in Vertical Scaling. Linking and Aligning Scores and Scales”, pp.233-251.2007.[8] M.Imran, Kuznetsov, K.Dziedziniewicz-Wojcik, A.Pfeiffer, P.Paparrigopoulos, S.Trigazis , T.Tedeschi , and D.Ciangottini, Migration of CMSWEB cluster at CERN to Kubernetes: a comprehensive study. Cluster Computing, 24(4), pp.3085-3099.2021.[9] C.Jensen, H.Howard, and R.Mortier , “ Examining Raft’s behaviour during partialnetwork failures,”Proceedings of the 1st Workshop on High Availability and Observa-ility of Cloud Systems, Apr. 2021.[10] C.Joseph, and K.Chandrasekaran,Nature‐inspired resource management and dynamic rescheduling of Microservices in Cloud data centers. Concurrency and Computation: Practice and Experience, 33(17).2021.[11] S.Junaid, A.Saeed2 ,R.White1, “Single Point of Failure (SPOF) – a useful concept and mnemonic to reduce reporting errors in cancer imaging”, ECR 2018 EPOS, Jan. 12, 2018.[12] J.Li, S.Kulkarni, K.Ramakrishnan, D.Li , Understanding Open Source ServerlessPlatforms. Proceedings of the 5th International Workshop on Serverless Computing- WOSC `19, 2019.[13]J.Lawrence,E.Prakash,C.Hewage, “Kubernetes:Essential for Cloud Tansformation” Cardiff School of Technologies, Cardiff Metropolitan University, p4.2021.[14] P.Martin, Discovery and Load Balancing. Kubernetes, pp.101-114. 2020.[15] V.Medel,C.Tolón,U.Arronategui, R.Tolosana-Calasanz, J.Bañares,and O.RanaClient-Side Scheduling Based on Application Characterization on Kubernetes.Economics of Grids, Clouds, Systems, and Services, pp.162-176. 2017.[16] G.Muntoni, J.Soldani ,A.Brogi, Mining the Architecture of Microservice-BasedApplications from their Kubernetes Deployment. Communications in Computer and Information Science, pp.103-115.2021.[17] D.Ongaro and J.Ousterhout,“In Search of an Understandable Consensus Algorithm,” www.usenix.org, 2014.[18] A.M.Potdar, G,S.Kengond, and M.M.Mulla, “Performance Evaluation of Docker Container and Virtual Machine,” Procedia Computer Science, vol. 171,pp.1419–1428, 2020.[19] R.Peinl, F.Holzschuher, and F.Pfitzer, “Docker cluster management for the cloud - survey results and own solution” journal of grid computing, 2016.[20] M.Sadoon, S.Hamid, H.Sofian, H.Altarturi, Z.H.Azizul, and N.Nasuha, “Fault tolerance in big data storage and processing systems: A review on challenges and solutions,” Ain Shams Engineering Journal, vol. 13, no. 2, p. 101538, Mar.2021.[21] G.Sayfan, “Mastering Kubernetes Master the art of container management by using the power of Kubernetes”. Birmingham Packt Publishing,2018.[22] R.Scolati, I.Fronza, N.El Ioini, A.Samir, and C.Pahl, “A Containerized Big Data Streaming Architecture for Edge Cloud Computing on Clustered Single-board Devices,” Proceedings of the 9th International Conference on Cloud Computing and Services Science, 2019.[23] V.S.Kushwah , S.K.Goyal , “ A Measuring Throughput for Fault Tolerant Based ACO Algorithm under Cloud Computing”:A Comparison Study. International Journal of Engineering & Technology, 7(4.12), p.39., 2018.[24] L Suresh, J.Loff, F.Kalim, S.Jyothi,N.Narodytska, L.Ryzhyk,S.Gamage, B.Oki,and M.Gasch , Building Scalable and Flexible Cluster Managers Using DeclarativeProgramming.2022. [online]Usenix.org.Available at: https://www.usenix.org/conference/osdi20/presentation/suresh[25] B.Thurgood, and R.Lennon, “Cloud Computing With Kubernetes Cluster Elastic Scaling. Proceedings of the 3rd International Conference on Future Networks andDistributed Systems”. 2019.[26] Y.Tong, and M.Kolen, ”Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests. Applied Measurement in Education”,20 (2), pp.227-253.2007.[27] E.Truyen ,B.Lagaisse,W.Joosen, A.Hoebreckx, and C.Dycker, Flexible Migrationin Blue-Green Deployments within a Fixed Cost .Proceedings of the 2020 6th Interna-tional Workshop on Container Technologies and Container Clouds, 2020.[28] O.Ungureanu , C.Vlădeanu , and R.Kooij , Kubernetes cluster optimization usinghybrid shared-state scheduling framework.Proceeding of the 3rd International Confer-ence on Future Networks and Distributed Systems, 2019.[29] M.Villamizar ,” Infrastructure cost comparison of running web applications in the cloud using AWS lambda and monolithic and microservice architectures”. Institute Of Electrical And Electronics Engineers -07-18, 2016. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202200253 en_US