學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 基於Kubernetes 高可用集群的節點失效容錯研究 以HTTP Web服務為驗證案例
The Study of Nodes Fault Tolerance Based on Kubernetes High-availability Clusters:A Pratical Study on HTTP Web Service
作者 羅時雨
Luo, Shih-Yu
貢獻者 張宏慶
Jang, Hung-Chin
羅時雨
Luo, Shih-Yu
關鍵詞 容錯
節點失效
容器化平台
高可用集群
網絡流量工作負載
Fault Tolerance
Kubernetes
Web Traffic Workload
Container
Autoscaler
日期 2022
上傳時間 1-Mar-2022 18:21:32 (UTC+8)
摘要 近年微服務架構、容器化技術普及、以Docker容器為標準化單位的
軟體封裝,其快速佈署、彈性調整、跨平台運作特性,能讓業界更專注於創新和業務需求、可輕鬆管理底層基礎設施。隨著物聯網、大數據機器學習盛行,得跨主機平行處理大量資料,故當服務發生不可預期中斷時,得維持系統資源可用性與穩定性。
隨著容器數量增長,Docker公司推出容器的管理平台Docker Swarm管理調度跨主機的容器,依據工作負載去調整其運作規模大小,當容器不可預期停止運作時,Docker Swarm叢集會自動產生新的容器,其確保容器服務高可用性。且在同時Google亦推出Kubernetes,故同時比較以Kubernetes 為基礎的 Horizontal Pod Autoscaler,其會依據節點記憶體目標使用率,自動調整服務Pod個数,提升整體資源利用率。Kubernetes簡化應用程式的管理與佈署,但佈署後其集群內效能未被有效去評估與比較,本研究會針對集群內節點資源配置、參數設定,以Vertical-Pod-Autoscaler、Descheduler、Ingress Controller、Scheduling Framework做優化調整。並再與Docker Swarm 架構比較。驗證叢集中節點發生故障失效,優化整體叢集內Web服務Traffic Workload平均反應時間、最長反應時間、連線數成功率、成功次數、失敗次數 數據結果。
In recent years , the popularization of the Microservice architeure,and Docker containers, its rapid deployment ,flexible adjustment,and cross-platform operation characteristics,enable enterprise to focus on innovation and business needs,and easily manage infrastructure.With the coming of Internet of Things and Data Machine Learning,a large amount of data be processed across hosts.Thus,When system’service is suddenly interrupted, the availability of resources can be sustained steadily.The Horizontal Pod Autoscaler based on Kubernetes automatically adjust the number of Pods according to the target memory utilization
of the node , improving the overall resource utilization. Apparently,Kubernetes simplify the management and deployment of pod, but its performance has not been effectively evaluated. The study will focus on the node resource configuration and parameter settings in cluster.Vertical-Pod-Autoscaler, Scheduling Framework, Descheuler, and Ingress Controller makes optimization adjustments.At the same time,compared with the Docker Swarm cluster, When the node fails, the research is implemented to optimize the Web service of the average response time , the longest response time,and the success rate of the number of connection.
參考文獻 [1] T.Ashwarya, E.Berryman, and M.Konrad, “RecSyncETCD: A FAULT-TOLERANT SERVICE FOR EPICS PV CONFIGURATION DATA*” , 17th Int.Conf. on Acc. and Large Exp. Physics Control Systems ,2019.

[2] A.Baarzi , G.Kesidis, D.Fleck, and A.Stavrou, “Microservices made attack-resilient using unsupervised service fissioning”.Proceedings of the 13th European workshop on Systems Security, 2020.

[3] L.Chen , Y.Pan , and R.O.Sinnott , “ Auto-scaling Walkability Analytics through
Kubernetes and Docker SWARM on the Cloud ”Proceedings of the 10th International Conference on Cloud Computing and Services Science, 2020.

[4] P.De , S.Caiano , R.Gonçalves ,and R.Morla,“FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO A Load Balancer for Bursting Hadoop-based Network Analysis Jobs,”2015.[Online].Available:https://repositorio-aberto.up.pt/bitstream/10216/79486/2/35642.pdf.

[5] M. Fukushi , T.Katsuta, and Y.Kurokawa , ” A degradable NoC router for the improvement of fault-tolerant routing performance". Artificial Life and Robotics
, 25(2), pp.301-307. 2019.

[6] El Haj Ahmed , F.Gil‐Castiñeira, and Costa‐Montenegro, E.KubCG : A dynamic Kubernetes scheduler for heterogeneous clusters. Software : Practice and Experience
, 51(2), pp.213-234.2020.

[7] D.Harris , ”Practical Issues in Vertical Scaling. Linking and Aligning Scores and Scales”, pp.233-251.2007.

[8] M.Imran, Kuznetsov, K.Dziedziniewicz-Wojcik, A.Pfeiffer, P.Paparrigopoulos, S.Trigazis , T.Tedeschi , and D.Ciangottini, Migration of CMSWEB cluster at CERN to Kubernetes: a comprehensive study. Cluster Computing, 24(4), pp.3085-3099.2021.

[9] C.Jensen, H.Howard, and R.Mortier , “ Examining Raft’s behaviour during partial
network failures,”Proceedings of the 1st Workshop on High Availability and Observa-
ility of Cloud Systems, Apr. 2021.

[10] C.Joseph, and K.Chandrasekaran,Nature‐inspired resource management and dynamic rescheduling of Microservices in Cloud data centers. Concurrency and Computation: Practice and Experience, 33(17).2021.

[11] S.Junaid, A.Saeed2 ,R.White1, “Single Point of Failure (SPOF) – a useful concept and mnemonic to reduce reporting errors in cancer imaging”, ECR 2018 EPOS, Jan. 12, 2018.

[12] J.Li, S.Kulkarni, K.Ramakrishnan, D.Li , Understanding Open Source Serverless
Platforms. Proceedings of the 5th International Workshop on Serverless Computing- WOSC `19, 2019.

[13]J.Lawrence,E.Prakash,C.Hewage, “Kubernetes:Essential for Cloud Tansformation
” Cardiff School of Technologies, Cardiff Metropolitan University, p4.2021.

[14] P.Martin, Discovery and Load Balancing. Kubernetes, pp.101-114. 2020.

[15] V.Medel,C.Tolón,U.Arronategui, R.Tolosana-Calasanz, J.Bañares,and O.Rana
Client-Side Scheduling Based on Application Characterization on Kubernetes.
Economics of Grids, Clouds, Systems, and Services, pp.162-176. 2017.

[16] G.Muntoni, J.Soldani ,A.Brogi, Mining the Architecture of Microservice-Based
Applications from their Kubernetes Deployment. Communications in Computer and Information Science, pp.103-115.2021.

[17] D.Ongaro and J.Ousterhout,“In Search of an Understandable Consensus Algorithm
,” www.usenix.org, 2014.

[18] A.M.Potdar, G,S.Kengond, and M.M.Mulla, “Performance Evaluation of Docker Container and Virtual Machine,” Procedia Computer Science, vol. 171,pp.1419–1428, 2020.

[19] R.Peinl, F.Holzschuher, and F.Pfitzer, “Docker cluster management for the cloud - survey results and own solution” journal of grid computing, 2016.

[20] M.Sadoon, S.Hamid, H.Sofian, H.Altarturi, Z.H.Azizul, and N.Nasuha, “Fault tolerance in big data storage and processing systems: A review on challenges and solutions,” Ain Shams Engineering Journal, vol. 13, no. 2, p. 101538, Mar.2021.
[21] G.Sayfan, “Mastering Kubernetes Master the art of container management by using the power of Kubernetes”. Birmingham Packt Publishing,2018.

[22] R.Scolati, I.Fronza, N.El Ioini, A.Samir, and C.Pahl, “A Containerized Big Data Streaming Architecture for Edge Cloud Computing on Clustered Single-board Devices,” Proceedings of the 9th International Conference on Cloud Computing and Services Science, 2019.

[23] V.S.Kushwah , S.K.Goyal , “ A Measuring Throughput for Fault Tolerant Based ACO Algorithm under Cloud Computing”:A Comparison Study. International Journal of Engineering & Technology, 7(4.12), p.39., 2018.

[24] L Suresh, J.Loff, F.Kalim, S.Jyothi,N.Narodytska, L.Ryzhyk,S.Gamage, B.Oki,
and M.Gasch , Building Scalable and Flexible Cluster Managers Using Declarative
Programming.2022. [online]Usenix.org.Available at: https://www.usenix.org/conference/osdi20/presentation/suresh

[25] B.Thurgood, and R.Lennon, “Cloud Computing With Kubernetes Cluster Elastic Scaling. Proceedings of the 3rd International Conference on Future Networks and
Distributed Systems”. 2019.

[26] Y.Tong, and M.Kolen, ”Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests. Applied Measurement in Education”,20 (2), pp.227-253.2007.

[27] E.Truyen ,B.Lagaisse,W.Joosen, A.Hoebreckx, and C.Dycker, Flexible Migration
in Blue-Green Deployments within a Fixed Cost .Proceedings of the 2020 6th Interna-
tional Workshop on Container Technologies and Container Clouds, 2020.

[28] O.Ungureanu , C.Vlădeanu , and R.Kooij , Kubernetes cluster optimization using
hybrid shared-state scheduling framework.Proceeding of the 3rd International Confer-
ence on Future Networks and Distributed Systems, 2019.

[29] M.Villamizar ,” Infrastructure cost comparison of running web applications in the cloud using AWS lambda and monolithic and microservice architectures”. Institute Of Electrical And Electronics Engineers -07-18, 2016.
描述 碩士
國立政治大學
資訊科學系碩士在職專班
107971018
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107971018
資料類型 thesis
dc.contributor.advisor 張宏慶zh_TW
dc.contributor.advisor Jang, Hung-Chinen_US
dc.contributor.author (Authors) 羅時雨zh_TW
dc.contributor.author (Authors) Luo, Shih-Yuen_US
dc.creator (作者) 羅時雨zh_TW
dc.creator (作者) Luo, Shih-Yuen_US
dc.date (日期) 2022en_US
dc.date.accessioned 1-Mar-2022 18:21:32 (UTC+8)-
dc.date.available 1-Mar-2022 18:21:32 (UTC+8)-
dc.date.issued (上傳時間) 1-Mar-2022 18:21:32 (UTC+8)-
dc.identifier (Other Identifiers) G0107971018en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/139314-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系碩士在職專班zh_TW
dc.description (描述) 107971018zh_TW
dc.description.abstract (摘要) 近年微服務架構、容器化技術普及、以Docker容器為標準化單位的
軟體封裝,其快速佈署、彈性調整、跨平台運作特性,能讓業界更專注於創新和業務需求、可輕鬆管理底層基礎設施。隨著物聯網、大數據機器學習盛行,得跨主機平行處理大量資料,故當服務發生不可預期中斷時,得維持系統資源可用性與穩定性。
隨著容器數量增長,Docker公司推出容器的管理平台Docker Swarm管理調度跨主機的容器,依據工作負載去調整其運作規模大小,當容器不可預期停止運作時,Docker Swarm叢集會自動產生新的容器,其確保容器服務高可用性。且在同時Google亦推出Kubernetes,故同時比較以Kubernetes 為基礎的 Horizontal Pod Autoscaler,其會依據節點記憶體目標使用率,自動調整服務Pod個数,提升整體資源利用率。Kubernetes簡化應用程式的管理與佈署,但佈署後其集群內效能未被有效去評估與比較,本研究會針對集群內節點資源配置、參數設定,以Vertical-Pod-Autoscaler、Descheduler、Ingress Controller、Scheduling Framework做優化調整。並再與Docker Swarm 架構比較。驗證叢集中節點發生故障失效,優化整體叢集內Web服務Traffic Workload平均反應時間、最長反應時間、連線數成功率、成功次數、失敗次數 數據結果。
zh_TW
dc.description.abstract (摘要) In recent years , the popularization of the Microservice architeure,and Docker containers, its rapid deployment ,flexible adjustment,and cross-platform operation characteristics,enable enterprise to focus on innovation and business needs,and easily manage infrastructure.With the coming of Internet of Things and Data Machine Learning,a large amount of data be processed across hosts.Thus,When system’service is suddenly interrupted, the availability of resources can be sustained steadily.The Horizontal Pod Autoscaler based on Kubernetes automatically adjust the number of Pods according to the target memory utilization
of the node , improving the overall resource utilization. Apparently,Kubernetes simplify the management and deployment of pod, but its performance has not been effectively evaluated. The study will focus on the node resource configuration and parameter settings in cluster.Vertical-Pod-Autoscaler, Scheduling Framework, Descheuler, and Ingress Controller makes optimization adjustments.At the same time,compared with the Docker Swarm cluster, When the node fails, the research is implemented to optimize the Web service of the average response time , the longest response time,and the success rate of the number of connection.
en_US
dc.description.tableofcontents 第一章、 研究背景
1-1研究背景 1
1-2研究動機 2
1-3實作目標與研究貢獻 2
第二章、相關研究
2-1 Fault tolerance in big a data storage and processing system 6
2-2 Docker微服務Container技術與過去的Virtual Machine 9
2-3 Docker swarm叢集應用 13
2-4 ETCD cluster容錯服務 14
第三章、系統架構
3-1系統架構各元件功能與作用 18
3-2系統架構元件運作細節
3-2-1 Kubernetes的 Master節點 23
3-2-1-1 Master節點的API Server 23
3-2-1-2 Master節點的Controller Manager 24
3-2-1-3 Master節點的Kube-scheduler 24
3-2-2 Kubernetes架構的Worker節點 25
3-2-3 Kubernetes 架構的Deployment 28
3-2-4 Kubernetes架構的 Horizontal Pod Autoscaler 26
3-2-5相關研究架構下透過Metrics server監控資訊執行
Horizontal Pod Autoscaler 30
3-3說明本架構下對於叢集中節點發生錯誤情境 32
3-4定義Fault type: Web HTTP服務連線異常 34

第四章 實驗內容
4-1實驗環境設定 35
4-2-1 Horizontal Pod Autoscaler目標利用率最佳化 36
4-2-2 VPA(Vertical-pod-autoscaler)36
4-2-3 Descheduler 37
4-2-4 Ingress Controller負載平衡 38
4-2-5 Scheduling Framework調度框架 39
4-3-1 實驗案例與測試腳本說明 40
4-3-2 Resource Measure 40
4-3-3 Resource Provison 41
4-3-4 Web traffic workload 驗證方式 42
4-4實驗結果 43
第五章 結論 46
參考文獻 47
zh_TW
dc.format.extent 5865079 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107971018en_US
dc.subject (關鍵詞) 容錯zh_TW
dc.subject (關鍵詞) 節點失效zh_TW
dc.subject (關鍵詞) 容器化平台zh_TW
dc.subject (關鍵詞) 高可用集群zh_TW
dc.subject (關鍵詞) 網絡流量工作負載zh_TW
dc.subject (關鍵詞) Fault Toleranceen_US
dc.subject (關鍵詞) Kubernetesen_US
dc.subject (關鍵詞) Web Traffic Workloaden_US
dc.subject (關鍵詞) Containeren_US
dc.subject (關鍵詞) Autoscaleren_US
dc.title (題名) 基於Kubernetes 高可用集群的節點失效容錯研究 以HTTP Web服務為驗證案例zh_TW
dc.title (題名) The Study of Nodes Fault Tolerance Based on Kubernetes High-availability Clusters:A Pratical Study on HTTP Web Serviceen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] T.Ashwarya, E.Berryman, and M.Konrad, “RecSyncETCD: A FAULT-TOLERANT SERVICE FOR EPICS PV CONFIGURATION DATA*” , 17th Int.Conf. on Acc. and Large Exp. Physics Control Systems ,2019.

[2] A.Baarzi , G.Kesidis, D.Fleck, and A.Stavrou, “Microservices made attack-resilient using unsupervised service fissioning”.Proceedings of the 13th European workshop on Systems Security, 2020.

[3] L.Chen , Y.Pan , and R.O.Sinnott , “ Auto-scaling Walkability Analytics through
Kubernetes and Docker SWARM on the Cloud ”Proceedings of the 10th International Conference on Cloud Computing and Services Science, 2020.

[4] P.De , S.Caiano , R.Gonçalves ,and R.Morla,“FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO A Load Balancer for Bursting Hadoop-based Network Analysis Jobs,”2015.[Online].Available:https://repositorio-aberto.up.pt/bitstream/10216/79486/2/35642.pdf.

[5] M. Fukushi , T.Katsuta, and Y.Kurokawa , ” A degradable NoC router for the improvement of fault-tolerant routing performance". Artificial Life and Robotics
, 25(2), pp.301-307. 2019.

[6] El Haj Ahmed , F.Gil‐Castiñeira, and Costa‐Montenegro, E.KubCG : A dynamic Kubernetes scheduler for heterogeneous clusters. Software : Practice and Experience
, 51(2), pp.213-234.2020.

[7] D.Harris , ”Practical Issues in Vertical Scaling. Linking and Aligning Scores and Scales”, pp.233-251.2007.

[8] M.Imran, Kuznetsov, K.Dziedziniewicz-Wojcik, A.Pfeiffer, P.Paparrigopoulos, S.Trigazis , T.Tedeschi , and D.Ciangottini, Migration of CMSWEB cluster at CERN to Kubernetes: a comprehensive study. Cluster Computing, 24(4), pp.3085-3099.2021.

[9] C.Jensen, H.Howard, and R.Mortier , “ Examining Raft’s behaviour during partial
network failures,”Proceedings of the 1st Workshop on High Availability and Observa-
ility of Cloud Systems, Apr. 2021.

[10] C.Joseph, and K.Chandrasekaran,Nature‐inspired resource management and dynamic rescheduling of Microservices in Cloud data centers. Concurrency and Computation: Practice and Experience, 33(17).2021.

[11] S.Junaid, A.Saeed2 ,R.White1, “Single Point of Failure (SPOF) – a useful concept and mnemonic to reduce reporting errors in cancer imaging”, ECR 2018 EPOS, Jan. 12, 2018.

[12] J.Li, S.Kulkarni, K.Ramakrishnan, D.Li , Understanding Open Source Serverless
Platforms. Proceedings of the 5th International Workshop on Serverless Computing- WOSC `19, 2019.

[13]J.Lawrence,E.Prakash,C.Hewage, “Kubernetes:Essential for Cloud Tansformation
” Cardiff School of Technologies, Cardiff Metropolitan University, p4.2021.

[14] P.Martin, Discovery and Load Balancing. Kubernetes, pp.101-114. 2020.

[15] V.Medel,C.Tolón,U.Arronategui, R.Tolosana-Calasanz, J.Bañares,and O.Rana
Client-Side Scheduling Based on Application Characterization on Kubernetes.
Economics of Grids, Clouds, Systems, and Services, pp.162-176. 2017.

[16] G.Muntoni, J.Soldani ,A.Brogi, Mining the Architecture of Microservice-Based
Applications from their Kubernetes Deployment. Communications in Computer and Information Science, pp.103-115.2021.

[17] D.Ongaro and J.Ousterhout,“In Search of an Understandable Consensus Algorithm
,” www.usenix.org, 2014.

[18] A.M.Potdar, G,S.Kengond, and M.M.Mulla, “Performance Evaluation of Docker Container and Virtual Machine,” Procedia Computer Science, vol. 171,pp.1419–1428, 2020.

[19] R.Peinl, F.Holzschuher, and F.Pfitzer, “Docker cluster management for the cloud - survey results and own solution” journal of grid computing, 2016.

[20] M.Sadoon, S.Hamid, H.Sofian, H.Altarturi, Z.H.Azizul, and N.Nasuha, “Fault tolerance in big data storage and processing systems: A review on challenges and solutions,” Ain Shams Engineering Journal, vol. 13, no. 2, p. 101538, Mar.2021.
[21] G.Sayfan, “Mastering Kubernetes Master the art of container management by using the power of Kubernetes”. Birmingham Packt Publishing,2018.

[22] R.Scolati, I.Fronza, N.El Ioini, A.Samir, and C.Pahl, “A Containerized Big Data Streaming Architecture for Edge Cloud Computing on Clustered Single-board Devices,” Proceedings of the 9th International Conference on Cloud Computing and Services Science, 2019.

[23] V.S.Kushwah , S.K.Goyal , “ A Measuring Throughput for Fault Tolerant Based ACO Algorithm under Cloud Computing”:A Comparison Study. International Journal of Engineering & Technology, 7(4.12), p.39., 2018.

[24] L Suresh, J.Loff, F.Kalim, S.Jyothi,N.Narodytska, L.Ryzhyk,S.Gamage, B.Oki,
and M.Gasch , Building Scalable and Flexible Cluster Managers Using Declarative
Programming.2022. [online]Usenix.org.Available at: https://www.usenix.org/conference/osdi20/presentation/suresh

[25] B.Thurgood, and R.Lennon, “Cloud Computing With Kubernetes Cluster Elastic Scaling. Proceedings of the 3rd International Conference on Future Networks and
Distributed Systems”. 2019.

[26] Y.Tong, and M.Kolen, ”Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests. Applied Measurement in Education”,20 (2), pp.227-253.2007.

[27] E.Truyen ,B.Lagaisse,W.Joosen, A.Hoebreckx, and C.Dycker, Flexible Migration
in Blue-Green Deployments within a Fixed Cost .Proceedings of the 2020 6th Interna-
tional Workshop on Container Technologies and Container Clouds, 2020.

[28] O.Ungureanu , C.Vlădeanu , and R.Kooij , Kubernetes cluster optimization using
hybrid shared-state scheduling framework.Proceeding of the 3rd International Confer-
ence on Future Networks and Distributed Systems, 2019.

[29] M.Villamizar ,” Infrastructure cost comparison of running web applications in the cloud using AWS lambda and monolithic and microservice architectures”. Institute Of Electrical And Electronics Engineers -07-18, 2016.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202200253en_US