學術產出-Proceedings

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 Enhancing Node Fault Tolerance through High-Availability Clusters in Kubernetes
作者 張宏慶
Jang, Hung-Chin;Luo, Shih-Yu
貢獻者 資訊系
關鍵詞 fault tolerance; Kubernetes; container; Autoscaler
日期 2023-04
上傳時間 16-Feb-2024 15:36:33 (UTC+8)
摘要 Microservices architecture and containerization technology have become ubiquitous recently. Docker containers have emerged as a standardized software packaging unit, offering rapid deployment, flexible scaling, and cross-platform operability. This enables the industry to focus on innovation and business needs while managing underlying infrastructure effortlessly. With the advent of technologies such as the Internet of Things, big data, and machine learning, there is a growing demand for parallel processing of large amounts of data across multiple hosts. Ensuring system resource availability and stability is essential, especially when services experience unexpected interruptions. As the number of containers increases, docker has introduced a container management platform called Docker Swarm to manage and schedule containers across multiple hosts and adjust their operational scale based on workload. If a container unexpectedly stops operating, the Docker Swarm cluster generates new containers automatically, ensuring the high availability of container services. Meanwhile, Google has introduced a container scheduling system called Kubernetes. The Horizontal Pod Autoscaler in Kubernetes automatically adjusts the number of service Pods based on node target memory usage, improving overall resource utilization. While Kubernetes can simplify application management and deployment, the cluster’s performance after deployment has yet to be effectively evaluated and compared. This study aims to optimize and adjust cluster node resource configuration and parameter settings using tools such as Vertical Pod Autoscaler, Descheduler, Ingress Controller, and Scheduling Framework. The performance of Kubernetes is compared with the Docker Swarm architecture to analyze the average response time, longest response time, connection success rate, success count, and failure count of the overall cluster’s web service traffic workload. The optimization is carried out to ensure the high availability of container services in case of node failures.
關聯 2023 IEEE 3rd International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), IEEE, Asia University, International Institute of Knowledge Innovation and Invention, NTSC
資料類型 conference
DOI https://doi.org/10.1109/ICEIB57887.2023.10170110
dc.contributor 資訊系
dc.creator (作者) 張宏慶
dc.creator (作者) Jang, Hung-Chin;Luo, Shih-Yu
dc.date (日期) 2023-04
dc.date.accessioned 16-Feb-2024 15:36:33 (UTC+8)-
dc.date.available 16-Feb-2024 15:36:33 (UTC+8)-
dc.date.issued (上傳時間) 16-Feb-2024 15:36:33 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/149875-
dc.description.abstract (摘要) Microservices architecture and containerization technology have become ubiquitous recently. Docker containers have emerged as a standardized software packaging unit, offering rapid deployment, flexible scaling, and cross-platform operability. This enables the industry to focus on innovation and business needs while managing underlying infrastructure effortlessly. With the advent of technologies such as the Internet of Things, big data, and machine learning, there is a growing demand for parallel processing of large amounts of data across multiple hosts. Ensuring system resource availability and stability is essential, especially when services experience unexpected interruptions. As the number of containers increases, docker has introduced a container management platform called Docker Swarm to manage and schedule containers across multiple hosts and adjust their operational scale based on workload. If a container unexpectedly stops operating, the Docker Swarm cluster generates new containers automatically, ensuring the high availability of container services. Meanwhile, Google has introduced a container scheduling system called Kubernetes. The Horizontal Pod Autoscaler in Kubernetes automatically adjusts the number of service Pods based on node target memory usage, improving overall resource utilization. While Kubernetes can simplify application management and deployment, the cluster’s performance after deployment has yet to be effectively evaluated and compared. This study aims to optimize and adjust cluster node resource configuration and parameter settings using tools such as Vertical Pod Autoscaler, Descheduler, Ingress Controller, and Scheduling Framework. The performance of Kubernetes is compared with the Docker Swarm architecture to analyze the average response time, longest response time, connection success rate, success count, and failure count of the overall cluster’s web service traffic workload. The optimization is carried out to ensure the high availability of container services in case of node failures.
dc.format.extent 112 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) 2023 IEEE 3rd International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), IEEE, Asia University, International Institute of Knowledge Innovation and Invention, NTSC
dc.subject (關鍵詞) fault tolerance; Kubernetes; container; Autoscaler
dc.title (題名) Enhancing Node Fault Tolerance through High-Availability Clusters in Kubernetes
dc.type (資料類型) conference
dc.identifier.doi (DOI) 10.1109/ICEIB57887.2023.10170110
dc.doi.uri (DOI) https://doi.org/10.1109/ICEIB57887.2023.10170110