Enhancing Node Fault Tolerance through High-Availability Clusters in Kubernetes | NCCU Academic Hub

Publications-Proceedings

Article View/Open

html(73)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	Enhancing Node Fault Tolerance through High-Availability Clusters in Kubernetes
作者	張宏慶 Jang, Hung-Chin;Luo, Shih-Yu
貢獻者	資訊系
關鍵詞	fault tolerance; Kubernetes; container; Autoscaler
日期	2023-04
上傳時間	16-Feb-2024 15:36:33 (UTC+8)
摘要	Microservices architecture and containerization technology have become ubiquitous recently. Docker containers have emerged as a standardized software packaging unit, offering rapid deployment, flexible scaling, and cross-platform operability. This enables the industry to focus on innovation and business needs while managing underlying infrastructure effortlessly. With the advent of technologies such as the Internet of Things, big data, and machine learning, there is a growing demand for parallel processing of large amounts of data across multiple hosts. Ensuring system resource availability and stability is essential, especially when services experience unexpected interruptions. As the number of containers increases, docker has introduced a container management platform called Docker Swarm to manage and schedule containers across multiple hosts and adjust their operational scale based on workload. If a container unexpectedly stops operating, the Docker Swarm cluster generates new containers automatically, ensuring the high availability of container services. Meanwhile, Google has introduced a container scheduling system called Kubernetes. The Horizontal Pod Autoscaler in Kubernetes automatically adjusts the number of service Pods based on node target memory usage, improving overall resource utilization. While Kubernetes can simplify application management and deployment, the cluster’s performance after deployment has yet to be effectively evaluated and compared. This study aims to optimize and adjust cluster node resource configuration and parameter settings using tools such as Vertical Pod Autoscaler, Descheduler, Ingress Controller, and Scheduling Framework. The performance of Kubernetes is compared with the Docker Swarm architecture to analyze the average response time, longest response time, connection success rate, success count, and failure count of the overall cluster’s web service traffic workload. The optimization is carried out to ensure the high availability of container services in case of node failures.
關聯	2023 IEEE 3rd International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), IEEE, Asia University, International Institute of Knowledge Innovation and Invention, NTSC
資料類型	conference
DOI	https://doi.org/10.1109/ICEIB57887.2023.10170110

dc.contributor	資訊系
dc.creator (作者)	張宏慶
dc.creator (作者)	Jang, Hung-Chin;Luo, Shih-Yu
dc.date (日期)	2023-04
dc.date.accessioned	16-Feb-2024 15:36:33 (UTC+8)	-
dc.date.available	16-Feb-2024 15:36:33 (UTC+8)	-
dc.date.issued (上傳時間)	16-Feb-2024 15:36:33 (UTC+8)	-
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/149875	-
dc.description.abstract (摘要)	Microservices architecture and containerization technology have become ubiquitous recently. Docker containers have emerged as a standardized software packaging unit, offering rapid deployment, flexible scaling, and cross-platform operability. This enables the industry to focus on innovation and business needs while managing underlying infrastructure effortlessly. With the advent of technologies such as the Internet of Things, big data, and machine learning, there is a growing demand for parallel processing of large amounts of data across multiple hosts. Ensuring system resource availability and stability is essential, especially when services experience unexpected interruptions. As the number of containers increases, docker has introduced a container management platform called Docker Swarm to manage and schedule containers across multiple hosts and adjust their operational scale based on workload. If a container unexpectedly stops operating, the Docker Swarm cluster generates new containers automatically, ensuring the high availability of container services. Meanwhile, Google has introduced a container scheduling system called Kubernetes. The Horizontal Pod Autoscaler in Kubernetes automatically adjusts the number of service Pods based on node target memory usage, improving overall resource utilization. While Kubernetes can simplify application management and deployment, the cluster’s performance after deployment has yet to be effectively evaluated and compared. This study aims to optimize and adjust cluster node resource configuration and parameter settings using tools such as Vertical Pod Autoscaler, Descheduler, Ingress Controller, and Scheduling Framework. The performance of Kubernetes is compared with the Docker Swarm architecture to analyze the average response time, longest response time, connection success rate, success count, and failure count of the overall cluster’s web service traffic workload. The optimization is carried out to ensure the high availability of container services in case of node failures.
dc.format.extent	112 bytes	-
dc.format.mimetype	text/html	-
dc.relation (關聯)	2023 IEEE 3rd International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB), IEEE, Asia University, International Institute of Knowledge Innovation and Invention, NTSC
dc.subject (關鍵詞)	fault tolerance; Kubernetes; container; Autoscaler
dc.title (題名)	Enhancing Node Fault Tolerance through High-Availability Clusters in Kubernetes
dc.type (資料類型)	conference
dc.identifier.doi (DOI)	10.1109/ICEIB57887.2023.10170110
dc.doi.uri (DOI)	https://doi.org/10.1109/ICEIB57887.2023.10170110