基于 LVScare 的 Kubernetes 高可用方案
在云环境中,Kubernetes 控制面的高可用通常交给云厂商的 LoadBalancer 来解决。但在以下场景中,这条路并不成立:
- 私有 IDC / 裸金属集群
- 边缘计算、离线环境
- 公有云迁回自建机房(缺少云 LB、云 SLB)
- 希望减少组件复杂度、避免引入 keepalived / VRRP
我之前有了解过 sealos,它提供了一种非常“反常规但优雅”的方案:
不做 VIP,不做 VRRP,而是用 IPVS + LVScare 在每个节点本地维护 kube-apiserver 的高可用访问能力。
本文就是在这一思路启发下,对 LVScare 作为 Kubernetes 高可用组件 的一次整理与实践总结。
传统 Kubernetes API Server 高可用方案回顾
方案一:云 LoadBalancer
优点:
- 简单、稳定
- 对使用者透明
缺点:
- 强依赖云厂商
- 私有化 / IDC 场景不可用
方案二:VRRP + keepalived(VIP)
VIP:6443
|
+----------+------------+
| |
Nginx <-keepalived-> Nginx(Backup)
|
+---+--------------------------+
| | |
apiserver apiserver apiserver问题在于:
- 需要额外的 LB 节点
- 需要维护 VRRP / VIP
- 架构组件偏重
sealos 的思路:去中心化的 API Server 高可用
sealos 提出的核心思路是:
不再提供一个“中心入口”,而是让每个节点都具备访问“所有 API Server 的能力”。
hosts 配置
假设节点主机名和 IP 如下,在每个节点上配置 hosts。
10.0.0.1 apiserver.cluster.local # 临时解析到第一个master的ip,集群搭建完成后会改掉
10.0.0.1 master1
10.0.0.2 master2
10.0.0.3 master3
10.0.0.11 node1安装软件
配置 Kubernetes 软件源:
cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.33/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.33/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF安装所需软件:
yum install -y runc ipset ipvsadm
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes安装containerd:
- 从 containerd 发布页面(https://github.com/containerd/containerd/releases)根据芯片架构下载最新的版本即可。比如 x86_64 平台,下载
containerd-2.x.x-linux-amd64.tar.gz - 解压 tar 包到
/usr/local
tar xf containerd-2.x.x-linux-amd64.tar.gz -C /usr/local- 配置systemd服务
cat > /usr/local/lib/systemd/system/containerd.service << "EOF"
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF- 设置服务开机自启
systemctl enable containerd kubelet- 配置 containerd
mkdir -p /etc/containerd
cat > /etc/containerd/config.toml << "EOF"
version = 3
root = "/var/lib/containerd"
state = "/run/containerd"
[grpc]
address = "/run/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "registry.k8s.io/pause:3.10"
[plugins.'io.containerd.cri.v1.runtime'.cni]
conf_dir = "/etc/cni/net.d"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/etc/containerd/certs.d"
EOF配置私有仓库:
你需要一个私有仓库,并且做了镜像代理:
- hub.yourdomain.com/docker.io -> 可以直接代理拉取docker.io的镜像
- hub.yourdomain.com/registry.k8s.io -> 可以直接代理拉取registry.k8s.io的镜像
如果你的私有仓库还不具备这样的能力,你需要手动搬运镜像并且适当改造以下containerd的配置文件。
mkdir -p /etc/containerd/certs.d/{docker.io,registry.k8s.io}
cat > /etc/containerd/certs.d/docker.io/hosts.toml << "EOF"
[host]
[host."https://hub.yourdomain.com/v2/docker.io"]
capabilities = ["pull", "resolve"]
override_path = true
EOF
cat > /etc/containerd/certs.d/registry.k8s.io/hosts.toml << "EOF"
[host]
[host."https://hub.yourdomain.com/v2/registry.k8s.io"]
capabilities = ["pull", "resolve"]
override_path = true
EOF运行 containerd:
systemctl start containerd以上操作,在每个节点都要执行。
初始化第一个控制节点
生成 kubeadm 配置文件:
kubeadm config print init-defaults --component-configs KubeletConfiguration > kubeadm.yaml在此基础上,修改:
apiVersion: kubeadm.k8s.io/v1beta4
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.0.0.1 # 修改,节点的IP
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
imagePullSerial: true
name: master1 # 修改,节点的主机名
taints: null
timeouts:
controlPlaneComponentHealthCheck: 4m0s
discovery: 5m0s
etcdAPICall: 2m0s
kubeletHealthCheck: 4m0s
kubernetesAPICall: 1m0s
tlsBootstrap: 5m0s
upgradeManifests: 5m0s
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1 # 新增配置段,设置kube-proxy为ipvs模式
kind: KubeProxyConfiguration
mode: ipvs
---
apiServer:
certSANs: # 新增配置段,添加其他master节点的相关信息
- master1
- master2
- master3
- 10.0.0.1
- 10.0.0.2
- 10.0.0.3
- 100.100.100.100 # 固定写 100.100.100.100,后面用作负载均衡IP
- apiserver.cluster.local
apiVersion: kubeadm.k8s.io/v1beta4
caCertificateValidityPeriod: 87600h0m0s # 修改证书有效期为10年
certificateValidityPeriod: 87600h0m0s # 修改证书有效期为10年
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: apiserver.cluster.local:6443 # 添加这个配置
controllerManager: {}
dns: {}
encryptionAlgorithm: RSA-2048
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: hub.yourdomain.com/docker.io/kubesphere # 修改为私有仓库地址,使用kubesphere是因为这个仓库有完整的Kubernetes镜像
kind: ClusterConfiguration
kubernetesVersion: 1.33.3 # 修改版本号,一般安装这个版本的最新小版本
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12 # 按需修改
podSubnet: 10.244.0.0/16 # 按需修改
proxy: {}
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10 # 如果修改了serviceSubnet,这个也要改
clusterDomain: cluster.local
containerRuntimeEndpoint: ""
cpuManagerReconcilePeriod: 0s
crashLoopBackOff: {}
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMaximumGCAge: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging:
flushFrequency: 0
options:
json:
infoBufferSize: "0"
text:
infoBufferSize: "0"
verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s拉取镜像:
kubeadm config images pull --config kubeadm.yaml初始化节点:
kubeadm init --upload-certs --config kubeadm.yamlinit 完成后,会输出两条 join 命令,一条是添加控制节点,一条是添加工作节点。
添加控制节点
执行第一个控制节点输出的那条添加控制节点的 kubeadm join 命令即可。
kubeadm join apiserver.cluster.local:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:xxx \
--control-plane --certificate-key xxx
添加工作节点
为了实现 kube-apiserver 的高可用,引入一个负载均衡工具 lvscare,其工作原理如下:
+----------+ +---------------+ virturl server: 100.100.100.100:6443
| master1 |<--------------------| nodes(ipvs) | real servers:
+----------+ |---------------+ 10.0.0.1:6443
| 10.0.0.2:6443
+----------+ | 10.0.0.3:6443
| master2 |<--------------------+
+----------+ |
|
+----------+ |
| master3 |<--------------------+
+----------+lvscare 做为静态 Pod 运行,会被 kubelet 拉起。
配置静态 Pod 部署文件:
注意修改 rs 的地址
mkdir -p /etc/kubernetes/manifests
cat > /etc/kubernetes/manifests/lvscare.yaml << "EOF"
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: kube-lvscare
namespace: kube-system
spec:
containers:
- args:
- care
- --vs
- 100.100.100.100:6443
- --health-path
- /healthz
- --health-schem
- https
- --rs
- 10.0.0.1:6443 # 第一个控制节点的apiserver
- --rs
- 10.0.0.2:6443 # 第二个控制节点的apiserver
- --rs
- 10.0.0.3:6443 # 第三个控制节点的apiserver
command:
- /usr/bin/lvscare
image: hub.yourdomain.com/docker.io/labring/lvscare:v5.0.1
imagePullPolicy: IfNotPresent
name: kube-lvscare
resources: {}
securityContext:
privileged: true
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
volumes:
- hostPath:
path: /lib/modules
type: ""
name: lib-modules
status: {}
EOF执行那条添加工作节点的 kubeadm join 命令即可。
kubeadm join apiserver.cluster.local:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:xxx
安装 canal 网络插件
下载 canal 部署文件:
https://raw.githubusercontent.com/projectcalico/calico/v3.30.2/manifests/canal.yaml
默认网段:10.244.0.0/16
如果前面部署 Kubernetes 集群修改过这个网段,那么 canal的yaml也需要改成一样的网段。
# Flannel network configuration. Mounted into the flannel container.
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}部署 canal:
kubectl apply -f canal.yaml修改 apiserver.cluster.local hosts 解析
- 工作节点
100.100.100.100 apiserver.cluster.local- 控制节点
控制节点的 apiserver.cluster.local 解析到本机 IP 地址(或127.0.0.1)即可。