Offline Deployment of Multi-Master Node K8s Cluster on ARM64 Architecture Servers

1. Current Environment

Server: TaiShan 2280 V2

Operating System: openEuler 24.03 SP2

Deployment Environment: 5 virtual machines (3 masters + 2 nodes) deployed on the server system via qemu+kvm — For virtual machine deployment, refer to the previous article: “Deploying Virtual Machines on openEuler”

K8s Version: 1.33.0

2. Cluster Network Planning

K8S Cluster Role	IP	Hostname	Installed Components
Control Node	192.168.1.91	master01	apiserver, controller-manager, scheduler, etcd, containerd, keepalived, nginx
Control Node	192.168.1.92	master02	apiserver, controller-manager, scheduler, etcd, containerd, keepalived, nginx
Control Node	192.168.1.93	master03	apiserver, controller-manager, scheduler, etcd, containerd, keepalived, nginx
Worker Node	192.168.1.94	node01	kubelet, kube-proxy, docker, calico, coredns
Worker Node	192.168.1.95	node02	kubelet, kube-proxy, docker, calico, coredns
VIP	192.168.1.99

3. Set Hostnames

$ hostnamectl set-hostname master01
$ hostnamectl set-hostname master02
$ hostnamectl set-hostname master03
$ hostnamectl set-hostname node01
$ hostnamectl set-hostname node02

4. Modify Hosts File (All Nodes)

$ vim /etc/hosts

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4
::1     localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.91 master01
192.168.1.92 master02
192.168.1.93 master03
192.168.1.94 node01
192.168.1.95 node02

5. Set Up Passwordless SSH Login (Each Master Node)

$ cd /root/.ssh
$ ssh-keygen -t rsa
$ ssh-copy-id [email protected]
$ ssh-copy-id [email protected]
$ ssh-copy-id [email protected]
$ ssh-copy-id [email protected]
$ ssh-copy-id [email protected]
$ scp /etc/hosts root@master01:/etc/hosts
$ scp /etc/hosts root@master02:/etc/hosts
$ scp /etc/hosts root@master03:/etc/hosts
$ scp /etc/hosts root@node01:/etc/hosts
$ scp /etc/hosts root@node02:/etc/hosts

6. Disable SELinux (All Nodes)

$ sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# After modifying the SELinux configuration file, restart the machine for the changes to take effect permanently. After rebooting, log in to the machine and execute the following command:
$ getenforce
# If it shows Disabled, SELinux has been turned off
$ reboot

7. Disable Swap Partition to Improve Performance (All Nodes)

# Temporarily disable
$ swapoff -a
# Permanently disable: comment out the swap mount line, add a comment at the beginning of the swap line
$ vim /etc/fstab
#/dev/mapper/centos-swap swap    swap   defaults     0 0

8. Modify Kernel Parameters (All Nodes)

$ modprobe br_netfilter
$ cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
$ sysctl -p /etc/sysctl.d/k8s.conf

9. Disable Firewalld (All Nodes)

$ systemctl stop firewalld ; systemctl disable firewalld

10. Import RPM Packages and Install (All Nodes)

$ yum -y install runc
$ rpm -ivh containerd-1.6.22-15.oe2403.aarch64.rpm
$ rpm -ivh cri-tools-1.33.0-150500.1.1.aarch64.rpm
$ rpm -ivh kubeadm-1.33.0-150500.1.1.aarch64.rpm
$ rpm -ivh kubectl-1.33.0-150500.1.1.aarch64.rpm
$ rpm -ivh kubernetes-cni-1.6.0-150500.1.1.aarch64.rpm
$ yum -y install conntrack-tools.aarch64 (local source is sufficient)
$ rpm -ivh kubelet-1.33.0-150500.1.1.aarch64.rpm
Set services to start on boot
$ systemctl enable kubelet ; systemctl enable containerd

11. Check Required Image Versions for Corresponding K8s Cluster Version

$ kubeadm config images list --kubernetes-version=v1.33.0

12. Configure Containerd (All Nodes)

$ mkdir -p /etc/containerd
$ containerd config default > /etc/containerd/config.toml
Modify the configuration file:
$ vim /etc/containerd/config.toml
Change SystemdCgroup = false to SystemdCgroup = true
Change sandbox_image = "k8s.gcr.io/pause:3.10" to sandbox_image="registry.aliyuncs.com/google_containers/pause:3.6" # The image version can be specified according to the version queried in step 11
Find config_path = "" and change it to:
config_path = "/etc/containerd/certs.d"
Modify /etc/crictl.yaml file
$ cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF

$ mkdir -p /etc/containerd/certs.d/docker.io/
$ vim /etc/containerd/certs.d/docker.io/hosts.toml
# Write the following content:
[host."https://vh3bm52y.mirror.aliyuncs.com", host."https://registry.docker-cn.com"]
  capabilities = ["pull"]

Restart containerd:
$ systemctl restart containerd && systemctl enable containerd

13. Set Container Runtime (All Nodes)

$ crictl config runtime-endpoint unix:///run/containerd/containerd.sock

14. Achieve High Availability for K8s API Server Nodes Using Keepalived + Nginx

1. Install Nginx Master-Slave

Install Nginx master-slave on master nodes (local source is sufficient)
$ yum install nginx keepalived nginx-mod-stream -y

2. Modify Nginx Configuration File. The Master and Slave are the Same (Each Master Node)

$ vim /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 1024;
}

# Layer 4 load balancing, providing load balancing for two Master API server components
stream {

    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';

    access_log  /var/log/nginx/k8s-access.log  main;

    upstream k8s-apiserver {
            server 192.168.1.91:6443 weight=5 max_fails=3 fail_timeout=30s;
            server 192.168.1.92:6443 weight=5 max_fails=3 fail_timeout=30s;
            server 192.168.1.93:6443 weight=5 max_fails=3 fail_timeout=30s;
    }
    
    server {
       listen 16443; # Since Nginx and master nodes are reused, this listening port cannot be 6443, otherwise there will be a conflict
       proxy_pass k8s-apiserver;
    }
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    server {
        listen       80 default_server;
        server_name  _;

        location / {
        }
    }
}

3. Keepalive Configuration

# Main Keepalived
$ vim  /etc/keepalived/keepalived.conf 
global_defs { 
   notification_email { 
     [email protected] 
     [email protected] 
     [email protected] 
   } 
   notification_email_from [email protected]  
   smtp_server 127.0.0.1 
   smtp_connect_timeout 30
   router_id NGINX_MASTER
} 

vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}

vrrp_instance VI_1 { 
    state MASTER 
    interface enp3s0  # Change to the actual network card name
    virtual_router_id 51 # VRRP Router ID instance, each instance is unique
    priority 100    # Priority, set backup server to 90 
    advert_int 1    # Specify VRRP heartbeat announcement interval, default 1 second 
    authentication { 
        auth_type PASS      
        auth_pass 1111
    }  
    # Virtual IP
    virtual_ipaddress { 
        192.168.1.99/24
    } 
    track_script {
        check_nginx
    } 
}

# vrrp_script: Specify the script to check the working status of nginx (determine failover based on nginx status)
# virtual_ipaddress: Virtual IP (VIP)

$ vim  /etc/keepalived/check_nginx.sh 
#!/bin/bash
# Check if nginx is alive
counter=$(ps -ef | grep nginx | grep sbin | egrep -cv "grep | $$" )
if [ $counter -eq 0 ]; then
# If not alive, try to start nginx
systemctl start nginx
sleep 2
# Wait 2 seconds and check nginx status again
counter=$(ps -ef | grep nginx | grep sbin | egrep -cv "grep|$$")
# Check again, if nginx is still not alive, stop keepalived to allow the address to drift
if [ $counter -eq 0 ]; then
systemctl stop keepalived
fi
fi

$ chmod +x  /etc/keepalived/check_nginx.sh

# Backup Keepalived
$ vim  /etc/keepalived/keepalived.conf 
global_defs { 
   notification_email { 
     [email protected] 
     [email protected] 
     [email protected] 
   } 
   notification_email_from [email protected]  
   smtp_server 127.0.0.1 
   smtp_connect_timeout 30
   router_id NGINX_BACKUP
} 

vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}

vrrp_instance VI_1 { 
    state BACKUP 
    interface enp3s0
    virtual_router_id 51 # VRRP Router ID instance, each instance is unique
    priority 90
    advert_int 1
    authentication { 
        auth_type PASS      
        auth_pass 1111
    }  
    virtual_ipaddress { 
        192.168.1.99/24
    } 
    track_script {
        check_nginx
    } 
}

$ vim  /etc/keepalived/check_nginx.sh 
#!/bin/bash
# Check if nginx is alive
counter=$(ps -ef | grep nginx | grep sbin | egrep -cv "grep | $$" )
if [ $counter -eq 0 ]; then
# If not alive, try to start nginx
systemctl start nginx
sleep 2
# Wait 2 seconds and check nginx status again
counter=$(ps -ef | grep nginx | grep sbin | egrep -cv "grep|$$")
# Check again, if nginx is still not alive, stop keepalived to allow the address to drift
if [ $counter -eq 0 ]; then
systemctl stop keepalived
fi
fi

$ chmod +x /etc/keepalived/check_nginx.sh
# Note: Keepalived determines whether to failover based on the script's return status code (0 means working normally, non-0 means not normal).

4. Start Services (Each Master Node)

$ systemctl daemon-reload
$ systemctl start nginx
$ systemctl start keepalived
$ systemctl enable nginx keepalived

5. Test if VIP is Successfully Bound

$ ip addr
# Check if VIP is bound to the corresponding network card

6. Test Keepalived

# Stop keepalived on master01, VIP will drift to master02
master01
$ systemctl stop keepalived

master02
$ ip addr

15. Initialize K8s Cluster Using Kubeadm

$ kubeadm config print init-defaults > kubeadm.yaml

# Modify the configuration according to our needs, such as changing the value of imageRepository, setting kube-proxy mode to ipvs, and noting that since we are using containerd as the runtime, we need to specify cgroupDriver as systemd during node initialization.

# kubeadm.yaml configuration file is as follows:
apiVersion: kubeadm.k8s.io/v1beta4
bootstrapTokens:

- groups:
  - system:bootstrappers:kubeadm:default-node-token
    token: abcdef.0123456789abcdef
    ttl: 24h0m0s
    usages:
  - signing
  - authentication
    kind: InitConfiguration
    #localAPIEndpoint:# Comment out these lines
    #advertiseAddress: 192.168.1.91
    #bindPort: 6443
    nodeRegistration:
    criSocket: unix:///run/containerd/containerd.sock
    imagePullPolicy: IfNotPresent
    imagePullSerial: true
    name: master01 # Node name
    taints: null
    timeouts:
    controlPlaneComponentHealthCheck: 4m0s
    discovery: 5m0s
    etcdAPICall: 2m0s
    kubeletHealthCheck: 4m0s
    kubernetesAPICall: 1m0s
    tlsBootstrap: 5m0s
    upgradeManifests: 5m0s

---

apiServer: {}
apiVersion: kubeadm.k8s.io/v1beta4
caCertificateValidityPeriod: 87600h0m0s
certificateValidityPeriod: 8760h0m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
encryptionAlgorithm: RSA-2048
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.33.0 # K8s version, adjust according to the actual version
controlPlaneEndpoint: 192.168.1.99:16443 # Cluster VIP
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16 # Specify pod subnet, if not specified, it defaults to 192.168 segment
proxy: {}
scheduler: {}

# Append the following content "---" also needs to be pasted in
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

16. Unpack Image Tarballs on Each Node

# Upload the downloaded image packages to each node and unpack the images using the ctr command. If you need the 1.33.0 arm64 version compatible image package, you can contact me directly to obtain it.
$ ctr -n=k8s.io images import <image-package-name>

17. Deploy the Cluster

$ kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification

It shows as follows, indicating that the installation is complete:

# Configure the kubectl configuration file, which is equivalent to authorizing kubectl, so that the kubectl command can use this certificate to manage the k8s cluster
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ kubectl get nodes

18. Scale K8s Master Nodes – Add master02 to the K8s Cluster

# Copy the certificates from master01 node to master02
# Create a certificate storage directory on master02:
$ cd /root && mkdir -p /etc/kubernetes/pki/etcd && mkdir -p ~/.kube/
# Copy the certificates from master01 node to master02:
$ scp /etc/kubernetes/pki/ca.crt master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/ca.key master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/sa.key master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/sa.pub master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/front-proxy-ca.crt master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/front-proxy-ca.key master02:/etc/kubernetes/pki/
$ scp /etc/kubernetes/pki/etcd/ca.crt master02:/etc/kubernetes/pki/etcd/
$ scp /etc/kubernetes/pki/etcd/ca.key master02:/etc/kubernetes/pki/etcd/
# After copying the certificates, execute the following command on master02, everyone copy their own, so that master02 can join the cluster and become a control node:
# Check the command to join the node on master01:
$ kubeadm token create --print-join-command
# It shows as follows:
kubeadm join 192.168.40.199:16443 --token zwzcks.u4jd8lj56wpckcwv \
    --discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728 \
    --control-plane
# Execute on master02:
$ kubeadm join 192.168.40.199:16443 --token zwzcks.u4jd8lj56wpckcwv \
    --discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728 \
    --control-plane --ignore-preflight-errors=SystemVerification   # Must add
$ mkdir -p $HOME/.kube
# Import the /etc/kubernetes/admin.conf file from master01 node to master02 node /root/.kube/config
$ scp /etc/kubernetes/admin.conf master02:/root/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Check the cluster status on master01:
$ kubectl get nodes
NAME              STATUS     ROLES                  AGE   VERSION
master01   NotReady   control-plane          49m   v1.33.0
master02   NotReady   control-plane          39s   v1.33.0    
# You can see that master02 has joined the cluster, follow this method to add master03 to the cluster

19. Scale K8s Cluster by Adding Worker Nodes

# Check the command to join the node on master01:
$ kubeadm token create --print-join-command
# It shows as follows:
kubeadm join 192.168.40.199:16443 --token zwzcks.u4jd8lj56wpckcwv \
    --discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728
# Execute the command on the node:
$ kubeadm join 192.168.40.199:16443 --token zwzcks.u4jd8lj56wpckcwv \
    --discovery-token-ca-cert-hash sha256:1ba1b274090feecfef58eddc2a6f45590299c1d0624618f1f429b18a064cb728
# Check the cluster status on master01:
$ kubectl get nodes
NAME              STATUS     ROLES                  AGE   VERSION
master01   NotReady   control-plane          49m   v1.33.0
master02   NotReady   control-plane          10m   v1.33.0
node01     NotReady   worker          39s   v1.33.0
# You can see that master02 has joined the cluster, follow this method to add master03 to the cluster

20. Modify Node Labels

$ kubectl label node master01 node-role.kubernetes.io/master=master
$ kubectl label node master02 node-role.kubernetes.io/master=master
$ kubectl label node master03 node-role.kubernetes.io/master=master
$ kubectl label node node01 node-role.kubernetes.io/node=node
$ kubectl label node node02 node-role.kubernetes.io/node=node

21. Install Kubernetes Network Component – Calico

# Upload the required Calico images calico.tar.gz to each node, manually unpack (including calico-cni.tar, calico-kube-controllers.tar, calico-node.tar, calico-pod2daemon-flexvol.tar, coredns.tar, pause.tar):
$ ctr -n=k8s.io images import calico.tar.gz
# Upload calico.yaml to master01, use the yaml file to install the Calico network plugin 
$ kubectl apply -f calico.yaml
# Note: The online download configuration file address is:
raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/calico.yaml
$ kubectl get node
NAME              STATUS   ROLES           AGE   VERSION
master01           Ready    control-plane     36m   v1.33.0
master02           Ready    control-plane     33m   v1.33.0
master03           Ready    control-plane     30m   v1.33.0
node01            Ready     work            21m   v1.33.0
node02            Ready     work            21m   v1.33.0

22. Create Pod for Testing

# Import nginx.tar to the node
$ ctr -n=k8s.io images import nginx.tar
# Import pod.yaml to the master node
$ kubectl apply -f pod.yaml
# Pod creation successful, cluster setup complete