ㅁ We have a working kuberentes cluster with a set of application running. Let us first explore the setup
How many deployments exsit in the cluster?
kubectl get deploymnets
ㅁ What is the version of ETCD running on the cluster? Check the ETCD Pod or Process
[v3.4.9], [v1.11], [v2.5], [v3.4], [v1.13]
kubectl describe pod etcd-controlplane --namespace=kube-system
or
kubectl exec etcd-controlplane -- ectd --version
ㅁ At what address do you reach the ETCD cluster from your master/controlplane node?
Check the ETCD Service configuration in the ETCD POD
kubectl describe pod etcd-controlplane --namespace=kube-system
에서 --listen-client-urls 확인
Command:
etcd
--advertise-client-urls=https://172.17.0.29:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--initial-advertise-peer-urls=https://172.17.0.29:2380
--initial-cluster=controlplane=https://172.17.0.29:2380
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://172.17.0.29:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://172.17.0.29:2380
--name=controlplane
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
ㅁ Where is the ETCD server certificate file located? (Note this path down as you will need to use it later)
kubectl describe pod etcd-controlplane --namespace=kube-system
에서 --cert-file 확인
Command:
etcd
--advertise-client-urls=https://172.17.0.29:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--initial-advertise-peer-urls=https://172.17.0.29:2380
--initial-cluster=controlplane=https://172.17.0.29:2380
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://172.17.0.29:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://172.17.0.29:2380
--name=controlplane
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
ㅁ Where is the ETCD CA Certificate file located? (Note this path down as you will need to use it later)
kubectl describe pod etcd-controlplane --namespace=kube-system
에서 --trusted-ca-file 확인
Command:
etcd
--advertise-client-urls=https://172.17.0.29:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--initial-advertise-peer-urls=https://172.17.0.29:2380
--initial-cluster=controlplane=https://172.17.0.29:2380
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://172.17.0.29:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://172.17.0.29:2380
--name=controlplane
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
ㅁ The master nodes in our cluster are planned for a regular maintenance reboot tonight. While web do not anticipate anything to go wrong, we are required to take the necessary backups. Take a snapshot of the ETCD database using the built-in snapshot functionality (Store the backup file at location /opt/snapshot-pre-boot.db)
kubernetes.io 에서 Snapshot using etcdctl options 사용을 확인
ETCDCTL_API=3 etcdctl --endpoints=[127.0.0.1:2379] \
--cacert=<trusted-ca-file> --cert=<cert-file> --key=<key-file> \
snapshot save <backup-file-location>
kubectl describe pod etcd-controlplane --namespace=kube-system
에서 command 부분에서 각 영역 확인
ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1:2379] \
--cacert="/etc/kubernetes/pki/etcd/ca.crt" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" \
snapshot save /opt/snapshot-pre-boot.db
에서 endpoint를 제거 후 snapshot
ETCDCTL_API=3 etcdctl \
--cacert="/etc/kubernetes/pki/etcd/ca.crt" --cert="/etc/kubernetes/pki/etcd/server.crt" --key="/etc/kubernetes/pki/etcd/server.key" \
snapshot save /opt/snapshot-pre-boot.db
Snapshot이 완료되었으면 status로 상태 확인
ETCDCTL_API=3 etcdctl snapshot status /opt/snapshot-pre-boot.db
Wake up We have a conference call! After the reboot the master nodes came back online, but none of our applications are accessable Check the status of the applications on the cluster. What's wrong
ㅁ Restore ETCD Snapshot to a new folder
/var/lib/etcd-from-backup은 신규 위치 (사전에 위치 확인)
ETCDCTL_API=3 etcdctl snapshot restore /opt/snapshot-pre-boot.db --data-dir=/var/lib/etcd-from-backup
--data-dir은 etcd 백업 파일을 복구한 위치이다.
이후
/etc/kubernetes/manifests/etcd.yaml
파일을 수정
volume 부분의 etcd-data의 hostPath 부분을
volumes:
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
을 아래와 같이 수정
volumes:
- hostPath:
path: /var/lib/etcd-from-backup
type: DirectoryOrCreate
name: etcd-data
하면 etcd db로 수정한 위치를 바라보게 된다.
정상적으로 Pod가 잘 보이는지 확인
kubectl get pod
'CKA &. CKAD > Cluster Maintenance' 카테고리의 다른 글
Working with ETCDCTL (0) | 2021.03.28 |
---|---|
Backup and Restore Methods (0) | 2021.03.28 |
Kuberenetes Upgrade Process (0) | 2021.03.28 |
Operating system Upgrade (0) | 2021.03.28 |