2020.12.31
1. 삭제 대상
- Kubernetes node : iap07
- OSD (Object Storage Daemon) : 0, 3
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd status
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | iap09 | 3324M | 276G | 0 | 0 | 0 | 0 | exists,up |
| 1 | iap07 | 3336M | 276G | 0 | 0 | 0 | 0 | exists,up |
| 2 | iap08 | 3354M | 276G | 0 | 0 | 0 | 0 | exists,up |
| 3 | iap07 | 3283M | 555G | 0 | 0 | 0 | 0 | exists,up |
…
| 19 | iap11 | 596G | 13.9T | 48 | 244k | 2 | 0 | exists,up |
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
2. Environments
- Kubernetes v1.16.15
- Rook/ceph v1.3.8
- ceph v14.2.10
3. Workaround
3.1 OSD 제거 (ceph toolbox 이용)
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}')
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd out osd.1
marked out osd.0.
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceps osd out osd.3
marked out osd.3.
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph status
cluster:
id: 1ef6e249-005e-477e-999b-b874f9fa0854
health: HEALTH_OK
…
data:
pools: 5 pools, 192 pgs
objects: 518.06k objects, 2.0 TiB
usage: 4.0 TiB used, 131 TiB / 135 TiB avail
pgs: 126902/1036144 objects misplaced (12.248%)
177 active+clean
14 active+remapped+backfill_wait
1 active+remapped+backfilling
io:
client: 1.2 KiB/s rd, 2.3 MiB/s wr, 2 op/s rd, 255 op/s wr
recovery: 21 MiB/s, 5 objects/s
progress:
Rebalancing after osd.1 marked out
[..............................]
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]#
Rebalancing 작업이 완료된 이후 purge 해야 하며, 아래 명령어로 작업이 완료 되었는지 사전에 확인 가능
"ceph osd df" 명령어 결과 중 PGS(Placement Groups)의 값은 0으로 변경되어야 함
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd safe-to-destroy 1 3
OSD(s) 1,3 are safe to destroy without reducing data durability.
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
6 hdd 0.45479 1.00000 466 GiB 34 GiB 33 GiB 96 KiB 1024 MiB 432 GiB 7.24 2.45 2 up
…
1 hdd 0.27280 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
3 hdd 0.54559 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
12 hdd 14.55220 1.00000 15 TiB 535 GiB 533 GiB 103 KiB 1.5 GiB 14 TiB 3.59 1.21 55 up
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]#
OSD를 삭제하기 위하여 사전에 별도 창에서 OSD POD를 종료해서 OSD의 상태를 'down'으로 변경
[iap@iap01 ~]$ kubectl scale --replicas=0 deployment/rook-ceph-osd-1 -n rook-ceph
deployment.apps/rook-ceph-osd-1 scaled
[iap@iap01 ~]$ kubectl scale --replicas=0 deployment/rook-ceph-osd-3 -n rook-ceph
deployment.apps/rook-ceph-osd-3 scaled
[iap@iap01 ~]$
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd purge 1 --yes-i-really-mean-it
purged osd.1
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd purge 3 --yes-i-really-mean-it
purged osd.3
[root@rook-ceph-tools-79d7c49c8d-4c4x5 /]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 145.52200 root default
-3 0 host iap07
-15 58.20880 host iap10
12 hdd 14.55220 osd.12 up 1.00000 1.00000
13 hdd 14.55220 osd.13 up 1.00000 1.00000
…
[root@rook-ceph-tools-79d7c49c8d-7zqbj /]#
The operator can automatically remove OSD deployments that are considered "safe-to-destroy" by Ceph.
[iap@iap01 ~]$ k edit cephclusters.ceph.rook.io rook-ceph -n rook-ceph
…
removeOSDsIfOutAndSafeToRemove: true
…
[iap@iap01 ~]$
자동으로 OSD Deployments가 삭제되지 않아서 수동으로 삭제함.
rook-ceph-operator 재 기동시 삭제 되어야 할 OSD deployment가 재 기동되는 오류 발생 (추후 확인), 다만 K8s Worker node를 삭제시는 발생되지 않음
[iap@iap01 ~]$ k delete deployments.apps rook-ceph-osd-1 -n rook-ceph
deployment.apps "rook-ceph-osd-2" deleted
[iap@iap01 ~]$ k delete deployments.apps rook-ceph-osd-3 -n rook-ceph
deployment.apps "rook-ceph-osd-5" deleted
[iap@iap01 ~]$
3.2 K8s worker node 제거 및 Disk 초기화
[iap@iap01 ~]$ kubectl drain iap07 --ignore-daemonsets --delete-local-data
node/iap07 cordoned
WARNING: ignoring DaemonSet-managed Pods: gpu-monitor/kube-prometheus-stack-1608276926-prometheus-node-exporter-24qfr, ...
evicting pod "crunchy-grafana-5746f458bb-rfg5k"
evicting pod "rook-ceph-osd-prepare-iap07-42mgx"
evicting pod "postgres-operator-66cc8bd589-lmmc7"
evicting pod "rook-ceph-crashcollector-iap07-5d9fd58568-jvs4r"
evicting pod "pgo-deploy-7f5qh"
pod/rook-ceph-osd-prepare-iap07-42mgx evicted
pod/pgo-deploy-7f5qh evicted
pod/crunchy-grafana-5746f458bb-rfg5k evicted
pod/postgres-operator-66cc8bd589-lmmc7 evicted
pod/rook-ceph-crashcollector-iap07-5d9fd58568-jvs4r evicted
node/iap07 evicted
[iap@iap01 ~]$ kubectl delete node iap07
node "iap07” deleted
[iap@iap01 ~]$
Zapping Devices (https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md#zapping-devices)
[root@iap07 ~]# rm -rf /var/lib/rook
[root@iap07 ~]# sgdisk --zap-all "/dev/sdb"; sgdisk --zap-all "/dev/sdc” # yum install gdisk -y
Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or other utilities.
[root@iap07 ~]# dd if=/dev/zero of=“/dev/sdb” bs=1M count=100 oflag=direct,dsync
…
[root@iap07 ~]# dd if=/dev/zero of=“/dev/sdb” bs=1M count=100 oflag=direct,dsync
…
[root@iap07 ~]# ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
[root@iap07 ~]# rm -rf /dev/ceph-*
[root@iap07 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 279.4G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 278.4G 0 part
├─centos-root 253:0 0 150G 0 lvm /
├─centos-swap 253:1 0 15.8G 0 lvm
└─centos-home 253:3 0 112.6G 0 lvm /home
sdb 8:16 0 279.4G 0 disk
sdc 8:32 0 558.7G 0 disk
sr0 11:0 1 1024M 0 rom
[root@iap07 ~]#
- 삭제 전 상태
[iap@iap01 ~]$ iap09 lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 279.4G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 278.4G 0 part
├─centos-root 253:0 0 150G 0 lvm /
├─centos-swap 253:1 0 15.8G 0 lvm
└─centos-home 253:2 0 112.6G 0 lvm /home
sdb 8:16 0 279.4G 0 disk
└─ceph--9e491968--e4a7--405b--9523--153e6885eba0-osd--... 253:3 0 279.4G 0 lvm
sdc 8:32 0 558.7G 0 disk
└─ceph--15ffe261--abb2--4893--bbe1--f387b3cdc4bc-osd--... 253:4 0 558.7G 0 lvm
sr0 11:0 1 1024M 0 rom
[iap@iap01 ~]$
'Kubernetes > Storage' 카테고리의 다른 글
Rook Ceph - OSD autoout (0) | 2021.09.16 |
---|---|
Rook Ceph - failed to get status (0) | 2021.09.16 |
Rook Ceph 구성 (0) | 2021.09.15 |
Rook ceph vs NFS (3) | 2021.09.15 |
NFS-Client Provisioner (0) | 2021.09.15 |
댓글