- 确认sc存在
- 确认csi daemonset 在每一个node上都有对应的pod,且正常运行,注意该ds需容忍所有污点
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 43m default-scheduler Successfully assigned dev-ybdp/idbuilder-etcd-0 to eu-ev-base-s6-c1m2-4xlarge-asg-x4c-aqt-sw3-server-npl
Warning FailedMount 32m kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[default-token-qmvcj data etcd-scripts]: error processing PVC dev-ybdp/data-idbuilder-etcd-0: failed to fetch PVC from API server: Get https://localhost:6443/api/v1/namespaces/dev-ybdp/persistentvolumeclaims/data-idbuilder-etcd-0: dial tcp 127.0.0.1:6443: connect: connection refused
Warning FailedMount 32m (x2 over 32m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data etcd-scripts default-token-qmvcj]: error processing PVC dev-ybdp/data-idbuilder-etcd-0: failed to fetch PVC from API server: Get https://localhost:6443/api/v1/namespaces/dev-ybdp/persistentvolumeclaims/data-idbuilder-etcd-0: dial tcp 127.0.0.1:6443: connect: connection refused
Warning FailedMount 27m (x2 over 29m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[default-token-qmvcj data etcd-scripts]: timed out waiting for the condition
Warning FailedMount 9m21s (x12 over 39m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data etcd-scripts default-token-qmvcj]: timed out waiting for the condition
Warning FailedMount 7m6s (x2 over 41m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[etcd-scripts default-token-qmvcj data]: timed out waiting for the condition
Warning FailedAttachVolume 3m (x28 over 43m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-08a21611-291c-4efd-ab3d-bd725e3732aa" : CSINode eu-ev-base-s6-c1m2-4xlarge-asg-x4c-aqt-sw3-server-npl does not contain driver cinder.csi.openstack.org
[root@eu-central-1-on-prem-k8s-master-1 ~]# kubectl get po -A -o wide | grep cinder-csi
You have mail in /var/spool/mail/root
[root@eu-central-1-on-prem-k8s-master-1 ~]# kubectl get po -A -o wide | grep cinder
kube-system csi-cinder-controllerplugin-86f6c96c8c-sd6ln 5/5 Running 0 46h 10.118.35.143 eu-central-1-on-prem-k8s-master-3 <none> <none>
kube-system csi-cinder-nodeplugin-4cm62 2/2 Running 209 205d 10.118.36.65 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-72b-nim-server-5sn <none> <none>
kube-system csi-cinder-nodeplugin-5xbdp 2/2 Running 209 205d 10.118.33.226 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-scj-kn3-server-2fm <none> <none>
kube-system csi-cinder-nodeplugin-8brtd 2/2 Running 2 205d 10.118.32.58 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-ge7-6bj-server-tqz <none> <none>
kube-system csi-cinder-nodeplugin-8kz4s 2/2 Running 4 128d 10.118.33.92 eu-central-1-on-prem-k8s-master-1 <none> <none>
kube-system csi-cinder-nodeplugin-8zs79 2/2 Running 3 205d 10.118.36.40 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-x2f-dl5-server-67f <none> <none>
kube-system csi-cinder-nodeplugin-dlbs2 2/2 Running 208 205d 10.118.32.15 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-iuj-qj5-server-j5h <none> <none>
kube-system csi-cinder-nodeplugin-f779n 2/2 Running 204 205d 10.118.33.59 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-x2i-anj-server-dv4 <none> <none>
kube-system csi-cinder-nodeplugin-gkg9h 2/2 Running 2 205d 10.118.35.108 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-7ls-f2i-server-pr5 <none> <none>
kube-system csi-cinder-nodeplugin-jxrh8 2/2 Running 6 201d 10.118.33.210 eu-central-1-on-prem-k8s-master-2 <none> <none>
kube-system csi-cinder-nodeplugin-l855s 2/2 Running 4 205d 10.118.36.17 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-p7f-znm-server-75o <none> <none>
kube-system csi-cinder-nodeplugin-p49q9 1/2 CrashLoopBackOff 27 205d 10.118.34.37 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-aqt-sw3-server-npl <none> <none>
kube-system csi-cinder-nodeplugin-r6jbw 2/2 Running 207 205d 10.118.34.36 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-d5y-mzw-server-ybv <none> <none>
kube-system csi-cinder-nodeplugin-sqp5m 2/2 Running 208 205d 10.118.32.67 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-izj-c6o-server-cal <none> <none>
kube-system csi-cinder-nodeplugin-th9fn 2/2 Running 207 205d 10.118.32.75 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-bwb-zin-server-27l <none> <none>
kube-system csi-cinder-nodeplugin-v6tmc 2/2 Running 207 205d 10.118.34.203 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-ybf-a6e-server-4wl <none> <none>
kube-system csi-cinder-nodeplugin-vdmjr 2/2 Running 7 205d 10.118.36.22 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-n5o-lpn-server-idx <none> <none>
kube-system csi-cinder-nodeplugin-vghsx 2/2 Running 7 205d 10.118.32.210 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-jci-cxc-server-ksh <none> <none>
kube-system csi-cinder-nodeplugin-w69ng 2/2 Running 3 205d 10.118.35.154 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-h2s-ea2-server-67x <none> <none>
kube-system csi-cinder-nodeplugin-xh7fq 2/2 Running 7 201d 10.118.35.143 eu-central-1-on-prem-k8s-master-3 <none> <none>
[root@eu-central-1-on-prem-k8s-master-1 ~]# kubectl get po -A -o wide | grep cinder | grep eu-ev-base-s6-c1m2-4xlarge-asg-x4c-aqt-sw3-server-npl
kube-system csi-cinder-nodeplugin-p49q9 1/2 CrashLoopBackOff 27 205d 10.118.34.37 eu-ev-base-s6-c1m2-4xlarge-asg-x4c-aqt-sw3-server-npl <none> <none>
[root@eu-central-1-on-prem-k8s-master-1 ~]# kubectl delete po -n kube-system csi-cinder-nodeplugin-p49q9
pod "csi-cinder-nodeplugin-p49q9" deleted
- cinder的问题
比如不小心变更配置,卷对应的pool改动,ceph 不稳定,ceph pool丢失,ceph 用户丢失等
- nova的问题
卷状态不一致问题处理
pod挂载pvc时显示
attachdetach-controller AttachVolume.Attach failed for volume "pvc-a2d5fdd1-b62c-49ae-aee4-8365be779eb8" : rpc error: code = Internal desc = ControllerPublishVolume Attach Volume failed with error failed to attach 051d57a7-40d7-4854-80b2-8e809088c060 volume to c9b82e1e-6f57-467f-b4c8-fc1bbdf50107 compute: Bad request with: [POST http://10.120.12.100:8774/v2.1/servers/c9b82e1e-6f57-467f-b4c8-fc1bbdf50107/os-volume_attachments], error message: {"badRequest": {"code": 400, "message": "Invalid volume: volume 051d57a7-40d7-4854-80b2-8e809088c060 already attached"}}
在虚拟机详情确实有看到这个盘已经挂载了,
pvc-a2d5fdd1-b62c-49ae-aee4-8365be779eb8 位于 /dev/vdc 上
但是 虚拟机里确实没看到这块盘
cn-sz-common-s6-c1m2-4xl-asg-tvk-zzi-ppo-server-spc
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 50G 0 disk
└─vda1 253:1 0 50G 0 part /
vdb 253:16 0 80G 0 disk
└─vdb1 253:17 0 80G 0 part /var/lib/docker
尝试解挂和挂载都会出问题
解挂时,
Invalid volume: Invalid input received: Invalid volume: Unable to detach volume. Volume status must be 'in-use' and attach_status must be 'attached' to detach. (HTTP 400) (Request-ID: req-7bc612bd-7af1-4c49-8555-ec36b9c91010) (HTTP 400) (Request-ID: req-be3d3b52-77af-4ad6-865f-b5170180ad86)
所以正确的处理方式,一律先解挂,再重建pod
处理下数据库记录,先解挂
select status,attach_status from volumes where id='051d57a7-40d7-4854-80b2-8e809088c060';
select * from volumes where id='051d57a7-40d7-4854-80b2-8e809088c060'\G;
+-----------+---------------+
| status | attach_status |
+-----------+---------------+
| available | detached |
+-----------+---------------+
| in-use | attached |
+-----------+---------------+
update volumes set status='in-use',attach_status='attached' where id='051d57a7-40d7-4854-80b2-8e809088c060';
update volumes set status='in-use',attach_status='attached' where id='';
不对卷本身是没有挂到虚拟机上的,libevirt vm xml 是没有显示的,但是nova show 可以看到这个盘,
所以cinder 的卷应该直接修改为未挂载状态,
nova的该盘应删掉
cinder reset-state --state available 051d57a7-40d7-4854-80b2-8e809088c060
update volumes set attach_status='detached' where id='051d57a7-40d7-4854-80b2-8e809088c060';
MariaDB [nova]> show tables;
+--------------------------------------------+
| Tables_in_nova |
+--------------------------------------------+
| block_device_mapping | 有磁盘信息
| instances | 无磁盘信息
+--------------------------------------------+
MariaDB [nova]> select updated_at,device_name,volume_id,deleted,attachment_id from block_device_mapping where instance_uuid='c9b82e1e-6f57-467f-b4c8-fc1bbdf50107';
+---------------------+-------------+--------------------------------------+---------+--------------------------------------+
| updated_at | device_name | volume_id | deleted | attachment_id |
+---------------------+-------------+--------------------------------------+---------+--------------------------------------+
| 2022-01-16 15:45:15 | /dev/vda | 56565a16-5649-44f7-9bd3-7e0f405b8738 | 0 | dc62f3f6-c9a9-4ffe-bd5d-3c58432cd3ee |
| 2022-01-16 15:45:15 | /dev/vdb | 932aea2d-e21d-4b65-a11d-f3f2cc732841 | 0 | a0120e97-1282-45ff-9402-8d1f7e56c2ad |
| 2021-07-22 05:42:45 | /dev/vdc | fc9de015-4147-4287-9145-90e1b6999a8b | 336016 | e6fd4209-8eb3-42a2-825f-b4e7ce38db87 |
| 2021-07-22 06:47:12 | /dev/vdc | 31f4a0a9-c713-48fc-953f-32c02a785d0d | 336067 | ef0e05d3-c71d-4539-84e1-c7e1433f3b8d |
| 2021-07-23 02:38:26 | /dev/vdd | c11a3863-acde-447c-8c66-3e1f3dc8ce3a | 336139 | 8fdb1ebc-f0bb-4b99-8f9e-536c1b087d47 |
| 2021-07-23 03:02:38 | /dev/vde | c6f15852-9e60-4969-addb-db748a1218c8 | 336160 | 9ee286c5-5e31-4f29-9ad4-5d83a02d142e |
| 2021-07-26 05:54:39 | /dev/vdf | 61bd21df-0348-4ec1-91b5-bb6c1dddc641 | 336433 | d99c24b1-0af4-47e5-9049-b9bf10133713 |
| 2022-01-18 00:58:24 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347818 | 84a6bd64-749f-4406-a4b4-57be9c01bb0f |
| 2022-01-18 01:01:13 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347852 | 8f3e03a9-d940-477d-be12-6835dc572bc8 |
| 2022-01-18 01:03:37 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347870 | 7a4fef5e-a13e-42e3-bda2-2160c4d06ffb |
| 2022-01-18 01:06:42 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347896 | e055369d-7e86-490f-a594-66c9a3f8cf25 |
| 2022-01-18 01:10:48 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347918 | e02bcf80-2ae8-4619-8dee-a4658cfd5fb8 |
| 2022-01-18 01:13:53 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347944 | dd3b0b76-9cae-4032-9127-0dfa91547971 |
| 2022-01-18 01:15:57 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347958 | 0eee90e5-3f40-48a9-85a8-73efaec13831 |
| 2022-01-18 01:17:00 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347974 | d6bb0cc3-caf3-4f3c-9161-8f36a8125523 |
| 2022-01-18 01:18:59 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 347992 | NULL |
| 2022-01-18 01:22:03 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348006 | 37360a51-26a9-44dc-95b0-d475ca633396 |
| 2022-01-18 01:24:06 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348034 | d7d5db37-c2c5-48fb-b5bf-4738bbf6962c |
| 2022-01-18 01:29:19 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348060 | fae11994-fb45-4e35-8ce0-26149bf51dc4 |
| 2022-01-18 01:31:21 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348072 | 47aef71c-f85d-46c0-ab89-4ab494558cf7 |
| 2022-01-18 01:35:24 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348090 | aff7ec9a-ad1e-420a-8394-d02c074979cd |
| 2022-01-18 01:36:26 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348108 | NULL |
| 2022-01-18 01:38:28 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348114 | 1f554112-fb2c-40e4-8d50-c0c032d2cc08 |
| 2022-01-18 01:42:31 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348138 | 5d53df66-ff9b-4ac6-8a52-87330fea1704 |
| 2022-01-18 01:44:34 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348152 | 2b7d0571-c3df-4def-8057-6c6fd0725476 |
| 2022-01-18 01:46:37 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348164 | 6a035859-8ed8-4cf8-955f-f4ab2a034b45 |
| 2022-01-18 01:48:38 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348186 | 899900d2-2f5d-4412-b476-1e5fd32fe6d7 |
| 2022-01-18 01:52:39 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348214 | d4a4355c-6a89-4e1a-8d81-16f0c485f688 |
| 2022-01-18 01:56:45 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348246 | 02d2195e-dc3e-4ec9-88ae-b91b8efb0cdf |
| 2022-01-18 01:58:48 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 348258 | 8c0baa4f-c69c-44e6-95e0-159ec99f6cbb |
| 2022-01-18 02:00:45 | /dev/vdc | 051d57a7-40d7-4854-80b2-8e809088c060 | 0 | NULL |
+---------------------+-------------+--------------------------------------+---------+--------------------------------------+
确实该盘尚未删除,而且也未对应到一个attachment_id 数据是不完整的,所以重置该盘为未挂载态,
后续重新挂载
select instance_uuid,updated_at,device_name,volume_id,deleted,attachment_id from block_device_mapping where volume_id='051d57a7-40d7-4854-80b2-8e809088c060' and deleted='0' ;
update block_device_mapping set deleted='348268' where volume_id='051d57a7-40d7-4854-80b2-8e809088c060' and deleted='0';
######### 另一个处于reserved态也如此尝试
cinder reset-state --state available a5690e57-b77f-4c91-8ea3-f4d0b8ddb4a7
update volumes set attach_status='detached' where id='a5690e57-b77f-4c91-8ea3-f4d0b8ddb4a7';
######### 另一个
MariaDB [nova]> select instance_uuid,updated_at,device_name,volume_id,deleted,attachment_id from block_device_mapping where volume_id='94b1b36d-c3c1-4cb7-a801-955bea1e9012';
update block_device_mapping set deleted='348449' where volume_id='94b1b36d-c3c1-4cb7-a801-955bea1e9012' and deleted='0';
######### 另一个处于reserved态也如此尝试
update volumes set attach_status='detached' where id='c6f15852-9e60-4969-addb-db748a1218c8';
######### 另一个
select instance_uuid,updated_at,device_name,volume_id,deleted,attachment_id from block_device_mapping where volume_id='b86b9190-a6ea-4676-b298-ff034c41a33b';
update block_device_mapping set deleted='348459' where volume_id='b86b9190-a6ea-4676-b298-ff034c41a33b' and deleted='0';
######### 另一个处于reserved态也如此尝试
update volumes set status='available',attach_status='detached' where id='a5690e57-b77f-4c91-8ea3-f4d0b8ddb4a7';
网友评论