美文网首页
The connection to the server lb.

The connection to the server lb.

作者: 87d6dc4b11a7 | 来源:发表于2024-09-01 16:08 被阅读0次

All-in-One模式部署在虚拟机的 Kubernetes 和 KubeSphere ,在经历虚机频繁的重启和关闭,在某次启动后,出现以下情况。

root@shawn-virtual-machine:~# k get node
E0902 13:17:50.626394   13014 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:50.626662   13014 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:50.628043   13014 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:50.628178   13014 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:50.629708   13014 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
The connection to the server lb.kubesphere.local:6443 was refused - did you specify the right host or port?
root@shawn-virtual-machine:~# k get pod -A
E0902 13:17:54.921584   13084 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:54.922120   13084 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:54.923808   13084 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:54.923964   13084 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
E0902 13:17:54.925327   13084 memcache.go:265] couldn't get current server API group list: Get "https://lb.kubesphere.local:6443/api?timeout=32s": dial tcp 192.168.17.18:6443: connect: connection refused
The connection to the server lb.kubesphere.local:6443 was refused - did you specify the right host or port?

1、检查docker状态

systemctl status docker

2、检查kubelet状态

systemctl status kubelet

3、检查6443端口状态

netstat -pnlt | grep 6443

6443端口没有被监听
4、查看kubelet日志

journalctl -xeu kubelet

5、判断可能是etcd出现问题,检查etcd状态

systemctl status etcd

ETCDCTL_API=3 etcdctl --endpoints 192.168.17.18:2379 \
  --cert=/etc/ssl/etcd/ssl/node-shawn-virtual-machine.pem \
  --key=/etc/ssl/etcd/ssl/node-shawn-virtual-machine-key.pem \
  --cacert=/etc/ssl/etcd/ssl/ca.pem \
  member list

ETCDCTL_API=3 etcdctl --endpoints 192.168.17.18:2379 \
  --cert=/etc/ssl/etcd/ssl/node-shawn-virtual-machine.pem \
  --key=/etc/ssl/etcd/ssl/node-shawn-virtual-machine-key.pem \
  --cacert=/etc/ssl/etcd/ssl/ca.pem \
  endpoint health

6、手动启动etcd

etcd --data-dir=/var/lib/etcd --listen-client-urls=http://192.168.17.18:2379 \
   --advertise-client-urls=http://192.168.17.18:2379

启动报错:

panic: freepages: failed to get all reachable pages (key[0]=(hex)616c61726d on leaf page(437) needs to be < than key of the next element in ancestor (hex)000000000000b3f85f0000000000000000. Pages stack: [3100 437])

7、从备份(/root/tmp/snapshot.db)中恢复

rm -rf /var/lib/etcd
etcdutl --data-dir=/var/lib/etcd snapshot restore /root/tmp/snapshot.db
root@shawn-virtual-machine:/var/lib# etcdutl --data-dir=/var/lib/etcd snapshot restore /root/tmp/snapshot.db                                                          2024-09-02T14:28:13+08:00       info    snapshot/v3_snapshot.go:260     restoring snapshot      {"path": "/root/tmp/snapshot.db", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap"}
2024-09-02T14:28:13+08:00       info    membership/store.go:141 Trimming membership information from the backend...
2024-09-02T14:28:13+08:00       info    membership/cluster.go:421       added member    {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2024-09-02T14:28:13+08:00       info    snapshot/v3_snapshot.go:287     restored snapshot       {"path": "/root/tmp/snapshot.db", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap"}
root@shawn-virtual-machine:/var/lib#
root@shawn-virtual-machine:/var/lib# ETCDCTL_API=3 etcdctl --endpoints 192.168.17.18:2379   --cert=/etc/ssl/etcd/ssl/node-shawn-virtual-machine.pem   --key=/etc/ssl/etcd/ssl/node-shawn-virtual-machine-key.pem   --cacert=/etc/ssl/etcd/ssl/ca.pem   member list
8e9e05c52164694d, started, etcd-shawn-virtual-machine, http://localhost:2380, https://192.168.17.18:2379, false

8、启动成功后,执行k get node,报错

Error from server (Forbidden): nodes is forbidden: User "kubernetes-admin" cannot list resource "nodes" in API group "" at the cluster scope
# kubeadm certs certificate-key
63881154cf600a52f90fc673e5dfaf529d0a91eca548bcb3afc6465407dd344b
# kubeadm init phase upload-certs --upload-certs --certificate-key 63881154cf600a52f90fc673e5dfaf529d0a91eca548bcb3afc6465407dd344b

相关文章

网友评论

      本文标题:The connection to the server lb.

      本文链接:https://www.haomeiwen.com/subject/gaecljtx.html