一、前言
在上一篇文章 从零开始搭建Kubernetes 1.10.0 集群(二、搭建虚拟机环境)中,我们已经搭建好了基础的虚拟机环境。现在,我们可以开启我们真正的K8S之旅。
我们将现有的虚拟机称之为Node1,用作主节点。为了减少工作量,在Node1安装Kubernetes后,我们利用VirtualBox的虚拟机复制功能,复制出两个完全一样的虚拟机作为工作节点。三者角色为:
- Node1:Master
- Node2:Woker
- Node3:Woker
二、安装Kubernetes
还是那句话,官方文档永远是最好的参考资料:https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
但是,仅供参考,因为墙的原因,并不完全适用于我们天朝子民。下面将详细介绍在Node1上安装Kubernetes的过程,安装完毕后,再进行虚拟机的复制出Node2、Node3即可。
配置K8S的yum源
官方仓库无法使用,建议使用阿里源的仓库,执行以下命令添加kubernetes.repo
仓库:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
关闭swap、防火墙
上一篇文章已介绍关闭
关闭SeLinux
执行:setenforce 0
安装K8S组件
执行以下命令安装kubelet、kubeadm、kubectl:
yum install -y kubelet kubeadm kubectl
如下图所示:
image.png
配置kubelet的cgroup drive
确保docker 的cgroup drive 和kubelet的cgroup drive一样:
docker info | grep -i cgroup
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
若显示不一样,则执行:
sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
如图:
image.png
启动kubelet
注意,根据官方文档描述,安装kubelet、kubeadm、kubectl三者后,要求启动kubelet:
systemctl enable kubelet && systemctl start kubelet
但实际测试发现,无法启动,报如下错误:
查看日志发现是没有证书:
unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
image.png
我在网上没有找到解决方法,但无意测试中发现,后面的kubeadm init
操作会创建证书。也就是说,现在无法启动并不影响后续操作,继续!
下载K8S的Docker镜像
本文使用的是K8S官方提供的kubeadm工具来初始化K8S集群,而初始化操作kubeadm init
会默认去访问谷歌的服务器,以下载集群所依赖的Docker镜像,因此也会超时失败,你懂得。
但是,只要我们可以提前导入这些镜像,kubeadm init
操作就会发现这些镜像已经存在,就不会再去访问谷歌。网上有一些方法可以获得这些镜像,如利用Docker Hub制作镜像等,但稍显繁琐。
这里,我已将初始化时用到的所有Docker镜像整理好了,镜像版本是V1.10.0。推荐大家使用。
- 地址:
https://pan.baidu.com/s/11AheivJxFzc4X6Q5_qCw8A
- 密码:
2zov
准备好的镜像如下图所示:
image.png
K8S更新速度很快,截止目前最新版本是V1.10.2。本人提供的镜像会越来越旧,需要最新版本的读者可以根据网上教材自行制作
脚本docker_images_load.sh
用于导入镜像:
docker load < quay.io#calico#node.tar
docker load < quay.io#calico#cni.tar
docker load < quay.io#calico#kube-controllers.tar
docker load < k8s.gcr.io#kube-proxy-amd64.tar
docker load < k8s.gcr.io#kube-scheduler-amd64.tar
docker load < k8s.gcr.io#kube-controller-manager-amd64.tar
docker load < k8s.gcr.io#kube-apiserver-amd64.tar
docker load < k8s.gcr.io#etcd-amd64.tar
docker load < k8s.gcr.io#k8s-dns-dnsmasq-nanny-amd64.tar
docker load < k8s.gcr.io#k8s-dns-sidecar-amd64.tar
docker load < k8s.gcr.io#k8s-dns-kube-dns-amd64.tar
docker load < k8s.gcr.io#pause-amd64.tar
docker load < quay.io#coreos#etcd.tar
docker load < quay.io#calico#node.tar
docker load < quay.io#calico#cni.tar
docker load < quay.io#calico#kube-policy-controller.tar
docker load < gcr.io#google_containers#etcd.tar
将镜像与该脚本放置同一目录,执行即可导入Docker镜像。运行docker images
,如下图所示,即表示镜像导入成功:
三、复制虚拟机
前言中提到,当Node1的Kubernetes安装完毕后,就需要进行虚拟机的复制了。
复制
复制前需要退出虚拟机,我们选择“正常关机”。右键虚拟机点击复制:
image.png
如上,新的节点命名为CentOS-Node2
,注意一定要勾选"重新初始化网卡Mac地址"。点击“复制”,稍等几分钟,即可完成复制:
依此法再复制一个节点命名为CentOS-Node3
。
添加网卡
复制结束后,如果直接启动三个虚拟机,你会发现每个机子的IP地址(网卡enp0s3)都是一样的:
image.png
这是因为复制虚拟机时连同网卡的地址也复制了,这样的话,三个节点之间是无法访问的。因此,我建议复制结束后,不要马上启动虚拟机,而先要为每一个虚拟机添加一个网卡,用于节点间的互通访问。
如下图所示,连接方式选择“Host-Only”模式:
image.png
网卡添加结束后,启动三个虚拟机,查看各个IP。以主节点Node1为例,运行ip addr
可以看到,网卡enp0s8为新添加的网卡2,IP地址为
192.168.56.101
。三个节点IP分别为:
- Node1:192.168.56.101
- Node2:192.168.56.102
- Node3:192.168.56.103
在这三个节点中,可以使用这些IP互相ping一下,确保网络连通正常。
另外,同上一节所述,建议启用端口转发功能,使用Xshell连接到Node1和Node2的终端。
设置虚拟机
网卡添加结束后,即可启动三个虚拟机,我们需要进行一些简单的设置,以主节点Node1为例:
- 编辑
/etc/hostname
,将hostname
修改为k8s-node1
- 编辑
/etc/hosts
,追加内容IP k8s-node1
以上IP为网卡2的IP地址,修改后重启生效。另外两个节点修改同理,主机名分别为k8s-node2
、k8s-node3
。
四、创建集群
kubeadm介绍
前面的工作都准备好后,我们就可以真正的创建集群了。这里使用的是官方提供的kubeadm工具,它可以快速、方便的创建一个K8S集群。kubeadm的具体介绍大家可以参考官方文档:https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
。
截止目前,kubeadm尚处于beta状态,官方暂时不推荐在生产环境使用,但是预计今年会推出GA版本。这里,我建议大家尽量使用kubeadm,相对于纯手动部署效率更高,也不容易出错。
创建集群
在Master主节点(k8s-node1)上执行:
kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=v1.10.0 --apiserver-advertise-address=192.168.56.102
含义:
1.选项--pod-network-cidr=192.168.0.0/16表示集群将使用Calico网络,这里需要提前指定Calico的子网范围
2.选项--kubernetes-version=v1.10.0指定K8S版本,这里必须与之前导入到Docker镜像版本v1.10.0一致,否则会访问谷歌去重新下载K8S最新版的Docker镜像
3.选项--apiserver-advertise-address表示绑定的网卡IP,这里一定要绑定前面提到的enp0s8网卡,否则会默认使用enp0s3网卡
4.若执行kubeadm init出错或强制终止,则再需要执行该命令时,需要先执行kubeadm reset重置
执行结果:
[root@k8s-node1 ~]# kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=v1.10.0 --apiserver-advertise-address=192.168.56.101
[init] Using Kubernetes version: v1.10.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.03.1-ce. Max validated version: 17.03
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [k8s-node1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.56.101]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [localhost] and IPs [127.0.0.1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [k8s-node1] and IPs [192.168.56.101]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 24.006116 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node k8s-node1 as master by adding a label and a taint
[markmaster] Master k8s-node1 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: kt62dw.q99dfynu1kuf4wgy
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 192.168.56.101:6443 --token kt62dw.q99dfynu1kuf4wgy --discovery-token-ca-cert-hash sha256:5404bcccc1ade37e9d80831ce82590e6079c1a3ea52a941f3077b40ba19f2c68
可以看到,提示集群成功初始化,并且我们需要执行以下命令:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
另外, 提示我们还需要创建网络,并且让其他节点执行kubeadm join...
加入集群。
创建网络
如果不创建网络,查看pod状态时,可以看到kube-dns组件是阻塞状态,集群时不可用的:
image.png
大家可以参考官方文档,根据需求选择适合的网络,这里,我们使用Calico(在前面初始化集群的时候就已经确定了)。
根据官方文档,在主节点上,需要执行如下命令:
kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
但需要注意的是:
本文实验时所使用的calico的docker镜像版本为v3.1.0,如下图所示
image.png
但截至本文撰写时,calico.yaml
文件中版本已升级为v3.1.1。因此我们需要下载calico.yaml
,手动编辑文件修改为v3.1.0并重新创建网络。否则,执行kubectl apply
命令时,会重新拉取v3.1.1的镜像导致超时失败。同时,kube-dns模块也会因为网络无法创建而Pending:
image.png
确保版本一致后,执行成功则提示:
image.png image.png
五、集群设置
将Master作为工作节点
K8S集群默认不会将Pod调度到Master上,这样Master的资源就浪费了。在Master(即k8s-node1)上,可以运行以下命令使其作为一个工作节点:
kubectl taint nodes --all node-role.kubernetes.io/master-
利用该方法,我们可以不使用minikube而创建一个单节点的K8S集群
执行成功后提示:
image.png
将其他节点加入集群
在其他两个节点k8s-node2和k8s-node3上,执行主节点生成的kubeadm join
命令即可加入集群:
kubeadm join 192.168.56.101:6443 --token kt62dw.q99dfynu1kuf4wgy --discovery-token-ca-cert-hash sha256:5404bcccc1ade37e9d80831ce82590e6079c1a3ea52a941f3077b40ba19f2c68
加入成功后,提示:
image.png
验证集群是否正常
当所有节点加入集群后,稍等片刻,在主节点上运行kubectl get nodes
可以看到:
如上,若提示notReady则表示节点尚未准备好,可能正在进行其他初始化操作,等待全部变为Ready即可。
大家可能会好奇,我们前面使用的是v1.10.0,为何这里版本是v1.10.2。实际上,这里显示是每个节点上kubelet程序的版本,即先前使用yum安装时的默认版本,是向下兼容的。而v.1.10.0指的是K8S依赖的Docker镜像版本,与
kubeadm init
命令中一定要保持一致。
另外,建议查看所有pod状态,运行kubectl get pods -n kube-system
:
如上,全部Running则表示集群正常。至此,我们的K8S集群就搭建成功了。走!去按摩一下颈椎,放松一下,真累啊!
八、废话
到目前为止,我们的K8S就真正搭建完毕了,下一章节《从零开始搭建Kubernetes集群(四、搭建K8S Dashboard)》敬请期待。
本人水平有限,难免有错误或遗漏之处,望大家指正和谅解,欢迎评论留言。
欢迎关注本人微信公众号:
爱你之心.jpg
网友评论
后显示
this version of kubeadm only supports deploying clusters with the control plane version >= 1.11.0. Current version: v1.10.0
请问出现版本低了怎么操作呢?谢谢
请问下下载calico.yaml这个在哪里下载,下载完之后怎么手动编辑文件修改呢?
求指教
cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubel
et.conf"Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamicall
yEnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this fi
le.EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
求指教,启动会报找不到那个config的配置文件
kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=v1.10.0 --apiserver-advertise-address=192.168.6.130
错误信息是:
[init] using Kubernetes version: v1.10.0
[preflight] running pre-flight checks
I0827 21:16:43.558655 3770 kernel_validator.go:81] Validating kernel version
I0827 21:16:43.558719 3770 kernel_validator.go:96] Validating kernel config
[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.06.1-ce. Max validated version: 17.03
[preflight] Some fatal errors occurred:
[ERROR KubeletVersion]: the kubelet version is higher than the control plane version. This is not a supported version skew and may lead to a malfunctional cluster. Kubelet version: "1.11.2" Control plane version: "1.10.0"
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...
这是什么情况呢?QAQ
configmaps "kubelet-config-1.11" is forbidden: User "system:bootstrap:2e5yeb" cannot get configmaps in the namespace "kube-system
错误。。不知道怎么了
kube-dns-86f4d74b45-w66fk 0/3 Pending 0 33m
然后kubeadm join运行也成功了,但是主节点上kubectl get nodes始终只有k8s-node1.
这个是怎么回事呢?还请赐教.
Add more computing resources / nodes to your cluster (preferred)
调整cpu数量就可以了
[init] This might take a minute or longer if the control plane images have to be pulled.
卡在了这一步不在前进,大约过一个小时后开始报错。
我猜想是否和kubeadm的版本有关? 因为下载的kubeadm版本高于您文章中的版本,但是镜像却是您文章中的版本,是否存在该问题?
另,附上部分日志:
Jun 20 01:56:39 master kubelet: E0620 01:56:39.833958 15867 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://192.168.36.15:6443/api/v1/services?limit=500&;resourceVersion=0: dial tcp 192.168.36.15:6443: getsockopt: connection refused
Jun 20 01:56:39 master kubelet: E0620 01:56:39.945760 15867 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.36.15:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dmaster&;limit=500&resourceVersion=0: dial tcp 192.168.36.15:6443: getsockopt: connection refused
还望不吝赐教,非常感谢!
我已在本地拉取并tag了所有镜像,出现了The connection to the server localhost:8080 was refused - did you specify the right host or port?
该问题,很是困扰。
在Master主节点(k8s-node1)上执行:
kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=v1.10.0 --apiserver-advertise-address=192.168.56.102
你的主节点是192.168.56.101,所以,你这行命令的 --apiserver-advertise-address的参数应该是101,而不是102.
请修正。
configmap "calico-config" created
secret "calico-etcd-secrets" created
daemonset.extensions "calico-node" created
deployment.extensions "calico-kube-controllers" created
serviceaccount "calico-kube-controllers" created
serviceaccount "calico-node" created
[root@k8s-node1 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-68d47cd995-82tbp 0/1 Error 2 1m
calico-node-6gxss 1/2 Running 0 1m
etcd-k8s-node1 1/1 Running 0 7m
kube-apiserver-k8s-node1 1/1 Running 0 8m
kube-controller-manager-k8s-node1 1/1 Running 0 7m
kube-dns-86f4d74b45-8kgtc 0/3 ContainerCreating 0 8m
kube-proxy-ssz7q 1/1 Running 0 8m
kube-scheduler-k8s-node1 1/1 Running 0 7m
[root@k8s-node1 ~]#
我下了 3.1.0 的配置文件直接 但是也是不对直接 error
2018-05-13 14:41:38.754 [INFO][1] main.go 90: Ensuring Calico datastore is initialized
[root@k8s-node1 ~]#
[root@k8s-node1 ~]# kubectl logs calico-kube-controllers-68d47cd995-82tbp -n kube-system
2018-05-13 14:41:38.745 [INFO][1] main.go 69: Loaded configuration from environment config=&config.Config{LogLevel:"info", ReconcilerPeriod:"5m", CompactionPeriod:"10m", EnabledControllers:"policy,profile,workloadendpoint,node", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:""}
2018-05-13 14:41:38.754 [INFO][1] main.go 90: Ensuring Calico datastore is initialized
2018-05-13 14:41:48.755 [INFO][1] etcdv3.go 327: Error returned from etcdv3 client error=context deadline exceeded etcdv3-etcdKey="/calico/resources/v3/projectcalico.org/clusterinformations/default" model-etcdKey=ClusterInformation(default) rev=""
2018-05-13 14:41:48.755 [ERROR][1] client.go 232: Error getting cluster information config ClusterInformation="default" error=context deadline exceeded
2018-05-13 14:41:48.755 [FATAL][1] main.go 95: Failed to initialize Calico datastore error=context deadline exceeded
Normal Scheduled 3m default-scheduler Successfully assigned calico-kube-controllers-68d47cd995-82tbp to k8s-node1
Normal SuccessfulMountVolume 3m kubelet, k8s-node1 MountVolume.SetUp succeeded for volume "etcd-certs"
Normal SuccessfulMountVolume 3m kubelet, k8s-node1 MountVolume.SetUp succeeded for volume "calico-kube-controllers-token-rhn8d"
Normal Pulled 1m (x5 over 3m) kubelet, k8s-node1 Container image "quay.io/calico/kube-controllers:v3.1.0" already present on machine
Normal Created 1m (x5 over 3m) kubelet, k8s-node1 Created container
Normal Started 1m (x5 over 3m) kubelet, k8s-node1 Started container
Warning BackOff 49s (x8 over 2m) kubelet, k8s-node1 Back-off restarting failed container
大佬 这里的手动编辑具体是怎么编辑的 新手 找不到在哪编辑。。