这个问题的原因有几种,我遇到的问题是拉去image失败,如“image pull failed for gcr.io/google_containers/pause:2.0”。原来k8s默认从gcr.io/google_containers拉去镜像,国内网络无法访问。原来忘了连接VPN了…
问题是比较低级,其实主要是想跟大家分享下定位的方法。主要是通过“kubectl describe pod PodName”指令查看pod发生的事件,从事件列表中可以查找到错误信息。
vagrant@vagrant-ubuntu-trusty-64:~/work/k8s-foo$ kubectl run foo --image=hello-world
deployment "foo" created
vagrant@vagrant-ubuntu-trusty-64:~/work/k8s-foo$ kubectl get pods
NAME READY STATUS RESTARTS AGE
foo-928603113-igh2x 0/1 ContainerCreating 0 4m
vagrant@vagrant-ubuntu-trusty-64:~/work/k8s-foo$ kubectl describe pod foo
Name: foo-928603113-igh2x
Namespace: default
Node: 127.0.0.1/127.0.0.1
Start Time: Mon, 11 Apr 2016 15:11:49 +0000
Labels: pod-template-hash=928603113,run=foo
Status: Pending
IP:
Controllers: ReplicaSet/foo-928603113
Containers:
foo:
Container ID:
Image: hello-world
Image ID:
Port:
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables:
Conditions:
Type Status
Ready False
Volumes:
default-token-fbasq:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fbasq
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
7m 7m 1 {default-scheduler } Normal Scheduled Successfully assigned foo-928603113-igh2x to 127.0.0.1
4m 4m 1 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for gcr.io/google_containers/pause:2.0, this may be because there are no credentials on this request. details: (API error (500): unable to ping registry endpoint https://gcr.io/v0/\nv2 ping attempt failed with error: Get https://gcr.io/v2/: dial tcp 74.125.203.82:443: i/o timeout\n v1 ping attempt failed with error: Get https://gcr.io/v1/_ping: dial tcp 74.125.203.82:443: i/o timeout\n)"
晚间尝试启动kube-dns时也遇到了类似的问题。查看kube-dns Service时一切正常:
vagrant@vagrant-ubuntu-trusty-64:~/work/k8s-foo$ kubectl get services kube-dns --namespace=kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 56m
但启动一个Service之后尝试使用Service名称解析dns却失败了。执行“kubectl get pods –namespace=kube-system”查看发现kube-dns相关pod启动失败了。
再通过“kubectl describe”查看相关pod的事件时发现原来kube-dns启动时也需要下载新镜像。果断开启VPN,再重启集群,over。
网友评论