0.准备及确认
准备工作:
- 首先你要有一个已经搭好的k8s集群
- 确认你是否有一个默认StorageClass且也配置好了动态pv,确认方法如下:
kubectl get sc
输出:
NAME PROVISIONER AGE
nfs (default) fuseim.pri/ifs 147m
slow kubernetes.io/gce-pd 5d
default
表示这个storageclass是默认的。
修改一个storageclass为默认:
kubectl patch storageclass <your-class-name> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
详情可以看这个链接:如何改变默认 StorageClass
1.下载安装
1.下载kfctl
选择你想要下载的安装包:kfctl release page
我直接选择目前最新的0.6.2
wget https://github.com/kubeflow/kubeflow/releases/download/v0.6.2/kfctl_v0.6.2_linux.tar.gz
- 解压
tar -xvf kfctl_v0.6.2_linux.tar.gz
- 将kfctl 复制到可执行路径下
cp kfctl /usr/bin/kfctl
3.申明一个kfctl路径:
export PATH=$PATH:"/usr/bin/kfctl"
4.申明kubeflow的配置保存路径,可以自己选择:
#export KFAPP="<your choice of application directory name>"
export KFAPP="/home/kubeflow-config"
5.安装istio
# Installs Istio by default. Comment out Istio components in the config file to skip Istio installation. See https://github.com/kubeflow/kubeflow/pull/3663
export CONFIG="https://raw.githubusercontent.com/kubeflow/kubeflow/v0.6-branch/bootstrap/config/kfctl_k8s_istio.0.6.2.yaml"
6.kfctl init
kfctl init ${KFAPP} --config=${CONFIG} -V
7.生成配置文件
cd ${KFAPP}
kfctl generate all -V
- kfctl 根据相应配置生成相应资源
这里目前版本有一个坑,那就没有提前创建kubeflow-anonymous这个namespace,导致会报以下错误:
E0916 12:32:59.188756 13066 memcache.go:135] couldn't get resource list for machinelearning.seldon.io/v1alpha2: the server could not find the requested resource
INFO[0012] creating Profile/kubeflow-anonymous filename="kustomize/kustomize.go:447"
WARN[0012] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
WARN[0016] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
WARN[0022] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
WARN[0030] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
WARN[0039] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
WARN[0053] Could not find namespace kubeflow-anonymous, wait and retry: namespaces "kubeflow-anonymous" not found filename="kustomize/kustomize.go:353"
做法就是先提前创建namespace kubeflow-anonymous
kubectl create namespace kubeflow-anonymous
再创建其他资源:
kfctl apply all -V
- 校验是否安装成功:
获取kubeflow这个namespace下所有资源
kubectl -n kubeflow get all
- 删除kubeflow
cd ${KFAPP}
# If you want to delete all the resources, run:
kfctl delete all -V
如果你还要使用以上方式去安装kubeflow,需要删除掉原有的istio-system下的所有资源:
kubectl delete ns istio-system
2.关于单节点的错误
由于单节点,master上有污点导致的错误:
message: '0/1 nodes are available: 1 node(s) had taints that the pod didn''t tolerate.'
解决:
kubectl taint nodes --all node-role.kubernetes.io/master-
3.关于无法拉去gcr上的镜像问题
message: Back-off pulling image "gcr.io/kubeflow-images-public/jupyter-web-app:9419d4d"
解决:
- 首先你需要列出所有所需的镜像。
gcr.io/kubeflow-images-public/ingress-setup:latest
gcr.io/kubeflow-images-public/admission-webhook:v20190520-v0-139-gcee39dbc-dirty-0d8f4c
gcr.io/kubeflow-images-public/kubernetes-sigs/application:1.0-beta
gcr.io/kubeflow-images-public/centraldashboard:v20190823-v0.6.0-rc.0-69-gcb7dab59
gcr.io/kubeflow-images-public/jupyter-web-app:9419d4d
gcr.io/kubeflow-images-public/katib/v1alpha2/katib-controller:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/katib-manager:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/katib-manager-rest:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-bayesianoptimization:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-grid:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-hyperband:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-nasrl:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/suggestion-random:v0.6.0-rc.0
gcr.io/kubeflow-images-public/katib/v1alpha2/katib-ui:v0.6.0-rc.0
gcr.io/kubeflow-images-public/metadata:v0.1.8
gcr.io/kubeflow-images-public/metadata-frontend:v0.1.8
gcr.io/ml-pipeline/api-server:0.1.23
gcr.io/ml-pipeline/persistenceagent:0.1.23
gcr.io/ml-pipeline/scheduledworkflow:0.1.23
gcr.io/ml-pipeline/frontend:0.1.23
gcr.io/ml-pipeline/viewer-crd-controller:0.1.23
gcr.io/kubeflow-images-public/notebook-controller:v20190603-v0-175-geeca4530-e3b0c4
gcr.io/kubeflow-images-public/profile-controller:v20190619-v0-219-gbd3daa8c-dirty-1ced0e
gcr.io/kubeflow-images-public/pytorch-operator:v1.0.0-rc.0
gcr.io/google_containers/spartakus-amd64:v1.1.0
gcr.io/kubeflow-images-public/tf_operator:v0.6.0.rc0
- 寻找一个镜像代理
Azure中国提供了gcr.io/k8s.gcr.io/quay.io镜像代理服务。
你需要做的就是拉下来再重新打tag。
当然这种操作很麻烦,参考这篇文章,
Azure中国提供了gcr.io/k8s.gcr.io镜像代理服务
根据上文作者的脚本,我添加了使用文件来拉取镜像的功能:
github docker_wrapper
使用方式
install
git clone https://github.com/silenceshell/docker_wrapper.git
sudo cp docker_wrapper/docker_wrapper.py /usr/local/bin/docker_wrapper
usage
You can use docker_wrapper to pull images from gcr.io/k8s.gcr.io/quay.io
and also from hub.docker.com
. In the later condition, It will directly pull from hub.docker.com
.
docker_wrapper pull k8s.gcr.io/kube-apiserver:v1.14.1
docker_wrapper pull gcr.io/google_containers/kube-apiserver:v1.14.1
docker_wrapper pull quay.io/coreos/flannel:v0.10.0-amd64
docker_wrapper pull nginx
docker_wrapper pull silenceshell/godaddy:0.0.2
You can pull images from a file, need a full path of a file.
docker_wrapper pull -r {image_list_full_path}
注意: 在使用一个文件来进行拉去进行时,{image_list_full_path}
必须是一个绝对路径。
网友评论