美文网首页
Linux搭建Ceph集群--详细流程介绍

Linux搭建Ceph集群--详细流程介绍

作者: ZNB_天玄 | 来源:发表于2017-07-07 23:46 被阅读846次

    近期在linux上搭建了用于分布式存储的----GlusterFS和Ceph这两个开源的分布式文件系统。

    前言----大家可以去github上搜索一下,看源码或者官方文档介绍,更多的去了解Ceph,在这里我就不一一的去介绍原理以及抽象技术层面的基础知识。下面我就搭建部署过程中遇到的问题,向大家做一个介绍及部署过程的详细流程。同时,也希望研究或者喜好这方面的同学,带有生产环境成熟方案或者意见的,也请在这里互相讨论,我们一起共同进步,感兴趣的同学可以加我微信(请备注请求与来源),互相沟通交流。

    下面开始Ceph的详细介绍:遇到的问题也会在过程中解决,并且特别说明的是,安装中没有特别说明的,都是安装在同一台admin-node上的配置。

    1.首先要查看你的网络是否与外网连通,好多都是虚拟机,需要配置代理,代理一定要配置在

    /etc/environment 中,否则不起作用。或者在yum.conf中配置代理也可以。我选在/etc/environment中配置。

    2.查看host及hostname名称

    [root@vm-10-112-178-135 gadmin]# vi /etc/hosts

    [root@vm-10-112-178-135 gadmin]# vi /etc/hostname

    3.安装epel

    [root@vm-10-112-178-135 gadmin]# yum install -y epel-release

    已加载插件:fastestmirrorLoading mirror speeds from cached hostfile正在解决依赖关系--> 正在检查事务---> 软件包 epel-release.noarch.0.7-9 将被 安装--> 解决依赖关系完成依赖关系解决============================================================================================================================================================================================================================================ Package                                                        架构                                                    版本                                                  源                                                      大小============================================================================================================================================================================================================================================正在安装: epel-release                                                  noarch                                                  7-9                                                  epel                                                    14 k事务概要============================================================================================================================================================================================================================================安装  1 软件包总下载量:14 k安装大小:24 kDownloading packages:警告:/var/cache/yum/x86_64/7/epel/packages/epel-release-7-9.noarch.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID 352c64e5: NOKEYepel-release-7-9.noarch.rpm 的公钥尚未安装epel-release-7-9.noarch.rpm                                                                                                                                                                                          |  14 kB  00:00:00从 http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7 检索密钥导入 GPG key 0x352C64E5: 用户ID    : "Fedora EPEL (7)"

    指纹      : 91e9 7d7c 4a5e 96f1 7f3e 888f 6a2f aea2 352c 64e5

    来自      : http://file.idc.pub/os/epel/RPM-GPG-KEY-EPEL-7

    Running transaction check

    Running transaction test

    Transaction test succeeded

    Running transaction

    正在安装    : epel-release-7-9.noarch                                                                                                        验证中      : epel-release-7-9.noarch                                                                                                                                                                                                1/1

    已安装:

    epel-release.noarch 0:7-9

    完毕!

    [root@vm-10-112-178-135 gadmin]# yum info epel-release

    已加载插件:fastestmirror

    Repository epel-testing is listed more than once in the configuration

    Repository epel-testing-debuginfo is listed more than once in the configuration

    Repository epel-testing-source is listed more than once in the configuration

    Repository epel is listed more than once in the configuration

    Repository epel-debuginfo is listed more than once in the configuration

    Repository epel-source is listed more than once in the configuration

    Loading mirror speeds from cached hostfile

    已安装的软件包

    名称    :epel-release

    架构    :noarch

    版本    :7

    发布    :9

    大小    :24 k

    源    :installed

    来自源:epel

    简介    : Extra Packages for Enterprise Linux repository configuration

    网址    :http://download.fedoraproject.org/pub/epel

    协议    : GPLv2

    描述    : This package contains the Extra Packages for Enterprise Linux (EPEL) repository

    : GPG key as well as configuration for yum.

    4.手动添加ceph.repo 的 yum资源库文件,下载ceph包文件从这里面下载

    [root@vm-10-112-178-135 gadmin]# vi /etc/yum.repos.d/ceph.repo

    文件内容,将以下文件内容复制到文件中:

    [ceph]

    name=Ceph packages for $basearch

    baseurl=http://download.ceph.com/rpm-jewel/el7/$basearch

    enabled=1

    priority=1

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/release.asc

    [ceph-noarch]

    name=Ceph noarch packages

    baseurl=http://download.ceph.com/rpm-jewel/el7/noarch

    enabled=1

    priority=1

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/release.asc

    [ceph-x86_64]

    name=Ceph x86_64 packages

    baseurl=http://download.ceph.com/rpm-jewel/el7/x86_64

    enabled=0

    priority=1

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/release.asc

    [ceph-aarch64]

    name=Ceph source packages

    baseurl=http://download.ceph.com/rpm-jewel/el7/aarch64

    enabled=0

    priority=1

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/release.asc

    [ceph-source]

    name=Ceph source packages

    baseurl=http://download.ceph.com/rpm-jewel/el7/SRPMS

    enabled=0

    priority=1

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/release.asc

    [apache2-ceph-noarch]

    name=Apache noarch packages for Ceph

    baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS

    enabled=1

    priority=2

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/autobuild.asc

    [apache2-ceph-source]

    name=Apache source packages for Ceph

    baseurl=http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/master/SRPMS

    enabled=0

    priority=2

    gpgcheck=1

    type=rpm-md

    gpgkey=http://download.ceph.com/keys/autobuild.asc

    5.安装ceph-deploy工具

    该工具是python写的一个官方的ceph集群安装工具。

    [root@vm-10-112-178-135 gadmin]#sudo yum update && sudo yum install ceph-deploy

    6.查看是否安装了NTP服务,用来校准时间,因为每台机器通信,保证时间统一

    7.安装openssh-server,配置ssh免密码连接。注意是在所有节点上。

    在所有的节点上安装openssh-server

    [root@vm-10-112-178-135 gadmin]# sudo yum install openssh-server

    [root@vm-10-112-178-135 gadmin]#

    添加用户名

    [root@vm-10-112-178-135 gadmin]# sudo useradd -d /home/cephadmin -m cephadmin

    [root@vm-10-112-178-135 gadmin]# passwd cephadmin

    设置用户拥有root权限的免密码权限

    [root@vm-10-112-178-135 gadmin]# echo "cephadmin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephadmin

    cephadmin ALL = (root) NOPASSWD:ALL

    [root@vm-10-112-178-135 gadmin]# sudo chmod 0440 /etc/sudoers.d/cephadmin

    [cephadmin@vm-10-112-178-135 ~]$ ssh-keygen

    [cephadmin@vm-10-112-178-135 .ssh]$ cd ~/.ssh/

    [cephadmin@vm-10-112-178-135 .ssh]$ ll

    总用量 8

    -rw------- 1 cephadmin cephadmin 1679 7月  5 20:19 id_rsa

    -rw-r--r-- 1 cephadmin cephadmin  409 7月  5 20:19 id_rsa.pub

    [cephadmin@vm-10-112-178-135 .ssh]$ vi config

    Copy the key to each Ceph Node, replacing {username} with the user name you created with Create a Ceph Deploy User.

    ssh-copy-id {username}@node1

    ssh-copy-id {username}@node2

    ssh-copy-id {username}@node3

    (Recommended) Modify the ~/.ssh/config file of your ceph-deploy admin node so that ceph-deploy can log in to Ceph nodes as the user you created without requiring you to specify --username {username} each time you execute ceph-deploy. This has the added benefit of streamlining ssh and scp usage. Replace {username} with the user name you created:

    Host node1

    Hostname node1

    User {username}

    Host node2

    Hostname node2

    User {username}

    Host node3

    Hostname node3

    User {username}

    Enable Networking On Bootup

    Ceph OSDs peer with each other and report to Ceph Monitors over the network. If networking is off by default, the Ceph cluster cannot come online during bootup until you enable networking.

    The default configuration on some distributions (e.g., CentOS) has the networking interface(s) off by default. Ensure that, during boot up, your network interface(s) turn(s) on so that your Ceph daemons can communicate over the network. For example, on Red Hat and CentOS, navigate to /etc/sysconfig/network-scripts and ensure that the ifcfg-{iface} file has ONBOOT set to yes.

    [cephadmin@vm-10-112-178-135 my-cluster]$ cd ~/.ssh/

    [cephadmin@vm-10-112-178-135 .ssh]$ ll

    总用量 12

    -rw-rw-r-- 1 cephadmin cephadmin  171 7月  5 20:26 config

    -rw------- 1 cephadmin cephadmin 1679 7月  5 20:19 id_rsa

    -rw-r--r-- 1 cephadmin cephadmin  409 7月  5 20:19 id_rsa.pub

    [cephadmin@vm-10-112-178-135 .ssh]$ vi config

    [cephadmin@vm-10-112-178-135 .ssh]$ cat config

    Host 10.112.178.141

    Hostname vm-10-112-178-141

    User cephadmin

    Host 10.112.178.142

    Hostname vm-10-112-178-142

    User cephadmin

    Host 10.112.178.143

    Hostname vm-10-112-178-143

    User cephadmin

    [cephadmin@vm-10-112-178-135 .ssh]$

    [cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-141

    The authenticity of host 'vm-10-112-178-141 (10.112.178.141)' can't be established.

    ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

    Are you sure you want to continue connecting (yes/no)? yes

    /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

    /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

    cephadmin@vm-10-112-178-141's password:

    Number of key(s) added: 1

    Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-141'"

    and check to make sure that only the key(s) you wanted were added.

    [cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-142

    The authenticity of host 'vm-10-112-178-142 (10.112.178.142)' can't be established.

    ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

    Are you sure you want to continue connecting (yes/no)? yes

    /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

    /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

    cephadmin@vm-10-112-178-142's password:

    Number of key(s) added: 1

    Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-142'"

    and check to make sure that only the key(s) you wanted were added.

    [cephadmin@vm-10-112-178-135 .ssh]$ sudo ssh-copy-id -i ~/.ssh/id_rsa.pub  cephadmin@vm-10-112-178-143

    The authenticity of host 'vm-10-112-178-143 (10.112.178.143)' can't be established.

    ECDSA key fingerprint is af:7a:19:1d:76:b7:9b:51:0c:88:3a:4c:33:d0:f8:a5.

    Are you sure you want to continue connecting (yes/no)? yes

    /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

    /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

    cephadmin@vm-10-112-178-143's password:

    Number of key(s) added: 1

    Now try logging into the machine, with:  "ssh 'cephadmin@vm-10-112-178-143'"

    and check to make sure that only the key(s) you wanted were added.

    [cephadmin@vm-10-112-178-135 .ssh]$ ssh cephadmin@10.112.178.141

    Bad owner or permissions on /home/cephadmin/.ssh/config

    [cephadmin@vm-10-112-178-135 .ssh]$ sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent

    FirewallD is not running

    说明防火墙已经关闭了,这样就不用考率防火墙了。

    [cephadmin@vm-10-112-178-135 ~]$ mkdir my-cluster

    [cephadmin@vm-10-112-178-135 ~]$ cd my-cluster/

    [cephadmin@vm-10-112-178-135 my-cluster]$

    [cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy new vm-10-112-178-135

    [vm-10-112-178-135][DEBUG ] 总下载量:59 M[vm-10-112-178-135][DEBUG ] 安装大小:218 M[vm-10-112-178-135][DEBUG ] Downloading packages:[vm-10-112-178-135][WARNIN] No data was received after 300 seconds, disconnecting...[vm-10-112-178-135][INFO  ] Running command: sudo ceph --version[vm-10-112-178-135][ERROR ] Traceback (most recent call last):[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/process.py", line 119, in run[vm-10-112-178-135][ERROR ]    reporting(conn, result, timeout)[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/log.py", line 13, in reporting[vm-10-112-178-135][ERROR ]    received = result.receive(timeout)[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 704, in receive[vm-10-112-178-135][ERROR ]    raise self._getremoteerror() or EOFError()[vm-10-112-178-135][ERROR ] RemoteError: Traceback (most recent call last):[vm-10-112-178-135][ERROR ]  File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 1036, in executetask[vm-10-112-178-135][ERROR ]    function(channel, **kwargs)[vm-10-112-178-135][ERROR ]  File "", line 12, in _remote_run

    [vm-10-112-178-135][ERROR ]  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__

    [vm-10-112-178-135][ERROR ]    errread, errwrite)

    [vm-10-112-178-135][ERROR ]  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child

    [vm-10-112-178-135][ERROR ]    raise child_exception

    [vm-10-112-178-135][ERROR ] OSError: [Errno 2] No such file or directory

    [vm-10-112-178-135][ERROR ]

    [vm-10-112-178-135][ERROR ]

    [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph --version

    原因是网络比较慢,达到5分钟超时

    解决方案:1.可以在每个节点上先行安装sudo yum -y install ceph

    2.数量比较多的话多执行几次此命令

    3.最佳方案是搭建本地源

    [cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum install -y ceph

    [cephadmin@vm-10-112-178-135 my-cluster]$ sudo yum -y install ceph-radosgw

    [ceph_deploy.cli][INFO  ]  dev                          : master

    [ceph_deploy.cli][INFO  ]  nogpgcheck                    : False

    [ceph_deploy.cli][INFO  ]  local_mirror                  : None

    [ceph_deploy.cli][INFO  ]  release                      : None

    [ceph_deploy.cli][INFO  ]  install_mon                  : False

    [ceph_deploy.cli][INFO  ]  gpg_url                      : None

    [ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts vm-10-112-178-141

    [ceph_deploy.install][DEBUG ] Detecting platform for host vm-10-112-178-141 ...

    [vm-10-112-178-141][DEBUG ] connection detected need for sudo

    sudo:抱歉,您必须拥有一个终端来执行 sudo

    [ceph_deploy][ERROR ] RuntimeError: connecting to host: vm-10-112-178-141 resulted in errors: IOError cannot send (already closed?)

    [cephadmin@vm-10-112-178-135 my-cluster]$

    解决方案:

    [摘要] Linux ssh执行远端服务器sudo命令时有如下报错:

    sudo: sorry, you must have a tty to run sudo

    sudo:抱歉,您必须拥有一个终端来执行 sudo

    网上搜了一下,解决办法是编辑 /etc/sudoers 文件,将Default requiretty注释掉。

    sudo vi /etc/sudoers

    #Default requiretty #注释掉 Default requiretty 一行

    具体操作:

    sudo sed -i 's/Defaults    requiretty/#Defaults    requiretty/g' /etc/sudoers

    sudo cat /etc/sudoers | grep requiretty

    -----由于http代理网速太慢,只能选择分别再每台虚拟机节点上 手动安装 ceph,安装指令如下----

    sudo yum -y install ceph

    sudo yum -y install ceph-radosgw

    [cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy install vm-10-112-178-135 vm-10-112-178-141 vm-10-112-178-142 vm-10-112-178-143

    ceph-deploy install {ceph-node}[{ceph-node} ...] --local-mirror=/opt/ceph-repo --no-adjust-repos --release=jewel

    [cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd prepare vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1

    [vm-10-112-178-142][WARNIN] 2017-07-06 16:19:33.928816 7f2940a11800 -1  ** ERROR: error creating empty object store in /var/local/osd0: (13) Permission denied

    [vm-10-112-178-142][WARNIN]

    [vm-10-112-178-142][ERROR ] RuntimeError: command returned non-zero exit status: 1

    [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /var/local/osd0

    解决方案:

    因为机器上的 /var/local/osd0 与 /var/local/osd1 这两个文件夹没有权限,使用 chmod 777 对文件夹 进行赋予 权限。

    [cephadmin@vm-10-112-178-135 my-cluster]$ ceph-deploy osd activate vm-10-112-178-142:/var/local/osd0 vm-10-112-178-143:/var/local/osd1

    相关文章

      网友评论

          本文标题:Linux搭建Ceph集群--详细流程介绍

          本文链接:https://www.haomeiwen.com/subject/xizrhxtx.html