美文网首页
Postgres-XL 安装

Postgres-XL 安装

作者: CocoAdapter | 来源:发表于2018-10-28 22:13 被阅读0次

    标准架构模式

    结构:
    标准情况下,GTM 和 GTM Slave 装在两台主机上;GTM-proxy,Coordinator,Datanode 装在一台主机上,两两之间互设Datanode slave;或分散Datanode 和 Datanode slave到不同的主机上以支持更好的HA,降低I/O压力。
    HA:
    Postgres-XL目前没有在内部实现自动主备切换,如果出现异常,需要手动切换。可以通过外力实现自动。(待研究)
    硬件需求:
    GTM 不需要太多I/O,主要是一些CPU和内存资源(具体压力待测)。Coordinator处理SQL查询,分发和汇总,需要CPU和内存资源。Datanode主要是I/O操作。

    实验环境

    实验环境为三台主机,CentOS 7.X,两台为虚拟机;一台为阿里云,用作动态添加删除结点测试。
    配置如下:


    虚拟机
    阿里云

    开始搭建

    1 VMware 网络设置

    打开VMware的虚拟网络编辑器,查看NAT模式下的几个关键信息:

    • 起始IP地址
    • 子网掩码
    • 结束IP地址
    网关信息
    IP信息

    2 安装CentOS

    VMware支持复制虚拟机,所以我们先只安装好一台主机的环境,复制,再做修改即可。

    2.1 安装选项

    不需要GUI,选择最小安装即可,缺什么之后再下载。设置root的密码,不添加额外的用户。

    2.2 虚拟机配置网络

    一般情况下,网络都是不通的,就算通的也需要配置。CentOS7.X中改用ip命令,ifconfig可能直接没安装了;查看IP地址命令如下:

    // root 用户
    ip address
    

    这个时候可能是DHCP得到的IP或没有IP,这个时候主要是记住自己的网卡是哪块,可能不一样,如:


    VMware
    阿里云

    然后,修改配置文件

    // root 用户
    vi /etc/sysconfig/network-scripts/ifcfg-网卡名字
    
    // 修改或添加以下内容,内容取决于虚拟网络编辑器中的内容,DNS这里用的阿里云的
    BOOTPROTO=static
    ONBOOT=yes
    IPADDR=192.168.209.131
    NETMASK=255.255.255.0
    GATEWAY=192.168.209.2
    DNS1=223.5.5.5
    DNS2=223.6.6.6
    

    然后再重启网络

    // root 用户
    service network restart
    

    2.3 防火墙设置

    如果可以,当然是定义规则来处理网络连接。不然,也可以直接关闭。

    // root 用户
    systemctl stop firewalld.service
    systemctl disable firewalld.service
    
    // 设置selinux
    vi /etc/selinux/config
    // 设置SELINUX=disabled
    SELINUX=disabled
    

    2.4 修改hosts文件

    根据集群规划,修改hosts文件

    // root用户
    vi /etc/hosts
    // 添加下面的内容
    192.168.209.131 node1
    192.168.209.132 node2
    xxx.xxx.xxx.xxx aliyun1
    
    

    2.5 创建用户, 配置ssh

    // root用户
    adduser pgxl
    passwd pgxl
    // 输入密码
    

    可以考虑给pgxl用户sudo权限,命令如下

    // root用户
    sudo visudo
    // 添加
    %pgxl ALL=(ALL) ALL
    

    切换成pgxl用户,需要提权加sudo,下述命令除特别说明,都在pgxl用户下执行。

    可能ssh都没安装,没装的话先装了。精简版CentOS很多常用软件包都没有,比如rsync, netstat, telnet 等等,检查是否安装,没有安装最好装了。

    配置免密登录

    su pgxl
    mkdir ~/.ssh
    chmod 700 ~/.ssh
    ssh-keygen -t rsa
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys
    // 虚拟机暂时不执行下面这条命令,复制后再执行。
    // 其实复制后因为keys完全一样,也可以不执行,这里还是给出通用正常部署流程。
    // nodex是主机名,需要存在于hosts文件中
    scp -p ~/.ssh/authorized_keys pgxl@nodex:~/.ssh/
    

    3 安装Postgres-XL

    3.1 下载安装

    yum update
    yum install gcc gcc-c++ kernel-devel readline-devel flex bison bison-devel zlib zlib-devel make docbook-style-dsssl jade
    // 自行下载一个 Postgres-XL的源码包并解压,假设解压后文件夹名为postgres-xl
    cd postgres-xl
    ./configure --prefix=/usr/local/pgxl/
    make
    sudo make install
    

    3.2 修改环境变量

    严格来说,应该修改pgxl的bashrc,但是如果报一堆 bash: xx command not found 之类的错误又找不到原因的话,可以直接写入 /etc/environment

    3.3 生成并修改配置文件

    pgxc_ctl 可以以交互模式运行,也可以接受参数

    pgxc_ctl prepare
    

    编辑pgxc_ctl.conf,有更改的地方如下:

    #---- GTM ------------------------------------------------------------------------------------
     
    # GTM is mandatory.  You must have at least (and only) one GTM master in your Postgres-XC cluster.
    # If GTM crashes and you need to reconfigure it, you can do it by pgxc_update_gtm command to update
    # GTM master with others.   Of course, we provide pgxc_remove_gtm command to remove it.  This command
    # will not stop the current GTM.  It is up to the operator.
     
    #---- Overall -------
    gtmName=gtm
     
    #---- GTM Master -----------------------------------------------
     
    #---- Overall ----
    gtmMasterServer=node1
    gtmMasterPort=20001
    gtmMasterDir=/data1/pgxc/nodes/gtm
     
    #---- Configuration ---
    gtmExtraConfig=none         # Will be added gtm.conf for both Master and Slave (done at initilization only)
    gtmMasterSpecificExtraConfig=none   # Will be added to Master's gtm.conf (done at initialization only)
     
    #---- GTM Slave -----------------------------------------------
     
    # Because GTM is a key component to maintain database consistency, you may want to configure GTM slave
    # for backup.
     
    #---- Overall ------
    gtmSlave=y                  # Specify y if you configure GTM Slave.   Otherwise, GTM slave will not be configured and
                                # all the following variables will be reset.
    gtmSlaveServer=node2        # value none means GTM slave is not available.  Give none if you don't configure GTM Slave.
    gtmSlavePort=20001          # Not used if you don't configure GTM slave.
    gtmSlaveDir=/data1/pgxc/nodes/gtm   # Not used if you don't configure GTM slave.
    # Please note that when you have GTM failover, then there will be no slave available until you configure the slave
    # again. (pgxc_add_gtm_slave function will handle it)
     
    #---- Configuration ----
    gtmSlaveSpecificExtraConfig=none # Will be added to Slave's gtm.conf (done at initialization only)
     
    #---- GTM Proxy -------------------------------------------------------------------------------------------------------
    # GTM proxy will be selected based upon which server each component runs on.
    # When fails over to the slave, the slave inherits its master's gtm proxy.  It should be
    # reconfigured based upon the new location.
    #
    # To do so, slave should be restarted.   So pg_ctl promote -> (edit postgresql.conf and recovery.conf) -> pg_ctl restart
    #
    # You don't have to configure GTM Proxy if you dont' configure GTM slave or you are happy if every component connects
    # to GTM Master directly.  If you configure GTL slave, you must configure GTM proxy too.
     
    #---- Shortcuts ------
    gtmProxyDir=/data1/pgxc/nodes/gtm_pxy
     
    #---- Overall -------
    gtmProxy=y              # Specify y if you conifugre at least one GTM proxy.   You may not configure gtm proxies
                            # only when you dont' configure GTM slaves.
                            # If you specify this value not to y, the following parameters will be set to default empty values.
                            # If we find there're no valid Proxy server names (means, every servers are specified
                            # as none), then gtmProxy value will be set to "n" and all the entries will be set to
                            # empty values.
    gtmProxyNames=(gtm_pxy1 gtm_pxy2)   # No used if it is not configured
    gtmProxyServers=(node1 node2)           # Specify none if you dont' configure it.
    gtmProxyPorts=(20002 20002)             # Not used if it is not configured.
    gtmProxyDirs=($gtmProxyDir $gtmProxyDir)    # Not used if it is not configured.
     
    #---- Configuration ----
    gtmPxyExtraConfig=none      # Extra configuration parameter for gtm_proxy.  Coordinator section has an example.
    gtmPxySpecificExtraConfig=(none none)
     
    #---- Coordinators ----------------------------------------------------------------------------------------------------
     
    #---- shortcuts ----------
    coordMasterDir=/data1/pgxc/nodes/coord
    coordSlaveDir=/data1/pgxc/nodes/coord_slave
    coordArchLogDir=/data1/pgxc/nodes/coord_archlog
     
    #---- Overall ------------
    coordNames=(coord1 coord2)      # Master and slave use the same name
    coordPorts=(20004 20004)            # Master and slave use the same port
    poolerPorts=(20010 20010)           # Master and slave use the same pooler port
    coordPgHbaEntries=(192.168.209.0/24)               # Assumes that all the coordinator (master/slave) accepts
                                                    # the same connection
                                                    # This entry allows only $pgxcOwner to connect.
                                                    # If you'd like to setup another connection, you should
                                                    # supply these entries through files specified below.
    # Note: The above parameter is extracted as "host all all 0.0.0.0/0 trust".   If you don't want
    # such setups, specify the value () to this variable and suplly what you want using coordExtraPgHba
    # and/or coordSpecificExtraPgHba variables.
     
    #---- Master -------------
    coordMasterServers=(node1 node2)        # none means this master is not available
    coordMasterDirs=($coordMasterDir $coordMasterDir)
    coordMaxWALsernder=5    # max_wal_senders: needed to configure slave. If zero value is specified,
                            # it is expected to supply this parameter explicitly by external files
                            # specified in the following.   If you don't configure slaves, leave this value to zero.
    coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder)
                            # max_wal_senders configuration for each coordinator.
     
    #---- Slave -------------
    coordSlave=n 
     
    #---- Configuration files---
    # Need these when you'd like setup specific non-default configuration 
    # These files will go to corresponding files for the master.
    # You may supply your bash script to setup extra config lines and extra pg_hba.conf entries 
    # Or you may supply these files manually.
    coordExtraConfig=coordExtraConfig   # Extra configuration file for coordinators.  
                            # This file will be added to all the coordinators'
                            # postgresql.conf
    # Pleae note that the following sets up minimum parameters which you may want to change.
    # You can put your postgresql.conf lines here.
    cat > $coordExtraConfig <<EOF
    #================================================
    # Added to all the coordinator postgresql.conf
    # Original: $coordExtraConfig
    log_destination = 'stderr'
    logging_collector = on
    log_directory = 'pg_log'
    listen_addresses = '*'
    max_connections = 100
    EOF
     
    # Additional Configuration file for specific coordinator master.
    # You can define each setting by similar means as above.
    coordSpecificExtraConfig=(none none none none)
    coordExtraPgHba=none    # Extra entry for pg_hba.conf.  This file will be added to all the coordinators' pg_hba.conf
    coordSpecificExtraPgHba=(none none none none)
     
    #----- Additional Slaves -----
    #
    # Please note that this section is just a suggestion how we extend the configuration for
    # multiple and cascaded replication.   They're not used in the current version.
    #
    coordAdditionalSlaves=n     # Additional slave can be specified as follows: where you
    coordAdditionalSlaveSet=(cad1)      # Each specifies set of slaves.   This case, two set of slaves are
                                                # configured
    cad1_Sync=n             # All the slaves at "cad1" are connected with asynchronous mode.
                                # If not, specify "y"
                                # The following lines specifies detailed configuration for each
                                # slave tag, cad1.  You can define cad2 similarly.
    cad1_Servers=(node08 node09 node06 node07)  # Hosts
    cad1_dir=/data1/pgxc/nodes/coord_slave_cad1
    cad1_Dirs=($cad1_dir $cad1_dir $cad1_dir $cad1_dir)
    cad1_ArchLogDir=/data1/pgxc/nodes/coord_archlog_cad1
    cad1_ArchLogDirs=($cad1_ArchLogDir $cad1_ArchLogDir $cad1_ArchLogDir $cad1_ArchLogDir)
     
     
    #---- Datanodes -------------------------------------------------------------------------------------------------------
     
    #---- Shortcuts --------------
    datanodeMasterDir=/data1/pgxc/nodes/dn_master
    datanodeSlaveDir=/data1/pgxc/nodes/dn_slave
    datanodeArchLogDir=/data1/pgxc/nodes/datanode_archlog
     
    #---- Overall ---------------
    #primaryDatanode=datanode1              # Primary Node.
    # At present, xc has a priblem to issue ALTER NODE against the primay node.  Until it is fixed, the test will be done
    # without this feature.
    primaryDatanode=datanode1               # Primary Node.
    datanodeNames=(datanode1 datanode2)
    datanodePorts=(20008 20009) # Master and slave use the same port!
    datanodePoolerPorts=(20012 20013)   # Master and slave use the same port!
    datanodePgHbaEntries=(192.168.209.0/24)    # Assumes that all the coordinator (master/slave) accepts
                                            # the same connection
                                            # This list sets up pg_hba.conf for $pgxcOwner user.
                                            # If you'd like to setup other entries, supply them
                                            # through extra configuration files specified below.
    # Note: The above parameter is extracted as "host all all 0.0.0.0/0 trust".   If you don't want
    # such setups, specify the value () to this variable and suplly what you want using datanodeExtraPgHba
    # and/or datanodeSpecificExtraPgHba variables.
     
    #---- Master ----------------
    datanodeMasterServers=(node1 node2) # none means this master is not available.
                                                        # This means that there should be the master but is down.
                                                        # The cluster is not operational until the master is
                                                        # recovered and ready to run.   
    datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir)
    datanodeMaxWalSender=5                              # max_wal_senders: needed to configure slave. If zero value is 
                                                        # specified, it is expected this parameter is explicitly supplied
                                                        # by external configuration files.
                                                        # If you don't configure slaves, leave this value zero.
    datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
                            # max_wal_senders configuration for each datanode
     
    #---- Slave -----------------
    datanodeSlave=y         # Specify y if you configure at least one coordiantor slave.  Otherwise, the following
                            # configuration parameters will be set to empty values.
                            # If no effective server names are found (that is, every servers are specified as none),
                            # then datanodeSlave value will be set to n and all the following values will be set to
                            # empty values.
    datanodeSlaveServers=(node2 node1)  # value none means this slave is not available
    datanodeSlavePorts=(20009 20008)
    datanodeSlavePoolerPorts=(20013 20012)
    datanodeSlaveSync=y     # If datanode slave is connected in synchronized mode
    datanodeSlaveDirs=($datanodeSlaveDir $datanodeSlaveDir)
    datanodeArchLogDirs=( $datanodeArchLogDir $datanodeArchLogDir)
    

    3.4 初始化

    执行集群初始化,初始化完成后会自动start各组件。若报错,集群跑不起来的,需要把error去掉才行。
    关键就是,conf文件里的各项配置是否正确,以及环境变量是否正确、工具程序是否安装(缺什么安装什么)。

    pgxc_ctl init all
    

    到此为止,集群安装结束。

    测试

    在任意一个安装了 Coordinator 的节点上,端口号按之前设置的来,PostgreSQL默认端口不是20004,默认数据库与当前用户名一样,所以需要显式指定。因为安装Postgres-XL的时候是pgxl用户,pgxc_ctl已经新建了一个pgxl用户并分配了权限,这些可以在输出日志中看见。

    psql -U pgxl -p 20004 -d postgres
    

    登录成功后,查看集群结构

    select oid, * from pgxc_node;
    

    创建分布式表,并插入测试数据

    create table test (col1 int, col2 text);
    insert into test select generate_series(1, 100), 'test';
    

    查看数据分布

    SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
    

    参考

    相关文章

      网友评论

          本文标题:Postgres-XL 安装

          本文链接:https://www.haomeiwen.com/subject/eiymtqtx.html