美文网首页
Clickhouse on K8s

Clickhouse on K8s

作者: john瀚 | 来源:发表于2020-12-10 15:02 被阅读0次

    1.安装chi-operator

    ClickHouse Operator creates, configures and manages ClickHouse clusters running on Kubernetes.

    kubectl apply -f https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install.yaml
    

    1.1 查看chi-operator

    kubectl -n kube-system get pod | grep clickhouse-operator
    

    如果pod的状态是running的,说明chi-operator部署成功。可通过下面的命令查看其日志。

    kubectl -n kube-system logs -f clickhouse-operator-5b45484748-kpg6t clickhouse-operator 
    

    2. 部署集群

    2.1 部署架构

    按照下图,将要部署一个2shard,2replica的一个集群,即需要四个pod。每个pod的存储使用loca pv的方式。也就是需要四台机器。


    image.png

    2.2 部署集群

    下面的代码包含两个部分

    • local pv的部署yaml,注意此处选定了四台机器。
    • chi的部署yaml。
    kind: StorageClass
    apiVersion: storage.k8s.io/v1
    metadata:
      name: clickhouse-local-volume
    provisioner: kubernetes.io/no-provisioner
    volumeBindingMode: WaitForFirstConsumer
    
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pv-clickhouse-0
    spec:
      capacity:
        storage: 100Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain
      storageClassName: clickhouse-local-volume
      hostPath:
        path: /mnt/data/clickhouse
        type: DirectoryOrCreate
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                    - "clickhouse1"
    
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pv-clickhouse-1
    spec:
      capacity:
        storage: 100Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain
      storageClassName: clickhouse-local-volume
      hostPath:
        path: /mnt/data/clickhouse
        type: DirectoryOrCreate
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                    - "clickhouse2"
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pv-clickhouse-2
    spec:
      capacity:
        storage: 100Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain
      storageClassName: clickhouse-local-volume
      hostPath:
        path: /mnt/data/clickhouse
        type: DirectoryOrCreate
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                    - "clickhouse3"
    ---
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: pv-clickhouse-3
    spec:
      capacity:
        storage: 100Gi
      accessModes:
        - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain
      storageClassName: clickhouse-local-volume
      hostPath:
        path: /mnt/data/clickhouse
        type: DirectoryOrCreate
      nodeAffinity:
        required:
          nodeSelectorTerms:
            - matchExpressions:
                - key: kubernetes.io/hostname
                  operator: In
                  values:
                    - "clickhouse4"
    ---
    apiVersion: "clickhouse.altinity.com/v1"
    kind: "ClickHouseInstallation"
    metadata:
      name: "aibee"
    spec:
      defaults:
        templates:
          serviceTemplate: service-template
          podTemplate: pod-template
          dataVolumeClaimTemplate: volume-claim
      configuration:
        settings:
          compression/case/method: zstd
          disable_internal_dns_cache: 1
          timezone: Asia/Shanghai
        zookeeper:
          nodes:
            - host: zk-svc
              port: 2181
          session_timeout_ms: 30000
          operation_timeout_ms: 10000
        clusters:
          - name: "clickhouse"
            layout:
              shardsCount: 2
              replicasCount: 2
      templates:
        serviceTemplates:
          - name: service-template
            spec:
              ports:
                - name: http
                  port: 8123
                - name: tcp
                  port: 9000
              type: LoadBalancer
    
        podTemplates:
          - name: pod-template
            spec:
              containers:
                - name: clickhouse
                  imagePullPolicy: Always
                  image: yandex/clickhouse-server:latest
                  volumeMounts:
                    # 挂载数据文件路径
                    - name: volume-claim
                      mountPath: /var/lib/clickhouse
                    # 挂载数据文件路径
                    - name: volume-claim
                      mountPath: /var/log/clickhouse-server
                  resources:
                    # 配置cpu和内存大小
                    limits:
                      memory: "1Gi"
                      cpu: "1"
                    requests:
                      memory: "1Gi"
                      cpu: "1"
    
        volumeClaimTemplates:
          - name: volume-claim
            reclaimPolicy: Retain
            spec:
              storageClassName: "clickhouse-local-volume"
              accessModes:
                - ReadWriteOnce
              resources:
                # pv的存储大小
                requests:
                  storage: 100Gi
    

    注意: volumeClaimTemplates的reclaimPolicy必须是Retain,这样即使删除集群,数据会保留下来。否则在删除集群的时候会删除所有以"Replica*"开头的table。我被这个坑了很久。源码如下:

    // hostGetDropTables returns set of 'DROP TABLE ...' SQLs
    func (s *Schemer) hostGetDropTables(host *chop.ChiHost) ([]string, []string, error) {
       // There isn't a separate query for deleting views. To delete a view, use DROP TABLE
       // See https://clickhouse.yandex/docs/en/query_language/create/
       sql := heredoc.Doc(`
          SELECT
             distinct name, 
             concat('DROP TABLE IF EXISTS "', database, '"."', name, '"') AS drop_db_query
          FROM system.tables
          WHERE engine like 'Replicated%'`,
       )
    
       names, sqlStatements, _ := s.getObjectListFromClickHouse([]string{CreatePodFQDN(host)}, sql)
       return names, sqlStatements, nil
    

    部署成功后的pod的分布情况:

    chi-aibee-clickhouse-0-0-0       1/1     Running        0          20m     192.168.35.196    clickhouse3   <none>           <none>
    chi-aibee-clickhouse-0-1-0       1/1     Running        0          20m     192.168.132.103   clickhouse2   <none>           <none>
    chi-aibee-clickhouse-1-0-0       1/1     Running        0          20m     192.168.13.41     clickhouse4   <none>           <none>
    chi-aibee-clickhouse-1-1-0       1/1     Running        0          19m     192.168.133.164   clickhouse1   <none>           <none>
    

    2.3 查看svc的地址

    kubectl get svc clickhouse-aibee
    
    NAME               TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                         AGE
    clickhouse-aibee   LoadBalancer   10.100.185.34   <pending>     8123:30745/TCP,9000:32346/TCP   22m
    

    2.4 连接集群

    使用上面的svc的ClusterIP,默认账户密码:clickhouse_operator/clickhouse_operator_password

    clickhouse-client -h 10.100.185.34 -u clickhouse_operator --password clickhouse_operator_password 
    

    更多自定义的情况请参考这个地址:https://github.com/Altinity/clickhouse-operator/blob/master/docs/custom_resource_explained.md

    2.4 内置的宏

    Operator provides set of macros, which are:

    1. {installation} -- ClickHouse Installation name
    2. {cluster} -- primary cluster name
    3. {replica} -- replica name in the cluster, maps to pod service name
    4. {shard} -- shard id

    ClickHouse also supports internal macros {database} and {table} that maps to current database and table respectively.

    下面的代码展示的是当前集群自动创建的macros,我们可以在创建表的时候使用。

    <yandex>
        <macros>
            <installation>aibee</installation>
            <all-sharded-shard>0</all-sharded-shard>
            <cluster>clickhouse</cluster>
            <shard>0</shard>
            <replica>chi-aibee-clickhouse-0-0</replica>
        </macros>
    </yandex>
    

    3 创建表

    CREATE TABLE events_local on cluster '{cluster}' (
        event_date  Date,
        event_type  Int32,
        article_id  Int32,
        title       String
    ) engine=ReplicatedMergeTree('/clickhouse/{installation}/{cluster}/tables/{shard}/{database}/{table}', '{replica}', event_date, (event_type, article_id), 8192);
    
    CREATE TABLE events on cluster '{cluster}' AS events_local
    ENGINE = Distributed('{cluster}', default, events_local, rand());
    

    3.1 插入数据

    INSERT INTO events SELECT today(), rand()%3, number, 'my title' FROM numbers(100);
    

    3.2 查看数据

    SELECT count() FROM events;
    SELECT count() FROM events_local;
    

    4 集群监控

    chi-operator已经集成了metrics-operator。下面的命令查看监控地址

    kubectl get service clickhouse-operator-metrics -n kube-system
    
    NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
    clickhouse-operator-metrics   ClusterIP   10.102.111.74   <none>        8888/TCP   48d
    

    Prometheus可以使用这个地址抓取metrics。
    http://<service/clickhouse-operator-metrics>:8888/metrics

    Grafana Dashbord

    https://github.com/Altinity/clickhouse-operator/blob/master/grafana-dashboard/Altinity_ClickHouse_Operator_dashboard.json
    

    更多请参考 https://github.com/Altinity/clickhouse-operator/blob/master/docs/prometheus_setup.md

    附录:
    https://github.com/Altinity/clickhouse-operator

    相关文章

      网友评论

          本文标题:Clickhouse on K8s

          本文链接:https://www.haomeiwen.com/subject/bffrgktx.html