美文网首页
Kubernetes EFK 实战 - Flunt-Bit &

Kubernetes EFK 实战 - Flunt-Bit &

作者: cxj_hit | 来源:发表于2018-06-11 14:15 被阅读0次

    准备

    环境规划

    有了上篇文章中的ElasticSearch集群,我们接下来就可以准备日志数据采集的工作。业界推荐的最流行的有两种:LogStash,Fluentd。此文中,我们采用Kubernetes官方采用的Fluent体系中的组件:Fluent Bit 和 Fluentd.
    所有的组件有:

    组件 用途
    Fluent Bit 拉起在每台宿主机上采集宿主机上的容器日志。(Fluent Bit 比较新一些,但是资源消耗比较低,性能比Fluentd好一些,但稳定性有待于进一步提升)
    Fluentd 两个用途:1 以日志收集中转中心角色拉起,Deployment部署模式;2 在部分Fluent Bit无法正常运行的主机上,以Daemon Set模式运行采集宿主机上的日志,并发送给日志收集中转中心
    ElasticSearch 用来接收日志收集中转中心发送过来的日志,并通过Kibana分析展示出来,鉴于硬件资源有限,仅保留一周左右的数据。
    Amazon S3 用来接收日志收集中转中心发送过来的日志,对日志进行压缩归档,也可后续使用Spark进行进一步大数据分析。

    部署架构

    image.png

    此图中,仅作描述Flunt-Bit 和Fluentd的采集集成,ES集群的部署架构,和Kubernetes微服务整体的集群架构,不在此图详述,有兴趣,可参考本人的其它文章。

    日志集中中转代理中心

    当服务节点比较多的时候,推荐使用集中中转代理中心进行初步的汇集转送,如果节点数没那么多,可以直接由Node发送到ES或者S3。

    Docker镜像准备

    由于我们的日志集中中转代理中心需要将日志采集点采集过来的日志转发到两个地方:ElasticSearch和Amazon S3,所以需要对我们创建的Fluentd的镜像进行再次处理。
    鉴于Kubernetes官网提供的有EFK的现成方案,我们就在这个方案基础上进行调整,使其满足我们自身的需求。

    获取Kubernetes官网的Fluentd的配置信息

    [centos@master1 efk]$ wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-ds.yaml
    --2018-06-08 17:46:35--  https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-ds.yaml
    Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.72.133
    Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.72.133|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 2774 (2.7K) [text/plain]
    Saving to: ‘fluentd-es-ds.yaml’
    
    fluentd-es-ds.yaml   100%[===================>]   2.71K  --.-KB/s    in 0s
    
    2018-06-08 17:46:36 (8.56 MB/s) - ‘fluentd-es-ds.yaml’ saved [2774/2774]
    
    [centos@master1 efk]$
    

    打开该文件fluentd-es-ds.yaml,获得官方使用的Docker镜像。

    ......
          - name: fluentd-es
            image: k8s.gcr.io/fluentd-elasticsearch:v2.0.4
            env:
    ......
    

    在本地将该镜像pull下来,tag到本地Repo,并推送到私有Repo,以便方便获取。此处需要科学上网,并将docker的代理加上。

    [centos@master1 efk]$ docker pull k8s.gcr.io/fluentd-elasticsearch:v2.0.4
    Trying to pull repository k8s.gcr.io/fluentd-elasticsearch ... 
    v2.0.4: Pulling from k8s.gcr.io/fluentd-elasticsearch
    e7bb522d92ff: Pull complete 
    92e6b816bc34: Pull complete 
    ffb38dbddc64: Pull complete 
    4900a3591877: Pull complete 
    812a2bf6252f: Pull complete 
    f8d5892f0b74: Pull complete 
    e6736dda51ce: Pull complete 
    Digest: sha256:b8c94527b489fb61d3d81ce5ad7f3ddbb7be71e9620a3a36e2bede2f2e487d73
    Status: Downloaded newer image for k8s.gcr.io/fluentd-elasticsearch:v2.0.4
    [centos@master1 efk]$ docker tag k8s.gcr.io/fluentd-elasticsearch:v2.0.4 hub.***.***/google_containers/fluentd-elasticsearch:v2.1.0
    [centos@master1 efk]$ docker push hub.***.***/google_containers/fluentd-elasticsearch:v2.1.0
    

    增加Amazon S3的支持

    基于Kubernetes镜像添加对S3的支持,新建Dockerfile,内容如下:

    FROM hub.***.***/google_containers/fluentd-elasticsearch:v2.1.0
    MAINTAINER X.J CHEN
    RUN \
        apt-get update -y && apt-get install ruby-dev -y && \
        gem install fluent-plugin-s3 && \
        apt-get clean
    

    编译该镜像,并上传到我们私库。

    [centos@master1 efk] docker build -t hub.***.***/google_containers/fluentd-s3:v2.1.0 .
    [centos@master1 efk] docker push hub.***.***/google_containers/fluentd-s3:v2.1.0
    

    至此,我们的镜像已准备完毕。

    Server Yaml文件准备

    参考上小节获取的fluentd-es-ds.yaml,创建我们的Fluentd Server Yaml文件fluentd-server-s3.yaml,具体内容如:
    调整内容主要有:

    • 更改ServiceAccount及相关的Rule。
    • 更改镜像为我们新Build的镜像。
    • 删除var log等读取本地日志参数的路径。
    • Replicas设置为5,这个主要是由于我们的环境生成日志量太大,该值可以依据自身的实际情况来决定Pod的个数,推荐最少2个。
    • 增加Fluent Server的Service,以供Kubernetes集群内日志采集节点访问上传自己的日志。
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: fluentd-server
      namespace: kube-system
      labels:
        k8s-app: fluentd-server
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: fluentd-server
      labels:
        k8s-app: fluentd-server
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    rules:
    - apiGroups:
      - ""
      resources:
      - "namespaces"
      - "pods"
      verbs:
      - "get"
      - "watch"
      - "list"
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: fluentd-server
      labels:
        k8s-app: fluentd-server
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    subjects:
    - kind: ServiceAccount
      name: fluentd-server
      namespace: kube-system
      apiGroup: ""
    roleRef:
      kind: ClusterRole
      name: fluentd-server
      apiGroup: ""
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: fluentd-server
      namespace: kube-system
      labels:
        k8s-app: fluentd-server
        kubernetes.io/cluster-service: "true"
        kubernetes.io/name: "Flunetd"
    spec:
      ports:
      - port: 24224
        protocol: TCP
        targetPort: server
      selector:
        k8s-app: fluentd-server
        
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: fluentd-server-v2.0.4
      namespace: kube-system
      labels:
        k8s-app: fluentd-server
        version: v2.0.4
        kubernetes.io/cluster-service: "true"
    spec:
      selector:
        matchLabels:
          k8s-app: fluentd-server
          version: v2.0.4
      replicas: 5
      template:
        metadata:
          labels:
            k8s-app: fluentd-server
            kubernetes.io/cluster-service: "true"
            version: v2.0.4
        spec:
          serviceAccountName: fluentd-server
          containers:
          - name: fluentd-server
            #image: k8s.gcr.io/fluentd-elasticsearch:v2.0.4
            image: hub.***.***/google_containers/fluentd-s3:v2.1.0
            imagePullPolicy: Always
            env:
            - name: FLUENTD_ARGS
              value: --no-supervisor -q
            resources:
              limits:
                memory: 1024Mi
              requests:
                cpu: 1000m
                memory: 200Mi
            volumeMounts:
            - name: config-volume
              mountPath: /etc/fluent/config.d
            ports:
            - containerPort: 24224
              name: server
              protocol: TCP
          terminationGracePeriodSeconds: 160
          volumes:
          - name: config-volume
            configMap:
              name: fluentd-server-config-v0.1.4
          imagePullSecrets:
          - name: kube-sec
    

    从Kubernetes官方获取Fluentd的配置文件。

    [centos@master1 efk]  wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml
    

    并在此基础上,创建我们自己的配置文件fluentd-server-s3-configmap.yaml,具体内容如下:
    主要调整内容:

    • 删除日志采集部分片段。
    • 增加Forward Input片段,增加Server的Host和监听信息。
    • 调整Output片段。
    • 请注意,ES和S3两个Output我们均有采用Buffer,均为文件Buffer。由于我们的Deployment中移除了Host 的Var log,此处Buffer缓存将会存在Docker 容器中,容器挂掉后,缓存的数据将丢失,请依据自身的实际情况调整(如果对数据特别敏感,甚至可以用StatetefulSet来创建PV)。
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: fluentd-server-config-v0.1.4
      namespace: kube-system
      labels:
        addonmanager.kubernetes.io/mode: Reconcile
    data:
      system.conf: |-
        <system>
          root_dir /tmp/fluentd-buffers/
        </system>
    
      system.input.conf: |-
        # Listen to incoming data over SSL
        <source>
          @type forward
          port 24224
          bind 0.0.0.0
        </source>
    
      output.conf: |-
        # Enriches records with Kubernetes metadata
        <filter kubernetes.**>
          @type kubernetes_metadata
        </filter>
        # Store Data in Elasticsearch and S3
        <match *.**>
          @type copy
          <store>
            @id elasticsearch
            @type elasticsearch
            @log_level info
            host elasticsearch
            port 9200
            #include_tag_key true
            #tag_key @log_name
            logstash_format true
            request_timeout    30s
            slow_flush_log_threshold 30s
            <buffer>
              @type file
              path /var/log/fluentd-buffers/server.buffer
              flush_mode interval
              #retry_type exponential_backoff
              flush_thread_count 12 # 可以根据实际需要也需兼顾下ES的处理能力进行调整
              flush_interval 8s  # 可以根据实际需要进行调整
              retry_max_interval 30
              chunk_limit_size 32M # 可以根据ES实际处理能力进行调整,务必<100MB 
              #queue_limit_length 64 #8
              total_limit_size 20G
              retry_wait 10s
            </buffer>
          </store>
          <store>
            @id s3
            @type s3
            @log_level info
            #include_tag_key true
            aws_key_id *********  # 请填写自身的key id和sec key
            aws_sec_key **********
            s3_bucket ******** #请填写自身的Bucket
            s3_region cn-north-1
            s3_object_key_format "%{path}dt=%{time_slice}_%{index}.%{file_extension}"
            #store_as json
            path hive/    
            time_slice_format %Y%m%d/%Y%m%d%H
            <buffer>
              @type file
              path /var/log/fluentd-buffers/s3.buffer
              timekey 3600 # 1 hour partition
              timekey_wait 10m
              timekey_use_utc true # use utc
              chunk_limit_size 256m
            </buffer>
          </store>
        </match>
    

    拉起Aggregation Server

    创建Server的配置。

    [centos@master1 efk]$ kubectl create -f fluentd-server-s3-configmap.yaml
    

    创建Server的Deployment。

    [centos@master1 efk]$ kubectl create -f fluentd-server-s3.yaml 
    

    检查Server的启动状态:

    [centos@master1 efk]$ kubectl get service -n kube-system -o wide | grep fluentd-server
    fluentd-server            ClusterIP   10.104.52.1      <none>        24224/TCP        4d        k8s-app=fluentd-server
    [centos@master1 efk]$ 
    [centos@master1 efk]$ kubectl get pods -n kube-system -o wide | grep fluentd-server
    fluentd-server-v2.0.4-855db7cfc5-4wn47   1/1       Running            0          2h        10.244.29.20     minion6
    fluentd-server-v2.0.4-855db7cfc5-pfmvd   1/1       Running            0          2h        10.244.3.211     minion17
    fluentd-server-v2.0.4-855db7cfc5-rjqxl   1/1       Running            0          2h        10.244.13.47     minion19
    fluentd-server-v2.0.4-855db7cfc5-shjfm   1/1       Running            0          2h        10.244.23.141    minion12
    fluentd-server-v2.0.4-855db7cfc5-w7m5f   1/1       Running            0          2h        10.244.30.233    minion5
    [centos@master1 efk]$ 
    

    也可以查看下日志:


    image.png

    日志Agent

    日志的Agent,我们使用的是Fluent Bit,原因还是那句:性能相较Fluentd稍好,消耗资源要少一些。但是鉴于Fluent Bit 的稳定性,有部分节点无法正常运行(有些是日志无法解析造成的,也有其它原因,由于太久没接触过C和C++,有时只能等待官方补丁),也有部分节点可能会运行一段时间崩溃的情况。所以对于日志要求比较高的场景,还是推荐使用Fluentd。
    常见的Fluent Bit的异常,该异常是由于日志文件Json解析异常直接导致Fluent Bit崩溃,号称在0.13.3版本中解决,问题依旧:

    [centos@master1 fluent-bit]$ 
    [centos@master1 fluent-bit]$ kubectl logs -f fluent-bit-5kvpl -n kube-system
    [2018/06/11 03:11:49] [ info] [engine] started (pid=1)
    [2018/06/11 03:11:49] [ info] [filter_kube] https=1 host=kubernetes.default.svc.cluster.local port=443
    [2018/06/11 03:11:49] [ info] [filter_kube] local POD info OK
    [2018/06/11 03:11:49] [ info] [filter_kube] testing connectivity with API server...
    [2018/06/11 03:11:49] [ info] [filter_kube] API server connectivity OK
    [2018/06/11 03:11:49] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
    [engine] caught signal (SIGSEGV)
    Fluent-Bit v0.13.2
    Copyright (C) Treasure Data
    
    #0  0x7fcc07f6eff1      in  ???() at ???:0
    #1  0x55b0b655dede      in  msgpack_sbuffer_write() at lib/msgpack-2.1.3/include/msgpack/sbuffer.h:84
    #2  0x55b0b6771ca5      in  msgpack_pack_ext_body() at lib/msgpack-2.1.3/include/msgpack/pack_template.h:890
    #3  0x55b0b6771ca5      in  msgpack_pack_object() at lib/msgpack-2.1.3/src/objectc.c:72
    #4  0x55b0b655e8c0      in  pack_map_content() at plugins/filter_kubernetes/kubernetes.c:321
    #5  0x55b0b655f129      in  cb_kube_filter() at plugins/filter_kubernetes/kubernetes.c:493
    #6  0x55b0b64feaea      in  flb_filter_do() at src/flb_filter.c:86
    #7  0x55b0b64fc53c      in  flb_input_dbuf_write_end() at include/fluent-bit/flb_input.h:642
    #8  0x55b0b64fe09c      in  flb_input_dyntag_append_raw() at src/flb_input.c:894
    #9  0x55b0b6522b1d      in  process_content() at plugins/in_tail/tail_file.c:290
    #10 0x55b0b6523968      in  flb_tail_file_chunk() at plugins/in_tail/tail_file.c:651
    #11 0x55b0b6521357      in  in_tail_collect_static() at plugins/in_tail/tail.c:129
    #12 0x55b0b64fe5db      in  flb_input_collector_fd() at src/flb_input.c:995
    #13 0x55b0b6505370      in  flb_engine_handle_event() at src/flb_engine.c:296
    #14 0x55b0b6505370      in  flb_engine_start() at src/flb_engine.c:515
    #15 0x55b0b64a5606      in  main() at src/fluent-bit.c:824
    #16 0x7fcc07e662e0      in  ???() at ???:0
    #17 0x55b0b64a3a89      in  ???() at ???:0
    #18 0xffffffffffffffff  in  ???() at ???:0
    [centos@master1 fluent-bit]$ 
    

    我们下面对于Fluent Bit和Fluentd的使用都将描述,以供大家参考。

    Fluent Bit

    准备Fluent Bit Yaml文件

    我们参考https://github.com/fluent/fluent-bit-kubernetes-logging来整理我们自己的Fluent Bit相关Yaml文件。

    [centos@master1 kube-log]$ git clone https://github.com/fluent/fluent-bit-kubernetes-logging.git
    

    Fluent Bit 配置文件

    我们还将继续使用fluent-bit-configmap.yaml来作为我们的Fluent Bit的配置文件,不过要增加Forward组件,以使得Fluent Bit能够正常的将日志转发至日志集中中转代理中心。

    • 增加output-fluentd.conf 片段,输出到Fluent Server。
    • 去掉输出到ES的配置片段。
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: fluent-bit-config
      namespace: kube-system
      labels:
        k8s-app: fluent-bit
    data:
      # Configuration files: server, input, filters and output
      # ======================================================
      fluent-bit.conf: |
        [SERVICE]
            Flush         1
            Log_Level     info
            Daemon        off
            Parsers_File  parsers.conf
            HTTP_Server   On
            HTTP_Listen   0.0.0.0
            HTTP_Port     2020
    
        @INCLUDE input-kubernetes.conf
        @INCLUDE filter-kubernetes.conf
        @INCLUDE output-fluentd.conf
        # @INCLUDE output-elasticsearch.conf
    
      input-kubernetes.conf: |
        [INPUT]
            Name              tail
            Tag               kube.*
            Path              /var/log/containers/*.log
            Parser            docker
            DB                /var/log/flb_kube.db
            Mem_Buf_Limit     5MB
            Skip_Long_Lines   On
            Refresh_Interval  5
    
      filter-kubernetes.conf: |
        [FILTER]
            Name                kubernetes
            Match               kube.*
            Kube_URL            https://kubernetes.default.svc.cluster.local:443
            Merge_Log           On
            K8S-Logging.Parser  On
    
      output-elasticsearch.conf: |
        [OUTPUT]
            Name            es
            Match           *
            Host            ${FLUENT_ELASTICSEARCH_HOST}
            Port            ${FLUENT_ELASTICSEARCH_PORT}
            Logstash_Format On
            Retry_Limit     False
    
      output-fluentd.conf: |
        [OUTPUT]
            Name          forward
            Match         *
            Host          ${FLUENTD_SERVER_HOST}
            Port          ${FLUENTD_SERVER_PORT}
            
      parsers.conf: |
        [PARSER]
            Name   apache
            Format regex
            Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
            Time_Key time
            Time_Format %d/%b/%Y:%H:%M:%S %z
    
        [PARSER]
            Name   apache2
            Format regex
            Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
            Time_Key time
            Time_Format %d/%b/%Y:%H:%M:%S %z
    
        [PARSER]
            Name   apache_error
            Format regex
            Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$
    
        [PARSER]
            Name   nginx
            Format regex
            Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
            Time_Key time
            Time_Format %d/%b/%Y:%H:%M:%S %z
    
        [PARSER]
            Name   json
            Format json
            Time_Key time
            Time_Format %d/%b/%Y:%H:%M:%S %z
    
        [PARSER]
            Name        docker
            Format      json
            Time_Key    time
            Time_Format %Y-%m-%dT%H:%M:%S.%L
            Time_Keep   On
            # Command      |  Decoder | Field | Optional Action
            # =============|==================|=================
            Decode_Field_As   escaped    log
    
        [PARSER]
            Name        syslog
            Format      regex
            Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
            Time_Key    time
            Time_Format %b %d %H:%M:%S
    
    

    Fluent Bit Pod配置

    我们将官方的相关Pod的各种配置(fluent-bit-service-account.yaml,fluent-bit-role.yaml,fluent-bit-role-binding.yaml)整合到一个文件中(fluent-bit-ds.yaml)以便方便维护。

    • 更改Docker镜像的Repo到我们的私库。我们将原有的镜像pull下来,不做任何更改,tag并push到私库。
    • 增加Fluentd Server的环境变量。
    • 请注意nodeSelector片段,需要将具备日志搜集的节点加上该标签。推荐一个节点一个节点加(一两个节点相隔几分钟,也可以用脚本来实现,当日志量比较大的时候,强烈推荐这么做,血淋淋的教训),批量加可能会造成数据风暴,导致中转中心处理不过来,也会导致ES处理不过来而拒掉Fluentd的链接。
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: fluent-bit
      namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRole
    metadata:
      name: fluent-bit-read
    rules:
    - apiGroups: [""]
      resources:
      - namespaces
      - pods
      verbs: ["get", "list", "watch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: fluent-bit-read
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: fluent-bit-read
    subjects:
    - kind: ServiceAccount
      name: fluent-bit
      namespace: kube-system
    ---
    apiVersion: extensions/v1beta1
    kind: DaemonSet
    metadata:
      name: fluent-bit
      namespace: kube-system
      labels:
        k8s-app: fluent-bit-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      template:
        metadata:
          labels:
            k8s-app: fluent-bit-logging
            version: v1
            kubernetes.io/cluster-service: "true"
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/port: "2020"
            prometheus.io/path: /api/v1/metrics/prometheus
        spec:
          nodeSelector:
            beta.kubernetes.io/fluentd-ds-ready: "true"
          containers:
          - name: fluent-bit
            image: hub.***.***/google_containers/fluent-bit:0.13.2
            imagePullPolicy: Always
            ports:
              - containerPort: 2020
            env:
            - name: FLUENTD_SERVER_HOST
              value: "fluentd-server"
            - name: FLUENTD_SERVER_PORT
              value: "24224"
            volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: fluent-bit-config
              mountPath: /fluent-bit/etc/
          terminationGracePeriodSeconds: 10
          volumes:
          - name: varlog
            hostPath:
              path: /var/log
          - name: varlibdockercontainers
            hostPath:
              path: /var/lib/docker/containers
          - name: fluent-bit-config
            configMap:
              name: fluent-bit-config
          serviceAccountName: fluent-bit
          tolerations:
          - key: node-role.kubernetes.io/master
            operator: Exists
            effect: NoSchedule
          imagePullSecrets:
          - name: kube-sec
    

    拉起Fluent Bit DaemonSet

    创建Fluent Bit所需的配置。

    [centos@master1 fluent-bit]$ ls -al
    total 28
    drwxrwxr-x 2 centos centos 4096 Jun  7 02:32 .
    drwxrwxr-x 7 centos centos 4096 Jun 11 03:03 ..
    -rw-rw-r-- 1 centos centos 3562 Jun  7 02:32 fluent-bit-configmap.yaml
    -rw-rw-r-- 1 centos centos 2248 Jun  6 12:51 fluent-bit-ds.yaml
    -rw-rw-r-- 1 centos centos  273 May 31 13:35 fluent-bit-role-binding.yaml
    -rw-rw-r-- 1 centos centos  194 May 31 13:33 fluent-bit-role.yaml
    -rw-rw-r-- 1 centos centos   90 May 31 13:35 fluent-bit-service-account.yaml
    [centos@master1 fluent-bit]$ 
    [centos@master1 fluent-bit]$ kubectl create -f fluent-bit-configmap.yaml
    

    拉起Fluent Bit的Daemon Set。

    [centos@master1 fluent-bit]$ kubectl create -f fluent-bit-ds.yaml
    

    检查Pod,此时会发现

    [centos@master1 fluent-bit]$ kubectl get pods -n kube-system -o wide | grep fluent-bit
    [centos@master1 fluent-bit]$
    

    不要紧张,我们接下来就要对Node打label,这样就会拉起来了,中间注意间隔点时间。

    [centos@master1 fluent-bit]$ kubectl label node minion1 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 fluent-bit]$ kubectl label node minion2 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 fluent-bit]$ kubectl label node minion3 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 fluent-bit]$ kubectl label node minion4 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 fluent-bit]$ kubectl label node minion5 beta.kubernetes.io/fluentd-ds-ready=true
    .....
    

    我们再检查节点被拉起状态:

    [centos@master1 fluent-bit]$ kubectl get pods -n kube-system -o wide | grep flu
    fluent-bit-2sd9k                         1/1       Running   0          4d        10.244.30.213    minion5
    fluent-bit-5jd4w                         1/1       Running   0          4d        10.244.32.12     minion3
    fluent-bit-952fn                         1/1       Running   0          4d        10.244.15.14     minion13
    fluent-bit-cz2xq                         1/1       Running   0          4d        10.244.29.250    minion6
    fluent-bit-fx22k                         1/1       Running   0          4d        10.244.25.235    minion10
    fluent-bit-g4fmw                         1/1       Running   0          4d        10.244.23.99     minion12
    fluent-bit-gnfxg                         1/1       Running   0          4d        10.244.28.207    minion7
    fluent-bit-h9t9l                         1/1       Running   0          4d        10.244.11.91     minion22
    fluent-bit-ld9fx                         1/1       Running   0          4d        10.244.3.191     minion17
    fluent-bit-pgc2f                         1/1       Running   0          4d        10.244.14.48     minion20
    fluent-bit-st2qq                         1/1       Running   0          3d        10.244.16.3      minion11
    fluent-bit-tm5hl                         1/1       Running   0          4d        10.244.12.46     minion18
    fluent-bit-tt44q                         1/1       Running   0          4d        10.244.21.24     minion14
    fluent-bit-vgptk                         1/1       Running   0          4d        10.244.31.9      minion4
    fluent-bit-vptft                         1/1       Running   0          4d        10.244.34.93     minion1
    fluent-bit-wpwl4                         1/1       Running   0          4d        10.244.13.35     minion19
    fluent-bit-xdvbz                         1/1       Running   0          4d        10.244.9.99      minion2
    fluent-bit-zrmsj                         1/1       Running   0          4d        10.244.20.33     minion15
    

    我们查看Kibana来确定下日志传输情况:


    image.png

    我们发现日志已传输到ES,我们接下来对Kibana稍微调整下显示的Fields,更能满足我们查看日志的需要。

    • 增加Host到显示列表。
    • 增加Pod Name到显示列表。
    • 增加Log到显示列表。
    image.png

    我们接下来挨个主机查看下日志情况:

    • 增加输入条件,查询节点的日志。
    image.png

    我们会发现每个主机的日志已正常的传递到ES,那我们接下来再检查下S3。
    我们会发现,日志已按照我们既定的目录规则创建出来。


    image.png

    点开某一个目录,则可以发现文件已存在,亦可以下载到本地,进行再次查看,不问不再做论述。

    image.png

    至此,Fluent Bit部署已完成,在拉起Fluent Bit的过程中,会有部分节点Crash,无法正常拉起。接下来这些节点,我们将采用Fluentd来进行日志的采集。

    Fluentd

    我们参考Fluentd Server章节中的描述,来准备Fluentd的拉起。Fluentd的Docker文件可以和Server章节中使用的一致。

    准备Fluentd Yaml文件

    Fluentd 配置文件

    我们新建一文件fluentd-standalone-configmap.yaml,用来Fluentd的独立运行。

    • 调整output.conf。
    • 移除ES片段。
    • 添加forward片段。
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: fluentd-sa-config-v0.1.4
      namespace: kube-system
      labels:
        addonmanager.kubernetes.io/mode: Reconcile
    data:
      system.conf: |-
        <system>
          root_dir /tmp/fluentd-buffers/
        </system>
    
      containers.input.conf: |-
        # This configuration file for Fluentd / td-agent is used
        # to watch changes to Docker log files. The kubelet creates symlinks that
        # capture the pod name, namespace, container name & Docker container ID
        # to the docker logs for pods in the /var/log/containers directory on the host.
        # If running this fluentd configuration in a Docker container, the /var/log
        # directory should be mounted in the container.
        #
        # These logs are then submitted to Elasticsearch which assumes the
        # installation of the fluent-plugin-elasticsearch & the
        # fluent-plugin-kubernetes_metadata_filter plugins.
        # See https://github.com/uken/fluent-plugin-elasticsearch &
        # https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter for
        # more information about the plugins.
        #
        # Example
        # =======
        # A line in the Docker log file might look like this JSON:
        #
        # {"log":"2014/09/25 21:15:03 Got request with path wombat\n",
        #  "stream":"stderr",
        #   "time":"2014-09-25T21:15:03.499185026Z"}
        #
        # The time_format specification below makes sure we properly
        # parse the time format produced by Docker. This will be
        # submitted to Elasticsearch and should appear like:
        # $ curl 'http://elasticsearch-logging:9200/_search?pretty'
        # ...
        # {
        #      "_index" : "logstash-2014.09.25",
        #      "_type" : "fluentd",
        #      "_id" : "VBrbor2QTuGpsQyTCdfzqA",
        #      "_score" : 1.0,
        #      "_source":{"log":"2014/09/25 22:45:50 Got request with path wombat\n",
        #                 "stream":"stderr","tag":"docker.container.all",
        #                 "@timestamp":"2014-09-25T22:45:50+00:00"}
        #    },
        # ...
        #
        # The Kubernetes fluentd plugin is used to write the Kubernetes metadata to the log
        # record & add labels to the log record if properly configured. This enables users
        # to filter & search logs on any metadata.
        # For example a Docker container's logs might be in the directory:
        #
        #  /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b
        #
        # and in the file:
        #
        #  997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
        #
        # where 997599971ee6... is the Docker ID of the running container.
        # The Kubernetes kubelet makes a symbolic link to this file on the host machine
        # in the /var/log/containers directory which includes the pod name and the Kubernetes
        # container name:
        #
        #    synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
        #    ->
        #    /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
        #
        # The /var/log directory on the host is mapped to the /var/log directory in the container
        # running this instance of Fluentd and we end up collecting the file:
        #
        #   /var/log/containers/synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
        #
        # This results in the tag:
        #
        #  var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
        #
        # The Kubernetes fluentd plugin is used to extract the namespace, pod name & container name
        # which are added to the log message as a kubernetes field object & the Docker container ID
        # is also added under the docker field object.
        # The final tag is:
        #
        #   kubernetes.var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
        #
        # And the final log record look like:
        #
        # {
        #   "log":"2014/09/25 21:15:03 Got request with path wombat\n",
        #   "stream":"stderr",
        #   "time":"2014-09-25T21:15:03.499185026Z",
        #   "kubernetes": {
        #     "namespace": "default",
        #     "pod_name": "synthetic-logger-0.25lps-pod",
        #     "container_name": "synth-lgr"
        #   },
        #   "docker": {
        #     "container_id": "997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b"
        #   }
        # }
        #
        # This makes it easier for users to search for logs by pod name or by
        # the name of the Kubernetes container regardless of how many times the
        # Kubernetes pod has been restarted (resulting in a several Docker container IDs).
    
        # Json Log Example:
        # {"log":"[info:2016-02-16T16:04:05.930-08:00] Some log text here\n","stream":"stdout","time":"2016-02-17T00:04:05.931087621Z"}
        # CRI Log Example:
        # 2016-02-17T00:04:05.931087621Z stdout F [info:2016-02-16T16:04:05.930-08:00] Some log text here
        <source>
          @id fluentd-containers.log
          @type tail
          path /var/log/containers/*.log
          pos_file /var/log/es-containers.log.pos
          time_format %Y-%m-%dT%H:%M:%S.%NZ
          tag raw.kubernetes.*
          read_from_head true
          <parse>
            @type multi_format
            <pattern>
              format json
              time_key time
              time_format %Y-%m-%dT%H:%M:%S.%NZ
            </pattern>
            <pattern>
              format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
              time_format %Y-%m-%dT%H:%M:%S.%N%:z
            </pattern>
          </parse>
        </source>
    
        # Detect exceptions in the log output and forward them as one log entry.
        <match raw.kubernetes.**>
          @id raw.kubernetes
          @type detect_exceptions
          remove_tag_prefix raw
          message log
          stream stream
          multiline_flush_interval 5
          max_bytes 500000
          max_lines 1000
        </match>
    
      system.input.conf: |-
        # Example:
        # 2015-12-21 23:17:22,066 [salt.state       ][INFO    ] Completed state [net.ipv4.ip_forward] at time 23:17:22.066081
        <source>
          @id minion
          @type tail
          format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
          time_format %Y-%m-%d %H:%M:%S
          path /var/log/salt/minion
          pos_file /var/log/salt.pos
          tag salt
        </source>
    
        # Example:
        # Dec 21 23:17:22 gke-foo-1-1-4b5cbd14-node-4eoj startupscript: Finished running startup script /var/run/google.startup.script
        <source>
          @id startupscript.log
          @type tail
          format syslog
          path /var/log/startupscript.log
          pos_file /var/log/es-startupscript.log.pos
          tag startupscript
        </source>
    
        # Examples:
        # time="2016-02-04T06:51:03.053580605Z" level=info msg="GET /containers/json"
        # time="2016-02-04T07:53:57.505612354Z" level=error msg="HTTP Error" err="No such image: -f" statusCode=404
        # TODO(random-liu): Remove this after cri container runtime rolls out.
        <source>
          @id docker.log
          @type tail
          format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
          path /var/log/docker.log
          pos_file /var/log/es-docker.log.pos
          tag docker
        </source>
    
        # Example:
        # 2016/02/04 06:52:38 filePurge: successfully removed file /var/etcd/data/member/wal/00000000000006d0-00000000010a23d1.wal
        <source>
          @id etcd.log
          @type tail
          # Not parsing this, because it doesn't have anything particularly useful to
          # parse out of it (like severities).
          format none
          path /var/log/etcd.log
          pos_file /var/log/es-etcd.log.pos
          tag etcd
        </source>
    
        # Multi-line parsing is required for all the kube logs because very large log
        # statements, such as those that include entire object bodies, get split into
        # multiple lines by glog.
    
        # Example:
        # I0204 07:32:30.020537    3368 server.go:1048] POST /stats/container/: (13.972191ms) 200 [[Go-http-client/1.1] 10.244.1.3:40537]
        <source>
          @id kubelet.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/kubelet.log
          pos_file /var/log/es-kubelet.log.pos
          tag kubelet
        </source>
    
        # Example:
        # I1118 21:26:53.975789       6 proxier.go:1096] Port "nodePort for kube-system/default-http-backend:http" (:31429/tcp) was open before and is still needed
        <source>
          @id kube-proxy.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/kube-proxy.log
          pos_file /var/log/es-kube-proxy.log.pos
          tag kube-proxy
        </source>
    
        # Example:
        # I0204 07:00:19.604280       5 handlers.go:131] GET /api/v1/nodes: (1.624207ms) 200 [[kube-controller-manager/v1.1.3 (linux/amd64) kubernetes/6a81b50] 127.0.0.1:38266]
        <source>
          @id kube-apiserver.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/kube-apiserver.log
          pos_file /var/log/es-kube-apiserver.log.pos
          tag kube-apiserver
        </source>
    
        # Example:
        # I0204 06:55:31.872680       5 servicecontroller.go:277] LB already exists and doesn't need update for service kube-system/kube-ui
        <source>
          @id kube-controller-manager.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/kube-controller-manager.log
          pos_file /var/log/es-kube-controller-manager.log.pos
          tag kube-controller-manager
        </source>
    
        # Example:
        # W0204 06:49:18.239674       7 reflector.go:245] pkg/scheduler/factory/factory.go:193: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2578313/2577886]) [2579312]
        <source>
          @id kube-scheduler.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/kube-scheduler.log
          pos_file /var/log/es-kube-scheduler.log.pos
          tag kube-scheduler
        </source>
    
        # Example:
        # I1104 10:36:20.242766       5 rescheduler.go:73] Running Rescheduler
        <source>
          @id rescheduler.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/rescheduler.log
          pos_file /var/log/es-rescheduler.log.pos
          tag rescheduler
        </source>
    
        # Example:
        # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
        <source>
          @id glbc.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/glbc.log
          pos_file /var/log/es-glbc.log.pos
          tag glbc
        </source>
    
        # Example:
        # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
        <source>
          @id cluster-autoscaler.log
          @type tail
          format multiline
          multiline_flush_interval 5s
          format_firstline /^\w\d{4}/
          format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
          time_format %m%d %H:%M:%S.%N
          path /var/log/cluster-autoscaler.log
          pos_file /var/log/es-cluster-autoscaler.log.pos
          tag cluster-autoscaler
        </source>
    
        # Logs from systemd-journal for interesting services.
        # TODO(random-liu): Remove this after cri container runtime rolls out.
        <source>
          @id journald-docker
          @type systemd
          filters [{ "_SYSTEMD_UNIT": "docker.service" }]
          <storage>
            @type local
            persistent true
          </storage>
          read_from_head true
          tag docker
        </source>
    
        <source>
          @id journald-container-runtime
          @type systemd
          filters [{ "_SYSTEMD_UNIT": "{{ container_runtime }}.service" }]
          <storage>
            @type local
            persistent true
          </storage>
          read_from_head true
          tag container-runtime
        </source>
    
        <source>
          @id journald-kubelet
          @type systemd
          filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]
          <storage>
            @type local
            persistent true
          </storage>
          read_from_head true
          tag kubelet
        </source>
    
        <source>
          @id journald-node-problem-detector
          @type systemd
          filters [{ "_SYSTEMD_UNIT": "node-problem-detector.service" }]
          <storage>
            @type local
            persistent true
          </storage>
          read_from_head true
          tag node-problem-detector
        </source>
        
        <source>
          @id kernel
          @type systemd
          filters [{ "_TRANSPORT": "kernel" }]
          <storage>
            @type local
            persistent true
          </storage>
          <entry>
            fields_strip_underscores true
            fields_lowercase true
          </entry>
          read_from_head true
          tag kernel
        </source>
    
      forward.input.conf: |-
        # Takes the messages sent over TCP
        <source>
          @type forward
        </source>
    
      monitoring.conf: |-
        # Prometheus Exporter Plugin
        # input plugin that exports metrics
        <source>
          @type prometheus
        </source>
    
        <source>
          @type monitor_agent
        </source>
    
        # input plugin that collects metrics from MonitorAgent
        <source>
          @type prometheus_monitor
          <labels>
            host ${hostname}
          </labels>
        </source>
    
        # input plugin that collects metrics for output plugin
        <source>
          @type prometheus_output_monitor
          <labels>
            host ${hostname}
          </labels>
        </source>
    
        # input plugin that collects metrics for in_tail plugin
        <source>
          @type prometheus_tail_monitor
          <labels>
            host ${hostname}
          </labels>
        </source>
    
      output.conf: |-
        # Enriches records with Kubernetes metadata
        <filter kubernetes.**>
          @type kubernetes_metadata
        </filter>
    
        <match **>
          @type forward
          require_ack_response true
          ack_response_timeout 30
          recover_wait 10s
          heartbeat_interval 1s
          phi_threshold 16
          send_timeout 10s
          hard_timeout 10s
          expire_dns_cache 15
          heartbeat_type tcp
          buffer_chunk_limit 2M
          buffer_queue_limit 64
          flush_interval 5s
          max_retry_wait 15
          disable_retry_limit
          num_threads 8
          
          <server>
            name fluentd-server
            host fluentd-server
            port 24224
            weight 100
            </server>
        </match>
    

    Fluentd Pod Yaml

    我们亦将新建fluentd-standalone.yaml,用来控制Fluentd Pod的启动。
    请注意此Yaml中,我们亦使用了NodeSelector,和Fluent Bit不同的是,我们使用的
    beta.kubernetes.io/fluentd-ds-ready = "fluentd"。

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: fluentd-es
      namespace: kube-system
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: fluentd-es
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    rules:
    - apiGroups:
      - ""
      resources:
      - "namespaces"
      - "pods"
      verbs:
      - "get"
      - "watch"
      - "list"
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: fluentd-es
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    subjects:
    - kind: ServiceAccount
      name: fluentd-es
      namespace: kube-system
      apiGroup: ""
    roleRef:
      kind: ClusterRole
      name: fluentd-es
      apiGroup: ""
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: fluentd-es-v2.0.4
      namespace: kube-system
      labels:
        k8s-app: fluentd-es
        version: v2.0.4
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    spec:
      selector:
        matchLabels:
          k8s-app: fluentd-es
          version: v2.0.4
      template:
        metadata:
          labels:
            k8s-app: fluentd-es
            kubernetes.io/cluster-service: "true"
            version: v2.0.4
          # This annotation ensures that fluentd does not get evicted if the node
          # supports critical pod annotation based priority scheme.
          # Note that this does not guarantee admission on the nodes (#40573).
          annotations:
            scheduler.alpha.kubernetes.io/critical-pod: ''
        spec:
          priorityClassName: system-node-critical
          serviceAccountName: fluentd-es
          containers:
          - name: fluentd-es
            #image: k8s.gcr.io/fluentd-elasticsearch:v2.0.4
            image: hub.***.***/google_containers/fluentd-s3:v2.1.0
            imagePullPolicy: Always
            env:
            - name: FLUENTD_ARGS
              value: --no-supervisor -q
            resources:
              limits:
                memory: 500Mi
              requests:
                cpu: 100m
                memory: 200Mi
            volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: config-volume
              mountPath: /etc/fluent/config.d
          nodeSelector:
            beta.kubernetes.io/fluentd-ds-ready: "fluentd"
          terminationGracePeriodSeconds: 30
          volumes:
          - name: varlog
            hostPath:
              path: /var/log
          - name: varlibdockercontainers
            hostPath:
              path: /var/lib/docker/containers
          - name: config-volume
            configMap:
              name: fluentd-sa-config-v0.1.4
          imagePullSecrets:
          - name: kube-sec
    

    拉起Fluentd DaemonSet

    创建Fluentd所需的配置。

    [centos@master1 efk]$ kubectl create -f fluentd-standalone-configmap.yaml 
    

    拉起Fluentd的DaemonSet。

    [centos@master1 efk]$ kubectl create -f fluentd-standalone.yaml 
    

    和Fluent Bit 拉起时一样,此时我们检查Pod,除了Fluentd Server的5个Pod外,此时亦发现没有Pod被拉起。

    [centos@master1 efk]$ kubectl get pods -n kube-system -o wide | grep fluentd
    fluentd-server-v2.0.4-855db7cfc5-4wn47   1/1       Running   0          6h        10.244.29.20     minion6
    fluentd-server-v2.0.4-855db7cfc5-pfmvd   1/1       Running   0          6h        10.244.3.211     minion17
    fluentd-server-v2.0.4-855db7cfc5-rjqxl   1/1       Running   0          6h        10.244.13.47     minion19
    fluentd-server-v2.0.4-855db7cfc5-shjfm   1/1       Running   0          6h        10.244.23.141    minion12
    fluentd-server-v2.0.4-855db7cfc5-w7m5f   1/1       Running   0          6h        10.244.30.233    minion5
    [centos@master1 efk]$ 
    

    我们接下来就要对Node打label,这样就会拉起来了,中间注意间隔点时间。

    [centos@master1 efk]$ kubectl label node minion8 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 efk]$ kubectl label node minion21 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 efk]$ kubectl label node minion9 beta.kubernetes.io/fluentd-ds-ready=true
    [centos@master1 efk]$ kubectl label node minion16 beta.kubernetes.io/fluentd-ds-ready=true
    

    再次检查节点被拉起的状态:

    [centos@master1 efk]$ kubectl get pods -n kube-system -o wide | grep fluentd
    fluentd-es-v2.0.4-75gdf                  1/1       Running   1          4d        10.244.27.245    minion8
    fluentd-es-v2.0.4-kx5pz                  1/1       Running   0          2h        10.244.10.96     minion21
    fluentd-es-v2.0.4-n89xj                  1/1       Running   6          4d        10.244.26.92     minion9
    fluentd-es-v2.0.4-zsrln                  1/1       Running   0          6h        10.244.19.67     minion16
    fluentd-server-v2.0.4-855db7cfc5-4wn47   1/1       Running   0          6h        10.244.29.20     minion6
    fluentd-server-v2.0.4-855db7cfc5-pfmvd   1/1       Running   0          6h        10.244.3.211     minion17
    fluentd-server-v2.0.4-855db7cfc5-rjqxl   1/1       Running   0          6h        10.244.13.47     minion19
    fluentd-server-v2.0.4-855db7cfc5-shjfm   1/1       Running   0          6h        10.244.23.141    minion12
    fluentd-server-v2.0.4-855db7cfc5-w7m5f   1/1       Running   0          6h        10.244.30.233    minion5
    [centos@master1 efk]$ 
    

    和Fluent Bit一样,需要去Kibana那边再次检查下节点的日志是否有正常传递到ES和S3,在此就不再累述。

    相关文章

      网友评论

          本文标题:Kubernetes EFK 实战 - Flunt-Bit &

          本文链接:https://www.haomeiwen.com/subject/fkiosftx.html