Ansible使用之playbooks大法

作者: 9c46ece5b7bd | 来源:发表于2017-04-24 21:48 被阅读279次

    Playbooks 使用指南

    1.主机与用户

    在yml文件中hosts指定主机组或者满足主机的patterns,以逗号分隔;
    remote_user指定以远程用户执行;sudo指定远程用户使用sudo权限执行命令
    注意:在每一个task中也可以定义自己的远程用户.也可以在一个task中使用sudo,而全局不使用sudo

    ---
    - hosts: webservers
      remote_user: pe
      vars:
        http_port: 80
        max_clients: 200
    #  sudo: yes
      tasks:
        - name:测试连通性
          ping:
          sudo: yes
        - name:重启nginx服务{{ http_port }}   #在全局定义了vars变量之后,可以在任何一地方进行引用
          template: src=/srv/nginx.j2 dest=/etc/nginx.conf  #同步nginx配置文件
          #service: name=nginx state=restarted
          sudo: yes         #在子任务中使用sudo
          sudo_user:supdev      #使用sudo去切换到其他用户执行
          notify:           #当检测文件被修改之后执行下面的语句
            - restart nginx         #改restart nginx语句被在最后的handlers中定义
    
    
        - name:修改selinux配置
          command: /sbin/setenforce 0
          shell: /usr/bin/somecommand || /bin/true  #如果成功执行命令的返回码不是0,可以这样做
          ignore_errors: True           #上面的shell模块执行也可以使用该参数
        - name: 拷贝文件
          copy: src=/etc/ansible/hosts dest=/etc/ansible/hosts
            owner=root group=root mode=0644 
        handlers:
          - name: restart nginx
        service: name=nginx state=restarted
    

    注意:如果使用sudo时需要指定密码,可以在运行的ansible-playbook命令时加上ask-sudo-pass

    2.Tasks 列表

    注意:

    1.每一个play中包含了一个task列表,一个task在其对应的所有主机执行完毕之后,下一个task才会执行;
    2.在运行playbooks时是按照从上到下的顺序进行的,如果一个hosts执行task失败,这个hosts将会从整个playbook的rotation中移除.
    3.每个task的目标在于执行一个module,通常是带有特定的参数来执行,在参数中可以使用变量(variables)
    Exapmle:shell,command,user,template(copy),service,yum等模块,后面接对应模块的一些参数
    4.每个task必须有一个name,这样在运行时,可以很好的辨别每个task执行的详细信息

    3.Handlers在发生改变时执行的操作

    一个task中定义了了配置文件的更改,当notify模块检测到文件有改动之后执行handlers中的操作

    - name: template configuration file
      template: src=template.j2 dest=/etc/foo.conf
      notify:
         - restart memcached
         - restart apache
    handlers:
        - name: restart memcached
          service:  name=memcached state=restarted
        - name: restart apache
          service: name=apache state=restarted
    
    

    注意:handlers会按照生命的顺序来执行。Handler最佳的应用场景就是用来重启服务,或者触发系统重启

    4.运行一个playbook

    ansible-playbook playbooks.yml -f 10 #并行运行ansible,并行级别为10
    

    5.使用Ansible-Pull (拉取配置)

    Ansible-pull 是一个小脚本,它从 git 上 checkout 一个关于配置指令的 repo,然后以这个配置指令来运行 ansible-playbook.

    6.奇银技巧

    在使用playbooks过程中,如果你想看到执行成功的 modules 的输出信息,使用 --verbose flag(否则只有执行失败的才会有输出信息
    在执行一个 playbook 之前,想看看这个 playbook 的执行会影响到哪些 hosts,你可以这样做:

    ansible-playbook playbook.yml --list-hosts
    

    附录:Playbooks案例分析

    1.使用playbooks进行应用jvm相关调整

    目录结构:

    sh-4.1$ tree 
    .
    ├── playbooks.yml
    ├── start.sh.j2
    ├── stop.sh.j2
    └── vars.yml
    

    Playbooks.yml配置

    ---
    #file: playbooks.yml
    - hosts: local
    #  remote_user: pe
    #  sudo: yes
      vars:
        service: " Nginx服务"
      vars_files:
        - vars.yml
      tasks:
      - name: "{{ service }}测试联通性 {{ ansible_date_time.iso8601 }} "
        ping:
      - name: 更新tomcat启动配置
        remote_user: pe
        sudo: yes
        template:
    #        src: "start.sh.j2"
    #        dest: "/tmp/start{{ ansible_date_time.iso8601_basic }}.sh"
    #        src: "stop.sh.j2"
    #        dest: "/tmp/stop{{ ansible_date_time.iso8601_basic }}.sh"
            src: "{{ item.src }}"
            dest: "{{ item.dest }}"
            owner: admin
            group: admin
            mode: 0755
        with_items:
              - { src: "start.sh.j2", dest: "/tmp/start{{ ansible_date_time.iso8601 }}.sh" }
              - { src: "stop.sh.j2", dest: "/tmp/stop{{ ansible_date_time.iso8601 }}.sh" }
    

    变量定义文件vars.yml

    ---
    #定义tomcat_version
    tomcat_version: tomcat6.0.33
    #定义jdk_version
    jdk_version: jdk1.6.0_25
    
    #定义app_name
    app_name: xxbandy.test.local
    #定义server_id
    server_id: 1
    

    star.sh模板文件

    #!/bin/bash
    
    #chown 555 -R /export/home/tomcat/domains/
    export CATALINA_HOME=/export/servers/{{ tomcat_version }}
    export CATALINA_BASE=/export/Domains/{{ app_name }}/server{{ server_id }}
    export CATALINA_PID=$CATALINA_BASE/work/catalina.pid
    export LANG=zh_CN.UTF-8
    ###JAVA
    export JAVA_HOME=/export/servers/{{ jdk_version }}
    export JAVA_BIN=/export/servers/{{ jdk_version }}/bin
    export PATH=$JAVA_BIN:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/bin
    export CLASSPATH=.:/lib/dt.jar:/lib/tools.jar
    export JAVA_OPTS="-Djava.library.path=/usr/local/lib -server -Xms2048m -Xmx2048m -XX:MaxPermSize=512m -XX:+UnlockExperimentalVMOptions -Djava.awt.headless=true -Dsun.net.client.defaultConnectTimeout=60000 -Dsun.net.client.defaultReadTimeout=60000 -Djmagick.systemclassloader=no -Dnetworkaddress.cache.ttl=300 -Dsun.net.inetaddr.ttl=300 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$CATALINA_BASE/logs -XX:ErrorFile=$CATALINA_BASE/logs/java_error_%p.log"
    export JAVA_HOME JAVA_BIN PATH CLASSPATH JAVA_OPTS
    $CATALINA_HOME/bin/startup.sh -config $CATALINA_BASE/conf/server.xml
    
    

    2.使用playbooks进行docker监控客户端telegraf配置更新

    telegraf.conf模板文件:

    [global_tags]
      dc = "bigdata-1"
      # dc = "us-east-1" # will tag all metrics with dc=us-east-1
      # rack = "1a"
      ## Environment variables can be used as tags, and throughout the config file
      # user = "$USER"
    [agent]
      ## Default data collection interval for all inputs
      #采集间隔时间
      interval = "10s"
      ## Rounds collection interval to 'interval'
      ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
      #采用轮询时间间隔
      round_interval = true
      ## Telegraf will send metrics to outputs in batches of at
      ## most metric_batch_size metrics.
      #每次发送到output的度量大小
      metric_batch_size = 1000
      ## For failed writes, telegraf will cache metric_buffer_limit metrics for each
      ## output, and will flush this buffer on a successful write. Oldest metrics
      ## are dropped first when this buffer fills.
      #为每一个output 设置缓存
      metric_buffer_limit = 10000
      ## Collection jitter is used to jitter the collection by a random amount.
      ## Each plugin will sleep for a random time within jitter before collecting.
      ## This can be used to avoid many plugins querying things like sysfs at the
      ## same time, which can have a measurable effect on the system.
      #设置收集抖动时间,防止多个采集源数据同一时间都在队列
      collection_jitter = "0s"
      ## Default flushing interval for all outputs. You shouldn't set this below
      ## interval. Maximum flush_interval will be flush_interval + flush_jitter
      #默认所有数据flush到outputs的时间(最大能到flush_interval + flush_jitter)
      flush_interval = "10s"
      ## Jitter the flush interval by a random amount. This is primarily to avoid
      ## large write spikes for users running a large number of telegraf instances.
      ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
      # flush的抖动时间
      flush_jitter = "0s"
      ## By default, precision will be set to the same timestamp order as the
      ## collection interval, with the maximum being 1s.
      ## Precision will NOT be used for service inputs, such as logparser and statsd.
      ## Valid values are "ns", "us" (or "µs"), "ms", "s".
      precision = ""
      ## Run telegraf in debug mode
      debug = false
      ## Run telegraf in quiet mode
      quiet = false
      ## Override default hostname, if empty use os.Hostname()
      hostname = ""
      ## If set to true, do no set the "host" tag in the telegraf agent.
      omit_hostname = false
    [[outputs.influxdb]]
      ## The full HTTP or UDP endpoint URL for your InfluxDB instance.
      ## Multiple urls can be specified as part of the same cluster,
      ## this means that only ONE of the urls will be written to each interval.
      # urls = ["udp://localhost:8089"] # UDP endpoint example
      urls = ["http://10.0.0.1:8086"] # required
      ## The target database for metrics (telegraf will create it if not exists).
      database = "bigdata" # required
      ## Retention policy to write to. Empty string writes to the default rp.
      retention_policy = ""
      ## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
      write_consistency = "any"
      ## Write timeout (for the InfluxDB client), formatted as a string.
      ## If not provided, will default to 5s. 0s means no timeout (not recommended).
      timeout = "5s"
      # username = "telegraf"
      # password = "metricsmetricsmetricsmetrics"
      ## Set the user agent for HTTP POSTs (can be useful for log differentiation)
      # user_agent = "telegraf"
      ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
      # udp_payload = 512
      ## Optional SSL Config
      # ssl_ca = "/etc/telegraf/ca.pem"
      # ssl_cert = "/etc/telegraf/cert.pem"
      # ssl_key = "/etc/telegraf/key.pem"
      ## Use SSL but skip chain & host verification
      # insecure_skip_verify = false
    [[inputs.cpu]]
      ## Whether to report per-cpu stats or not
      percpu = true
      ## Whether to report total system cpu stats or not
      totalcpu = true
      ## Comment this line if you want the raw CPU time metrics
      fielddrop = ["time_*"]
    [[inputs.disk]]
      ## By default, telegraf gather stats for all mountpoints.
      ## Setting mountpoints will restrict the stats to the specified mountpoints.
      mount_points = ["/export"]
      fieldpass = ["inodes*"]
      ## Ignore some mountpoints by filesystem type. For example (dev)tmpfs (usually
      ## present on /run, /var/run, /dev/shm or /dev).
      ## By default, telegraf will gather stats for all devices including
      ## disk partitions.
      ## Setting devices will restrict the stats to the specified devices.
      # devices = ["sda", "sdb"]
      ## Uncomment the following line if you need disk serial numbers.
      # skip_serial_number = false
      # no configuration
    [[inputs.mem]]
      # no configuration
      # no configuration
      # no configuration
      # no configuration
    [[inputs.docker]]
      endpoint = "tcp://127.0.0.1:5256"
      container_names = []
    

    配置以及重启telegraf的playbook文件:

    ---
    #file: playbooks.yml
    - hosts: bigdata
      remote_user: root
      vars:
        service: "dockers telegraf update"
      tasks:
      - name: "{{ service }}测试联通性 {{ ansible_date_time.iso8601 }} "
        ping:
      tasks: 
      - name: "{{ service }} 更新配置文件"
        template:
          src: "telegraf.j2"
          dest: "/etc/telegraf/telegraf.conf"
        notify: restart telegraf
    
      handlers:
        - name: restart telegraf
          service: name=telegraf state=restarted
    

    执行结果:

    sh-4.2# ansible-playbook telegraf.yml 
     [WARNING]: While constructing a mapping from /export/ansible/telegraf.yml, line 3, column 3, found a duplicate dict key (tasks). Using last
    defined value only.
    
    
    PLAY [bigdata] *****************************************************************
    
    TASK [setup] *******************************************************************
    ok: [10.0.0.1]
    ok: [10.0.0.2]
    ok: [10.0.0.3]
    ok: [10.0.0.4]
    ok: [10.0.0.5]
    
    TASK [dockers telegraf update 更新配置文件] ******************************************
    ok: [10.0.0.1]
    ok: [10.0.0.2]
    ok: [10.0.0.3]
    ok: [10.0.0.4]
    ok: [10.0.0.5]
    
    PLAY RECAP *********************************************************************
    10.0.0.1             : ok=2    changed=0    unreachable=0    failed=0   
    10.0.0.2              : ok=2    changed=0    unreachable=0    failed=0   
    10.0.0.3             : ok=2    changed=0    unreachable=0    failed=0   
    10.0.0.4              : ok=2    changed=0    unreachable=0    failed=0   
    10.0.0.5             : ok=2    changed=0    unreachable=0    failed=0
    

    因为配置文件是没有改动过的,因此不会触发后面的restart telegraf操作

    相关文章

      网友评论

        本文标题:Ansible使用之playbooks大法

        本文链接:https://www.haomeiwen.com/subject/oeqmzttx.html