美文网首页
Varnish(主要用作缓存)

Varnish(主要用作缓存)

作者: 尛尛大尹 | 来源:发表于2017-11-15 20:15 被阅读0次

    1.介绍

    http是现在服务中用的主流协议
    restful
    缓存 :
               代理缓存-----proxy---类似于递归
               旁挂式缓存-----类似于迭代
            (递归:直接给出答案,不知道的被人去找答案)
            (迭代:自己一级一级问)
    缓存有过期机制和条件式请求
    vanish具有代理式缓存
                   正向代理:代表客户端发请求 client
                   反向代理:代表服务器端  server
    httpd既能做正向代理又能做反向代理
    Nginx核心是代理,也可以缓存 
    squid    : httpd
    varnish : Nginx
                     Nginx用epoll做事件驱动
    c10k:并发连接  connections 10k
    c100k
    vanish工作在客户端和服务器之间作反向代理,并且用作缓存
    mysql关系型数据库
    

    运维日常:


    clipboard.png
    clipboard1.png

    2.Web Page Cache:

                 squid --> varnish
                  程序的运行具有局部性特征:
                              时间局部性:一个数据被访问过之后,可能很快会被再次访问到
                                 空间局部性:一个数据被访问时,其周边的数据也有可能被访问到
                   cache:命中
                                热区:局部性;(就像某宝推荐的商品有时效性和局部性)
                                              时效性:
                                                    缓存空间耗尽:LRU,最近最少使用;
                                                     过期:缓存清理
    缓存命中率:hit/(hit+miss) (命中次数/命中次数+未命中次数)
                        (0,1)          命中次数+为命中次数不一定等于总次数,有的不经过缓存访问
                        页面命中率:基于页面数量进行衡量
                         字节命中率: 基于页面的体积进行衡量
    缓存与否:
                   私有数据:private,private cache;
                   公共数据:public,public or private cache;
    Cache-related Headers Fields 缓存的区域性
                The most important caching header fields are:
                   Expires:过期时间;
                   Expires:Thu, 22 Oct 2026 06:34:30 GMT
                         Cache-Control:max-age=
                             
                           Etag
                     If-None-Match
    
         Last-Modified
         If-Modified-Since
    
         Vary
         Age
     
    缓存有效性判断机制:
    过期时间:Expires
    HTTP\/1.0
    Expires:过期
    HTTP\/1.1
    Cache-Control: maxage=
    Cache-Control: s-maxage=
    条件式请求:
    Last-Modified/If-Modified-Since:基于文件的修改时间戳来判别;
    Etag\/If-None-Match:基于文件的校验码来判别;
    Expires:Thu, 13 Aug 2026 02:05:12 GMT
    Cache-Control:max-age=315360000
    ETag:"1ec5-502264e2ae4c0"
    Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT
    缓存层级:
          私有缓存:用户代理附带的本地缓存机制;
          公共缓存:反向代理服务器的缓存功能;
          User-Agent <--> private cache <--> public cache <--> public cache 2 <--> Original Server
    请求报文用于通知缓存服务如何使用缓存响应请求:
    (以下是请求报文可用的选项)
    cache-request-directive =
    "no-cache",                        
    | "no-store"                         
    | "max-age" "=" delta-seconds        
    | "max-stale" [ "=" delta-seconds ]  
    | "min-fresh" "=" delta-seconds      
    | "no-transform"                    
    | "only-if-cached"                  
    | cache-extension                    
     响应报文用于通知缓存服务器如何存储上级服务器响应的内容:
    (以下是响应报文可用的选项)
    cache-response-directive =
    "public"                               
    | "private" [ "=" <"> 1#field-name <"> ]
    | "no-cache" [ "=" <"> 1#field-name <"> ],可缓存,但响应给客户端之前 条件式请求进行缓存有效性验正;
    | "no-store" ,不允许存储响应内容于缓存中;                           
    | "no-transform"                        
    | "must-revalidate"                     
    | "proxy-revalidate"                  
    | "max-age" "=" delta-seconds  定义最大缓存时长,过期时间       
    | "s-maxage" "=" delta-seconds 定义最大缓存时长,仅用于控制公共时长        
    | cache-extension     
    开源解决方案:
       squid:
       varnish:
       varnish官方站点: http://www.varnish-cache.org/
    Community
    Enterprise
     This is Varnish Cache, a high-performance HTTP accelerator.
    
    clipboard2.png

    varnish2.0,3.0处理过程


    clipboard3.png

    varnish4.0
    varnish的程序环境:(我们只能缓存GET和HEAD请求)s


    varnish (1)4.png

    3.vanish程序架构:

    vanish由manager和cacher进程组成,还有共享内存日志组件
    Manager进程(主控进程)
    Cacher进程,包含多种类型的线程:
    accept, worker, expiry, ...
    (cacher处理各种缓存事物,比如处理请求,管理缓存,清理过期缓存)
    shared memory log:
    ( 共享内存日志:为了免得日志成为性能瓶颈,日志信息直接计入内存)
    统计数据:计数器;
    日志区域:日志记录;
    varnishlog, varnishncsa, varnishstat...
    配置接口:VCL
    Varnish Configuration Language,
    vcl complier --> c complier --> shared object
    /etc/varnish/varnish.params:配置varnish服务进程的工作特性,例如监听的地址和端口,缓存机制;
    /etc/varnish/default.vcl:配置各Child/Cache线程的缓存策略;
    主程序:
    /usr/sbin/varnishd
    CLI interface:
    /usr/bin/varnishadm
    Shared Memory Log交互工具:
    /usr/bin/varnishhist
    /usr/bin/varnishlog
    /usr/bin/varnishncsa
    /usr/bin/varnishstat
    /usr/bin/varnishtop
    测试工具程序:
    /usr/bin/varnishtest
    VCL配置文件重载程序:
    /usr/sbin/varnish_reload_vcl
    Systemd Unit File:
    /usr/lib/systemd/system/varnish.service
    varnish服务
    /usr/lib/systemd/system/varnishlog.service
    /usr/lib/systemd/system/varnishncsa.service
    日志持久的服务;
    varnish的缓存存储机制( Storage Types):
    -s [name=]type[,options]
    · malloc[,size]
    内存存储,[,size]用于定义空间大小;重启后所有缓存项失效;
    · file[,path[,size[,granularity]]]
    磁盘文件存储,黑盒;重启后所有缓存项失效;
    · persistent,path,size
    文件存储,黑盒;重启后所有缓存项有效;实验;暂不能用

    clipboard5.png

    varnish程序的选项:

    程序选项:/etc/varnish/varnish.params文件
    -a address[:port][,address[:port][...],默认为6081端口;
    -T address[:port],默认为6082端口;
    -s [name=]type[,options],定义缓存存储机制;
    -u user
    -g group
    -f config:VCL配置文件;
    -F:运行于前台;
    ...
    运行时参数:/etc/varnish/varnish.params文件, DEAMON_OPTS
    DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300"
    -p param=value:设定运行参数及其值; 可重复使用多次;
    -r param[,param...]: 设定指定的参数为只读状态;


    clipboard6.png

    重载vcl配置文件:

    ~ ]# varnish_reload_vcl
    # varnishadm(varnish客户端命令)
    -S /etc/varnish/secret -T [ADDRESS:]PORT
    help [<command>]
    ping [<timestamp>]
    auth <response>
    quit
    banner
    status
    start
    stop
    vcl.load <configname> <filename> 相当于varnish_relod_acl:加载ACL
    vcl.inline <configname> <quoted_VCLstring>
    vcl.use <configname>
    vcl.discard <configname>
    vcl.list
    param.show [-l] [<param>]
    param.set <param> <value>
    panic.show
    panic.clear
    storage.list
    vcl.show [-v] <configname>
    backend.list [<backend_expression>]
    backend.set_health <backend_expression> <state>
    ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
    ban.list


    clipboard7.png
    clipboard8.png clipboard9.png clipboard10.png

    配置文件相关:

    vcl.list
    vcl.load:装载,加载并编译;
    vcl.use:激活;
    vcl.discard:删除;
    vcl.show [-v] <configname>:查看指定的配置文件的详细信息;
    运行时参数:
    param.show -l:显示列表;
    param.show <PARAM>
    param.set <PARAM> <VALUE>
    缓存存储:
    storage.list
    后端服务器:
    backend.list
    VCL:
    ”域“专有类型的配置语言;
    state engine:状态引擎;
    VCL有多个状态引擎,状态之间存在相关性,但状态引擎彼此间互相隔离;每个状态引擎可使用return(x)指明关联至哪个下一级引擎;每个状态引擎对应于vcl文件中的一个配置段,即为subroutine
    vcl_hash --> return(hit) --> vcl_hit
    vcl_recv的默认配置:
    sub vcl_recv {
    if (req.method == "PRI") {
    /* We do not support SPDY or HTTP/2.0 */
    return (synth(405));
    }
    if (req.method != "GET" &&
    req.method != "HEAD" &&
    req.method != "PUT" &&
    req.method != "POST" &&
    req.method != "TRACE" &&
    req.method != "OPTIONS" &&
    req.method != "DELETE") {
    /* Non-RFC2616 or CONNECT which is weird. */
    return (pipe);
    }
    if (req.method != "GET" && req.method != "HEAD") {
    /* We only deal with GET and HEAD by default */
    return (pass);
    }
    if (req.http.Authorization || req.http.Cookie) {
    /* Not cacheable by default */
    return (pass);
    }
    return (hash);
    }
    }

    Client Side:

    vcl_recv, vcl_pass, vcl_hit, vcl_miss, vcl_pipe, vcl_purge, vcl_synth, vcl_deliver
    vcl_recv:
    hash:vcl_hash
    pass: vcl_pass
    pipe: vcl_pipe
    synth: vcl_synth
    purge: vcl_hash --> vcl_purge
    vcl_hash:
    lookup:
    hit: vcl_hit
    miss: vcl_miss
    pass, hit_for_pass: vcl_pass
    purge: vcl_purge
    Backend Side:
    vcl_backend_fetch, vcl_backend_response, vcl_backend_error
    两个特殊的引擎:
    vcl_init:在处理任何请求之前要执行的vcl代码:主要用于初始化VMODs;
    vcl_fini:所有的请求都已经结束,在vcl配置被丢弃时调用;主要用于清理VMODs;

    vcl的语法格式:

    (1) VCL files start with vcl 4.0;表明版本
    (2) //, # and /* foo / for comments;//,#代表单行注释 / */代表多行注释
    (3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { ...};
    每个子例程以sub关键字开头
    (4) No loops, state-limited variables(受限于引擎的内建变量);不支持循环
    (5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action);用于实现状态引擎转换;
    (6) Domain-specific;
    The VCL Finite State Machine
    (1) Each request is processed separately;每个请求都是独立的
    (2) Each request is independent from others at any given time;每个请求都与 其他的请求分离
    (3) States are related, but isolated;状态是有关联的
    (4) return(action); exits one state and instructs Varnish to proceed to the next state;return退出当前状态
    (5) Built-in VCL code is always present and appended below your own VCL;
    Built-in VCL 相当于默认vcl

    三类主要语法:

    sub subroutine {
    ...
    }
    if CONDITION {
    ...
    } else {
    ...
    }
    return(), hash_data()
    VCL Built-in Functions and Keywords
    函数:
    regsub(str, regex, sub)
    regsuball(str, regex, sub)
    ban(boolean expression)
    hash_data(input)
    synthetic(str)
    Keywords:
    call subroutine, return(action),new,set,unset
    操作符:
    ==, !=, ~, >, >=, <, <=
    逻辑操作符:&&, ||, !
    变量赋值:=
    举例:obj.hits是内建变量,用于保存某缓存项的从缓存中命中的次数;
    if (obj.hits>0) {
    set resp.http.X-Cache = "HIT via "(固定字符串) + server.ip;
    } else {
    set resp.http.X-Cache = "MISS from "(未能命中) + server.ip;
    }


    clipboard11.png
    clipboard12.png clipboard13.png

    常用变量:

    bereq., req.
    bereq.http.HEADERS
    bereq.request:请求方法;
    bereq.url:请求的url;
    bereq.proto:请求的协议版本;
    bereq.backend:指明要调用的后端主机;
    req.http.Cookie:客户端的请求报文中Cookie首部的值;
    req.http.User-Agent ~ "chrome"
    beresp., resp.
    beresp.http.HEADERS
    beresp.status:响应的状态码;
    reresp.proto:协议版本;
    beresp.backend.name:BE主机的主机名;
    beresp.ttl:BE主机响应的内容的余下的可缓存时长;
    obj.*
    obj.hits:此对象从缓存中命中的次数;
    obj.ttl:对象的ttl值,缓存过期
    server.*
    server.ip:varnish主机的IP;
    server.hostname:varnish主机的Hostname

    client.*
    client.ip:发请求至varnish主机的客户端IP;
    用户自定义:
    set
    unset
    示例1:强制对某类资源的请求不检查缓存:
    vcl_recv {
    if (req.url ~ "(?i)^/(login|admin)") { (?i)表示忽略大小写,后面是匹配URI,
    不包括主机名和端口
    return(pass);
    }
    }

    clipboard14.png
    vcl.show test2 查看内容
    vcl.use test2

    示例2:对于特定类型的资源,例如公开的图片等,取消其私有标识,并强行设定其可以由varnish缓存的时长; 定义在vcl_backend_response中;
    (并不是所有的cookie都不可以被缓存下来,可以剥离cookie)
    if (beresp.http.cache-control !~ "s-maxage") {
    if (bereq.url ~ "(?i).(jpg|jpeg|png|gif|css|js)$") {
    unset beresp.http.Set-Cookie;
    set beresp.ttl = 3600s;
    }
    }
    示例3:定义在vcl_recv中;
    if (req.restarts == 0) { 请求的重启次数为0
    if (req.http.X-Fowarded-For) { 如果请求报文中有forwarded-for
    set req.http.X-Forwarded-For = req.http.X-Forwarded-For + "," + client.ip;
    } else {
    set req.http.X-Forwarded-For = client.ip;
    }
    }
    后端主机# vim /etc/httpd/conf/htttpd.conf


    clipboard15.png

    purge:手动修剪指定的缓存项
    ban:一类的缓存项
    (1) 能执行purge操作
    sub vcl_purge {
    return (synth(200,"Purged"));
    }
    (2) 何时执行purge操作
    sub vcl_recv {
    if (req.method == "PURGE") {
    return(purge);
    }
    ...
    }
    添加此类请求的访问控制法则:
    acl purgers {
    "127.0.0.0"/8;
    "10.1.0.0"/16;
    }
    sub vcl_recv {
    if (req.method == "PURGE") {
    if (!client.ip ~ purgers) {
    return(synth(405,"Purging not allowed for " + client.ip));
    }
    return(purge);
    }
    ...
    }


    clipboard16.png
    clipboard17.png
    Banning:

    (1) varnishadm:
    ban <field> <operator> <arg>
    示例:
    ban req.url ~ ^/javascripts
    (2) 在配置文件中定义,使用ban()函数;
    示例:
    if (req.method == "BAN") {
    ban("req.http.host == " + req.http.host + " && req.url == " + req.url);
    # Throw a synthetic page so the request won't go to the backend.
    return(synth(200, "Ban added"));
    }
    ban req.http.host==www.ilinux.io && req.url==/test1.html

    如何设定使用多个后端主机:

    backend default {每添加一个后端主机就写一个backend,default代表一个主机名
    .host = "172.16.100.6";真正主机的地址
    .port = "80";真正主机的端口
    }
    backend appsrv {
    .host = "172.16.100.7";
    .port = "80";
    }
    sub vcl_recv {
    if (req.url ~ "(?i).php$") {
    set req.backend_hint = appsrv;
    } else {
    set req.backend_hint = default;
    }
    ...
    }


    clipboard18.png
    clipboard19.png clipboard20.png clipboard21.png

    Director:

    varnish module;
    使用前需要导入:
    import directors;

    varnish两种调度算法轮询和随机

    示例:
    import directors; # load the directors
    backend server1 {
    .host =
    .port =
    }
    backend server2 {
    .host =
    .port =
    }
    sub vcl_init {
    new GROUP_NAME = directors.round_robin();
    GROUP_NAME.add_backend(server1);
    GROUP_NAME.add_backend(server2);
    }
    sub vcl_recv {
    # send all traffic to the bar director:
    set req.backend_hint = GROUP_NAME.backend();组名
    }


    clipboard22.png

    会话保存的三种方式:

                                会话绑定 基于原IP绑定,基于应用层绑定
                                session复制
                                session服务器
    

    基于cookie的session sticky:

    sub vcl_init {
    new h = directors.hash();
    h.add_backend(one, 1); // backend 'one' with weight '1'
    h.add_backend(two, 1); // backend 'two' with weight '1'
    }
    sub vcl_recv {
    // pick a backend based on the cookie header of the client
    set req.backend_hint = h.backend(req.http.cookie);
    }
    BE Health Check:
    backend BE_NAME {
    .host =
    .port =
    .probe = {
    .url=
    .timeout=
    .interval=
    .window=
    .threshold=
    }
    }
    .probe:定义健康状态检测方法;
    .url:检测时要请求的URL,默认为”/";
    .request:发出的具体请求;
    .request =
    "GET /.healthtest.html HTTP/1.1"
    "Host: www.magedu.com"
    "Connection: close"
    .window:基于最近的多少次检查来判断其健康状态;
    .threshold:最近.window中定义的这么次检查中至有.threshhold定义的次数是成功;
    .interval:检测频度;每个多长时间检查一次
    .timeout:超时时长;
    .expected_response:期望的响应码,默认为200;

    clipboard23.png

    健康状态检测的配置方式:

    (1) probe PB_NAME { }
    backend NAME = {
    .probe = PB_NAME;
    ...
    }
    (2) backend NAME {
    .probe = {
    ...
    }
    }
    示例:
    probe check {
    .url = "/.healthcheck.html";
    .window = 5;
    .threshold = 4;
    .interval = 2s;
    .timeout = 1s;
    }
    backend default {
    .host = "10.1.0.68";
    .port = "80";
    .probe = check;
    }
    backend appsrv {
    .host = "10.1.0.69";
    .port = "80";
    .probe = check;
    }
    手动设定BE主机的状态:
    sick:管理down;
    healthy:管理up;
    auto:probe auto;

    clipboard24.png

    设置后端的主机属性:

    backend BE_NAME {
    ...
    .connect_timeout = 0.5s;
    .first_byte_timeout = 20s;
    .between_bytes_timeout = 5s;两个字节之间传送的间隔,如果超时也认为down
    .max_connections = 50;
    }
    varnish的运行时参数:
    线程模型:
    cache-worker
    cache-main
    ban lurker
    acceptor:
    epoll/kqueue:
    ...
    线程相关的参数:使用线程池机制管理线程;
    在线程池内部,其每一个请求由一个线程来处理; 其worker线程的最大数决定了varnish的并发响应能力;
    每个参数都要使用-p来引导
    thread_pools:Number of worker thread pools. 最好小于或等于CPU核心数量;
    thread_pool_max:每线程池的最大线程数;
    thread_pool_min:The minimum number of worker threads in each pool. 额外意义为“最大空闲线程数”;
    最大并发连接数 = thread_pools * thread_pool_max
    thread_pool_timeout:Thread idle threshold. Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.
    thread_pool_add_delay:Wait at least this long after creating a thread.添加线程延迟一段时间,使用默认值就好
    thread_pool_destroy_delay:Wait this long after destroying a thread.
    Timer相关的参数:
    send_timeout:Send timeout for client connections. If the HTTP response hasn't been transmitted in this many seconds the session is closed.
    timeout_idle:Idle timeout for client connections.
    timeout_req: Max time to receive clients request headers, measured from first non-white-space character to double CRNL.
    cli_timeout:Timeout for the childs replies to CLI requests from the mgt_param.

    设置方式:
    vcl.param
    param.set
    永久有效的方法:
    varnish.params
    DEAMON_OPTS="-p PARAM1=VALUE -p PARAM2=VALUE"
    varnish运行时参数,重启缓存将失效


    clipboard25.png

    varnish日志区域:

    shared memory log
    计数器
    日志信息
    1、varnishstat - Varnish Cache statistics
    -1 表示只显示一批就结束
    -1 -f FILED_NAME
    -f FILED_NAME 查看某一个字段
    -l:可用于-f选项指定的字段名称列表;

    MAIN.cache_hit
    MAIN.cache_miss 没有命中
    # varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss
    显示指定参数的当前统计数据;
    # varnishstat -l -f MAIN -f MEMPOOL
    列出指定配置段的每个参数的意义;


    clipboard26.png

    2、varnishtop - Varnish log entry ranking

    -1 Instead of a continously updated display, print the statistics once and exit.
    -i taglist,可以同时使用多个-i选项,也可以一个选项跟上多个标签;
    -I <[taglist:]regex>
    -x taglist:排除列表,出了什么其他的都显示
    -X <[taglist:]regex>


    clipboard27.png clipboard28.png

    3、varnishlog - Display Varnish logs
    4、 varnishncsa - Display Varnish logs in Apache / NCSA combined log format


    clipboard29.png

    内建函数:

    hash_data():
    指明哈希计算的数据;减少差异,以提升命中率;
    regsub(str,regex,sub):
    把str中被regex第一次匹配到字符串替换为sub;主要用于URL Rewrite
    regsuball(str,regex,sub):
    把str中被regex每一次匹配到字符串均替换为sub;
    return():
    ban(expression)
    ban_url(regex):
    Bans所有的其URL可以被此处的regex匹配到的缓存对象;
    synth(status,"STRING"):purge操作;

    总结
    varnish: state engine, vcl
    varnish 4.0:
    vcl_init
    vcl_rec
    vcl_hash
    vcl_hit
    vcl_pass
    vcl_miss
    vcl_pip
    vcl_waiting
    vcl_purge
    vcl_deliver
    vcl_synth
    vcl_fini
    vcl_backend_fetch
    vcl_backend_response
    vcl_backend_error
    sub VCL_STATE_ENGINE {
    ...
    }
    backend BE_NAME {}
    probe PB_NAME {}acl ACL_NAME {}

    实战项目:两个lamp部署wordpress,用Nginx反代,做压测;nginx后部署varnish缓存,调整vcl,多次压测;
    ab, http_load, webbench, seige, jmeter, loadrunner,...
    补充资料:varnish book
    http://book.varnish-software.com/4.0/

    示例:
    backend imgsrv1 {
    .host = "192.168.10.11";
    .port = "80";
    }
    backend imgsrv2
    .host = "192.168.10.12";
    .port = "80";
    }
    backend appsrv1 {
    .host = "192.168.10.21";
    .port = "80";
    }
    backend appsrv2 {
    .host = "192.168.10.22";
    .port = "80";
    }
    sub vcl_init {
    new imgsrvs = directors.random();
    imgsrvs.add_backend(imgsrv1,10);
    imgsrvs.add_backend(imgsrv2,20);
    new staticsrvs = directors.round_robin();
    appsrvs.add_backend(appsrv1);
    appsrvs.add_backend(appsrv2);
    new appsrvs = directors.hash();
    appsrvs.add_backend(appsrv1,1);
    appsrvs.add_backend(appsrv2,1);
    }
    sub vcl_recv {
    if (req.url ~ "(?i).(css|js)$" {
    set req.backend_hint = staticsrvs.backend();
    }
    if (req.url ~ "(?i).(jpg|jpeg|png|gif)$" {
    set req.backend_hint = imgsrvs.backend();
    } else {
    set req.backend_hint = appsrvs.backend(req.http.cookie);
    }
    }

    相关文章

      网友评论

          本文标题:Varnish(主要用作缓存)

          本文链接:https://www.haomeiwen.com/subject/qcvhvxtx.html