微服务之性能测试

作者: 老瓦在霸都 | 来源:发表于2019-10-28 21:37 被阅读0次

    性能测试怎么做? 什么性能测试工具最称手?仁者见仁,智者见智。

    性能测试的目的是为了充分了解系统及服务

    • 所能承受的最大容量是多少,
    • 有无性能瓶颈?
    • 如果有性能瓶颈,瓶颈在哪里?

    最重要的有三点:
    1)响应时间
    2)吞吐量
    3)成功率

    吞吐量=\frac{1s}{响应时间}

    1. 协议层面的考量

    以最常用的 HTTP 协议, 我们要看如下要点:

    1. 响应码(Response code)

    响应码代表了 HTTP REST API 的响应成功与否, 其中5xx 的响应码是需要重点注意, 密切观察产线上出现的错误

    1. 响应时间(ResponseTime)

    响应时间是衡量 REST API 性能的重要指标, 基于它我们可以知道微服务可以在多长时间内响应, 通常我们需要知道最大值max, 平均值average 和P99 (百分之99的请求的响应时间) 值

    1. 请求次数(Request volume)

    请求次数也就是请求数量的多少, 绝对数量意义不大, 单位时间内的请求数量更有意义.

    1. HTTP 请求的频率(Request frequency)

    常用度量指标有QPS (Query Per Second) 或 TPS (Transaction Per Second)

    1. 应用程序性能指标 (APDEX)

    APDEX=\frac{满意的请求数 + \frac{可容忍的请求数}{2}}{全部的请求数}

    假设响应时间在T秒之内是令人满意的,在 F 秒之外是令人沮丧的

    • 1) 满意的 satistifed

    这代表响应时间小于设定的阈值(T秒),用户感觉满意。

    • 2) 可容忍的 tolerating

    这代表响应时间大于T秒 并小于F秒,性能不佳但是还可以继续使用,用户感觉仍可容忍。

    • 3)失望的 Frustrated

    这代表响应时间超过F秒,用户难以接受,放弃继续使用,用户感觉失望。

    其它的协议有各自类似的度量指标,比如 SIP 就有

    • SRD (Session Request Delay) 会话请求延迟
    • SDD (Session Disconnect Delay) 会话断开延迟
    • SDT (Session Duration Time) 会话持续时间
    • SER (Session Establishment Ratio) 会话建立比率
    • SEER (Session Establishment Effectiveness Ratio) 会话建立有效率
    • ISAs (Ineffective Session Attempts) 无效会话尝试数
    • SCR (Session Completion Ratio) 会话完成率

    2. 系统层面的考量

    当系统的压力越来越大,相关的资源能否能满足要求?
    我们需要考察如下指标:

    • CPU 利用率

    • 内存 利用率

    • 磁盘剩余空间

    • 磁盘 I/O 率

    • 网络 I/O 率

    • JVM 虚拟机统计

    3. 测试怎么做

    以一个简单的微服务为例

    import os
    import json
    import requests
    import redis
    from flask_httpauth import HTTPBasicAuth
    from flask import make_response
    from flask import Flask
    from flask import request
    from werkzeug.exceptions import NotFound, ServiceUnavailable
    from flask import render_template
    
    ACCOUNTS_API_PATH = "/api/v1/accounts"
    REDIS_KEY = "walter_accounts"
    
    app = Flask(__name__)
    
    current_path = os.path.dirname(os.path.realpath(__file__))
    
    auth = HTTPBasicAuth()
    
    users = {
        "walter": "pass1234"
    }
    
    json_file = "{}/account.json".format(current_path)
    redis_enabled = True
    #docker run --restart always -p 6379:6379 -d --name local-redis redis
    
    class RedisClient:
        def __init__(self):
            self.redis_host = "localhost"
            self.redis_port = 6379
            self.redis_password = ''
            self.redis_conn = None
    
        def connect(self):
            #if(redis_enabled):
            pool = redis.ConnectionPool(host=self.redis_host, port=self.redis_port)
            self.redis_conn = redis.Redis(connection_pool=pool)
    
        def set(self, key, value):
            self.redis_conn.set(key, value)
    
        def get(self, key):
            return self.redis_conn.get(key)
    
    redis_client = RedisClient()
    
    if(redis_enabled):
        redis_client.connect()
    
    def read_data():
        if redis_enabled:
            jsonStr = redis_client.get(REDIS_KEY)
            if not jsonStr:
                jsonStr = "{}"
            return json.loads(jsonStr)
        else:
            json_fp = open(json_file, "r")
            return json.load(json_fp)
    
    
    def save_data(accounts):
        if redis_enabled:
            redis_client.set(REDIS_KEY, json.dumps(accounts))
        else:
            json_fp = open(json_file, "w")
            json.dump(accounts, json_fp, sort_keys=True, indent=4)
    
    
    @auth.get_password
    def get_pw(username):
        if username in users:
            return users.get(username)
        return None
    
    
    def generate_response(arg, response_code=200):
        response = make_response(json.dumps(arg, sort_keys=True, indent=4))
        response.headers['Content-type'] = "application/json"
        response.status_code = response_code
        return response
    
    
    @app.route('/')
    def index():
        return render_template('index.html')
    
    
    @auth.login_required
    @app.route(ACCOUNTS_API_PATH, methods=['GET'])
    def list_account():
        accounts = read_data()
        return generate_response(accounts)
    
    
    # Create account
    @auth.login_required
    @app.route(ACCOUNTS_API_PATH, methods=['POST'])
    def create_account():
        account = request.json
        sitename = account["siteName"]
        accounts = read_data()
        if sitename in accounts:
            return generate_response({"error": "conflict"}, 409)
        accounts[sitename] = account
        save_data(accounts)
        return generate_response(account)
    
    
    # Retrieve account
    @auth.login_required
    @app.route(ACCOUNTS_API_PATH + '/<sitename>', methods=['GET'])
    def retrieve_account(sitename):
        accounts = read_data()
        if sitename not in accounts:
            return generate_response({"error": "not found"}, 404)
    
        return generate_response(accounts[sitename])
    
    
    # Update account
    @auth.login_required
    @app.route(ACCOUNTS_API_PATH + '/<sitename>', methods=['PUT'])
    def update_account(sitename):
        accounts = read_data()
        if sitename not in accounts:
            return generate_response({"error": "not found"}, 404)
    
        account = request.json
        print(account)
        accounts[sitename] = account
        save_data(accounts)
        return generate_response(account)
    
    
    # Delete account
    @auth.login_required
    @app.route(ACCOUNTS_API_PATH + '/<sitename>', methods=['DELETE'])
    def delete_account(sitename):
        accounts = read_data()
        if sitename not in accounts:
            return generate_response({"error": "not found"}, 404)
    
        del (accounts[sitename])
        save_data(accounts)
        return generate_response("", 204)
    
    
    if __name__ == "__main__":
        app.run(port=5000, debug=True)
    
    '''
    http --auth walter:pass --json POST http://localhost:5000/api/v1/accounts \
        userName=walter password=pass siteName=163 siteUrl=http://163.com
    '''
    

    准备环境

    先安装 libev 和 python3

    brew install libev
    brew install python3
    

    所需类库在 requirements.txt 描述如下

    flask
    flask-httpauth
    requests
    httpie
    redis
    locust
    

    再安装 virtualenv 和 所需要的类库

    
    virtualenv pip3 install virtualenv
    virtualenv -p python3 venv
    source venv/bin/activate
    # then install the required libraries
    pip install -r requirements.txt
    

    启动

    python account.py
    

    可使用用 httpie (参见 https://httpie.org/) 来做一个简单的测试

    # 添加网易帐号
    http --auth walter:pass --json POST http://localhost:5000/api/v1/accounts userName=walter password=pass siteName=163 siteUrl=http://163.com
    HTTP/1.0 200 OK
    Content-Length: 108
    Content-type: application/json
    Date: Thu, 24 Oct 2019 14:08:05 GMT
    Server: Werkzeug/0.12.2 Python/3.7.3
    
    {
        "password": "pass",
        "siteName": "163",
        "siteUrl": "http://163.com",
        "userName": "walter"
    }
    # 添加微博帐号
    http --auth walter:pass --json POST http://localhost:5000/api/v1/accounts userName=walter password=pass siteName=weibo siteUrl=http://weibo.com
    HTTP/1.0 200 OK
    Content-Length: 108
    Content-type: application/json
    Date: Thu, 24 Oct 2019 14:08:05 GMT
    Server: Werkzeug/0.12.2 Python/3.7.3
    
    {
        "password": "pass",
        "siteName": "weibo",
        "siteUrl": "http://weibo.com",
        "userName": "walter"
    }
    # 获取所有帐号
    http --auth walter:pass --json GET http://localhost:5000/api/v1/accounts
    HTTP/1.0 200 OK
    Content-Length: 290
    Content-type: application/json
    Date: Thu, 24 Oct 2019 14:20:54 GMT
    Server: Werkzeug/0.12.2 Python/3.7.3
    
    {
        "163": {
            "password": "pass",
            "siteName": "163",
            "siteUrl": "http://163.com",
            "userName": "walter"
        },
        "weibo": {
            "password": "pass",
            "siteName": "weibo",
            "siteUrl": "http://weibo.com",
            "userName": "walter"
        }
    }
    

    性能测试

    传统的性能测试工具有很多, 比如 ab, jmeter, loadrunner 等等,用来持续增加压力。
    以 ab(Apache Benchmark) 为例, 我们测试100个并发量,10000条请求

    $ab -c 100 -n 10000 http://127.0.0.1:5000/api/v1/accounts
    This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Licensed to The Apache Software Foundation, http://www.apache.org/
    
    Benchmarking 127.0.0.1 (be patient)
    Completed 1000 requests
    Completed 2000 requests
    Completed 3000 requests
    Completed 4000 requests
    Completed 5000 requests
    Completed 6000 requests
    Completed 7000 requests
    Completed 8000 requests
    Completed 9000 requests
    Completed 10000 requests
    Finished 10000 requests
    
    
    Server Software:        Werkzeug/0.12.2
    Server Hostname:        127.0.0.1
    Server Port:            5000
    
    Document Path:          /api/v1/accounts
    Document Length:        290 bytes
    
    Concurrency Level:      100
    Time taken for tests:   30.550 seconds
    Complete requests:      10000
    Failed requests:        0
    Total transferred:      4370000 bytes
    HTML transferred:       2900000 bytes
    Requests per second:    327.33 [#/sec] (mean)
    Time per request:       305.504 [ms] (mean)
    Time per request:       3.055 [ms] (mean, across all concurrent requests)
    Transfer rate:          139.69 [Kbytes/sec] received
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        0    1   2.5      1     144
    Processing:     5  303  78.7    288     587
    Waiting:        3  303  78.7    288     587
    Total:         15  304  78.8    289     588
    
    Percentage of the requests served within a certain time (ms)
      50%    289
      66%    347
      75%    362
      80%    372
      90%    405
      95%    435
      98%    458
      99%    571
     100%    588 (longest request)
    

    ab 用来做简单的 HTTP 接口测试还行,如果需要做业务接口的串联测试就力有未逮了。
    Jmeter 当然功能强大,也有一定的扩展性,但是在这里我们并不想用 jmeter , 原因有两点
    1)Jmeter是资源消耗黑洞, 每个任务/用户都要使用一个线程。
    2)Jmeter是基于配置的,而Locust是基于编程来实现的性能测试工具,它可以实现更加灵活的控制。

    Locust 即英文蝗虫之意, 是一款开源的性能测试工具,开始上手试一下
    它要先写一个脚本文件 locust.py

    from locust import HttpLocust, TaskSet, task, seq_task
    
    import load_test_util
    import json
    import yaml
    
    from queue  import Queue
    from threading import Timer
    logger = load_test_util.init_logger("account-load-test")
    token_refresh_time = 300
    
    
    
    class UserBehavior(TaskSet):
    
        def on_start(self):
            logger.info("on_start")
            self.auth_headers = load_test_util.getAuthHeaders()
            self.account_queue = Queue()
    
        def on_stop(self):
            logger.info("on_stop, clear queue")
    
        def list_account(self):
            self.client.get("/api/v1/accounts")
    
        @seq_task(1)
        def create_account(self):
            post_dict = load_test_util.create_account_request()
    
            post_data = json.dumps(post_dict)
    
            logger.info("auth_headers: %s", json.dumps(self.auth_headers))
            logger.info("post_data: %s", post_data)
    
            response = self.client.post("/api/v1/accounts", headers = self.auth_headers, data=post_data)
            logger.info("response: %d, %s", response.status_code, response.text)
            if (200 <= response.status_code < 300):
                siteName = post_dict['siteName']
                logger.info("siteName: %s" % siteName)
                self.account_queue.put(siteName)
    
            return response
    
        @seq_task(2)
        def retrieve_account(self):
    
            if not self.account_queue.empty():
                siteName = self.account_queue.get(True, 1)
                logger.info("retrieve_account by siteName %s", siteName)
                response = self.client.get("/api/v1/accounts/" + siteName, headers=self.auth_headers, name="/api/v1/accounts/siteName")
                logger.info("retrieve_account's response: %d, %s", response.status_code, response.text)
                self.account_queue.put(siteName)
    
    
        @seq_task(3)
        def update_account(self):
             if not self.account_queue.empty():
                siteName = self.account_queue.get(True, 1)
                post_dict = load_test_util.create_account_request()
    
                put_data = json.dumps(post_dict)
                response = self.client.put("/api/v1/accounts/"+ siteName, headers = self.auth_headers, data=put_data, name="/api/v1/accounts/siteName")
                logger.info("response: %d, %s", response.status_code, response.text)
                self.account_queue.put(siteName)
    
        @seq_task(4)
        def delete_account(self):
            if not self.account_queue.empty():
                siteName = self.account_queue.get(True, 1)
                response = self.client.delete("/api/v1/accounts/" + siteName, headers = self.auth_headers, name="/api/v1/accounts/siteName")
                logger.info("response: %d, %s", response.status_code, response.text)
    
    class WebsiteUser(HttpLocust):
        task_set = UserBehavior
        min_wait = 500
        max_wait = 3000
    

    启动 locust 作性能测试

    $ locust -f account_load_test.py --host=http://localhost:5000
    Starting web monitor at *:8089
    Starting Locust 0.11.0
    

    打开 http://localhost:8089

    locus ui

    参见下图


    locust ui RPS Response Time User number

    可以用如下方法在命令行下实行梯度加压

    locust -f account_load_test.py --host=http://localhost:5000 --no-web c 100 -r 100 -t 30m --csv=100.csv
    locust -f account_load_test.py --host=http://localhost:5000 --no-web c 200 -r 200 -t 30m --csv=200.csv
    locust -f account_load_test.py --host=http://localhost:5000 --no-web c 400 -r 400 -t 30m --csv=400.csv
    locust -f account_load_test.py --host=http://localhost:5000 --no-web c 800 -r 800 -t 30m --csv=800.csv
    

    当一台机器产生的压力不够时,我们可以使用多台服务器来加压,如下图所示, 一台 master server, 若干台 slave server

    • 先启动 master locust
    locust -f account_load_test.py --master --host=http://localhost:5000
    
    • 再启动 slave locust
    locust -f account_load_test.py --slave --master-host:10.224.77.11
    
    locus distributed testing

    通过 locust 所生成的测试报告,可导出 csv 文件进行详细分析,再结合微服务的各种度量数据进行分析:

    1. 系统级度量分析 CPU, Mem, Disk I/O, Network I/O, etc.
    2. 应用级度量分析:响应时间,响应码,吞吐量等等
    3. 业务级度量分析:与特定的业务流程相关的度量数据分析。

    之后就是通过C/C++, Python,Java等语言各自的 profiler 工具进行性能分析,如 gprof, cProfiler, hprof, VisualVM, JMC 等等。

    以我比较熟悉的 Java 来说, 在服务器上启动 Java service 时在命令行中加入如下参数

    java -jar -Dcom.sun.management.jmxremote.host=10.224.112.73 \
    -Dcom.sun.management.jmxremote.port=9091 \
    -Dcom.sun.management.jmxremote.rmi.port=9012 \
    -Dcom.sun.management.jmxremote.ssl=false \
    -Dcom.sun.management.jmxremote.authenticate=false potato-server.jar
    

    然后在本机上启动JDK 自带的 JVisualVm,添加远程节点 10.224.112.73, 并创建到 10.224.112.73:9091的 JMX 连接

    VisualVM Overview

    观察 CPU, Mem, Class, Thread


    Visual VM Monitor

    线程


    VisualVM Thread

    并可针对 CPU 或 Mem 进行一段时间的采样

    VisualVM sampler

    参考资料

    相关文章

      网友评论

        本文标题:微服务之性能测试

        本文链接:https://www.haomeiwen.com/subject/lvckvctx.html