美文网首页程序员
gunicorn不停服重启更新服务

gunicorn不停服重启更新服务

作者: 梟遙書眚 | 来源:发表于2021-01-21 19:39 被阅读0次

    gunicorn不停服重启更新服务

    每次项目更新最头疼的就是重启服务的那一段空白期,如果没有负载均衡或者负载均衡没有做好,那么在重启服务的这段时间中都会造成短暂的“宕机”,给用户的体验很不好,gunicorn使用prefork master-worker模型,可以管理自己fork的进程,这就可以让你动态的添加减少worker进程。这次就直接讲gunicorn如何不停机更新服务,这里是官方文档 https://docs.gunicorn.org/en/stable/signals.html

    信号

    gunicorn是通过信号处理来达到对进程管理的目的,先看一下他接收的几种信号

    • QUIT: 快速关闭

    • TERM: 优雅的关闭。等待worker完成当前请求直到达到超时时间

    • HUP: 重新加载配置,使用新的配置启动新的工作进程,并优雅地关闭较老的工作进程。

    • TTIN: 增加一个进程

    • TTOU: 减少一个进程

    • USR1: 重新打开日志文件

    • USR2: 在线升级gunicorn

    • WINCH: 优雅地关闭守护进程(后台运行的进程)

    上面的信号这次只说三个HUP,USR2,TERM

    HUP

    文档中的意思使用HUP可以达到重启的效果,测试的日志是这样的

    [2021-01-21 17:25:14 +0800] [20388] [INFO] Handling signal: hup
    [2021-01-21 17:25:14 +0800] [20388] [INFO] Hang up: Master
    [2021-01-21 17:25:14 +0800] [29249] [INFO] Booting worker with pid: 29249
    [2021-01-21 17:25:14 +0800] [29248] [INFO] Booting worker with pid: 29248
    [2021-01-21 17:25:14 +0800] [29250] [INFO] Booting worker with pid: 29250
    [2021-01-21 17:25:14 +0800] [28643] [INFO] Shutting down
    [2021-01-21 17:25:14 +0800] [28643] [INFO] Error while closing socket [Errno 9] Bad file descriptor
    [2021-01-21 17:25:14 +0800] [28640] [INFO] Shutting down
    [2021-01-21 17:25:14 +0800] [28640] [INFO] Error while closing socket [Errno 9] Bad file descriptor
    [2021-01-21 17:25:14 +0800] [28642] [INFO] Shutting down
    [2021-01-21 17:25:14 +0800] [28642] [INFO] Error while closing socket [Errno 9] Bad file descriptor
    [2021-01-21 17:25:14 +0800] [28643] [INFO] Finished server process [28643]
    [2021-01-21 17:25:14 +0800] [28643] [INFO] Worker exiting (pid: 28643)
    [2021-01-21 17:25:14 +0800] [28640] [INFO] Finished server process [28640]
    [2021-01-21 17:25:14 +0800] [28640] [INFO] Worker exiting (pid: 28640)
    [2021-01-21 17:25:14 +0800] [28642] [INFO] Finished server process [28642]
    [2021-01-21 17:25:14 +0800] [28642] [INFO] Worker exiting (pid: 28642)
    [2021-01-21 17:25:15 +0800] [29248] [INFO] Started server process [29248]
    [2021-01-21 17:25:15 +0800] [29248] [INFO] Waiting for application startup.
    [2021-01-21 17:25:15 +0800] [29248] [INFO] ASGI 'lifespan' protocol appears unsupported.
    [2021-01-21 17:25:15 +0800] [29248] [INFO] Application startup complete.
    [2021-01-21 17:25:15 +0800] [29249] [INFO] Started server process [29249]
    [2021-01-21 17:25:15 +0800] [29249] [INFO] Waiting for application startup.
    [2021-01-21 17:25:15 +0800] [29249] [INFO] ASGI 'lifespan' protocol appears unsupported.
    [2021-01-21 17:25:15 +0800] [29249] [INFO] Application startup complete.</pre>
    

    通过日志可以看到他是先停止了旧进程然后再启动了新的进程,但是从gunicorn源码中看是先启动了进程然后通过进程数和配置的进程数对比来kill掉老的进程:

    # 简化后的处理HUP方法
    # spawn new workers
    for _ in range(self.cfg.workers):
       self.spawn_worker()  # 这里启动了进程
    # manage workers
    self.manage_workers()  # 这里根据进程启动的时候给的一个age值来kill掉老的进程</pre>
    
    # manage_workers方法
    def manage_workers(self):
     """\
     Maintain the number of workers by spawning or killing
     as required.
     """
     if len(self.WORKERS) < self.num_workers:
     self.spawn_workers()
    ​
     workers = self.WORKERS.items()
     workers = sorted(workers, key=lambda w: w[1].age)
     while len(workers) > self.num_workers:
     (pid, _) = workers.pop(0)
     self.kill_worker(pid, signal.SIGTERM)
    ​
     active_worker_count = len(workers)
     if self._last_logged_active_worker_count != active_worker_count:
     self._last_logged_active_worker_count = active_worker_count
     self.log.debug("{0} workers".format(active_worker_count),
     extra={"metric": "gunicorn.workers",
     "value": active_worker_count,
     "mtype": "gauge"})
    

    测试了一下也确实会有问题(我用的django3.1服务用的uvicorn,因为uvicorn没有进程管理的功能所以用gunicorn来启动uvicorn,uvicorn官方文档也是这么建议的),在重启的瞬间发起请求会有异常抛出

    USR2

    It executes a new binary whose PID file is postfixed with .2 (e.g. /var/run/gunicorn.pid.2), which in turn starts a new master process and new worker processes
    大概的意思发送USR2信号后会启动新的主进程和工作进程也就是新的master进程和worker进程

    先看一下当前的进程(为了方便观看我删除了ps命令结果的最后一列信息):

    [root@Luckybamboo report-web]# ps -ef | grep uvicorn.workers
    root      9146     1  0 17:30 pts/7    00:00:00 gunicorn
    root      9168  9146  1 17:30 pts/7    00:00:00 gunicorn
    root      9169  9146  1 17:30 pts/7    00:00:00 gunicorn
    root      9170  9146  1 17:30 pts/7    00:00:00 gunicorn
    

    可以看到当前的master进程为9146,工作进程分别为9168,9169,9170

    发送信号后的变化为:

    [root@Luckybamboo report-web]# kill -USR2 9146
    [root@Luckybamboo report-web]# ps -ef | grep uvicorn.workers
    root      9146     1  0 17:30 pts/7    00:00:00 gunicorn 
    root      9168  9146  1 17:30 pts/7    00:00:00 gunicorn 
    root      9169  9146  1 17:30 pts/7    00:00:00 gunicorn 
    root      9170  9146  1 17:30 pts/7    00:00:00 gunicorn 
    root     11562  9146  9 17:32 pts/7    00:00:00 gunicorn 
    root     11564 11562 30 17:32 pts/7    00:00:00 gunicorn 
    root     11565 11562 64 17:32 pts/7    00:00:00 gunicorn 
    root     11566 11562 60 17:32 pts/7    00:00:00 gunicorn 
    

    这时候可以看到启动了新的master进程11562,新的工作进程11564,11565,11566

    这个时候可以通过TERM信号来停止老的进程9146只保留新的进程就可以了

    [root@Luckybamboo report-web]# kill -TERM 9146
    [root@Luckybamboo report-web]# ps -ef | grep uvicorn.workers
    root     11562     1  0 17:32 pts/7    00:00:00 gunicorn 
    root     11564 11562  2 17:32 pts/7    00:00:00 gunicorn 
    root     11565 11562  2 17:32 pts/7    00:00:00 gunicorn 
    root     11566 11562  2 17:32 pts/7    00:00:00 gunicorn
    

    可以看到这时候就只有新的进程了。我期望的是在新的进程启动之后旧的进程将不再处理新的请求,测试了一下确实是这样,但是因为测试的比较少而且源码中没有看到这个逻辑,而且这个信号是用来在线升级gunicorn的,所以最好还是把旧的进程当成正常的进程来看待处理,文档中也说如果不用新的进程可以kill掉新的进程,也可以接着对旧的进程进行各种信号处理,希望有人能补充我这种期望该怎么操作

    相关文章

      网友评论

        本文标题:gunicorn不停服重启更新服务

        本文链接:https://www.haomeiwen.com/subject/skbdzktx.html