美文网首页
django middleware简单分析

django middleware简单分析

作者: llicety | 来源:发表于2018-02-27 09:45 被阅读0次

    在阅读资料的时候,经常见到资料上说,django处理请求流程的时候,是先middleware处理,如果没有返回response,那么才到我们写的视图view中去处理(包括函数视图和对象视图【基于django-restframe-work】)

    那么django的middleware是在什么时候,如何被加载,middleware又做了些什么处理呢?
    首先要明确middleware是一个类,他有一些固定名字的一系列方法(process_系列),从django1.10版本起,middleware是继承自django/utils/deprecations中的MiddlewareMixin类,这是一个可调用的对象,其代码如下:

    class MiddlewareMixin(object):
        def __init__(self, get_response=None):
            self.get_response = get_response
            super(MiddlewareMixin, self).__init__()
    
        def __call__(self, request):
            response = None
            if hasattr(self, 'process_request'):
                response = self.process_request(request)
            if not response:
                response = self.get_response(request)
            if hasattr(self, 'process_response'):
                response = self.process_response(request, response)
            return response
    

    其他中间件类可以继承这个类,然后自己实现中间件中固定的方法,从而实现自己的中间件。

    现在我们从头开始梳理django处理request的流程,进而窥探中间件的处理过程。
    先看WSGIHandler类

    class WSGIHandler(base.BaseHandler):
        request_class = WSGIRequest
    
        def __init__(self, *args, **kwargs):
            super(WSGIHandler, self).__init__(*args, **kwargs)
            self.load_middleware()
    
        def __call__(self, environ, start_response):
            set_script_prefix(get_script_name(environ))
            signals.request_started.send(sender=self.__class__, environ=environ)
            try:
                request = self.request_class(environ)
                print "request.COOKIES: ", request.COOKIES
                print "request.HTTP_AUTHORIZATION: ", request.META.get('HTTP_AUTHORIZATION','No HTTP_AUTHORIZATION')
            except UnicodeDecodeError:
                logger.warning(
                    'Bad Request (UnicodeDecodeError)',
                    exc_info=sys.exc_info(),
                    extra={
                        'status_code': 400,
                    }
                )
                response = http.HttpResponseBadRequest()
            else:
                response = self.get_response(request)
    
            response._handler_class = self.__class__
    
            status = '%d %s' % (response.status_code, response.reason_phrase)
            response_headers = [(str(k), str(v)) for k, v in response.items()]
            for c in response.cookies.values():
                response_headers.append((str('Set-Cookie'), str(c.output(header=''))))
            start_response(force_str(status), response_headers)
            if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
                response = environ['wsgi.file_wrapper'](response.file_to_stream)
            
            print "type(response), response: ", type(response), response
    #        print "response.cookies: ", response.items()[0][1]
    #        print "response.headers: ", response._headers
            return response
    

    这里面我们先重点关注_init_函数中的self.load_middleware()和_call_函数中的response = self.get_response(request)
    load_middleware的源码如下:

        def load_middleware(self):
            """
            Populate middleware lists from settings.MIDDLEWARE (or the deprecated
            MIDDLEWARE_CLASSES).
    
            Must be called after the environment is fixed (see __call__ in subclasses).
            """
            self._request_middleware = []
            self._view_middleware = []
            self._template_response_middleware = []
            self._response_middleware = []
            self._exception_middleware = []
    
            if settings.MIDDLEWARE is None:
                warnings.warn(
                    "Old-style middleware using settings.MIDDLEWARE_CLASSES is "
                    "deprecated. Update your middleware and use settings.MIDDLEWARE "
                    "instead.", RemovedInDjango20Warning
                )
                handler = convert_exception_to_response(self._legacy_get_response)
                for middleware_path in settings.MIDDLEWARE_CLASSES:
                    mw_class = import_string(middleware_path)
                    try:
                        mw_instance = mw_class()
                    except MiddlewareNotUsed as exc:
                        if settings.DEBUG:
                            if six.text_type(exc):
                                logger.debug('MiddlewareNotUsed(%r): %s', middleware_path, exc)
                            else:
                                logger.debug('MiddlewareNotUsed: %r', middleware_path)
                        continue
    
                    if hasattr(mw_instance, 'process_request'):
                        self._request_middleware.append(mw_instance.process_request)
                    if hasattr(mw_instance, 'process_view'):
                        self._view_middleware.append(mw_instance.process_view)
                    if hasattr(mw_instance, 'process_template_response'):
                        self._template_response_middleware.insert(0, mw_instance.process_template_response)
                    if hasattr(mw_instance, 'process_response'):
                        self._response_middleware.insert(0, mw_instance.process_response)
                    if hasattr(mw_instance, 'process_exception'):
                        self._exception_middleware.insert(0, mw_instance.process_exception)
            else:
                handler = convert_exception_to_response(self._get_response)
                for middleware_path in reversed(settings.MIDDLEWARE):
                    middleware = import_string(middleware_path)
                    try:
                        mw_instance = middleware(handler)
                    except MiddlewareNotUsed as exc:
                        if settings.DEBUG:
                            if six.text_type(exc):
                                logger.debug('MiddlewareNotUsed(%r): %s', middleware_path, exc)
                            else:
                                logger.debug('MiddlewareNotUsed: %r', middleware_path)
                        continue
    
                    if mw_instance is None:
                        raise ImproperlyConfigured(
                            'Middleware factory %s returned None.' % middleware_path
                        )
    
                    if hasattr(mw_instance, 'process_view'):
                        self._view_middleware.insert(0, mw_instance.process_view)
                    if hasattr(mw_instance, 'process_template_response'):
                        self._template_response_middleware.append(mw_instance.process_template_response)
                    if hasattr(mw_instance, 'process_exception'):
                        self._exception_middleware.append(mw_instance.process_exception)
    
                    handler = convert_exception_to_response(mw_instance)
    
            # We only assign to this when initialization is complete as it is used
            # as a flag for initialization being complete.
            self._middleware_chain = handler
    

    self.load_middleware()主要作用就是去settings配置文件读取设置的middleware,然后初始化WSGIHandler类中的各个middleware的相关变量,这些变量主要包括self._request_middleware,self._view_middleware,self._response_middleware等存放中间件方法的列表。

    WSGIHandler的_call_函数中的response = self.get_response(request),这也是django处理request的入口

        def get_response(self, request):
            """Return an HttpResponse object for the given HttpRequest."""
            # Setup default url resolver for this thread
            set_urlconf(settings.ROOT_URLCONF)
    
            response = self._middleware_chain(request)
    
            # This block is only needed for legacy MIDDLEWARE_CLASSES; if
            # MIDDLEWARE is used, self._response_middleware will be empty.
            try:
                # Apply response middleware, regardless of the response
                for middleware_method in self._response_middleware:
                    response = middleware_method(request, response)
                    # Complain if the response middleware returned None (a common error).
                    if response is None:
                        raise ValueError(
                            "%s.process_response didn't return an "
                            "HttpResponse object. It returned None instead."
                            % (middleware_method.__self__.__class__.__name__))
            except Exception:  # Any exception should be gathered and handled
                signals.got_request_exception.send(sender=self.__class__, request=request)
                response = self.handle_uncaught_exception(request, get_resolver(get_urlconf()), sys.exc_info())
    
            response._closable_objects.append(request)
    
            # If the exception handler returns a TemplateResponse that has not
            # been rendered, force it to be rendered.
            if not getattr(response, 'is_rendered', True) and callable(getattr(response, 'render', None)):
                response = response.render()
    
            if response.status_code == 404:
                logger.warning(
                    'Not Found: %s', request.path,
                    extra={'status_code': 404, 'request': request},
                )
    
            return response
    

    get_response函数中重点关注response = self._middleware_chain(request)这句。self._middleware_chain在WSGIHandler调用_init_的时候调用self.load_middleware时完成初始化的。当settings中的middleware是用MIDDLEWARE_CLASSES 表示时,_middleware_chain其实就是一个被装饰的_get_response函数,当settings中的middleware是MIDDLEWARE表示时,_middleware_chain是一个middleware对象,这个middleware对象中的get_response方法是前面加载的middleware的一个合集(个人理解表述)。具体可以参见上面self.load_middleware的源码。

    下面看_get_response,也就是真正处理request的函数,看明白了这个函数,也就基本明白了django处理request的流程

        def _get_response(self, request):
            """
            Resolve and call the view, then apply view, exception, and
            template_response middleware. This method is everything that happens
            inside the request/response middleware.
            """
            response = None
    
            if hasattr(request, 'urlconf'):
                urlconf = request.urlconf
                set_urlconf(urlconf)
                resolver = get_resolver(urlconf)
            else:
                resolver = get_resolver()
    
            resolver_match = resolver.resolve(request.path_info)
            callback, callback_args, callback_kwargs = resolver_match
            request.resolver_match = resolver_match
    
            # Apply view middleware
            for middleware_method in self._view_middleware:
                response = middleware_method(request, callback, callback_args, callback_kwargs)
                if response:
                    break
    
            if response is None:
                wrapped_callback = self.make_view_atomic(callback)
                try:
                    response = wrapped_callback(request, *callback_args, **callback_kwargs)
                except Exception as e:
                    response = self.process_exception_by_middleware(e, request)
    
            # Complain if the view returned None (a common error).
            if response is None:
                if isinstance(callback, types.FunctionType):    # FBV
                    view_name = callback.__name__
                else:                                           # CBV
                    view_name = callback.__class__.__name__ + '.__call__'
    
                raise ValueError(
                    "The view %s.%s didn't return an HttpResponse object. It "
                    "returned None instead." % (callback.__module__, view_name)
                )
    
            # If the response supports deferred rendering, apply template
            # response middleware and then render the response
            elif hasattr(response, 'render') and callable(response.render):
                for middleware_method in self._template_response_middleware:
                    response = middleware_method(request, response)
                    # Complain if the template response middleware returned None (a common error).
                    if response is None:
                        raise ValueError(
                            "%s.process_template_response didn't return an "
                            "HttpResponse object. It returned None instead."
                            % (middleware_method.__self__.__class__.__name__)
                        )
    
                try:
                    response = response.render()
                except Exception as e:
                    response = self.process_exception_by_middleware(e, request)
    
            return response
    

    在_get_response函数中,首先解析访问的url,从而获得后台开发者自己写的view处理函数,也就是callback, callback_args, callback_kwargs = resolver_match中的callback,真正调用在wrapped_callback = self.make_view_atomic(callback),从_get_response的执行顺序我们就可以看出,只有在所有的middleware执行完后还没有获得response,才会执行开发者所写的view函数,这也是开头说的,django处理request流程,现有middleware开始,最后才到view函数。
    在django的1.10版本源码中,并没有看到谁去显示的调用各个中间件的各种函数,比如process_request,那么middleware中的process_request等一些列函数谁去调用呢?其实关键点在_middleware_chain函数。前面提到,在django的1.10版本以前,各个中间件中的函数在load_middleware的时候放到固定的函数列表中,然后在固定的流程去执行这些函数,但是从1.10版本起,并没有地方显示的调用,刚刚说了,关键点在于1.10版本以后,_middleware_chain已经变成了一个特殊的middleware对象了,这个middleware对象中的get_response函数在每一次加载新的中间件时被迭代更新,从而包含了前面加载的中间件。所以在最后执行middleware_chain的时候就相当于调用了中间件类的_call_方法,这个_call_去递归调用前面加载的中间件的_call_方法,从而调用每一个中间件的定义的process*系列函数。这是一个难以理解的地方,好好理解load_middelware函数中的函数convert_exception_to_response,就可以明白这个点。

    def convert_exception_to_response(get_response):
        """
        Wrap the given get_response callable in exception-to-response conversion.
    
        All exceptions will be converted. All known 4xx exceptions (Http404,
        PermissionDenied, MultiPartParserError, SuspiciousOperation) will be
        converted to the appropriate response, and all other exceptions will be
        converted to 500 responses.
    
        This decorator is automatically applied to all middleware to ensure that
        no middleware leaks an exception and that the next middleware in the stack
        can rely on getting a response instead of an exception.
        """
        @wraps(get_response, assigned=available_attrs(get_response))
        def inner(request):
            try:
                response = get_response(request)
            except Exception as exc:
                response = response_for_exception(request, exc)
            return response
        return inner
    

    当难以理解某段代码的时候,可以写一个小例子测试实验一下。

    from functools import wraps
    
    def available_attrs(fn):
        """
        Return the list of functools-wrappable attributes on a callable.
        This is required as a workaround for http://bugs.python.org/issue3445
        under Python 2.
        """
        WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')
    
        return tuple(a for a in WRAPPER_ASSIGNMENTS if hasattr(fn, a))
    
    def convert_exception_to_response(get_response):
        """
        Wrap the given get_response callable in exception-to-response conversion.
    
        All exceptions will be converted. All known 4xx exceptions (Http404,
        PermissionDenied, MultiPartParserError, SuspiciousOperation) will be
        converted to the appropriate response, and all other exceptions will be
        converted to 500 responses.
    
        This decorator is automatically applied to all middleware to ensure that
        no middleware leaks an exception and that the next middleware in the stack
        can rely on getting a response instead of an exception.
        """
        @wraps(get_response, assigned=available_attrs(get_response))
        def inner():
            response = None
            try:
                response = get_response()
            except Exception as exc:
                print exc
            return response
        return inner
    
    def get_response():
        print "xxxxx"
    
    class A1(object):
        def __init__(self, f):
            self.f = f
            print "A1 init"
    
        def __call__(self, *args, **kwargs):
            self.f()
            print "A1 call"
    
    class A2(object):
        def __init__(self, f):
            self.f = f
            print "A2 init"
    
        def __call__(self, *args, **kwargs):
            self.f()
            print "A2 call"
    
    class A3(object):
        def __init__(self, f):
            self.f = f
            print "A3 init"
    
        def __call__(self, *args, **kwargs):
            self.f()
            print "A3 call"
    
    
    f = convert_exception_to_response(get_response)
    # print dir(f)
    f = convert_exception_to_response(A1(f))
    # print dir(f)
    # f.f()
    f = convert_exception_to_response(A2(f))
    # print dir(f)
    f.f()
    f = convert_exception_to_response(A3(f))
    # print type(f)
    # print dir(f)
    # print type(available_attrs)
    # print dir(available_attrs)
    f()
    

    输出结果为

    A1 init
    A2 init
    xxxxx
    A1 call
    A3 init
    xxxxx
    A1 call
    A2 call
    A3 call
    

    通过小例子,就比较清晰的看到convert_exception_to_response函数做了什么。

    比如django.contrib.auth.middleware.AuthenticationMiddleware中的认证函数process_request就是在这里被调用的。

    class AuthenticationMiddleware(MiddlewareMixin):
        def process_request(self, request):
            assert hasattr(request, 'session'), (
                "The Django authentication middleware requires session middleware "
                "to be installed. Edit your MIDDLEWARE%s setting to insert "
                "'django.contrib.sessions.middleware.SessionMiddleware' before "
                "'django.contrib.auth.middleware.AuthenticationMiddleware'."
            ) % ("_CLASSES" if settings.MIDDLEWARE is None else "")
            request.user = SimpleLazyObject(lambda: get_user(request))
    

    网上的资料说middleware继承MiddlewareMixin是从django的1.10版本开始的,前面的版本是没有继承对象的,也就是传统的中间件(legacy middleware)

    总结下来就是,django 1.10版本以前,所有的middlware的方法都是加入到特定的数组中的,然后依次调用数组的中方法处理request和response。1.10版本起,middleware是一个可调用对象,process_request,get_response, process_response在直接调用meddleware对象时通过调用call方法调用对应的函数。比如用户认证的AuthenticationMiddleware,就是初始化request.user。
    借用网上的一张图片:

    image.png

    中间件的应用场景
    由于中间件工作在 视图函数执行前、执行后(像不像所有视图函数的装饰器!)适合所有的请求/一部分请求做批量处理

    1、做IP限制
    放在中间件类的列表中,阻止某些IP访问了;

    2、URL访问过滤
    如果用户访问的是login视图(放过)
    如果访问其他视图(需要检测是不是有session已经有了放行,没有返回login),这样就省得在 多个视图函数上写装饰器了!

    3、缓存(还记得CDN吗?)
    客户端请求来了,中间件去缓存看看有没有数据,有直接返回给用户,没有再去逻辑层 执行视图函数

    参考来源:
    https://docs.djangoproject.com/en/2.0/topics/http/middleware/
    https://code.ziqiangxuetang.com/django/django-middleware.html
    http://www.cnblogs.com/huchong/p/7819296.html
    http://daoluan.net/%E5%AD%A6%E4%B9%A0%E6%80%BB%E7%BB%93/2013/09/13/decode-django-have-look-at-middleware.html

    相关文章

      网友评论

          本文标题:django middleware简单分析

          本文链接:https://www.haomeiwen.com/subject/rctuxftx.html