美文网首页
Flask MuiltPart解析

Flask MuiltPart解析

作者: 霡霂976447044 | 来源:发表于2019-09-29 08:38 被阅读0次

    1. multipart/form-data

    POST http://www.example.com HTTP/1.1
    Content-Type:multipart/form-data; boundary=----WebKitFormBoundaryrGKCBY7qhFd3TrwA
    
    ------WebKitFormBoundaryrGKCBY7qhFd3TrwA
    Content-Disposition: form-data; name="text"
    
    title
    ------WebKitFormBoundaryrGKCBY7qhFd3TrwA
    Content-Disposition: form-data; name="file"; filename="chrome.png"
    Content-Type: image/png
    
    PNG ... content of chrome.png ...
    ------WebKitFormBoundaryrGKCBY7qhFd3TrwA--
    

    网上摘录:
    这又是一个常见的 POST 数据提交的方式。我们使用表单上传文件时,必须让 <form> 表单的 enctype 等于 multipart/form-data。直接来看一个请求示例:
    这个例子稍微复杂点。首先生成了一个 boundary 用于分割不同的字段,为了避免与正文内容重复,boundary 很长很复杂。然后 Content-Type 里指明了数据是以 multipart/form-data 来编码,本次请求的 boundary 是什么内容。消息主体里按照字段个数又分为多个结构类似的部分,每部分都是以 --boundary 开始,紧接着是内容描述信息,然后是回车,最后是字段具体内容(文本或二进制)。如果传输的是文件,还要包含文件名和文件类型信息。消息主体最后以 --boundary-- 标示结束。关于 multipart/form-data 的详细定义,请前往 rfc1867 查看。
    这种方式一般用来上传文件,各大服务端语言对它也有着良好的支持。
    上面提到的这两种 POST 数据的方式,都是浏览器原生支持的,而且现阶段标准中原生 <form> 表单也只支持这两种方式(通过 <form> 元素的 enctype 属性指定,默认为 application/x-www-form-urlencoded。其实 enctype 还支持 text/plain,不过用得非常少)。
    随着越来越多的 Web 站点,尤其是 WebApp,全部使用 Ajax 进行数据交互之后,我们完全可以定义新的数据提交方式,给开发带来更多便利。

    socket.socket.makefile

    此方法根据一个socket连接对象返回关联的文件对象。可以方便高效率的对socket进行流操作。
    server.py

    import socket
    import io
    
    HOST = 'localhost'
    PORT = 8001
    
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    
    s.bind((HOST, PORT))
    s.listen(5)
    
    print('Server start at: %s:%s' %(HOST, PORT))
    print('wait for connection...')
    
    while True:
        conn, addr = s.accept()
        print('Connected by ', addr)
        f_conn: io.BufferedReader = conn.makefile('rb')
        while True:
            try:
                print(dir(f_conn), type(f_conn))
                data = f_conn.readline()
                print(data)
                conn.send(b"server received you message.")
            except Exception as e:
                print(e)
                break
    

    client.py

    import socket
    HOST = 'localhost'
    PORT = 8001
    
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))
    
    while True:
        try:
            cmd = input("Please input msg:")
            s.send((cmd+'\n').encode())
            print(cmd.encode())
        except Exception as e:
            print(e)
    

    3. Flask Code

    def _make_chunk_iter(stream, limit, buffer_size):
        """Helper for the line and chunk iter functions."""
        if isinstance(stream, (bytes, bytearray, text_type)):
            raise TypeError(
                "Passed a string or byte object instead of true iterator or stream."
            )
        if not hasattr(stream, "read"):
            for item in stream:
                if item:
                    yield item
            return
        if not isinstance(stream, LimitedStream) and limit is not None:
            stream = LimitedStream(stream, limit)
        _read = stream.read
        while 1:
            item = _read(buffer_size)
            if not item:
                break
            yield item
    
    def parse_lines(self, file, boundary, content_length, cap_at_buffer=True):
        """Generate parts of
        ``('begin_form', (headers, name))``
        ``('begin_file', (headers, name, filename))``
        ``('cont', bytestring)``
        ``('end', None)``
    
        Always obeys the grammar
        parts = ( begin_form cont* end |
                  begin_file cont* end )*
        """
        next_part = b"--" + boundary
        last_part = next_part + b"--"
    
        iterator = chain(
            make_line_iter(
                file,
                limit=content_length,
                buffer_size=self.buffer_size,
                cap_at_buffer=cap_at_buffer,
            ),
            _empty_string_iter,
        )
    
        terminator = self._find_terminator(iterator)
    
        if terminator == last_part:
            return
        elif terminator != next_part:
            self.fail("Expected boundary at start of multipart data")
    
        while terminator != last_part:
            headers = parse_multipart_headers(iterator)
    
            disposition = headers.get("content-disposition")
            if disposition is None:
                self.fail("Missing Content-Disposition header")
            disposition, extra = parse_options_header(disposition)
            transfer_encoding = self.get_part_encoding(headers)
            name = extra.get("name")
            filename = extra.get("filename")
    
            # if no content type is given we stream into memory.  A list is
            # used as a temporary container.
            if filename is None:
                yield _begin_form, (headers, name)
    
            # otherwise we parse the rest of the headers and ask the stream
            # factory for something we can write in.
            else:
                yield _begin_file, (headers, name, filename)
    
            buf = b""
            for line in iterator:
                if not line:
                    self.fail("unexpected end of stream")
    
                if line[:2] == b"--":
                    terminator = line.rstrip()
                    if terminator in (next_part, last_part):
                        break
    
                if transfer_encoding is not None:
                    if transfer_encoding == "base64":
                        transfer_encoding = "base64_codec"
                    try:
                        line = codecs.decode(line, transfer_encoding)
                    except Exception:
                        self.fail("could not decode transfer encoded chunk")
    
                # we have something in the buffer from the last iteration.
                # this is usually a newline delimiter.
                if buf:
                    yield _cont, buf
                    buf = b""
    
                # If the line ends with windows CRLF we write everything except
                # the last two bytes.  In all other cases however we write
                # everything except the last byte.  If it was a newline, that's
                # fine, otherwise it does not matter because we will write it
                # the next iteration.  this ensures we do not write the
                # final newline into the stream.  That way we do not have to
                # truncate the stream.  However we do have to make sure that
                # if something else than a newline is in there we write it
                # out.
                if line[-2:] == b"\r\n":
                    buf = b"\r\n"
                    cutoff = -2
                else:
                    buf = line[-1:]
                    cutoff = -1
                yield _cont, line[:cutoff]  # 返回上传的文件内容
    
            else:  # pragma: no cover
                raise ValueError("unexpected end of part")
    
            # if we have a leftover in the buffer that is not a newline
            # character we have to flush it, otherwise we will chop of
            # certain values.
            if buf not in (b"", b"\r", b"\n", b"\r\n"):
                yield _cont, buf
    
            yield _end, None
    
    class MultiPartParser(object):
        def parse_parts(self, file, boundary, content_length):
            """Generate ``('file', (name, val))`` and
            ``('form', (name, val))`` parts.
            """
            in_memory = 0
    
            for ellt, ell in self.parse_lines(file, boundary, content_length):  # 
                if ellt == _begin_file:  # 读取到是文件_begin_file只是一个字符串
                    headers, name, filename = ell
                    is_file = True
                    guard_memory = False
                    filename, container = self.start_file_streaming(  # container就是files.get('avatar').save方法里面复制到本地的内容
                        filename, headers, content_length
                    )
                    _write = container.write
    
                elif ellt == _begin_form:
                    headers, name = ell
                    is_file = False
                    container = []
                    _write = container.append
                    guard_memory = self.max_form_memory_size is not None
    
                elif ellt == _cont:  # 如果得到了上传的文件
                    _write(ell)  # 保存到FileStorage的容器里面
                    # if we write into memory and there is a memory size limit we
                    # count the number of bytes in memory and raise an exception if
                    # there is too much data in memory.
                    if guard_memory:
                        in_memory += len(ell)
                        if in_memory > self.max_form_memory_size:
                            self.in_memory_threshold_reached(in_memory)
    
                elif ellt == _end:
                    if is_file:
                        container.seek(0)
                        yield (
                            "file",
                            (name, FileStorage(container, filename, name, headers=headers)),
                        )
                    else:
                        part_charset = self.get_part_charset(headers)
                        yield (
                            "form",
                            (name, b"".join(container).decode(part_charset, self.errors)),
                        )
          def parse(self, file, boundary, content_length):
            formstream, filestream = tee(
                self.parse_parts(file, boundary, content_length), 2
            )
            form = (p[1] for p in formstream if p[0] == "form")  # 可能是字符键值对
            files = (p[1] for p in filestream if p[0] == "file")  # 可能是文件
            return self.cls(form), self.cls(files)  # 把(name, FileStorage)变为一个字典 name作为键FileStorage作为值 所以能够直接调用request.files.get('avatar').save方法
    
    class FormDataParser(object):
        def __init__(
            self,
            stream_factory=None,
            charset="utf-8",
            errors="replace",
            max_form_memory_size=None,
            max_content_length=None,
            cls=None,
            silent=True,
        )
        
         @exhaust_stream
        def _parse_multipart(self, stream, mimetype, content_length, options):
            parser = MultiPartParser(
                self.stream_factory,
                self.charset,
                self.errors,
                max_form_memory_size=self.max_form_memory_size,
                cls=self.cls,
            )
            boundary = options.get("boundary")  # multipart/form-data要求options参数有一个boundary值作为边界
            if boundary is None:
                raise ValueError("Missing boundary")
            if isinstance(boundary, text_type):
                boundary = boundary.encode("ascii")
            form, files = parser.parse(stream, boundary, content_length) # 调用MultiPartParser.parse解析方法
            return stream, form, files
      def parse(self, stream, mimetype, content_length, options=None):
            if (
                self.max_content_length is not None
                and content_length is not None
                and content_length > self.max_content_length
            ):
                raise exceptions.RequestEntityTooLarge()
            if options is None:
                options = {}
            parse_functions = {
                "multipart/form-data": _parse_multipart,
                "application/x-www-form-urlencoded": _parse_urlencoded,
                "application/x-url-encoded": _parse_urlencoded,
            }
            parse_func = self.get_parse_func(mimetype, options) # 根据Content-Type类型指定不同的解析方法 重点看第一个multipart/form-data
            if parse_func is not None:
                try:
                    return parse_func(self, stream, mimetype, content_length, options)  # 这个调用就是调用_parse_multipart
                except ValueError:
                    if not self.silent:
                        raise
    
            return stream, self.cls(), self.cls()
     
    class BaseRequest(object):   
        def make_form_data_parser(self):
            return self.form_data_parser_class(
                self._get_file_stream,  # 返回的是 wsgi协议里面的 environ["wsgi.input"]
                self.charset,
                self.encoding_errors,
                self.max_form_memory_size,
                self.max_content_length,
                self.parameter_storage_class,
            )
    
    
    parser = self.make_form_data_parser()
    data = parser.parse(self._get_stream_for_parsing(), mimetype, content_length, options)
    

    相关文章

      网友评论

          本文标题:Flask MuiltPart解析

          本文链接:https://www.haomeiwen.com/subject/ufkbuctx.html