美文网首页tomcat专题
Tomcat7 request line(请求行)源码解析

Tomcat7 request line(请求行)源码解析

作者: 绝尘驹 | 来源:发表于2017-09-17 22:40 被阅读39次

    本文试图说清楚tomcat 如下几个问题:

    • tomcat 底层到底有几层buffer,是怎么一层一层读上来到应用层的
    • tomcat request line 解析

    要分析tomcat 读,首先你要知道tomcat nio的线程模型,如果不了解这个知识的话,不好理解本文。

    先上一张tomcat 的buffer 关系图:

    tomcat-buffer-1.jpg

    Tomcat 大概的流程图如下:

    tomcat-flow.jpg

    SocketBuffe

    SocketBuffe 是tomcat最NIO层面的buffer,也是tomcat的一层buffer,可以通过connect 配置缓冲区大小,是否用direct buffer,通过这里我们可以看优化tomcat时,可以指定direct 为true

    public SocketBufferHandler(int readBufferSize, int writeBufferSize,
                boolean direct) {
            this.direct = direct;
            if (direct) {
                readBuffer = ByteBuffer.allocateDirect(readBufferSize);
                writeBuffer = ByteBuffer.allocateDirect(writeBufferSize);
            } else {
                readBuffer = ByteBuffer.allocate(readBufferSize);
                writeBuffer = ByteBuffer.allocate(writeBufferSize);
            }
        }
    

    readBufferSize 大小默认是8192byte,即8k,如果你的post 请求内容比这个大,

    如果配置使用堆外内存DirectByteBuffer,tomcat 清理是采用主动清理的方式,方法是通过反射拿到DirectByteBuffer的cleaner 方法,再通过反射执行cleaner方法,拿到cleaner对象,在free时执行cleaner 清除
    堆内的引用对象cleaner

    可以看出tomcat的底层socket buffer 是用完了就回收的, 没有重用,这点不得不佩服netty的优化,有内存池,buffer 对象池两层优化。

    Http11Processor

    Tomcat 接收到clien 发送的http 请求后,读http请求的报文由
    Http11Processor 的service方法负责处理,

    Http11Processor 由ConnectionHandler 创建,tomcat 对关键的类都实现了重用,以减少频繁创建和销毁的开销,会从recycledProcessors 里pop出来,

    if (processor == null) {
        processor = recycledProcessors.pop();
        if (getLog().isDebugEnabled()) {
            getLog().debug(sm.getString("abstractConnectionHandler.processorPop",
                    processor));
        }
    }
    if (processor == null) {
        processor = getProtocol().createProcessor();
        register(processor);
    }
    

    Http11Processor 的创建

    Http11Processor 创建需要指定tomcat 读缓冲区的大小,即包含请求头header的大小,请求body的大小等

    maxHttpHeaderSize 默认是8k

    public Http11Processor(int maxHttpHeaderSize, AbstractEndpoint<?> endpoint,int maxTrailerSize,
                Set<String> allowedTrailerHeaders, int maxExtensionSize, int maxSwallowSize,
                Map<String,UpgradeProtocol> httpUpgradeProtocols, boolean sendReasonPhrase) {
    
            super(endpoint);
            userDataHelper = new UserDataHelper(log);
    
           inputBuffer = new Http11InputBuffer(request, maxHttpHeaderSize);
            request.setInputBuffer(inputBuffer);
    
            outputBuffer = new Http11OutputBuffer(response, maxHttpHeaderSize, sendReasonPhrase);
            response.setOutputBuffer(outputBuffer);
    
            // Create and add the identity filters.
            // tomcat 通过filter来 读body,IdentityInputFilter 读非Chunk body
            inputBuffer.addFilter(new IdentityInputFilter(maxSwallowSize));
            outputBuffer.addFilter(new IdentityOutputFilter());
    
            // Create and add the chunked filters.
            inputBuffer.addFilter(new ChunkedInputFilter(maxTrailerSize, allowedTrailerHeaders,
                    maxExtensionSize, maxSwallowSize));
            outputBuffer.addFilter(new ChunkedOutputFilter());
    
            // Create and add the void filters.
            inputBuffer.addFilter(new VoidInputFilter());
            outputBuffer.addFilter(new VoidOutputFilter());
    
            // Create and add buffered input filter
            inputBuffer.addFilter(new BufferedInputFilter());
    
            // Create and add the chunked filters.
    
            //inputBuffer.addFilter(new GzipInputFilter());
            outputBuffer.addFilter(new GzipOutputFilter());
    
            pluggableFilterIndex = inputBuffer.getFilters().length;
    
            this.httpUpgradeProtocols = httpUpgradeProtocols;
        }
    

    Http11InputBuffer 的创建

    public Http11InputBuffer(Request request, int headerBufferSize) {
    
            this.request = request;
            headers = request.getMimeHeaders();
    
            this.headerBufferSize = headerBufferSize;
    
            filterLibrary = new InputFilter[0];
            activeFilters = new InputFilter[0];
            lastActiveFilter = -1;
    
            parsingHeader = true;
            parsingRequestLine = true;
            parsingRequestLinePhase = 0;
            parsingRequestLineEol = false;
            parsingRequestLineStart = 0;
            parsingRequestLineQPos = -1;
            headerParsePos = HeaderParsePosition.HEADER_START;
            swallowInput = true;
    
            inputStreamInputBuffer = new SocketInputBuffer();
        }
      
    

    Http11InputBuffer 创建时,就指定了headerBufferSize 的大小,还有个inputStreamInputBuffer,inputStreamInputBuffer 是在读http body时用到的。在分析http body时会讲到

    在拿到可用的Http11Processor 后,调用它的核心方法service 方法,service 方法比较长,我们只关注几个关键点

    1 初始化读写缓冲区

     //初始化读缓冲区,inputBuffer即Http11InputBuffer
     inputBuffer.init(socketWrapper);
     //初始化写缓冲区 
     outputBuffer.init(socketWrapper);
    

    我们看下Http11InputBuffer init的代码如下:

    void init(SocketWrapperBase<?> socketWrapper) {
            wrapper = socketWrapper;
            wrapper.setAppReadBufHandler(this);
        int bufLength = headerBufferSize +
    wrapper.getSocketBufferHandler().getReadBuffer().capacity();
            if (byteBuffer == null || byteBuffer.capacity() < bufLength) {
                byteBuffer = ByteBuffer.allocate(bufLength);
                byteBuffer.position(0).limit(0);
            }
        }
    

    init 方法是为Http11InputBuffer 内部创建一个读缓冲区byteBuffer,就是这个byteBuffer 在后面的读请求头,header,body 时都会用到,这是tomcat的一个核心bytebuffer

    初始化bytebuffer的大小:大小为headerBufferSize + socket buffer的大小

    headerBufferSize 默认8*1024
    socket buffer size 默认是8192

    看完了buffer 初始化的工作,下面就是开始解析http 协议 内容了,我们知道http 协议内容分为三部分即:
    request line + request header + request body 组成

    http-protocol.jpg

    那首先解析的是http 请求头 request line

    //解析请求头inputBuffer 是上面提到的Http11InputBuffer
    if (!inputBuffer.parseRequestLine(keptAlive)) {
             //如果没有读到完整的请求行,parsingRequestLinePhase 是 1 
             if (inputBuffer.getParsingRequestLinePhase() == -1) {
                            return SocketState.UPGRADING;
               } else if (handleIncompleteRequestLineRead()) {
                     //如果没有读到一个完整的请求头,则需要等待继续读,即需要重新注册读事件
                     break;
              }
     }
    

    接下来我们看看inputBuffer.parseRequestLine 方法有个读标记
    parsingRequestLinePhase,parsingRequestLinePhase的值代表读请求行不同的部分

    parsingRequestLinePhase = 0

    初始值,这时byteBuffer 是空的,即position == limit = 0,触发第一次读这读后面再分析

    先要确定parsingRequestLineStart 的值,怕前面有换行或者回车符,如果没有则第一个就是本次请求的buffer的。并设置parsingRequestLinePhase 为2

    parsingRequestLinePhase = 2

    parsingRequestLinePhase 为2 就开始读method,直到读到第一个空格为止,设置request的method,并设置parsingRequestLinePhase = 3

    if (byteBuffer.position() >= byteBuffer.limit()) {
                        if (!fill(false)) // request line parsing
                            return false;
                    }
                    // Spec says method name is a token followed by a single SP but
                    // also be tolerant of multiple SP and/or HT.
                    int pos = byteBuffer.position();
                    byte chr = byteBuffer.get();
                    if (chr == Constants.SP || chr == Constants.HT) {
                        space = true;
                        //读到了空格,说明知道了method的长度。即pos - parsingRequestLineStart,这时就可以设置request的method了。
                        request.method().setBytes(byteBuffer.array(), parsingRequestLineStart,
                                pos - parsingRequestLineStart);
                    } else if (!HttpParser.isToken(chr)) {
                        byteBuffer.position(byteBuffer.position() - 1);
                        throw new IllegalArgumentException(sm.getString("iib.invalidmethod"));
                    }
    
    parsingRequestLinePhase = 3

    parsingRequestLinePhase 是计算请求url的偏移位置。往后读,直到读到非空格的字符。并parsingRequestLinePhase = 4

    //跳过空格,让parsingRequestLineStart 到url的第一个偏移位置。
            if (parsingRequestLinePhase == 3) {
                // Spec says single SP but also be tolerant of multiple SP and/or HT
                boolean space = true;
                while (space) {
                    // Read new bytes if needed
                    if (byteBuffer.position() >= byteBuffer.limit()) {
                        if (!fill(false)) // request line parsing
                            return false;
                    }
                    byte chr = byteBuffer.get();
                    if (!(chr == Constants.SP || chr == Constants.HT)) {
                        space = false;
                        byteBuffer.position(byteBuffer.position() - 1);
                    }
                }
                parsingRequestLineStart = byteBuffer.position();
                parsingRequestLinePhase = 4;
            }
    
    parsingRequestLinePhase = 4

    这阶段主要计算出两部分,因为get请求url后面可能是带参数的,所以需要计算出url的偏移量和长度,以及查询参数的偏移量parsingRequestLineQPos和长度,以及url结束时的偏移end,url的偏移在parsingRequestLinePhase = 3时计算好了。

    所以url的内容为
    url length = parsingRequestLineQPos - parsingRequestLineStart
    如果url部分有?,则查询参数的内容为
    queryStr length = end - parsingRequestLineQPos - 1

    //有查询参数
    if (parsingRequestLineQPos >= 0) {
                  request.queryString().setBytes(byteBuffer.array(), parsingRequestLineQPos + 1,
                            end - parsingRequestLineQPos - 1);
                    request.requestURI().setBytes(byteBuffer.array(), parsingRequestLineStart,
                     parsingRequestLineQPos - parsingRequestLineStart);
    } else {
            request.requestURI().setBytes(byteBuffer.array(), parsingRequestLineStart,
                            end - parsingRequestLineStart);
     }
    parsingRequestLinePhase = 5;
    

    parsingRequestLinePhase = 5 和 3一样,需要调过后面的空格,防止出现多个空格。并计算出了请求行最后一部分协议版本的偏移parsingRequestLineStart,设置parsingRequestLinePhase = 6

    parsingRequestLinePhase = 6

    从parsingRequestLineStart读,直到读到回车符CR标记end,读到LF就认为是结会束了,并设置parsingRequestLineEol 结束的偏移量

    Protocol 的内容为:
    end- parsingRequestLineStart

    到此tomcat 读http 头就算理完了。上面都是在bytebuffer 已经有内存的基础上做的,但是bytebuffer 内容是怎么读到的,经历了那几次copy,还是个谜,下面我们就来揭开下:

    在上面每个阶段都会判断当前bytebuffer 是否还有数据可读:

    /*
    position = limit 即已经读完了,需要执行fill重新填充,参数是false,
     是非阻塞读,那什么时候阻塞读呢,是我们在调用getInputStream()时,
     就是阻塞的
    */
    if (byteBuffer.position() >= byteBuffer.limit()) {
             if (!fill(false)) // request line parsing
                     return false;
     }
    

    fill 调用的是NioSocketWrapper的read方法:

    @Override
    public int read(boolean block, ByteBuffer to) throws IOException {
    
                //先从tomcat 底层socket buffer 缓冲区读,如果buffer缓冲区还有未读的buffer,则不需要到OS底层读缓冲区读
                int nRead = populateReadBuffer(to);
                if (nRead > 0) {
                    return nRead;
                    /*
                     * Since more bytes may have arrived since the buffer was last
                     * filled, it is an option at this point to perform a
                     * non-blocking read. However correctly handling the case if
                     * that read returns end of stream adds complexity. Therefore,
                     * at the moment, the preference is for simplicity.
                     */
                }
                //到这里是tomcat  socketBufferHandler 的read buffer 已经读完,
                // The socket read buffer capacity is socket.appReadBufSize
                // tomcat buffer 缓冲区已经读完,则需要从OS底层缓冲区读
                int limit = socketBufferHandler.getReadBuffer().capacity();
                /**
                 * 在第一读的时候,to 是空的,to.remaining() 即 socket buffer capacity + header size
                 * 所以在第一次读时to.remaining() >= limit 为true,即直接从os 底层读到byte buffer */
                if (to.remaining() >= limit) {
                    //to buffer 缓冲区可写空间大于个socketBufferHandler read buffer 的容量
                    //设置该次读取的最大值limit,即socket buffer的大小。
                    to.limit(to.position() + limit);
                    //realy read from os buffer to app buffer to
                    //直接从os 底层读到 to buffer,避免一次copy
                    nRead = fillReadBuffer(block, to);
                    updateLastRead();
                } else {
                    // Fill the read buffer as best we can.
                    // 先读到tomcat socketBufferHandler 的 read buffer
                    nRead = fillReadBuffer(block);
                    updateLastRead();
                   // Fill as much of the remaining byte array as possible with the// data that was just read
                    if (nRead > 0) {
                        nRead = populateReadBuffer(to);
                    }
                }
                return nRead;
            }
    

    populateReadBuffer 是负责把tomcat 底层的socket buffer 即nio buffer的内容copy到 Http11Processor 的bytebuffer。

    protected int populateReadBuffer(ByteBuffer to) {
            // Is there enough data in the read buffer to satisfy this request?
            // Copy what data there is in the read buffer to the byte array
            // read buffer 刚写入了数据,需要为读做好准备,即做fip 操作
            // limit = position,position = 0;
            socketBufferHandler.configureReadBufferForRead();
            // copy socketBufferHandler buffer 到 to
            int nRead = transfer(socketBufferHandler.getReadBuffer(), to);
            if (log.isDebugEnabled()) {
                log.debug("Socket: [" + this + "], Read from buffer: [" + nRead + "]");
            }
            return nRead;
        }
    
    

    通过前面的分析,to 即Http11Processor 的byteBuffer,byteBuffer的容量为 header size + socket buffer size,在第一读的时候,to 是空的,to.remaining() 即 socket buffer capacity + header size - 0,所以在第一次读时to.remaining() >= limit 为true,即直接从os 底层读到byte buffer, 待到byte buffer 剩余的空间不足socket buffer大小时,才会先读到socket buffer,socket buffer 大小比bytebuffer大,这样一次能从底层os 读更多的内容,再 从socket buffer copy到byte buffer,这样做的意义应该是能减少一次系统底层调用read.

    总结下:

    如果请求行 + header + body的大小不足一个8k,即只有一次底层系统io 读就可以读完,后面都是从byte buffer里面取

    如果请求行 + header + body 大于8k,小于16k,需要读两次系统io读操作。即刚好把Http11Processor 的bytebuffer 读满,前面8k是tomcat在解析时读取的,body部分是我们在主动调用getParameter或者getInputeSteam时读取的。

    如果请求行 + header + body 大于16k,需要读两次系统io读操作。即刚好把Http11Processor 的bytebuffer 读满。则需要三次系统IO读操作,前两次是读到bytebuffer,后面的部分是socket buffer,通过后面分析body解析时可以确认下。

    好了,解析http 请求行算是写完了,解析header 和 body 部分得另起一篇文章,太长了没有人读。

    备注:如有分析的不对的地方,还请指出,欢迎讨论。

    相关文章

      网友评论

        本文标题:Tomcat7 request line(请求行)源码解析

        本文链接:https://www.haomeiwen.com/subject/suimsxtx.html