美文网首页
大师兄的Python源码学习笔记(四十七): Python的内存

大师兄的Python源码学习笔记(四十七): Python的内存

作者: superkmi | 来源:发表于2021-12-24 20:17 被阅读0次

    大师兄的Python源码学习笔记(四十六): Python的内存管理机制(一)
    大师兄的Python源码学习笔记(四十八): Python的内存管理机制(三)

    二、小块空间的内存池

    • 在Python中,经常需要申请小块内存,这些小块内存在申请后,很快会被释放。
    • 由于这些小块内存的申请并不是为了创建对象,所以没有对象一级的内存池机制,这意味着需要执行大量的mallocfree操作。
    • 为了提高效率,Python引入了一个内存池机制,用于管理对小块内存的申请和释放,也就是Pymalloc机制,通过PyObject_MallocPyObject_ReallocPyObject_Free三个接口显示给Python。
    • 这个小块内存的内存池可以视为一个层次结构,从下至上分别是blockpoolarena内存池
    • 其中blockpoolarena在代码中可以找到实体,而内存池只是一个概念,表示Python对整个小块内存分配和释放行为的内存管理机制。
    1. block
    • 在最底层,block是一个确定大小的内存块。
    • block有很多种,不同种类的block有不同的内存大小,这个值为size class
    • 为了在当前32位和64位平台获得最佳性能,所有blocksize class都是8字节对齐的。
    Objects\obmalloc.c
    
     * Allocation strategy abstract:
     *
     * For small requests, the allocator sub-allocates <Big> blocks of memory.
     * Requests greater than SMALL_REQUEST_THRESHOLD bytes are routed to the
     * system's allocator.
     *
     * Small requests are grouped in size classes spaced 8 bytes apart, due
     * to the required valid alignment of the returned address. Requests of
     * a particular size are serviced from memory pools of 4K (one VMM page).
     * Pools are fragmented on demand and contain free lists of blocks of one
     * particular size class. In other words, there is a fixed-size allocator
     * for each size class. Free pools are shared by the different allocators
     * thus minimizing the space reserved for a particular size class.
     *
     * This allocation strategy is a variant of what is known as "simple
     * segregated storage based on array of free lists". The main drawback of
     * simple segregated storage is that we might end up with lot of reserved
     * memory for the different free lists, which degenerate in time. To avoid
     * this, we partition each free list in pools and we share dynamically the
     * reserved space between all free lists. This technique is quite efficient
     * for memory intensive programs which allocate mainly small-sized blocks.
     *
     * For small requests we have the following table:
     *
     * Request in bytes     Size of allocated block      Size class idx
     * ----------------------------------------------------------------
     *        1-8                     8                       0
     *        9-16                   16                       1
     *       17-24                   24                       2
     *       25-32                   32                       3
     *       33-40                   40                       4
     *       41-48                   48                       5
     *       49-56                   56                       6
     *       57-64                   64                       7
     *       65-72                   72                       8
     *        ...                   ...                     ...
     *      497-504                 504                      62
     *      505-512                 512                      63
     *
     *      0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying
     *      allocator.
     */
    
    /*==========================================================================*/
    
    /*
     * -- Main tunable settings section --
     */
    
    /*
     * Alignment of addresses returned to the user. 8-bytes alignment works
     * on most current architectures (with 32-bit or 64-bit address busses).
     * The alignment value is also used for grouping small requests in size
     * classes spaced ALIGNMENT bytes apart.
     *
     * You shouldn't change this unless you know what you are doing.
     */
    #define ALIGNMENT               8               /* must be 2^N */
    #define ALIGNMENT_SHIFT         3
    
    /* Return the number of bytes in size class I, as a uint. */
    #define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
    
    • 此外,block的上限被设定为512,当申请内存小于这个上限时,交给不同种类的block处理;当申请的内存超过上限,则交给第一层的内存管理机制,即PyMem函数来处理。
    Objects\obmalloc.c
    
    #define SMALL_REQUEST_THRESHOLD 512
    #define NB_SMALL_SIZE_CLASSES   (SMALL_REQUEST_THRESHOLD / ALIGNMENT)
    
    • 根据SMALL_REQUEST_THRESHOLDALIGNMENT的限定,可以看到不同种类blocksize class对应一个Size class idx,下面两个公式用来转换size classsize class idx
    Objects\obmalloc.c
    
    #define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
    
    size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
    
    • 但是在Python中,block只是一个概念,源码中并没有与之对应的实体存在,我们知道他是具有一定大小的内存,但不与Python中的某个实体对应。
    • 然后Python提供了一个用来管理block的实体,就是pool
    2. pool
    • 一组block的集合称为一个pool,一个pool管理着一堆固定大小的内存块。
    • 事实上,pool管理着一大块内存,并通过策略将这块大内存划分为多个小内存块。
    • 一个pool的大小通常为一个系统内存页,由于当前大多数Python支持的系统内存页都是4KB,所以pool的大小通常定义为4KB。
    Objects\obmalloc.c
    
    /*
     * The system's VMM page size can be obtained on most unices with a
     * getpagesize() call or deduced from various header files. To make
     * things simpler, we assume that it is 4K, which is OK for most systems.
     * It is probably better if this is the native page size, but it doesn't
     * have to be.  In theory, if SYSTEM_PAGE_SIZE is larger than the native page
     * size, then `POOL_ADDR(p)->arenaindex' could rarely cause a segmentation
     * violation fault.  4K is apparently OK for all the platforms that python
     * currently targets.
     */
    #define SYSTEM_PAGE_SIZE        (4 * 1024)
    #define SYSTEM_PAGE_SIZE_MASK   (SYSTEM_PAGE_SIZE - 1)
    ... ...
    /*
     * Size of the pools used for small blocks. Should be a power of 2,
     * between 1K and SYSTEM_PAGE_SIZE, that is: 1k, 2k, 4k.
     */
    #define POOL_SIZE               SYSTEM_PAGE_SIZE        /* must be 2^N */
    #define POOL_SIZE_MASK          SYSTEM_PAGE_SIZE_MASK
    
    • pool在Python对应的结构体是pool_header:
    Objects\obmalloc.c
    
    /* Pool for small blocks. */
    struct pool_header {
        union { block *_padding;
                uint count; } ref;          /* number of allocated blocks    */
        block *freeblock;                   /* pool's free list head         */
        struct pool_header *nextpool;       /* next pool of this size class  */
        struct pool_header *prevpool;       /* previous pool       ""        */
        uint arenaindex;                    /* index into arenas of base adr */
        uint szidx;                         /* block size class index        */
        uint nextoffset;                    /* bytes to virgin block         */
        uint maxnextoffset;                 /* largest valid nextoffset      */
    };
    
    • 从名字可以看出,pool_header是一个pool的头部。
    • 一个pool管理的所有block大小都是一样的,也就是他们的size class idx是一样的,由pool_header中的szidx定义。
    • 假如有一块4KB的内存,看看Python如何将其改造成pool
    Objects\obmalloc.c
    static int
    pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
    {
        block *bp;
        poolp pool;
        poolp next;
        uint size;
     ... ...
        init_pool:
            /* Frontlink to used pools. */
            next = usedpools[size + size]; /* == prev */
            pool->nextpool = next;
            pool->prevpool = next;
            next->nextpool = pool;
            next->prevpool = pool;
            pool->ref.count = 1;
            if (pool->szidx == size) {
                /* Luckily, this pool last contained blocks
                 * of the same size class, so its header
                 * and free list are already initialized.
                 */
                bp = pool->freeblock;
                assert(bp != NULL);
                pool->freeblock = *(block **)bp;
                goto success;
            }
            /*
             * Initialize the pool header, set up the free list to
             * contain just the second block, and return the first
             * block.
             */
            pool->szidx = size;
            size = INDEX2SIZE(size);
            bp = (block *)pool + POOL_OVERHEAD;
            pool->nextoffset = POOL_OVERHEAD + (size << 1);
            pool->maxnextoffset = POOL_SIZE - size;
            pool->freeblock = bp + size;
            *(block **)(pool->freeblock) = NULL;
            goto success;
        }
    ... ...
    success:
        UNLOCK();
        assert(bp != NULL);
        *ptr_p = (void *)bp;
        return 1;
    }
    
    • 最后传出的*ptr_p(bp)也就是从pool中取出的第一块block的指针。
    • 由于第一块block已经分配了,也就是已经分配了1块block,所以pool->ref.count = 1
    • 此外,*ptr_p(bp)指向的地址后有将近4KB内存都是可用的,但申请内存的函数只会使用[bp,bp+size]这个区间的内存,这是由size class idx保证的,所以经过改造后的4KB内存如下:
    • 图中的实线箭头是指针,虚线箭头是偏移位置,在nextoffsetmaxnextoffset中存储的是相对于pool头部的偏移位置
    • 现在假设申请5块28字节的内存,由于28个字节对应的size class idx为3,所以实际会申请5块32字节的内存:
    Objects\obmalloc.c
    
    static int
    pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
    {
        block *bp;
        poolp pool;
        poolp next;
        uint size;
    
    ... ...
        if (pool != pool->nextpool) {
            /*
             * There is a used pool for this size class.
             * Pick up the head block of its free list.
             */
            ++pool->ref.count;
            bp = pool->freeblock;
            assert(bp != NULL);
            if ((pool->freeblock = *(block **)bp) != NULL) {
                goto success;
            }
    
            /*
             * Reached the end of the free list, try to extend it.
             */
            if (pool->nextoffset <= pool->maxnextoffset) {
                /* There is room for another block. */
                pool->freeblock = (block*)pool +
                                  pool->nextoffset;
                pool->nextoffset += INDEX2SIZE(size);
                *(block **)(pool->freeblock) = NULL;
                goto success;
            }
    
            ... ...
        }
    }
    
    • 可以看出freeblock是下一个可用block的起始地址,当再次申请32字节的block时,只需要返回freeblock只想的地址就可以。
    • nextoffsetmaxoffset是两个用于对pool中的block集合进行迭代的变量,它所指的偏移位置正好指向了freeblock之后的下一个可用的block地址,在分配block之后,freeblocknextoffset都会向前移动一个block的距离,如此反复进行遍历。
    • maxoffset指向pool中的最后一个可用block,它界定了pool的边界。
    • 除了freeblock之外,Python还设置了*freeblock
    Objects\obmalloc.c
    
    static int
    pymalloc_free(void *ctx, void *p)
    {
        poolp pool;
        block *lastfree;
        poolp next, prev;
        uint size;
    
    ... ...
    
        /* Link p to the start of the pool's freeblock list.  Since
         * the pool had at least the p block outstanding, the pool
         * wasn't empty (so it's already in a usedpools[] list, or
         * was full and is in no list -- it's not in the freeblocks
         * list in any case).
         */
        assert(pool->ref.count > 0);            /* else it was empty */
        *(block **)p = lastfree = pool->freeblock;
        pool->freeblock = (block *)p;
    }
    
    • 当一个block被释放时,freeblock虽然指向了一个有效的pool内地址,但*freeblock为NULL。
    • 假设Python释放的是blockA,在代码清单1之后,A中的第一个字节的值被设置成为当前的freeblock值,而在代码清单2之后,freeblock的值被更新了,指向block A的首地址。
    • 这样,一个block被插入到了离散自由的block链表中。
    • 当第2块和第4块block都被释放后,可以看到一个初具规模的离散自由block链表:
    • 对于这个离散自由block链表,可以从freeblock开始很容易的以freeblock = *freeblock的方式遍历这条链表,当*freeblock为NULL时,表明到达了链表尾部。
    Objects\obmalloc.c
    
    static int
    pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
    {
        block *bp;
        poolp pool;
        poolp next;
        uint size;
    
    ... ...
        if (pool != pool->nextpool) {
            /*
             * There is a used pool for this size class.
             * Pick up the head block of its free list.
             */
            ++pool->ref.count;
            bp = pool->freeblock;
            assert(bp != NULL);
            if ((pool->freeblock = *(block **)bp) != NULL) {
                goto success;
            }
    ... ...
    }
    
    • 在前面出现过的这段代码中,可以看到判断pool->freeblock = *(block **)bp,如果为真,则表示离散链表已经不存在了。
    • 如果可能则继续分配poolnextoffset指定的下一块block,否则就提供另一个pool
    • 这意味着也存在一个pool的集合

    相关文章

      网友评论

          本文标题:大师兄的Python源码学习笔记(四十七): Python的内存

          本文链接:https://www.haomeiwen.com/subject/ifbjfrtx.html