美文网首页
大师兄的Python源码学习笔记(四十七): Python的内存

大师兄的Python源码学习笔记(四十七): Python的内存

作者: superkmi | 来源:发表于2021-12-24 20:17 被阅读0次

大师兄的Python源码学习笔记(四十六): Python的内存管理机制(一)
大师兄的Python源码学习笔记(四十八): Python的内存管理机制(三)

二、小块空间的内存池

  • 在Python中,经常需要申请小块内存,这些小块内存在申请后,很快会被释放。
  • 由于这些小块内存的申请并不是为了创建对象,所以没有对象一级的内存池机制,这意味着需要执行大量的mallocfree操作。
  • 为了提高效率,Python引入了一个内存池机制,用于管理对小块内存的申请和释放,也就是Pymalloc机制,通过PyObject_MallocPyObject_ReallocPyObject_Free三个接口显示给Python。
  • 这个小块内存的内存池可以视为一个层次结构,从下至上分别是blockpoolarena内存池
  • 其中blockpoolarena在代码中可以找到实体,而内存池只是一个概念,表示Python对整个小块内存分配和释放行为的内存管理机制。
1. block
  • 在最底层,block是一个确定大小的内存块。
  • block有很多种,不同种类的block有不同的内存大小,这个值为size class
  • 为了在当前32位和64位平台获得最佳性能,所有blocksize class都是8字节对齐的。
Objects\obmalloc.c

 * Allocation strategy abstract:
 *
 * For small requests, the allocator sub-allocates <Big> blocks of memory.
 * Requests greater than SMALL_REQUEST_THRESHOLD bytes are routed to the
 * system's allocator.
 *
 * Small requests are grouped in size classes spaced 8 bytes apart, due
 * to the required valid alignment of the returned address. Requests of
 * a particular size are serviced from memory pools of 4K (one VMM page).
 * Pools are fragmented on demand and contain free lists of blocks of one
 * particular size class. In other words, there is a fixed-size allocator
 * for each size class. Free pools are shared by the different allocators
 * thus minimizing the space reserved for a particular size class.
 *
 * This allocation strategy is a variant of what is known as "simple
 * segregated storage based on array of free lists". The main drawback of
 * simple segregated storage is that we might end up with lot of reserved
 * memory for the different free lists, which degenerate in time. To avoid
 * this, we partition each free list in pools and we share dynamically the
 * reserved space between all free lists. This technique is quite efficient
 * for memory intensive programs which allocate mainly small-sized blocks.
 *
 * For small requests we have the following table:
 *
 * Request in bytes     Size of allocated block      Size class idx
 * ----------------------------------------------------------------
 *        1-8                     8                       0
 *        9-16                   16                       1
 *       17-24                   24                       2
 *       25-32                   32                       3
 *       33-40                   40                       4
 *       41-48                   48                       5
 *       49-56                   56                       6
 *       57-64                   64                       7
 *       65-72                   72                       8
 *        ...                   ...                     ...
 *      497-504                 504                      62
 *      505-512                 512                      63
 *
 *      0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying
 *      allocator.
 */

/*==========================================================================*/

/*
 * -- Main tunable settings section --
 */

/*
 * Alignment of addresses returned to the user. 8-bytes alignment works
 * on most current architectures (with 32-bit or 64-bit address busses).
 * The alignment value is also used for grouping small requests in size
 * classes spaced ALIGNMENT bytes apart.
 *
 * You shouldn't change this unless you know what you are doing.
 */
#define ALIGNMENT               8               /* must be 2^N */
#define ALIGNMENT_SHIFT         3

/* Return the number of bytes in size class I, as a uint. */
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
  • 此外,block的上限被设定为512,当申请内存小于这个上限时,交给不同种类的block处理;当申请的内存超过上限,则交给第一层的内存管理机制,即PyMem函数来处理。
Objects\obmalloc.c

#define SMALL_REQUEST_THRESHOLD 512
#define NB_SMALL_SIZE_CLASSES   (SMALL_REQUEST_THRESHOLD / ALIGNMENT)
  • 根据SMALL_REQUEST_THRESHOLDALIGNMENT的限定,可以看到不同种类blocksize class对应一个Size class idx,下面两个公式用来转换size classsize class idx
Objects\obmalloc.c

#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)

size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
  • 但是在Python中,block只是一个概念,源码中并没有与之对应的实体存在,我们知道他是具有一定大小的内存,但不与Python中的某个实体对应。
  • 然后Python提供了一个用来管理block的实体,就是pool
2. pool
  • 一组block的集合称为一个pool,一个pool管理着一堆固定大小的内存块。
  • 事实上,pool管理着一大块内存,并通过策略将这块大内存划分为多个小内存块。
  • 一个pool的大小通常为一个系统内存页,由于当前大多数Python支持的系统内存页都是4KB,所以pool的大小通常定义为4KB。
Objects\obmalloc.c

/*
 * The system's VMM page size can be obtained on most unices with a
 * getpagesize() call or deduced from various header files. To make
 * things simpler, we assume that it is 4K, which is OK for most systems.
 * It is probably better if this is the native page size, but it doesn't
 * have to be.  In theory, if SYSTEM_PAGE_SIZE is larger than the native page
 * size, then `POOL_ADDR(p)->arenaindex' could rarely cause a segmentation
 * violation fault.  4K is apparently OK for all the platforms that python
 * currently targets.
 */
#define SYSTEM_PAGE_SIZE        (4 * 1024)
#define SYSTEM_PAGE_SIZE_MASK   (SYSTEM_PAGE_SIZE - 1)
... ...
/*
 * Size of the pools used for small blocks. Should be a power of 2,
 * between 1K and SYSTEM_PAGE_SIZE, that is: 1k, 2k, 4k.
 */
#define POOL_SIZE               SYSTEM_PAGE_SIZE        /* must be 2^N */
#define POOL_SIZE_MASK          SYSTEM_PAGE_SIZE_MASK
  • pool在Python对应的结构体是pool_header:
Objects\obmalloc.c

/* Pool for small blocks. */
struct pool_header {
    union { block *_padding;
            uint count; } ref;          /* number of allocated blocks    */
    block *freeblock;                   /* pool's free list head         */
    struct pool_header *nextpool;       /* next pool of this size class  */
    struct pool_header *prevpool;       /* previous pool       ""        */
    uint arenaindex;                    /* index into arenas of base adr */
    uint szidx;                         /* block size class index        */
    uint nextoffset;                    /* bytes to virgin block         */
    uint maxnextoffset;                 /* largest valid nextoffset      */
};
  • 从名字可以看出,pool_header是一个pool的头部。
  • 一个pool管理的所有block大小都是一样的,也就是他们的size class idx是一样的,由pool_header中的szidx定义。
  • 假如有一块4KB的内存,看看Python如何将其改造成pool
Objects\obmalloc.c
static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
    block *bp;
    poolp pool;
    poolp next;
    uint size;
 ... ...
    init_pool:
        /* Frontlink to used pools. */
        next = usedpools[size + size]; /* == prev */
        pool->nextpool = next;
        pool->prevpool = next;
        next->nextpool = pool;
        next->prevpool = pool;
        pool->ref.count = 1;
        if (pool->szidx == size) {
            /* Luckily, this pool last contained blocks
             * of the same size class, so its header
             * and free list are already initialized.
             */
            bp = pool->freeblock;
            assert(bp != NULL);
            pool->freeblock = *(block **)bp;
            goto success;
        }
        /*
         * Initialize the pool header, set up the free list to
         * contain just the second block, and return the first
         * block.
         */
        pool->szidx = size;
        size = INDEX2SIZE(size);
        bp = (block *)pool + POOL_OVERHEAD;
        pool->nextoffset = POOL_OVERHEAD + (size << 1);
        pool->maxnextoffset = POOL_SIZE - size;
        pool->freeblock = bp + size;
        *(block **)(pool->freeblock) = NULL;
        goto success;
    }
... ...
success:
    UNLOCK();
    assert(bp != NULL);
    *ptr_p = (void *)bp;
    return 1;
}
  • 最后传出的*ptr_p(bp)也就是从pool中取出的第一块block的指针。
  • 由于第一块block已经分配了,也就是已经分配了1块block,所以pool->ref.count = 1
  • 此外,*ptr_p(bp)指向的地址后有将近4KB内存都是可用的,但申请内存的函数只会使用[bp,bp+size]这个区间的内存,这是由size class idx保证的,所以经过改造后的4KB内存如下:
  • 图中的实线箭头是指针,虚线箭头是偏移位置,在nextoffsetmaxnextoffset中存储的是相对于pool头部的偏移位置
  • 现在假设申请5块28字节的内存,由于28个字节对应的size class idx为3,所以实际会申请5块32字节的内存:
Objects\obmalloc.c

static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
    block *bp;
    poolp pool;
    poolp next;
    uint size;

... ...
    if (pool != pool->nextpool) {
        /*
         * There is a used pool for this size class.
         * Pick up the head block of its free list.
         */
        ++pool->ref.count;
        bp = pool->freeblock;
        assert(bp != NULL);
        if ((pool->freeblock = *(block **)bp) != NULL) {
            goto success;
        }

        /*
         * Reached the end of the free list, try to extend it.
         */
        if (pool->nextoffset <= pool->maxnextoffset) {
            /* There is room for another block. */
            pool->freeblock = (block*)pool +
                              pool->nextoffset;
            pool->nextoffset += INDEX2SIZE(size);
            *(block **)(pool->freeblock) = NULL;
            goto success;
        }

        ... ...
    }
}
  • 可以看出freeblock是下一个可用block的起始地址,当再次申请32字节的block时,只需要返回freeblock只想的地址就可以。
  • nextoffsetmaxoffset是两个用于对pool中的block集合进行迭代的变量,它所指的偏移位置正好指向了freeblock之后的下一个可用的block地址,在分配block之后,freeblocknextoffset都会向前移动一个block的距离,如此反复进行遍历。
  • maxoffset指向pool中的最后一个可用block,它界定了pool的边界。
  • 除了freeblock之外,Python还设置了*freeblock
Objects\obmalloc.c

static int
pymalloc_free(void *ctx, void *p)
{
    poolp pool;
    block *lastfree;
    poolp next, prev;
    uint size;

... ...

    /* Link p to the start of the pool's freeblock list.  Since
     * the pool had at least the p block outstanding, the pool
     * wasn't empty (so it's already in a usedpools[] list, or
     * was full and is in no list -- it's not in the freeblocks
     * list in any case).
     */
    assert(pool->ref.count > 0);            /* else it was empty */
    *(block **)p = lastfree = pool->freeblock;
    pool->freeblock = (block *)p;
}
  • 当一个block被释放时,freeblock虽然指向了一个有效的pool内地址,但*freeblock为NULL。
  • 假设Python释放的是blockA,在代码清单1之后,A中的第一个字节的值被设置成为当前的freeblock值,而在代码清单2之后,freeblock的值被更新了,指向block A的首地址。
  • 这样,一个block被插入到了离散自由的block链表中。
  • 当第2块和第4块block都被释放后,可以看到一个初具规模的离散自由block链表:
  • 对于这个离散自由block链表,可以从freeblock开始很容易的以freeblock = *freeblock的方式遍历这条链表,当*freeblock为NULL时,表明到达了链表尾部。
Objects\obmalloc.c

static int
pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
{
    block *bp;
    poolp pool;
    poolp next;
    uint size;

... ...
    if (pool != pool->nextpool) {
        /*
         * There is a used pool for this size class.
         * Pick up the head block of its free list.
         */
        ++pool->ref.count;
        bp = pool->freeblock;
        assert(bp != NULL);
        if ((pool->freeblock = *(block **)bp) != NULL) {
            goto success;
        }
... ...
}
  • 在前面出现过的这段代码中,可以看到判断pool->freeblock = *(block **)bp,如果为真,则表示离散链表已经不存在了。
  • 如果可能则继续分配poolnextoffset指定的下一块block,否则就提供另一个pool
  • 这意味着也存在一个pool的集合

相关文章

网友评论

      本文标题:大师兄的Python源码学习笔记(四十七): Python的内存

      本文链接:https://www.haomeiwen.com/subject/ifbjfrtx.html