- 大师兄的Python源码学习笔记(四十九): Python的内存
- 大师兄的Python源码学习笔记(五十一): Python的内存
- 大师兄的Python源码学习笔记(五十九): Python的内存
- 大师兄的Python源码学习笔记(五十六): Python的内存
- 大师兄的Python源码学习笔记(五十七): Python的内存
- 大师兄的Python源码学习笔记(五十八): Python的内存
- 大师兄的Python源码学习笔记(五十三): Python的内存
- 大师兄的Python源码学习笔记(五十二): Python的内存
- 大师兄的Python源码学习笔记(五十五): Python的内存
- 大师兄的Python源码学习笔记(五十四): Python的内存
大师兄的Python源码学习笔记(四十七): Python的内存管理机制(二)
二、小块空间的内存池
3. arena
- 在Python中,多个pool聚合的结果就是一个arena。
- 一个arena里默认有64个pool,如果pool的默认大小为4kb,arena的默认值就是256kb。
Objects/obmalloc.c
#define ARENA_SIZE (256 << 10) /* 256KB */
- arena在源码中对应的arena_object结构体如下:
Objects/obmalloc.c
typedef uint8_t block;
... ...
/* Record keeping for arenas. */
struct arena_object {
/* The address of the arena, as returned by malloc. Note that 0
* will never be returned by a successful malloc, and is used
* here to mark an arena_object that doesn't correspond to an
* allocated arena.
*/
uintptr_t address;
/* Pool-aligned pointer to the next pool to be carved off. */
block* pool_address;
/* The number of available pools in the arena: free pools + never-
* allocated pools.
*/
uint nfreepools;
/* The total number of pools in the arena, whether or not available. */
uint ntotalpools;
/* Singly-linked list of available pools. */
struct pool_header* freepools;
/* Whenever this arena_object is not associated with an allocated
* arena, the nextarena member is used to link all unassociated
* arena_objects in the singly-linked `unused_arena_objects` list.
* The prevarena member is unused in this case.
*
* When this arena_object is associated with an allocated arena
* with at least one available pool, both members are used in the
* doubly-linked `usable_arenas` list, which is maintained in
* increasing order of `nfreepools` values.
*
* Else this arena_object is associated with an allocated arena
* all of whose pools are in use. `nextarena` and `prevarena`
* are both meaningless in this case.
*/
struct arena_object* nextarena;
struct arena_object* prevarena;
};
- arena_object仅仅是arena的一部分,一个完整的arena包括一个arena_object和透过这个arena_object管理着的pool集合。
3.1 未使用的arena和可用的arena
- 在arena_object中可以看到nextarena和prevarena,这是否意味着会有一个或多个arena构成的链表?
- 实际确实会存在多个arena_object构成的集合,但这个集合并不构成链表,而是构成了一个arena数组。
- 这个数组的首个地址由arenas维护,他是Python中通用小块内存的内存池。
- 另一方面nextarena和prevarena也确实是用来连接arena_object组成链表这是为什么哪?
- arena是用来管理一组pool集合的,并且arena_object和pool_header的作用看起来也是一样的。
- 但是pool_header和arena_object管理的内存有一点细微的差别:
- pool_header管理的内存和pool_header自身是一块连续的内存;
- arena_object与其管理的内存是分离的
![]()
- 这意味着,当pool_header被申请时,它所管理的block内存集合一定也被申请了。
- 而当arena_object被申请时,他所管理的pool集合内存则没有被申请。
- 所以arena_object和pool集合在某一时刻需要建立联系,这个建立联系的时刻是一个关键时刻,会将一个arena_object切分为两种状态。
- 当一个arena的arena_object没有与pool集合建立联系时,处于未使用状态;一旦建立了联系,则转换到可用状态。
- 对于每一种状态,都有一个arena链表,未使用对应****的表头是unused_arena_objects,通过nextarena连接,是一个单向链表。
- 而可用的arena的链表表头是usable_arenas,通过nextarena和prevarena连接,是一个双向链表。
3.2 申请arena
- Python使用new_arena来创建arena:
Objects/obmalloc.c
/* Array of objects used to track chunks of memory (arenas). */
static struct arena_object* arenas = NULL;
/* Number of slots currently allocated in the `arenas` vector. */
static uint maxarenas = 0;
/* The head of the singly-linked, NULL-terminated list of available
* arena_objects.
*/
static struct arena_object* unused_arena_objects = NULL;
/* The head of the doubly-linked, NULL-terminated at each end, list of
* arena_objects associated with arenas that have pools available.
*/
static struct arena_object* usable_arenas = NULL;
/* How many arena_objects do we initially allocate?
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the
* `arenas` vector.
*/
#define INITIAL_ARENA_OBJECTS 16
/* Number of arenas allocated that haven't been free()'d. */
static size_t narenas_currently_allocated = 0;
... ...
/* Allocate a new arena. If we run out of memory, return NULL. Else
* allocate a new arena, and return the address of an arena_object
* describing the new arena. It's expected that the caller will set
* `usable_arenas` to the return value.
*/
static struct arena_object*
new_arena(void)
{
struct arena_object* arenaobj;
uint excess; /* number of bytes above pool alignment */
void *address;
static int debug_stats = -1;
if (debug_stats == -1) {
const char *opt = Py_GETENV("PYTHONMALLOCSTATS");
debug_stats = (opt != NULL && *opt != '\0');
}
if (debug_stats)
_PyObject_DebugMallocStats(stderr);
if (unused_arena_objects == NULL) {
uint i;
uint numarenas;
size_t nbytes;
/* Double the number of arena objects on each allocation.
* Note that it's possible for `numarenas` to overflow.
*/
numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS;
if (numarenas <= maxarenas)
return NULL; /* overflow */
#if SIZEOF_SIZE_T <= SIZEOF_INT
if (numarenas > SIZE_MAX / sizeof(*arenas))
return NULL; /* overflow */
#endif
nbytes = numarenas * sizeof(*arenas);
arenaobj = (struct arena_object *)PyMem_RawRealloc(arenas, nbytes);
if (arenaobj == NULL)
return NULL;
arenas = arenaobj;
/* We might need to fix pointers that were copied. However,
* new_arena only gets called when all the pages in the
* previous arenas are full. Thus, there are *no* pointers
* into the old array. Thus, we don't have to worry about
* invalid pointers. Just to be sure, some asserts:
*/
assert(usable_arenas == NULL);
assert(unused_arena_objects == NULL);
/* Put the new arenas on the unused_arena_objects list. */
for (i = maxarenas; i < numarenas; ++i) {
arenas[i].address = 0; /* mark as unassociated */
arenas[i].nextarena = i < numarenas - 1 ?
&arenas[i+1] : NULL;
}
/* Update globals. */
unused_arena_objects = &arenas[maxarenas];
maxarenas = numarenas;
}
/* Take the next available arena object off the head of the list. */
assert(unused_arena_objects != NULL);
arenaobj = unused_arena_objects;
unused_arena_objects = arenaobj->nextarena;
assert(arenaobj->address == 0);
address = _PyObject_Arena.alloc(_PyObject_Arena.ctx, ARENA_SIZE);
if (address == NULL) {
/* The allocation failed: return NULL after putting the
* arenaobj back.
*/
arenaobj->nextarena = unused_arena_objects;
unused_arena_objects = arenaobj;
return NULL;
}
arenaobj->address = (uintptr_t)address;
++narenas_currently_allocated;
++ntimes_arena_allocated;
if (narenas_currently_allocated > narenas_highwater)
narenas_highwater = narenas_currently_allocated;
arenaobj->freepools = NULL;
/* pool_address <- first pool-aligned address in the arena
nfreepools <- number of whole pools that fit after alignment */
arenaobj->pool_address = (block*)arenaobj->address;
arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE;
assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
excess = (uint)(arenaobj->address & POOL_SIZE_MASK);
if (excess != 0) {
--arenaobj->nfreepools;
arenaobj->pool_address += POOL_SIZE - excess;
}
arenaobj->ntotalpools = arenaobj->nfreepools;
return arenaobj;
}
- 在这段代码中,收回会检查当前unused_arena_objects链表中是否还有未使用的arena。
- 如果存在未使用的arena:
- 从unused_arena_objects链表中抽取一个arena,并斩断链表与提取arena之间的联系。
- 申请一块大小为ARENA_SIZE的内存,将申请的内存地址赋给arena的address,作为pool集合的容身之处。
- 这时arena_object已经与pool集合建立了联系,具备了成为可用内存的条件。
- 随后,会在arena中设置一些用于维护pool集合的信息。
- 这时,Python会放弃一些申请到的内存,将可使用的内存边界(pool_address)的调整到与系统页对齐。
- 如果不存在未使用的arena:
- Python会将maxarenas的值设置为当前的一倍,用于扩大系统的arena集合(小块内存内存池)数。
- 之后会检查这个新得到的maxarenas是否内存溢出。
- 如果没有溢出则通过
realloc
扩大arena指向的内存,并对新申请的内存进行设置。- 这里新申请的内存的address值会被设置为0。
- 实际上address是arena状态的标识,0表示未使用,非0表示可用。
- 最后设置unused_arena_objects链表,这样系统又有了未使用内存,可以跳回上一个分支继续创建arena。
网友评论