Go语言——内存管理

作者: 陈先生_9e91 | 来源:发表于2018-10-25 14:14 被阅读0次

Go语言——内存管理
Go 语言内存管理（一）：系统内存管理
GO 内存管理
图解 Go 内存管理器的内存分配策略
Go 语言内存管理（二）：Go 内存管理
Golang源码之图解内存分配
go语言内存管理
Go的内存管理
go 使用pprof 排查内存泄露
如何用 GVM 管理 Go 项目

Go语言——内存管理

参考：

图解 TCMalloc

Golang 内存管理

Go 内存管理

问题

内存碎片：避免内存碎片，提高内存利用率。
多线程：稳定性，效率问题。

内存分配

内存划分

arena即为所谓的堆区，应用中需要的内存从这里分配, 大小为512G，为了方便管理把arena区域划分成一个个的page，每个page为8KB,一共有512GB/8KB个页
spans区域存放span的指针，每个指针对应一个page，所以span区域的大小为(512GB/8KB) * 指针大小8byte = 512M
bitmap区域大小也是通过arena计算出来512GB / (指针大小(8 byte) * 8 / 2) = 16G，用于表示arena区域中哪些地址保存了对象, 并且对象中哪些地址包含了指针，主要用于GC。

分配细节

object size > 32K，则使用 mheap 直接分配。
object size < 16 byte，不包含指针使用 mcache 的小对象分配器 tiny 直接分配；包含指针分配策略与[16 B, 32 K]类似。
object size >= 16 byte && size <=32K byte 时，先使用 mcache 中对应的 size class 分配。
如果 mcache 对应的 size class 的 span 已经没有可用的块，则向 mcentral 请求。
如果 mcentral 也没有可用的块，则向 mheap 申请，并切分。
如果 mheap 也没有合适的 span，则向操作系统申请。

span

可以看出span是一个非常重要的数据结构，每个span包含若干个连续的page。

小对象分配会在span page中划分更小的粒度；大对象通过多页实现。

size class

go1.10\src\runtime\sizeclasses.go

// class  bytes/obj  bytes/span  objects  tail waste  max waste
//     1          8        8192     1024           0     87.50%
//     2         16        8192      512           0     43.75%
//     3         32        8192      256           0     46.88%
//     4         48        8192      170          32     31.52%
//     5         64        8192      128           0     23.44%
//     6         80        8192      102          32     19.07%
//     7         96        8192       85          32     15.95%
//     8        112        8192       73          16     13.56%
//     9        128        8192       64           0     11.72%
//    10        144        8192       56         128     11.82%

//    ...
//    65      28672       57344        2           0      4.91%
//    66      32768       32768        1           0     12.50%

上表中每列含义如下：

class： class ID，每个span结构中都有一个class ID, 表示该span可处理的对象类型
bytes/obj：该class代表对象的字节数
bytes/span：每个span占用堆的字节数，也即页数*页大小
objects: 每个span可分配的对象个数，也即（bytes/spans）/（bytes/obj）
tail bytes: 每个span产生的内存碎片，也即（bytes/spans）%（bytes/obj）

上表可见最大的对象是32K大小，超过32K大小的由特殊的class表示，该class ID为0，每个class只包含一个对象。所以上面只有列出了1-66。

有点像装箱算法，按照规格分配，减少内存碎片。

struct

span是内存管理的基本单位，每个span用来管子特定的size class对象，根据size class，span将若干个页分成块进行管理。

go1.10\src\runtime\mheap.go

type mspan struct {
    next *mspan     // next span in list, or nil if none
    prev *mspan     // previous span in list, or nil if none
   
    startAddr uintptr // address of first byte of span aka s.base()
    npages    uintptr // number of pages in span
    
    nelems uintptr // number of object in the span.
    
    allocBits  *gcBits
    gcmarkBits *gcBits
    
    allocCount  uint16     // number of allocated objects
    spanclass   spanClass  // size class and noscan (uint8)
    
    elemsize    uintptr    // computed from sizeclass or from npages
}

以size class 10为例，npages=1，nelems=56，spanclass=10，elemsize=144；startAddr指arena区位置；next和prev指spans区，span链表；allocBits是一个bitmap，标记分配块分配情况，这个设计我也用过，之前用redis bitmap实现了IPAM。

cache

从上面我们知道go通过span来分配内存，那在哪里用span？通过之前的学习Go语言——goroutine并发模型，我们知道每个P都有mcache，通过mcache管理每个G需要的内存。

go1.10\src\runtime\mcache.go

type mcache struct {
   tiny             uintptr
   tinyoffset       uintptr
    
   alloc [numSpanClasses]*mspan // spans to allocate from, indexed by spanClass
}

numSpanClasses = _NumSizeClasses << 1
_NumSizeClasses = 67

alloc是span数组，长度是67 << 1，说明每种size class有2组元素。第一组span对象中包含了指针，叫做scan，表示需要gc scan；第二组没有指针，叫做noscan。提高gc scan性能。

mcache初始没有span，G先从central动态申请span，并缓存在cache。

central

go1.10\src\runtime\mcentral.go

type mcentral struct {
   lock      mutex
   spanclass spanClass
   nonempty  mSpanList // list of spans with a free object, ie a nonempty free list
   empty     mSpanList // list of spans with no free objects (or cached in an mcache)

   // nmalloc is the cumulative count of objects allocated from
   // this mcentral, assuming all spans in mcaches are
   // fully-allocated. Written atomically, read under STW.
   nmalloc uint64
}

lock: 多个G并发从central申请span，所以需要lock，保证一致性
spanclass : 每个mcentral管理着一组有相同size class的span列表
nonempty: 指还有内存可用的span列表
empty: 指没有内存可用的span列表
nmalloc: 指累计分配的对象个数

线程从central获取span步骤如下：

加锁
从nonempty列表获取一个可用span，并将其从链表中删除
将取出的span放入empty链表
将span返回给线程
解锁
线程将该span缓存进cache

线程将span归还步骤如下：

加锁
将span从empty列表删除
将span加入nonempty列表
解锁

heap

central只管理特定的size class span，所以必然有一个更上层的数据结构，管理所有的sizeclass central，这就是heap。

go1.10\src\runtime\mheap.go

type mheap struct {
   lock      mutex
   
   spans []*mspan

   // Malloc stats.
   largealloc  uint64                  // bytes allocated for large objects
   nlargealloc uint64                  // number of large object allocations
   largefree   uint64                  // bytes freed for large objects (>maxsmallsize)
   nlargefree  uint64                  // number of frees for large objects (>maxsmallsize)
    
   // range of addresses we might see in the heap
   bitmap        uintptr // Points to one byte past the end of the bitmap
   bitmap_mapped uintptr

   arena_start uintptr
   arena_used  uintptr // Set with setArenaUsed.

   arena_alloc uintptr
   arena_end   uintptr

   arena_reserved bool

   central [numSpanClasses]struct {
      mcentral mcentral
      pad      [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte
   }
}

spans：映射span -> page
large：大对象，>32K
bitmap： gc
arena： arena区相关信息，pages，堆区
central：通过size class管理span，每种size class对应两个central

heap

code

前面都是背景知识，现在开始具体的分配过程

go1.10\src\runtime\malloc.go

注释

小对象

申请

// Allocating a small object proceeds up a hierarchy of caches:
//
// 1. Round the size up to one of the small size classes
//    and look in the corresponding mspan in this P's mcache.
//    Scan the mspan's free bitmap to find a free slot.
//    If there is a free slot, allocate it.
//    This can all be done without acquiring a lock.
//
// 2. If the mspan has no free slots, obtain a new mspan
//    from the mcentral's list of mspans of the required size
//    class that have free space.
//    Obtaining a whole span amortizes the cost of locking
//    the mcentral.
//
// 3. If the mcentral's mspan list is empty, obtain a run
//    of pages from the mheap to use for the mspan.
//
// 4. If the mheap is empty or has no page runs large enough,
//    allocate a new group of pages (at least 1MB) from the
//    operating system. Allocating a large run of pages
//    amortizes the cost of talking to the operating system.

将申请的内存大小，向上取整成对应的size class；并且从P的mcache中找对应的mspan。
如果mspan有空闲slot，就分配。这个过程不需要lock，因为只有一个G会向P申请。
如果mspan没有空闲slot，就从mcentral获取新的mspan。这个过程需要lock，因为会有多个G同时申请。
如果mcentral没有mspan，就从mheap申请。
如果mheap空间不足，就想OS申请一组page，最少1MB。

释放

// Sweeping an mspan and freeing objects on it proceeds up a similar
// hierarchy:
//
// 1. If the mspan is being swept in response to allocation, it
//    is returned to the mcache to satisfy the allocation.
//
// 2. Otherwise, if the mspan still has allocated objects in it,
//    it is placed on the mcentral free list for the mspan's size
//    class.
//
// 3. Otherwise, if all objects in the mspan are free, the mspan
//    is now "idle", so it is returned to the mheap and no longer
//    has a size class.
//    This may coalesce it with adjacent idle mspans.
//
// 4. If an mspan remains idle for long enough, return its pages
//    to the operating system.

如果mspan在响应分配时被扫描，就返回mcache以满足分配。（不是很理解）
如果mspan中有分配的对象，就将它放置到mcentral的free list中
如果mspan空闲，就返回mheap，并且不关联size class，即变成page
如果msapn空闲很久，就把page还给OS，缩容。

大对象

// Allocating and freeing a large object uses the mheap
// directly, bypassing the mcache and mcentral.

大对象都是直接操作mheap，跳过mcache和mcentral

实现

tiny( < 16 byte )

off := c.tinyoffset
// Align tiny pointer for required (conservative) alignment.
if size&7 == 0 {
   off = round(off, 8)
} else if size&3 == 0 {
   off = round(off, 4)
} else if size&1 == 0 {
   off = round(off, 2)
}
if off+size <= maxTinySize && c.tiny != 0 {
   // The object fits into existing tiny block.
   x = unsafe.Pointer(c.tiny + off)
   c.tinyoffset = off + size
   c.local_tinyallocs++
   mp.mallocing = 0
   releasem(mp)
   return x
}
// Allocate a new maxTinySize block.
span := c.alloc[tinySpanClass]
v := nextFreeFast(span)
if v == 0 {
   v, _, shouldhelpgc = c.nextFree(tinySpanClass)
}
x = unsafe.Pointer(v)
(*[2]uint64)(x)[0] = 0
(*[2]uint64)(x)[1] = 0
// See if we need to replace the existing tiny block with the new one
// based on amount of remaining free space.
if size < c.tinyoffset || c.tiny == 0 {
   c.tiny = uintptr(x)
   c.tinyoffset = size
}
size = maxTinySize

tinyoffset表示tiny当前分配到什么地址了，之后的分配根据 tinyoffset 寻址。

先根据要分配的对象大小进行地址对齐，比如size是8的倍数，tinyoffset 和 8 对齐。
然后就是进行分配。如果tiny剩余的空间不够用，则重新申请一个 16 byte 的内存块，并分配给 object。
如果新块剩余空间比老快大，就用新的内存块替换。

large（> 32 k）

var s *mspan
shouldhelpgc = true
systemstack(func() {
   s = largeAlloc(size, needzero, noscan)
})
s.freeindex = 1
s.allocCount = 1
x = unsafe.Pointer(s.base())
size = s.elemsize

func largeAlloc(size uintptr, needzero bool, noscan bool) *mspan {
   if size+_PageSize < size {
      throw("out of memory")
   }
   npages := size >> _PageShift
   if size&_PageMask != 0 {
      npages++
   }

   // Deduct credit for this span allocation and sweep if
   // necessary. mHeap_Alloc will also sweep npages, so this only
   // pays the debt down to npage pages.
   deductSweepCredit(npages*_PageSize, npages)

   s := mheap_.alloc(npages, makeSpanClass(0, noscan), true, needzero)
   if s == nil {
      throw("out of memory")
   }
   s.limit = s.base() + size
   heapBitsForSpan(s.base()).initSpan(s)
   return s
}

跳过了mspan和mcentral，直接在mheap上面分配。

normal（[16 b, 32 k]）

var sizeclass uint8
if size <= smallSizeMax-8 {
   sizeclass = size_to_class8[(size+smallSizeDiv-1)/smallSizeDiv]
} else {
   sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv-1)/largeSizeDiv]
}
size = uintptr(class_to_size[sizeclass])
spc := makeSpanClass(sizeclass, noscan)
span := c.alloc[spc]
v := nextFreeFast(span)
if v == 0 {
   v, span, shouldhelpgc = c.nextFree(spc)
}
x = unsafe.Pointer(v)
if needzero && span.needzero != 0 {
   memclrNoHeapPointers(unsafe.Pointer(v), size)
}

对于 size 介于 16 ~ 32K byte 的内存分配

先计算应该分配的 sizeclass，
然后去 mcache 里面 alloc[sizeclass] 申请，
如果 mcache.alloc[sizeclass] 不足以申请，则 mcache 向 mcentral 申请，然后再分配。
mcentral 给 mcache 分配完之后会判断自己需不需要扩充，如果需要则向 mheap 申请。

计算出 sizeclass，那么就可以去 mcache.alloc[sizeclass] 分配了，注意这是一个 mspan 指针，真正的分配函数是 nextFreeFast() 函数:

// nextFreeFast returns the next free object if one is quickly available.
// Otherwise it returns 0.
func nextFreeFast(s *mspan) gclinkptr {
   theBit := sys.Ctz64(s.allocCache) // Is there a free object in the allocCache?
   if theBit < 64 {
      result := s.freeindex + uintptr(theBit)
      if result < s.nelems {
         freeidx := result + 1
         if freeidx%64 == 0 && freeidx != s.nelems {
            return 0
         }
         s.allocCache >>= uint(theBit + 1)
         s.freeindex = freeidx
         s.allocCount++
         return gclinkptr(result*s.elemsize + s.base())
      }
   }
   return 0
}

allocCache 这里是用位图表示内存是否可用，1 表示可用。然后通过 span 里面的 freeindex 和 elemsize 来计算地址即可。
当mcache没有可用地址时，通过nextFree向mcentral甚至mheap申请:

func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) {
   s = c.alloc[spc]
   shouldhelpgc = false
   freeIndex := s.nextFreeIndex()
    
   if freeIndex == s.nelems {
      systemstack(func() {
         c.refill(spc)
      })
      shouldhelpgc = true
      s = c.alloc[spc]

      freeIndex = s.nextFreeIndex()
   }

   if freeIndex >= s.nelems {
      throw("freeIndex is not valid")
   }

   v = gclinkptr(freeIndex*s.elemsize + s.base())
   s.allocCount++
   return
}

mcache 向 mcentral，如果 mcentral 不够，则向 mheap 申请。

// Gets a span that has a free object in it and assigns it
// to be the cached span for the given sizeclass. Returns this span.
func (c *mcache) refill(spc spanClass) {
   _g_ := getg()

   _g_.m.locks++
   // Return the current cached span to the central lists.
   s := c.alloc[spc]

   // Get a new cached span from the central lists.
   s = mheap_.central[spc].mcentral.cacheSpan()

   c.alloc[spc] = s
   _g_.m.locks--
}

Go语言——内存管理
Go语言——内存管理参考：图解 TCMalloc Golang 内存管理 Go 内存管理问题内存碎片：避免...
Go 语言内存管理（一）：系统内存管理
介绍要搞明白 Go 语言的内存管理，就必须先理解操作系统以及机器硬件是如何管理内存的。因为 Go 语言的内部机制...
GO 内存管理
要搞明白 Go 语言的内存管理，就必须先理解操作系统以及机器硬件是如何管理内存的。因为 Go 语言的内部机制是建立...
图解 Go 内存管理器的内存分配策略
关于Go的内存分配在 Go 语言里，从内存的分配到不再使用后内存的回收等等这些内存管理工作都是由 Go 在底层完...
Go 语言内存管理（二）：Go 内存管理
介绍了解操作系统对内存的管理机制后，现在可以去看下 Go 语言是如何利用底层的这些特性来优化内存的。Go 的内存...
Golang源码之图解内存分配
转载自图解Go语言内存分配 Go语言内置运行时（就是runtime），抛弃了传统的内存分配方式，改为自主管理。这样...
go语言内存管理
参考连接： https://www.cnblogs.com/xumaojun/p/8547439.html htt...
Go的内存管理
本文翻译自Memory Management in Go，介绍了Go语言中内存管理的相关概念。所有的计算机程序语...
go 使用pprof 排查内存泄露
go 是自带gc的语言，会自动管理内存，不用像C/C++那样，需要程序员手动释放内存，不用手动管理内存，就能少掉很...
如何用 GVM 管理 Go 项目
使用 Go 版本管理器管理多个版本的 Go 语言环境及其模块。 Go 语言版本管理器（GVM）是管理 Go 语言环...

Go语言——内存管理

Go语言——内存管理

问题

内存分配

分配细节

span

size class

struct

cache

central

heap

code

注释

小对象

大对象

实现

tiny( < 16 byte )

large（> 32 k）

normal（[16 b, 32 k]）

相关文章

Go语言——内存管理

Go 语言内存管理（一）：系统内存管理

GO 内存管理

图解 Go 内存管理器的内存分配策略

Go 语言内存管理（二）：Go 内存管理

Golang源码之图解内存分配

go语言内存管理

Go的内存管理

go 使用pprof 排查内存泄露

如何用 GVM 管理 Go 项目

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

go语言