1. 内存分配步骤
go 给对象分配内存的主要流程:
- object size > 32K,则使用 mheap 直接分配。
- object size < 16 byte,使用 mcache 的小对象分配器 tiny 直接分配。 (其实 tiny * 就是一个指针,暂且这么说吧。)
- object size > 16 byte && size <=32K byte 时,先使用 mcache 中对应的 size class 分配。
- 如果 mcache 对应的 size class 的 span 已经没有可用的块,则向 mcentral 请求。
- 如果 mcentral 也没有可用的块,则向 mheap 申请,并切分。
- 如果 mheap 也没有合适的 span,则向操作系统申请。
1.1 关键数据结构
1.1.1 mcache
我们知道每个 Gorontine 的运行都是绑定到一个 P 上面,mcache 是每个 P 的 cache. 这样一个M处理mcache时就不需要加锁(因为同一时刻P只会绑定到一个M下)
mcache 结构如下:
// Per-thread (in Go, per-P) cache for small objects.
// No locking needed because it is per-thread (per-P).
type mcache struct {
// 小对象分配器,小于 16 byte 的小对象都会通过 tiny 来分配。
tiny uintptr
tinyoffset uintptr
local_tinyallocs uintptr // number of tiny allocs not counted in other stats
// The rest is not accessed on every malloc.
alloc [_NumSizeClasses]*mspan // spans to allocate from
stackcache [_NumStackOrders]stackfreelist
.......
}
其中
alloc [_NumSizeClasses]*mspan,这是一个大小为 67 的指针(指针指向 mspan )数组(_NumSizeClasses = 67)),每个数组元素用来包含特定大小的块。当要分配内存大小时,为 object 在 alloc 数组中选择合适的元素来分配。67 种块大小为 0,8 byte, 16 byte, …
var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 256, 288, 320, 352, 384, 416, 448, 480, 512, 576, 640, 704, 768, 896, 1024, 1152, 1280, 1408, 1536, 1792, 2048, 2304, 2688, 3072, 3200, 3456, 4096, 4864, 5376, 6144, 6528, 6784, 6912, 8192, 9472, 9728, 10240, 10880, 12288, 13568, 14336, 16384, 18432, 19072, 20480, 21760, 24576, 27264, 28672, 32768}
而mspan的结构如下:
type mspan struct {
next *mspan // next span in list, or nil if none
prev *mspan // previous span in list, or nil if none
list *mSpanList // For debugging. TODO: Remove.
// 用位图来管理可用的 free object,1 表示可用
allocCache uint64
...
sizeclass uint8 // size class,大小的类别
...
elemsize uintptr // computed from sizeclass or from npages
...
}
1.1.2 mcentral
当 mcache 不够用的时候,会从 mcentral 申请。mcentral存储在mheap当中
type mcentral struct {
lock mutex // 可能多个P竞争,因此要有锁
sizeclass int32 // 也会有67个块大小的类别区分
nonempty mSpanList // list of spans with a free object, ie a nonempty free list
empty mSpanList // list of spans with no free objects (or cached in an mcache)
}
type mSpanList struct {
first *mspan
last *mspan
}
1.1.3 mheap
type mheap struct {
lock mutex
free [_MaxMHeapList]mSpanList // free lists of given length
freelarge mSpanList // free lists length >= _MaxMHeapList
busy [_MaxMHeapList]mSpanList // busy lists of large objects of given length
busylarge mSpanList // busy lists of large objects length >= _MaxMHeapList
sweepgen uint32 // sweep generation, see comment in mspan
sweepdone uint32 // all spans are swept
// allspans is a slice of all mspans ever created. Each mspan
// appears exactly once.
//
// The memory for allspans is manually managed and can be
// reallocated and move as the heap grows.
//
// In general, allspans is protected by mheap_.lock, which
// prevents concurrent access as well as freeing the backing
// store. Accesses during STW might not hold the lock, but
// must ensure that allocation cannot happen around the
// access (since that may free the backing store).
allspans []*mspan // all spans out there
// spans is a lookup table to map virtual address page IDs to *mspan.
// For allocated spans, their pages map to the span itself.
// For free spans, only the lowest and highest pages map to the span itself.
// Internal pages map to an arbitrary span.
// For pages that have never been allocated, spans entries are nil.
//
// This is backed by a reserved region of the address space so
// it can grow without moving. The memory up to len(spans) is
// mapped. cap(spans) indicates the total reserved memory.
spans []*mspan
// sweepSpans contains two mspan stacks: one of swept in-use
// spans, and one of unswept in-use spans. These two trade
// roles on each GC cycle. Since the sweepgen increases by 2
// on each cycle, this means the swept spans are in
// sweepSpans[sweepgen/2%2] and the unswept spans are in
// sweepSpans[1-sweepgen/2%2]. Sweeping pops spans from the
// unswept stack and pushes spans that are still in-use on the
// swept stack. Likewise, allocating an in-use span pushes it
// on the swept stack.
sweepSpans [2]gcSweepBuf
// central free lists for small size classes.
// the padding makes sure that the MCentrals are
// spaced CacheLineSize bytes apart, so that each MCentral.lock
// gets its own cache line.
central [_NumSizeClasses]struct {
mcentral mcentral
pad [sys.CacheLineSize]byte
}
.....
}
mheap_ 是一个全局变量,会在系统初始化的时候初始化(在函数 mallocinit() 中)
- allspans []*mspan: 所有的 spans 都是通过 mheap_ 申请,所有申请过的 mspan 都会记录在 allspans。结构体中的 lock 就是用来保证并发安全的。
- central [_NumSizeClasses]…: 这个就是之前介绍的 mcentral ,每种大小的块对应一个 mcentral。
- spans []*mspan: 记录 arena 区域页号(page number)和 mspan 的映射关系。
1.2 初始化
runtime·rt0_go 会调用schedinit,初始化环境
func schedinit() {
...
mallocinit()
...
gcinit()
......
// 初始化P
if procresize(procs) != nil {
throw("unknown runnable goroutine during bootstrap")
}
.......
}
1.2.1 mallocinit
// 操作系统内存的管理抽象层
// runtime管理的内存空间可能为下面四种状态
// 1. None 没有被映射管理的,region的默认状态
// 2. Reserved 已经被runtime拥有,但是访问的话会出错,还没有计入进程的内存占用
// 3. Prepared 在此类区域中访问内存未定义(防问可能故障,可能会返回意外的零等)
// 4. Ready 可以安全访问
//对于每个操作系统,都有一组通用的帮助程序定义了该过渡这些状态之间的存储区域。 助手如下:
// sysAlloc transitions an OS-chosen region of memory from None to Ready.它从操作系统获取一块内存一般为100kb或者1mb,一般立刻可以使用
// sysFree transitions a memory region from any state to None 用来释放内存给操作系统
// sysReserve transitions a memory region from None to Reserved
// sysMap transitions a memory region from Reserved to Prepared 可以使内存确保快速转换为Ready
// sysUsed transitions a memory region from Prepared to Ready
// sysUnused transitions a memory region from Ready to Prepared.通知OS,进程不再使用这块内存,OS可以做其他用途
// sysFault transitions a memory region from Ready or Prepared to Reserved 如果访问这块内存会返回错误,只会在runtime debugging的时候使用
func mallocinit() {
.....
// 初始化mheap
mheap_.init()
// 初始化mcache
_g_ := getg()
_g_.m.mcache = allocmcache()
.....
}
1.2.2 mheap_.init() 初始化mheap
func (h *mheap) init() {
h.treapalloc.init(unsafe.Sizeof(treapNode{}), nil, nil, &memstats.other_sys)
h.spanalloc.init(unsafe.Sizeof(mspan{}), recordspan, unsafe.Pointer(h), &memstats.mspan_sys)
h.cachealloc.init(unsafe.Sizeof(mcache{}), nil, nil, &memstats.mcache_sys)
h.specialfinalizeralloc.init(unsafe.Sizeof(specialfinalizer{}), nil, nil, &memstats.other_sys)
h.specialprofilealloc.init(unsafe.Sizeof(specialprofile{}), nil, nil, &memstats.other_sys)
h.arenaHintAlloc.init(unsafe.Sizeof(arenaHint{}), nil, nil, &memstats.other_sys)
// Don't zero mspan allocations. Background sweeping can
// inspect a span concurrently with allocating it, so it's
// important that the span's sweepgen survive across freeing
// and re-allocating a span to prevent background sweeping
// from improperly cas'ing it from 0.
//
// This is safe because mspan contains no heap pointers.
h.spanalloc.zero = false
// h->mapcache needs no init
// 初始化central
for i := range h.central {
h.central[i].mcentral.init(spanClass(i))
}
}
1.2.3 allocmcache()
func allocmcache() *mcache {
var c *mcache
systemstack(func() {
lock(&mheap_.lock)
c = (*mcache)(mheap_.cachealloc.alloc())
c.flushGen = mheap_.sweepgen
unlock(&mheap_.lock)
})
for i := range c.alloc {
c.alloc[i] = &emptymspan
}
c.next_sample = nextSample()
return c
}
1.2.4 procresize 初始化P
func procresize(nprocs int32) *p {
.......
//确保P的个数为nprocs个
if nprocs > int32(len(allp)) {
// Synchronize with retake, which could be running
// concurrently since it doesn't run on a P.
lock(&allpLock)
if nprocs <= int32(cap(allp)) {
allp = allp[:nprocs]
} else {
nallp := make([]*p, nprocs)
// Copy everything up to allp's cap so we
// never lose old allocated Ps.
copy(nallp, allp[:cap(allp)])
allp = nallp
}
unlock(&allpLock)
}
// initialize new P's
// 初始化新的P
for i := old; i < nprocs; i++ {
pp := allp[i]
if pp == nil {
pp = new(p)
}
pp.init(i)
atomicstorep(unsafe.Pointer(&allp[i]), unsafe.Pointer(pp))
}
......
}
- 其中 pp.init(i)
// init initializes pp, which may be a freshly allocated p or a
// previously destroyed p, and transitions it to status _Pgcstop.
func (pp *p) init(id int32) {
pp.id = id
pp.status = _Pgcstop
pp.sudogcache = pp.sudogbuf[:0]
for i := range pp.deferpool {
pp.deferpool[i] = pp.deferpoolbuf[i][:0]
}
pp.wbBuf.reset()
if pp.mcache == nil {
if id == 0 {
// p0使用m0的mcache
if getg().m.mcache == nil {
throw("missing mcache?")
}
pp.mcache = getg().m.mcache // bootstrap
} else {
// 其他p使用新创建分配mcache
pp.mcache = allocmcache()
}
}
if raceenabled && pp.raceprocctx == 0 {
if id == 0 {
pp.raceprocctx = raceprocctx0
raceprocctx0 = 0 // bootstrap
} else {
pp.raceprocctx = raceproccreate()
}
}
}
2 内存申请
申请内存时会调用newobject方法
2.1 newobject
func newobject(typ *_type) unsafe.Pointer {
return mallocgc(typ.size, typ, true)
}
2.2 newobject
分配一个size byte的对象
大于32K的直接从heap上分配
其他则从 per-P cache分配
// Allocate an object of size bytes.
// Small objects are allocated from the per-P cache's free lists.
// Large objects (> 32 kB) are allocated straight from the heap.
func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer {
// _GCoff 为 GC not running,后台清理阶段,write barrier disabled
// GC marking roots and workbufs: allocate black, write barrier ENABLED
// _GCmarktermination GC mark termination: allocate black, P's help GC, write barrier ENABLED
if gcphase == _GCmarktermination {
throw("mallocgc called with gcphase == _GCmarktermination")
}
// 如果申请大小为0,直接返回
if size == 0 {
return unsafe.Pointer(&zerobase)
}
....
// getg().m.mcache 获取当前m的mcache
c := gomcache()
....
if size <= maxSmallSize {
if noscan && size < maxTinySize {
// 不包含指针并且小于16bytes的直接在mcahce Tiny allocator
....
// 如果当前的tiny还有足够空间,则直接从tiny分配
if off+size <= maxTinySize && c.tiny != 0 {
// The object fits into existing tiny block.
x = unsafe.Pointer(c.tiny + off)
// 更新c.tinyoffset
c.tinyoffset = off + size
c.local_tinyallocs++
mp.mallocing = 0
releasem(mp)
return x
}
//否则新建一个tiny,并分配size
// Allocate a new maxTinySize block.
span := c.alloc[tinySpanClass]
v := nextFreeFast(span)
if v == 0 {
// 如果mcache中的空闲span已经用完,则会激活gc
v, _, shouldhelpgc = c.nextFree(tinySpanClass)
}
x = unsafe.Pointer(v)
(*[2]uint64)(x)[0] = 0
(*[2]uint64)(x)[1] = 0
// See if we need to replace the existing tiny block with the new one
// based on amount of remaining free space.
//看这意思,如果剩余空间比旧的tiny大,则tiny指向当前的span,否则还在旧的span
if size < c.tinyoffset || c.tiny == 0 {
c.tiny = uintptr(x)
c.tinyoffset = size
}
size = maxTinySize
}
} else {
// 如果size <= maxSmallSize && (!noscan || size > maxTinySize)则从mcache中分配
var sizeclass uint8
if size <= smallSizeMax-8 {
sizeclass = size_to_class8[(size+smallSizeDiv-1)/smallSizeDiv]
} else {
sizeclass = size_to_class128[(size-smallSizeMax+largeSizeDiv-1)/largeSizeDiv]
}
size = uintptr(class_to_size[sizeclass])
spc := makeSpanClass(sizeclass, noscan)
span := c.alloc[spc]
v := nextFreeFast(span)
if v == 0 {
// 如果mcache中的空闲span已经用完,则会激活gc
v, span, shouldhelpgc = c.nextFree(spc)
}
x = unsafe.Pointer(v)
// 如果需要清0的话,则清0
if needzero && span.needzero != 0 {
memclrNoHeapPointers(unsafe.Pointer(v), size)
}
} else { // 如果size > maxSmallSize(32k) ,则从mheap分配
var s *mspan
// 每次从mheap分配内存都会激活gc
shouldhelpgc = true
systemstack(func() {
s = largeAlloc(size, needzero, noscan)
})
s.freeindex = 1
s.allocCount = 1
x = unsafe.Pointer(s.base())
size = s.elemsize
}
....
// 开启gc
if shouldhelpgc {
if t := (gcTrigger{kind: gcTriggerHeap}); t.test() {
gcStart(t)
}
}
return x
}
- 其中nextFree
返回一个spc类型的span,如果没有空闲的span,则refills cache
并根据分配过程决定是否要开启一轮gc
func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) {
s = c.alloc[spc]
shouldhelpgc = false
freeIndex := s.nextFreeIndex()
if freeIndex == s.nelems {
// The span is full.
if uintptr(s.allocCount) != s.nelems {
println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
throw("s.allocCount != s.nelems && freeIndex == s.nelems")
}
// 没有空闲的span,则从 mheap_.central[spc].mcentral.cacheSpan() 获取span补充
c.refill(spc)
// 激活gc
shouldhelpgc = true
s = c.alloc[spc]
freeIndex = s.nextFreeIndex()
}
if freeIndex >= s.nelems {
throw("freeIndex is not valid")
}
v = gclinkptr(freeIndex*s.elemsize + s.base())
s.allocCount++
if uintptr(s.allocCount) > s.nelems {
println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems)
throw("s.allocCount > s.nelems")
}
return
}
3 垃圾回收
go使用三色标记法做垃圾回收
- 起初所有对象都是白色。
- 从根出发扫描所有可达对象,标记为灰色,放入待处理队列。
- 从队列取出灰色对象,将其引用对象标记为灰色放入队列,自身标记为黑色。
-
重复 3,直到灰色对象队列为空。此时白色对象即为垃圾,进行回收。
关于上图有几点需要说明的是:
- 首先从 root 开始遍历,root 包括全局指针和 goroutine 栈上的指针。
- mark 有两个过程。
- 从 root 开始遍历,标记为灰色。遍历灰色队列。
- re-scan 全局指针和栈。因为 mark 和用户程序是并行的,所以在过程 1 的时候可能会有新的对象分配,这个时候就需要通过写屏障(write barrier)记录下来。re-scan 再完成检查一下。
- Stop The World 有两个过程。
- 第一个是 GC 将要开始的时候,这个时候主要是一些准备工作,比如 enable write barrier。
- 第二个过程就是上面提到的 re-scan 过程。如果这个时候没有 stw,那么 mark 将无休止。
另外针对上图各个阶段对应 GCPhase 如下:
- Off: _GCoff
- Stack scan ~ Mark: _GCmark
- Mark termination: _GCmarktermination
3.1 垃圾回收周期
- sweep termination
- mark
- mark termination
- sweep
3.2 何时开启gc?
- 如果分配内存时mcache没有空闲的span
- 如果申请超过32k的内存块
- 调用runtime.GC(),主动GC
- sysmon保证最多每隔forcegcperiod(2分钟)一次gc
3.3 gcStart
gcStart 开启GC,从_GCoff到_GCmark(debug.gcstoptheworld == 0时)
func gcStart(trigger gcTrigger) {
.....
gcBgMarkStartWorkers()
.......
}
3.3.1 gcBgMarkStartWorkers
gcBgMarkStartWorkers prepares 后台mark worker goroutines
直到mark phase才会运行这些g
func gcBgMarkStartWorkers() {
// Background marking is performed by per-P G's. Ensure that
// each P has a background GC G.
// 会给每个p构建一个markworker,并且阻塞markwoker直到调用gcController.findRunnable唤醒
for _, p := range allp {
if p.gcBgMarkWorker == 0 {
go gcBgMarkWorker(p)
notetsleepg(&work.bgMarkReady, -1)
noteclear(&work.bgMarkReady)
}
}
}
- gcBgMarkWorker
func gcBgMarkWorker(_p_ *p) {
notewakeup(&work.bgMarkReady)
for {
// 一直休眠,直到通过 gcController.findRunnable.唤醒
// Go to sleep until woken by gcController.findRunnable.
// We can't releasem yet since even the call to gopark
// may be preempted.
gopark(func(g *g, parkp unsafe.Pointer) bool {
park := (*parkInfo)(parkp)
// The worker G is no longer running, so it's
// now safe to allow preemption.
releasem(park.m.ptr())
// If the worker isn't attached to its P,
// attach now. During initialization and after
// a phase change, the worker may have been
// running on a different P. As soon as we
// attach, the owner P may schedule the
// worker, so this must be done after the G is
// stopped.
if park.attach != 0 {
p := park.attach.ptr()
park.attach.set(nil)
// cas the worker because we may be
// racing with a new worker starting
// on this P.
if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) {
// The P got a new worker.
// Exit this worker.
return false
}
}
return true
}, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0)
// 等待gcController.findRunnable.唤醒后执行
systemstack(func() {
// Mark our goroutine preemptible so its stack
// can be scanned. This lets two mark workers
// scan each other (otherwise, they would
// deadlock). We must not modify anything on
// the G stack. However, stack shrinking is
// disabled for mark workers, so it is safe to
// read from the G stack.
casgstatus(gp, _Grunning, _Gwaiting)
switch _p_.gcMarkWorkerMode {
default:
throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")
case gcMarkWorkerDedicatedMode:
gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)
if gp.preempt {
// We were preempted. This is
// a useful signal to kick
// everything out of the run
// queue so it can run
// somewhere else.
lock(&sched.lock)
for {
gp, _ := runqget(_p_)
if gp == nil {
break
}
globrunqput(gp)
}
unlock(&sched.lock)
}
// Go back to draining, this time
// without preemption.
gcDrain(&_p_.gcw, gcDrainFlushBgCredit)
case gcMarkWorkerFractionalMode:
gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)
case gcMarkWorkerIdleMode:
gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)
}
casgstatus(gp, _Gwaiting, _Grunning)
})
}
}
}
- 调度时优先调度gcworker
func schedule() {
.......
if gp == nil && gcBlackenEnabled != 0 {
gp = gcController.findRunnableGCWorker(_g_.m.p.ptr())
tryWakeP = tryWakeP || gp != nil
}
.......
}
- findRunnableGCWorker
// 返回the background mark worker for _p_ if it should be run
func (c *gcControllerState) findRunnableGCWorker(_p_ *p) *g {
......
// Run the background mark worker
gp := _p_.gcBgMarkWorker.ptr()
casgstatus(gp, _Gwaiting, _Grunnable)
if trace.enabled {
traceGoUnpark(gp, 0)
}
return gp
}
- 唤醒gcBgMarkWorker后,执行gcDrain做具体的标记操作
3.4 标记
- gcDrain
gcstart启动阶段准备了N个goMarkWorkers。每个worker都处理以下相同流程。 - 如果是第一次mark则首先markroot将所有root区的指针入队。
- 从gcw中取节点出对开始扫描处理scanobject,节点出队列就是黑色了。
- 扫描时获取该节点所有子节点的类型信息判断是不是指针,若是指针且并没有被标记则greyobject入队。
- 每个worker都去gcw中拿任务直到为空break。
gcDrain 遍历roots和work buffers中的objects,一直对对象做标记
func gcDrain(gcw *gcWork, flags gcDrainFlags) {
// Drain root marking jobs.
if work.markrootNext < work.markrootJobs {
for !(preemptible && gp.preempt) {
job := atomic.Xadd(&work.markrootNext, +1) - 1
if job >= work.markrootJobs {
break
}
// 标记root
markroot(gcw, job)
if check != nil && check() {
goto done
}
}
}
}
参考
Golang 内存管理
Golang 垃圾回收剖析
Go 垃圾回收
推荐
强烈推荐 legendtkl的博客
http://legendtkl.com/