美文网首页
Go常见的一些性能优化

Go常见的一些性能优化

作者: 迪克dike | 来源:发表于2021-04-28 14:37 被阅读0次

    []byte和string

    转换

    • 尽量避免[]byte和string的互相转换,go的string是不可变类型,标准实现中和[]byte的互转均为值拷贝
    • 多数场景下都可以优先选择强转换方式进行互转
    //强转换
    func stringToBytes(s string) []byte {
       x := (*[2]uintptr)(unsafe.Pointer(&s))
       b := [3]uintptr{x[0], x[1], x[1]}
       return *(*[]byte)(unsafe.Pointer(&b))
    }
    
    func bytesToString(b []byte) string {
       return *(*string)(unsafe.Pointer(&b))
    }
    

    仅在只读场景下使用强转换

    内存申请

    提前预估容量

    • slice/map初始化尽量估计好长度,能有效减少内存分配次数,优化很明显
    • 尽量规避使用append,因为需要值拷贝,且涉及到重新申请内存,可能会发生逃逸(Mac环境下测试:当append之后的slice长度大于8时会被分配到堆上)
    • 如果无法预估,一般场景下可以考虑申请足够大的空间,并在场景允许的情况下优先考虑复用slice
    func useCap1() {
       arr := make([]int, 0, 2048)
       for i := 0; i < 2048; i++ {
          arr = append(arr, i)
       }
    }
    
    func useCap2() {
       arr := make([]int, 2048)
       for i := 0; i < 2048; i++ {
           arr[i] = i
       }
    }
    
    func noCap() {
       var arr []int
       for i := 0; i < 2048; i++ {
          arr = append(arr, i)
       }
    }
    

    Benchmark

    goos: darwin
    goarch: amd64
    BenchmarkUseCap1-12       966577              1212 ns/op               0 B/op          0 allocs/op
    BenchmarkUseCap2-12      2398420               499 ns/op               0 B/op          0 allocs/op
    BenchmarkNoCap-12         192712              6016 ns/op           58616 B/op         14 allocs/op
    

    slice扩容的主要代码,常规场景下的扩容逻辑为cap<1024时每次翻倍,cap>1024时每次增长25%,此处也可以对应上benchmark中noCap()分配在了堆上,并经过了14次扩容

    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
       newcap = cap
    } else {
       if old.len < 1024 {
          newcap = doublecap
       } else {
          // Check 0 < newcap to detect overflow
          // and prevent an infinite loop.
          for 0 < newcap && newcap < cap {
             newcap += newcap / 4
          }
          // Set newcap to the requested cap when
          // the newcap calculation overflowed.
          if newcap <= 0 {
             newcap = cap
          }
       }
    }
    

    优先在栈上分配

    func BenchmarkHeap(b *testing.B) {
       m := make([]*string, 1000)
       for i := 0; i < b.N; i++ {
          for i := 0; i < 1000; i++ {
             s := "test"
             m[i] = &s
          }
       }
    }
    
    func BenchmarkStack(b *testing.B) {
       m := make([]string, 1000)
       for i := 0; i < b.N; i++ {
          for i := 0; i < 1000; i++ {
             s := "test"
             m[i] = s
          }
       }
    }
    

    Benchmark

    goos: darwin
    goarch: amd64
    BenchmarkHeap-12           44640         23033 ns/op       16000 B/op       1000 allocs/op
    BenchmarkStack-12        4650966           252 ns/op           0 B/op          0 allocs/op
    

    Map/Slice

    Map中简单结构尽量不使用指针

    map[int]*int

    func gcTime() time.Duration {
        start := time.Now()
        runtime.GC()
        return time.Since(start)
    }
    
    func f1() {
        s := make(map[int]int, 5e7)
        for i := 0; i < 5e7; i++ {
            s[i] = i
        }
        fmt.Printf("With %T, GC took %s\n", s, gcTime())
        _ = s[0]
    }
    
    func f2() {
        s := make(map[int]*int, 5e7)
        for i := 0; i < 5e7; i++ {
            s[i] = &i
        }
        fmt.Printf("With %T, GC took %s\n", s, gcTime())
        _=s[0]
    }
    

    Output:

    With map[int]int, GC took 31.956029ms
    With map[int]*int, GC took 184.174966ms
    

    不包含指针的map在gc中不需要scanObject
    另外根据map的实现(关键词搜索bmap),当元素值大于128byte时,还是需要scanObject

    type BigStruct struct {
       C01 int
       C02 int
       //...
       C16 int // 128byte gc scan临界点
       C17 int //136byte
     }
     
     func f3() {
       s := make(map[int]BigStruct, N)
       for i := 0; i < N; i++ {
          s[i] = BigStruct{}
       }
       fmt.Printf("With %T, GC took %s\n", s, timeGC())
       _ = s[0]
    }
    

    Output:

    With map[int]main.BigStruct, GC took 1.628134832s
    With map[int]main.NoBigStruct, GC took 44.708865ms
    

    BigStruct 多了一个C17,GC时间大幅增加

    对比[]*int, []int和[]BigStruct

    func f4() {
        s := make([]*int, N)
        for i := 0; i < N; i++ {
            s[i] = &i
        }
        fmt.Printf("With %T, GC took %s\n", s, gcTime())
        _ = s[0]
    }
    
    func f5() {
        s := make([]int, N)
        for i := 0; i < N; i++ {
            s[i] = i
        }
        fmt.Printf("With %T, GC took %s\n", s, gcTime())
        _ = s[0]
    }
    
    func f6() {
        s := make([]BigStruct, N)
        for i := 0; i < N; i++ {
            s[i] = BigStruct{}
        }
        fmt.Printf("With %T, GC took %s\n", s, gcTime())
        _ = s[0]
    }
    

    Output:

    With []*int, GC took 137.308395ms
    With []int, GC took 211.862µs
    With []main.BigStruct, GC took 173.504µs
    
    

    slice包含指针的时候同理需要scanObject,但不包含指针时不受元素大小影响,且gc效率要比map高很多
    上面的优化受到很多条条框框限制,比如map[int]string其实是包含指针的(见string定义),无法享受高效的gc,看上去不实用,但是基于此有一种应用较多的优化方式,即把大型的map结构转换为map[int]int(索引)+slice的方式,把gc压力转移到slice上(比map gc开销低),典型例子如bigcache

    相关文章

      网友评论

          本文标题:Go常见的一些性能优化

          本文链接:https://www.haomeiwen.com/subject/reqvrltx.html