GO基础学习笔记（2）数组、切片和map

作者: 温岭夹糕 | 来源:发表于2022-07-29 18:42 被阅读0次

GO基础学习笔记（2）数组、切片和map
go语言数组、切片、映射
Go语言实战(三) - 内置容器
Go基础 - 4 数组,切片,集合,通道
golang系列教程
Go语言探索 - 10（原创）
go编程基础视频笔记-数组与切片map
<>
Go语言探索 - 11（原创）
go学习笔记(数组&切片)

数组

An array is a numbered sequence of elements of a single type, called the element type. The number of elements is called the length of the array and is never negative.

根据ref描述可知数组是单一元素的集合

var arr =[...]int{1,2,3}
var arr1 [3]

数组的比较也没啥好说的，元素类型和长度一样就可以用等价==对比（最经典的就是传参）
查看底层实现demo.go

var a= [...]int{0,1,2,3,4,5,6,7,8,9}
fmt.Println(a)

启动dlv

dlv debug main.go
b main.go:9  //var a的行数
c
disass  //查看底层汇编

image.png

不像string在runtime上有特殊的底层结构，实质上就是分配一块连续的内存，那么数组的大小就和类型和长度有关（复习：string的len会读取底层结构体的len字段）

slice

A slice is a descriptor for a contiguous segment of an underlying array and provides access to a numbered sequence of elements from that array. A slice type denotes the set of all slices of arrays of its element type. The number of elements is called the length of the slice and is never negative. The value of an uninitialized slice is nil.

切片是底层数组连续段的合集，长度是元素的数量，切片一旦被初始化，总是与其元素的底层数组相关联--即共享存储空间

var a =[...]int{0,1,2,3,4,5,6,7,8,9}
var b = a[:7]
b[0]=1
fmt.Println(a)
b[7] =2 //报错，超出范围
fmt.Println(a)

切片声明

var a []int{1,2,3}

image.png

看不太懂汇编，但与上图看起来很相似，猜到大概就算先建立数组再划分切片范围

make

另一种声明方式make

var a = make([]int,3)

image.png

我们看到make底层是使用了makeslice方法，具体方法自己看吧

r
b makeslice
c  //找到文件是在runtime.slice.go

makeslice，创建slice前要判断是否超过分配的最大内存（溢出检查），再调用mallocgc在堆上申请一片连续的内存

func makeslice(et *_type, len, cap int) unsafe.Pointer {
        mem, overflow := math.MulUintptr(et.size, uintptr(cap))
        if overflow || mem > maxAlloc || len < 0 || len > cap {
                mem, overflow := math.MulUintptr(et.size, uintptr(len))
                if overflow || mem > maxAlloc || len < 0 {
                        panicmakeslicelen()
                }
                panicmakeslicecap()
        }

        return mallocgc(mem, et, true)
}

既然没有关联数组，make直接当老大分配并初始化类型所需的内存和结构，并返回复合类型本身(slice是复合类型)的地址

image.png

make返回地址后，第8行的编译代码并没有结束，而是划分了地址空间
我们在makeslice所在的文件slice.go中找到runtime对slice的定义

type slice struct {
        array unsafe.Pointer
        len   int
        cap   int
}

这个时候我们不就恍然大悟了，知道划分这三个地址的原因了，slice在运行时是一个有三个成员的结构体（切片a是数组a的切片，切片b是切片a的切片，切片b的底层数组是切片a的底层数组），那么len和cap就能直接找到

image.png

这三行汇编代码，第一行是var b=len(a)的汇编，二三是cap的汇编，基本地址的读取范围就算make出的范围
ref对make的描述

Call             Core type    Result

make(T, n)       slice        slice of type T with length n and capacity n
make(T, n, m)    slice        slice of type T with length n and capacity m

make(T)          map          map of type T
make(T, n)       map          map of type T with initial space for approximately n elements

make(T)          channel      unbuffered channel of type T
make(T, n)       channel      buffered channel of type T, buffer size n

make适用于三种类型，我们根据实验可以猜到不同的类型调用底层不同的函数

new

官网还有种产生slice的方法就是new

new([10]int)[0:5]

image.png

底层调用newobect，直接是一个分配内存函数

func newobject(typ *_type) unsafe.Pointer {
        return mallocgc(typ.size, typ, true)
}

我们修改下代码对比下

new([10]int)

打印一下发现打印的是一个内存地址，确实没毛病啊，mallocgc是返回一个地址指针，我们后续需要自己用切片的形式[:5]来初始化这个切片！而make是在mallocgc基础上帮我们多包装了一步切片的初始化
官方文档对new的描述

// The new built-in function allocates memory. The first argument is a type,
// not a value, and the value returned is a pointer to a newly
// allocated zero value of that type.
func new(Type) *Type

他是一个分配内存的内置函数，返回该类型新分配的零指针

扩容

a=append(a,11)

根据汇编找到底层实现函数是slice.go/growslice，贴一下伪代码

func growslice(et *_type, old slice, cap int) slice {
#一些检查
//先尝试两倍扩容
//如果需要的容量超过原来切片的两倍，直接使用需要的容量作为新容量，否则，当原切片小于256，直接翻倍，大于256就不停增加(需要的容量+3*256)/4直到容量满足(好多资料都是基于1024，但我这1.18源代码是256)
        newcap := old.cap
        doublecap := newcap + newcap
        if cap > doublecap {
                newcap = cap
        } else {
                const threshold = 256
                if old.cap < threshold {
                        newcap = doublecap
                } else {
                        // Check 0 < newcap to detect overflow
                        // and prevent an infinite loop.
                        for 0 < newcap && newcap < cap {
                                // Transition from growing 2x for small slices
                                // to growing 1.25x for large slices. This formula
                                // gives a smooth-ish transition between the two.
                                newcap += (newcap + 3*threshold) / 4
                        }
                        // Set newcap to the requested cap when
                        // the newcap calculation overflowed.
                        if newcap <= 0 {
                                newcap = cap
                        }
                }
        }

  #计算内存所需代码
  #内存分配代码
//这个结构体上文见过
return slice{p, old.len, newcap}

小结一下扩容算法:如果需要的容量超过原来切片的两倍，直接使用需要的容量作为新容量，否则，当原切片小于256，直接翻倍，大于256就不停增加(需要的容量+3*256)/4直到容量满足
注意扩容时对slice的array成员也进行了重新赋值，他是动态分配新数组，新数组长度按一定规律扩容(上面算法)，把旧数组拷贝到新数组，之后新数组成了底层数组，旧数组被垃圾回收

func main(){
        var a  =[...]int{1}
        var b=a[:]
        b=append(b,2)
        fmt.Println(a,b)

}
// [1] [1,2]

一旦扩容，切片和原数组就会解除绑定，后续修改不会反应到原数组上

map

func main(){
        var a map[string]int
        fmt.Println(a,unsafe.Sizeof(a))

}
// [] 8

我们看到大小输出是8，哪些数据结构大小是8？int ?指针?底层跟这两八九不离十，dlv看一看很遗憾就是一块内存地址段mov qword ptr [rsp+0x18], 0x0，那我们根据前面的经验,string.go中有string结构体，slice.go有slice结构体，那map呢，看了文件我们发现有一个hmap结构体，是map的头部类型结构

// A header for a Go map.
type hmap struct {
        count     int // # live cells == size of map.  Must be first (used by len() builtin)
        flags     uint8
        B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
        noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
        hash0     uint32 // hash seed

        buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
        oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
        nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)

        extra *mapextra // optional fields
}

image.png

有经验的同学一眼就看出来了，这不就跟哈希链表很像吗（redis不就这样存键值对吗，利用哈希存键，哈希冲突时，拉出一个哈希链表/哈希桶），挑几个重要的解释

count是当前map中元素的个数（len会直接找这个）
B。B=log2(bucket) 或者说B^2 =buckets
overflow是溢出桶的最大数量
hash0哈希种子
buckets指向哈希桶的指针
oldbuckets是在map扩容前指向的前一个buckets指针
nevacuate是扩容进度计数器
extra与overflow有关

真正的键值对数据存储在buckets指针指向的哈希桶中，默认为8，在全局中有定义

image.png

1右移三位1000b = 8d，哈希桶填满了，且map没到扩容条件时会建立overflow，overflow挂在bucket末尾即overflow头指针指向bucket末尾指针
bucket长啥样

// A bucket for a Go map.
type bmap struct {
        // tophash generally contains the top byte of the hash value
        // for each key in this bucket. If tophash[0] < minTopHash,
        // tophash[0] is a bucket evacuation state instead.
        tophash [bucketCnt]uint8
        // Followed by bucketCnt keys and then bucketCnt elems.
        // NOTE: packing all the keys together and then all the elems together makes the
        // code a bit more complicated than alternating key/elem/key/elem/... but it allows
        // us to eliminate padding which would be needed for, e.g., map[int64]int8.
        // Followed by an overflow pointer.
}

就一个长度为8的数组字段，不再多了，当map插入数据，先用哈希函数对key做运算获得hashcode，低位用于选定bucket，高位用于在bmp中确定key存的值的位置
map扩容再看看？

我们直到map不能使用cap函数，意味着它没有cap，那何来的扩容？我们再看看make一个map，底层使用makemap方法

image.png
确实也没有cap，传入的第三个值是hmp
其实map会对底层内存自动管理，当插入元素超出一定个数后自动扩容，在mapassign函数中有这几行代码

        if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
                hashGrow(t, h)
                goto again // Growing the table invalidates everything, so try again
        }

overLoadFactor会检查负载因子LoadFactor,当count>loadfactor*2^B或over flow bucket溢出桶太多时自动扩容，again是该函数中的一段标志，也就是说如果中途发现还不够还要继续扩容（建立一个更大规模的数组bmp，但是真正把数组的旧数据踢到新桶中的时候是操作该数据的时候，oldbucket指向旧桶，直到数据迁移完再释放旧桶）。

map的并发问题

package main
func main() {
    m := make(map[int]int)
    go func() {
        for {
            _ = m[1]
        }
    }()
    go func() {
        for {
            m[2] = 2
        }
    }()
    select {}
}

冲突检测(吐槽下我这服务器1CPU1核只有一个本地队列，运行根本不出现数据竞争)

go run -race main.go

image.png

并发读写时是有冲突的，它本身并没有实现锁，所以我们可以自定义带锁的map结构体，或使用sync.Map(自带成员 mutex锁)

参考

1.append
2.ref
3.深度解密slice
4.深入了解slice
5.一文解决map并发问题
6.gomap扩容机制

GO基础学习笔记（2）数组、切片和map
数组 An array is a numbered sequence of elements of a singl...
go语言数组、切片、映射
go的一些语法有点晦涩，这些很基础，做一下笔记数组数组声明数组声明初始化切片切片的声明切片的追加切片...
Go语言实战(三) - 内置容器
本节我们来学习数组，切片，Map和字符串。在Go语言中，我们一般不直接使用数组，而是使用切片来管理线性表结构，它的...
Go基础 - 4 数组,切片,集合,通道
数组 array 切片 slice 集合 map Map的迭代顺序是随机的通道 channel 相关阅读学习笔记
golang系列教程
Go包管理 Go开发工具 Go Doc 文档 Go 数组 Go 切片 Go Map Go 类型 Go 函数方法 G...
Go语言探索 - 10（原创）
上一篇文章主要学习了Go语言的结构体以及结构体指针，本篇文章主要学习Go语言的切片以及Map。 Go语言数组的长度...
go编程基础视频笔记-数组与切片map
数组Array 定义数组的格式：var [n]，n>=0数组长度也是类型的一部分，因此具有不同长度...
<>
数组和切片切片和数组的区别数组是属于值类型。结构体，基础类型数据也属于值类型。注意GO语言对于“传值和传引用...
Go语言探索 - 11（原创）
Go语言基础系列博客用到的所有示例代码上一篇文章主要学习了Go语言的切片以及Map。本篇文章主要学习的是Go语言...
go学习笔记(数组&切片)
数组数组是存储在一段连续内存中的固定长度的数据类型。数组中的数据类型是一致的可以是内置的类型，也可以是自定义的数...