1.数组
数组是切片和映射的基础数据结构。
数组是一个长度固定的数据类型,用于存储一段具有相同类型元素的连续块。
1.1 数组赋值是深拷贝
在 Go 中,与 C 数组变量隐式作为指针使用不同,Go 数组是值类型,赋值和函数传参操作都会复制整个数组数据。
var arr1 [3]string
arr2 := [3]string{"red", "yellow", "green"}
arr1 = arr2 //数组拷贝(深拷贝)
fmt.Println(arr1)
fmt.Println(arr2)
arr1[0] = "black"
fmt.Println(arr1)
fmt.Println(arr2)
结果:
[red yellow green]
[red yellow green]
--------
[black yellow green]
[red yellow green]
1.2 数组传参最好用指针
var arr [1e6]int
func foo1(arr [1e6]int) { //每次拷贝整个数组
...
}
func foo2(arr *[1e6]int) { //传递指针,效率更高
...
}
测试函数
func testCopy() {
arrayA := [2]int{100, 200}
var arrayB [2]int
arrayB = arrayA
fmt.Printf("arrayA : %p , %v\n", &arrayA, arrayA)
fmt.Printf("arrayB : %p , %v\n", &arrayB, arrayB)
testArray(arrayA)
}
func testArray(x [2]int) {
fmt.Printf("func Array : %p , %v\n", &x, x)
}
结果:
arrayA : 0xc000084000 , [100 200]
arrayB : 0xc000084010 , [100 200]
func Array : 0xc0000180e0 , [100 200]
2.切片
切片是引用传递,所以它们不需要使用额外的内存并且比使用数组更有效率。
切片本身并不是动态数组或者数组指针。它内部实现的数据结构通过指针引用底层数组,设定相关属性将数据读写操作限定在指定的区域内。切片本身是一个只读对象,其工作机制类似数组指针的一种封装。
切片(slice)是对数组一个连续片段的引用,所以切片是一个引用类型(因此更类似于 C++ 中的 Vector 类型,或者 Python 中的 list 类型)。这个片段可以是整个数组,或者是由起始和终止索引标识的一些项的子集。需要注意的是,终止索引标识的项不包括在切片内。切片提供了一个与指向数组的动态窗口。
给定项的切片索引可能比相关数组的相同元素的索引小。和数组不同的是,切片的长度可以在运行时修改,最小为 0 ,最大为相关数组的长度:切片是一个长度可变的数组。
2.1 数据结构
src/runtime/slice.go:
type slice struct {
array unsafe.Pointer
len int
cap int
}
切片的结构体由3部分构成,Pointer 是指向一个数组的指针,len 代表当前切片的长度,cap 是当前切片的容量。cap 总是大于等于 len 的。
2.2 创建切片
src/runtime/slice.go:
func makeslice(et *_type, len, cap int) unsafe.Pointer {
mem, overflow := math.MulUintptr(et.size, uintptr(cap))
if overflow || mem > maxAlloc || len < 0 || len > cap {
// NOTE: Produce a 'len out of range' error instead of a
// 'cap out of range' error when someone does make([]T, bignumber).
// 'cap out of range' is true too, but since the cap is only being
// supplied implicitly, saying len is clearer.
// See golang.org/issue/4085.
mem, overflow := math.MulUintptr(et.size, uintptr(len))
if overflow || mem > maxAlloc || len < 0 {
panicmakeslicelen()
}
panicmakeslicecap()
}
return mallocgc(mem, et, true)
}
上图是用 make 函数创建的一个 len = 4, cap = 6 的切片。内存空间申请了6个 int 类型的内存大小。由于 len = 4,所以后面2个暂时访问不到,但是容量还是在的。这时候数组里面每个变量都是0 。
这里是用字面量创建的一个 len = 6,cap = 6 的切片,这时候数组里面每个元素的值都初始化完成了。需要注意的是 [ ] 里面不要写数组的容量,因为如果写了个数以后就是数组了,而不是切片了。
2.2.1 nil切片
nil切片:var slice []int
nil 切片被用在很多标准库和内置函数中,描述一个不存在的切片的时候,就需要用到 nil 切片。比如函数在发生异常的时候,返回的切片就是 nil 切片。nil 切片的指针指向 nil。
2.2.2 空切片
空切片一般会用来表示一个空的集合。比如数据库查询,一条结果也没有查到,那么就可以返回一个空切片。
silce := make( []int , 0 )
slice := []int{ }
空切片和 nil 切片的区别在于,空切片指向的地址不是nil,指向的是一个内存地址,但是它没有分配任何内存空间,即底层元素包含0个元素。
2.3 切片扩容
详细内容请参考:https://segmentfault.com/a/1190000040413412
func growslice(oldPtr unsafe.Pointer, newLen, oldCap, num int, et *_type) slice {
oldLen := newLen - num
if raceenabled {
callerpc := getcallerpc()
racereadrangepc(oldPtr, uintptr(oldLen*int(et.size)), callerpc, abi.FuncPCABIInternal(growslice))
}
if msanenabled {
msanread(oldPtr, uintptr(oldLen*int(et.size)))
}
if asanenabled {
asanread(oldPtr, uintptr(oldLen*int(et.size)))
}
if newLen < 0 {
panic(errorString("growslice: len out of range"))
}
if et.size == 0 {
// append should not create a slice with nil pointer but non-zero len.
// We assume that append doesn't need to preserve oldPtr in this case.
return slice{unsafe.Pointer(&zerobase), newLen, newLen}
}
newcap := oldCap
doublecap := newcap + newcap
if newLen > doublecap {
newcap = newLen
} else {
const threshold = 256
if oldCap < threshold {
newcap = doublecap
} else {
// Check 0 < newcap to detect overflow
// and prevent an infinite loop.
for 0 < newcap && newcap < newLen {
// Transition from growing 2x for small slices
// to growing 1.25x for large slices. This formula
// gives a smooth-ish transition between the two.
newcap += (newcap + 3*threshold) / 4
}
// Set newcap to the requested cap when
// the newcap calculation overflowed.
if newcap <= 0 {
newcap = newLen
}
}
}
var overflow bool
var lenmem, newlenmem, capmem uintptr
// Specialize for common values of et.size.
// For 1 we don't need any division/multiplication.
// For goarch.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
// For powers of 2, use a variable shift.
switch {
case et.size == 1:
lenmem = uintptr(oldLen)
newlenmem = uintptr(newLen)
capmem = roundupsize(uintptr(newcap))
overflow = uintptr(newcap) > maxAlloc
newcap = int(capmem)
case et.size == goarch.PtrSize:
lenmem = uintptr(oldLen) * goarch.PtrSize
newlenmem = uintptr(newLen) * goarch.PtrSize
capmem = roundupsize(uintptr(newcap) * goarch.PtrSize)
overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize
newcap = int(capmem / goarch.PtrSize)
case isPowerOfTwo(et.size):
var shift uintptr
if goarch.PtrSize == 8 {
// Mask shift for better code generation.
shift = uintptr(sys.TrailingZeros64(uint64(et.size))) & 63
} else {
shift = uintptr(sys.TrailingZeros32(uint32(et.size))) & 31
}
lenmem = uintptr(oldLen) << shift
newlenmem = uintptr(newLen) << shift
capmem = roundupsize(uintptr(newcap) << shift)
overflow = uintptr(newcap) > (maxAlloc >> shift)
newcap = int(capmem >> shift)
capmem = uintptr(newcap) << shift
default:
lenmem = uintptr(oldLen) * et.size
newlenmem = uintptr(newLen) * et.size
capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
capmem = roundupsize(capmem)
newcap = int(capmem / et.size)
capmem = uintptr(newcap) * et.size
}
// The check of overflow in addition to capmem > maxAlloc is needed
// to prevent an overflow which can be used to trigger a segfault
// on 32bit architectures with this example program:
//
// type T [1<<27 + 1]int64
//
// var d T
// var s []T
//
// func main() {
// s = append(s, d, d, d, d)
// print(len(s), "\n")
// }
if overflow || capmem > maxAlloc {
panic(errorString("growslice: len out of range"))
}
var p unsafe.Pointer
if et.ptrdata == 0 {
p = mallocgc(capmem, nil, false)
// The append() that calls growslice is going to overwrite from oldLen to newLen.
// Only clear the part that will not be overwritten.
// The reflect_growslice() that calls growslice will manually clear
// the region not cleared here.
memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
} else {
// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
p = mallocgc(capmem, et, true)
if lenmem > 0 && writeBarrier.enabled {
// Only shade the pointers in oldPtr since we know the destination slice p
// only contains nil pointers because it has been cleared during alloc.
bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(oldPtr), lenmem-et.size+et.ptrdata)
}
}
memmove(p, oldPtr, lenmem)
return slice{p, newLen, newcap}
}
2.3.1 生成新数组
因为原来数组的容量已经达到了最大值,再想扩容, Go 默认会先开一片内存区域,把原来的值拷贝过来,然后再执行 append() 操作。这种情况丝毫不影响原数组。
测试扩容:
func testGrow() {
slice := []int{10, 20, 30, 40}
newSlice := append(slice, 50)
fmt.Printf("Before slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("Before newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
newSlice[1] += 10
fmt.Printf("After slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("After newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
}
结果:
Before slice = [10 20 30 40], Pointer = 0xc000008108, len = 4, cap = 4
Before newSlice = [10 20 30 40 50], Pointer = 0xc000008120, len = 5, cap = 8
After slice = [10 20 30 40], Pointer = 0xc000008108, len = 4, cap = 4
After newSlice = [10 30 30 40 50], Pointer = 0xc000008120, len = 5, cap = 8
Go 中切片扩容的策略是这样的:
- 首先判断,如果新申请容量(cap)大于2倍的旧容量(old.cap),最终容量(newcap)就是新申请的容量(cap)
- 否则判断,如果旧切片的长度小于1024,则最终容量(newcap)就是旧容量(old.cap)的两倍,即(newcap=doublecap)
- 否则判断,如果旧切片长度大于等于1024,则最终容量(newcap)从旧容量(old.cap)开始循环增加原来的 1/4,即(newcap=old.cap,for {newcap += newcap/4})直到最终容量(newcap)大于等于新申请的容量(cap),即(newcap >= cap)
- 如果最终容量(cap)计算值溢出,则最终容量(cap)就是新申请容量(cap)
总结:
- 如果期望容量大于当前容量的两倍就会使用期望容量;
- 如果当前切片的长度小于 1024 就会将容量翻倍;
- 如果当前切片的长度大于 1024 就会每次增加 25% 的容量,直到新容量大于期望容量;
- 这里只是确定切片的大致容量,接下来还需要根据切片中元素的大小对它们进行对齐
2.3.2 复用老数组,特别注意容易产生bug
func testSameArray() {
array := [4]int{10, 20, 30, 40}
slice := array[0:2]
newSlice := append(slice, 50)
fmt.Printf("Before slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("Before newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
newSlice[1] += 10
fmt.Printf("After slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("After newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
fmt.Printf("After array = %v\n", array)
}
结果
Before slice = [10 20], Pointer = 0xc00008e180, len = 2, cap = 4
Before newSlice = [10 20 50], Pointer = 0xc00008e198, len = 3, cap = 4
After slice = [10 30], Pointer = 0xc00008e180, len = 2, cap = 4
After newSlice = [10 30 50], Pointer = 0xc00008e198, len = 3, cap = 4
After array = [10 30 50 40]
在这种情况下,扩容以后并没有新建一个新的数组,扩容前后的数组都是同一个,这也就导致了新的切片修改了一个值,也影响到了老的切片了。并且 append() 操作也改变了原来数组里面的值。一个 append() 操作影响了这么多地方,如果原数组上有多个切片,那么这些切片都会被影响!无意间就产生了莫名的 bug!
这种情况,由于原数组还有容量可以扩容,所以执行 append() 操作以后,会在原数组上直接操作,所以这种情况下,扩容以后的数组还是指向原来的数组。
2.3.3 对比示例
func testGrow1() {
slice := []int{1, 2, 3, 4, 5}
newSlice := slice[1:3] //cap(newSlice) : 4
fmt.Printf("Before slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("Before newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
newSlice = append(newSlice, 60) // slice当前结果为{1, 2, 3, 60, 5}
//newSlice的append操作影响了slice的结果,可以尝试 newSlice := slice[1:3:3]再看看结果
fmt.Printf("After slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("After newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
}
func testGrow2() {
slice := []int{1, 2, 3, 4, 5}
newSlice := slice[1:3:3] //cap(newSlice) : 4
fmt.Printf("Before slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("Before newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
newSlice = append(newSlice, 60) // slice当前结果为{1, 2, 3, 60, 5}
fmt.Printf("After slice = %v, Pointer = %p, len = %d, cap = %d\n", slice, &slice, len(slice), cap(slice))
fmt.Printf("After newSlice = %v, Pointer = %p, len = %d, cap = %d\n", newSlice, &newSlice, len(newSlice), cap(newSlice))
}
结果:
Before slice = [1 2 3 4 5], Pointer = 0xc00010e180, len = 5, cap = 5
Before newSlice = [2 3], Pointer = 0xc00010e198, len = 2, cap = 4
After slice = [1 2 3 60 5], Pointer = 0xc00010e180, len = 5, cap = 5
After newSlice = [2 3 60], Pointer = 0xc00010e198, len = 3, cap = 4
----------
Before slice = [1 2 3 4 5], Pointer = 0xc00010e210, len = 5, cap = 5
Before newSlice = [2 3], Pointer = 0xc00010e228, len = 2, cap = 2
After slice = [1 2 3 4 5], Pointer = 0xc00010e210, len = 5, cap = 5
After newSlice = [2 3 60], Pointer = 0xc00010e228, len = 3, cap = 4
2.4 切片复制
func slicecopy(toPtr unsafe.Pointer, toLen int, fromPtr unsafe.Pointer, fromLen int, width uintptr) int {
if fromLen == 0 || toLen == 0 {
return 0
}
n := fromLen
if toLen < n {
n = toLen
}
if width == 0 {
return n
}
size := uintptr(n) * width
if raceenabled {
callerpc := getcallerpc()
pc := abi.FuncPCABIInternal(slicecopy)
racereadrangepc(fromPtr, size, callerpc, pc)
racewriterangepc(toPtr, size, callerpc, pc)
}
if msanenabled {
msanread(fromPtr, size)
msanwrite(toPtr, size)
}
if asanenabled {
asanread(fromPtr, size)
asanwrite(toPtr, size)
}
if size == 1 { // common case worth about 2x to do here
// TODO: is this still worth it with new memmove impl?
*(*byte)(toPtr) = *(*byte)(fromPtr) // known to be a byte pointer
} else {
memmove(toPtr, fromPtr, size)
}
return n
}
在这个方法中,slicecopy 方法会把源切片值(即 fm Slice )中的元素复制到目标切片(即 to Slice )中,并返回被复制的元素个数,copy 的两个类型必须一致。slicecopy 方法最终的复制结果取决于较短的那个切片,当较短的切片复制完成,整个复制过程就全部完成了。
网友评论