美文网首页设计模式
手撸golang 基本数据结构与算法 快速排序

手撸golang 基本数据结构与算法 快速排序

作者: 老罗话编程 | 来源:发表于2021-02-26 06:59 被阅读0次

    缘起

    最近阅读<<我的第一本算法书>>(【日】石田保辉;宫崎修一)
    本系列笔记拟采用golang练习之

    快速排序(Quick Sort)

    快速排序算法首先会在序列中随机选择一个基准值(pivot),
    然后将除了基准值以外的数, 
    分为“比基准值小的数”和“比基准值大的数”这两个类别,
    再将其排列成以下形式:
    [ 比基准值小的数 ] 基准值 [ 比基准值大的数 ]
    
    接着,对两个“[]”中的数据进行排序之后,
    整体的排序便完成了。
    
    对“[]”里面的数据进行排序时同样也会使用快速排序。
    
    快速排序是一种“分治法”。
    它将原本的问题分成两个子问题(比基准值小的数和比基准值大的数),
    然后再分别解决这两个问题(递归地)。
    
    平均运行时间为O(nlogn)
    
    摘自 <<我的第一本算法书>> 【日】石田保辉;宫崎修一
    

    流程(非递归, 原地快速排序)

    1. 给定待排序数组data[N]
    2. 定义待排序栈stack, 其中元素是一个(left, right)整型坐标, 表示待排序子序列的范围
    3. 初始时, 将(0, N-1)压入stack, 表示需要将整个序列进行排序
    4. 当stack不为空时, 循环执行:
      1. 待排序子序列出栈: left, right = stack.pop()
      2. 取基准值v = data[left], 然后data[left]置为nil, 腾出一个格子备用
      3. 取左指针l = left, 右指针r = right, 当前指针标识(左/右)rside = true
      4. 如果rside == true, 将右指针r向左移动, 直到: data[r] < v, 或r=l
      5. 如果找到data[r] < v, 则把data[r]置入data[l]指向的空位, data[r]设nil, 腾出一个格子
      6. 如果rside == false, 将左指针l向右移动, 直到: data[l] > v, 或l=r
      7. 如果找到data[l] > v, 则把data[l]置入data[r]指向的空位, data[l]设nil, 腾出一个格子
      8. 如果l == r, 左右序列切分完成, 将基准值v置入data[l], 返回
      9. 循环执行步骤4-8
    5. stack为空, 排序完成

    为什么要非递归

    • 极端情况下(比如特别大的数组, 刚好已经是倒序排列, 而每次取基准值是取left位置), 递归算法可能导致栈嵌套过深, 一个是占用大量内存, 二个是可能导致栈溢出错误.
    • 快速排序需要左右子序列的中间结果, 再进行合并, 因此无法通过尾递归优化消除栈嵌套

    目标

    • 实现并验证快速排序
    • 使用辅助的子序列坐标栈, 实现非递归执行
    • 原地排序, 不占用额外空间

    设计

    • ISorter: 定义排序器接口. 定义值比较函数以兼容任意数值类型, 通过调整比较函数实现倒序排序
    • tQStack: 实现一个堆栈, 辅助快速排序时, 记录待排序的子序列坐标
    • tQuickSort: 非递归的原地快速排序器, 实现ISorter接口, 使用辅助栈消除递归

    单元测试

    quick_sort_test.go, 测试过程与堆排序, 归并排序类似, 样本规模为10万元素

    package sorting
    
    import (
        "fmt"
        "learning/gooop/sorting"
        "learning/gooop/sorting/quick_sort"
        "math/rand"
        "testing"
        "time"
    )
    
    func Test_QuickSort(t *testing.T) {
        fnAssertTrue := func(b bool, msg string) {
            if !b {
                t.Fatal(msg)
            }
        }
    
        reversed := false
        fnCompare := func(a interface{}, b interface{}) sorting.CompareResult {
            i1 := a.(int)
            i2 := b.(int)
    
            if i1 < i2 {
                if reversed {
                    return sorting.GREATER
                } else {
                    return sorting.LESS
                }
            } else if i1 == i2 {
                return sorting.EQUAL
            } else {
                if reversed {
                    return sorting.LESS
                } else {
                    return sorting.GREATER
                }
            }
        }
    
        fnTestSorter := func(sorter sorting.ISorter) {
            reversed = false
    
            // test simple array
            samples := []interface{} { 2,3,1,5,4,7,6 }
            samples = sorter.Sort(samples, fnCompare)
            fnAssertTrue(fmt.Sprintf("%v", samples) == "[1 2 3 4 5 6 7]",  "expecting 1,2,3,4,5,6,7")
            t.Log("pass sorting [2 3 1 5 4 7 6] >> [1 2 3 4 5 6 7]")
    
            // test 10000 items sorting
            rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
            for plus := 0;plus < 5;plus++ {
                sampleCount := 100 * 1000 + plus
                t.Logf("prepare large array with %v items", sampleCount)
                samples = make([]interface{}, sampleCount)
                for i := 0; i < sampleCount; i++ {
                    samples[i] = rnd.Intn(sampleCount * 10)
                }
    
                t.Logf("sorting large array with %v items", sampleCount)
                t0 := time.Now().UnixNano()
                samples = sorter.Sort(samples, fnCompare)
                cost := time.Now().UnixNano() - t0
                for i := 1; i < sampleCount; i++ {
                    fnAssertTrue(fnCompare(samples[i-1], samples[i]) != sorting.GREATER, "expecting <=")
                }
                t.Logf("end sorting large array, cost = %v ms", cost/1000000)
            }
    
            // test 0-20
            sampleCount := 20
            t.Log("sorting 0-20")
            samples = make([]interface{}, sampleCount)
            for i := 0;i < sampleCount;i++ {
                for {
                    p := rnd.Intn(sampleCount)
                    if samples[p] == nil {
                        samples[p] = i
                        break
                    }
                }
            }
            t.Logf("unsort = %v", samples)
    
            samples = sorter.Sort(samples, fnCompare)
            t.Logf("sorted = %v", samples)
            fnAssertTrue(fmt.Sprintf("%v", samples) == "[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]", "expecting 0-20")
            t.Log("pass sorting 0-20")
    
            // test special
            samples = []interface{} {}
            samples = sorter.Sort(samples, fnCompare)
            t.Log("pass sorting []")
    
            samples = []interface{} { 1 }
            samples = sorter.Sort(samples, fnCompare)
            t.Log("pass sorting [1]")
    
            samples = []interface{} { 3,1 }
            samples = sorter.Sort(samples, fnCompare)
            fnAssertTrue(fmt.Sprintf("%v", samples) == "[1 3]",  "expecting 1,3")
            t.Log("pass sorting [1 3]")
    
            reversed = true
            samples = []interface{} { 2, 3,1 }
            samples = sorter.Sort(samples, fnCompare)
            fnAssertTrue(fmt.Sprintf("%v", samples) == "[3 2 1]",  "expecting 3,2,1")
            t.Log("pass sorting [3 2 1]")
        }
    
        t.Log("\ntesting QuickSorter")
        fnTestSorter(quick_sort.QuickSorter)
    }
    

    测试输出

    • 快速排序真的很快, 与堆排序,归并排序是一个数量级, 10万随机元素排序耗时仅数十毫秒
    • 对随机数据的排序比归并排序还稍快一些, 这可能是因为原地排序不需要预分配缓冲区
    $ go test -v quick_sort_test.go 
    === RUN   Test_QuickSort
        quick_sort_test.go:111: 
            testing QuickSorter
        quick_sort_test.go:48: pass sorting [2 3 1 5 4 7 6] >> [1 2 3 4 5 6 7]
        quick_sort_test.go:54: prepare large array with 100000 items
        quick_sort_test.go:60: sorting large array with 100000 items
        quick_sort_test.go:67: end sorting large array, cost = 27 ms
        quick_sort_test.go:54: prepare large array with 100001 items
        quick_sort_test.go:60: sorting large array with 100001 items
        quick_sort_test.go:67: end sorting large array, cost = 28 ms
        quick_sort_test.go:54: prepare large array with 100002 items
        quick_sort_test.go:60: sorting large array with 100002 items
        quick_sort_test.go:67: end sorting large array, cost = 33 ms
        quick_sort_test.go:54: prepare large array with 100003 items
        quick_sort_test.go:60: sorting large array with 100003 items
        quick_sort_test.go:67: end sorting large array, cost = 32 ms
        quick_sort_test.go:54: prepare large array with 100004 items
        quick_sort_test.go:60: sorting large array with 100004 items
        quick_sort_test.go:67: end sorting large array, cost = 27 ms
        quick_sort_test.go:72: sorting 0-20
        quick_sort_test.go:83: unsort = [11 3 4 2 9 19 18 7 12 6 13 5 10 0 15 14 17 1 8 16]
        quick_sort_test.go:86: sorted = [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
        quick_sort_test.go:88: pass sorting 0-20
        quick_sort_test.go:93: pass sorting []
        quick_sort_test.go:97: pass sorting [1]
        quick_sort_test.go:102: pass sorting [1 3]
        quick_sort_test.go:108: pass sorting [3 2 1]
    --- PASS: Test_QuickSort (0.18s)
    PASS
    ok      command-line-arguments  0.184s
    

    ISorter.go

    定义排序器接口. 定义值比较函数以兼容任意数值类型, 通过调整比较函数实现倒序排序

    package sorting
    
    type ISorter interface {
        Sort(data []interface{}, comparator CompareFunction) []interface{}
    }
    
    type CompareFunction func(a interface{}, b interface{}) CompareResult
    
    type CompareResult int
    const LESS CompareResult = -1
    const EQUAL CompareResult = 0
    const GREATER CompareResult = 1
    

    tQStack.go

    实现一个堆栈, 辅助快速排序时, 记录待排序的子序列坐标

    package quick_sort
    
    type tQStack struct {
        stack []tIntPair
        capacity int
        size int
    }
    
    type tIntPair [2]int
    var gEmptyPair = [2]int{ -1, -1 }
    
    func newQStack() *tQStack {
        return &tQStack{
            stack: make([]tIntPair, 0),
            capacity: 0,
            size: 0,
        }
    }
    
    func (me *tQStack) push(left,right int) {
        node := tIntPair([2]int{left,right})
        if me.size < me.capacity {
            me.stack[me.size] = node
        } else {
            me.stack = append(me.stack, node)
            me.capacity++
        }
        me.size++
    }
    
    func (me *tQStack) pop() (left, right int) {
        me.size--
        it := me.stack[me.size]
        me.stack[me.size] = gEmptyPair
        return it[0], it[1]
    }
    
    func (me *tQStack) isEmpty() bool {
        return me.size <= 0
    }
    
    func (me *tQStack) isNotEmpty() bool {
        return me.size > 0
    }
    

    tQuickSort.go

    非递归的原地快速排序器, 实现ISorter接口, 使用辅助栈消除递归

    package quick_sort
    
    import (
        "learning/gooop/sorting"
    )
    
    type tQuickSort struct {
    }
    
    func newQuickSort() sorting.ISorter {
        return &tQuickSort{}
    }
    
    func (me *tQuickSort) Sort(data []interface{}, comparator sorting.CompareFunction) []interface{} {
        if data == nil {
            return nil
        }
    
        size := len(data)
        if size <= 1 {
            return data
        }
    
        if size == 2 {
            if comparator(data[0], data[1]) == sorting.GREATER {
                data[0],data[1] = data[1], data[0]
                return data
            }
        }
    
        stack := newQStack()
        stack.push(0, size - 1)
        me.qsort(data, comparator, stack)
        return data
    }
    
    func (me *tQuickSort) qsort(data []interface{}, comparator sorting.CompareFunction, stack *tQStack) {
        for ;stack.isNotEmpty(); {
            left, right := stack.pop()
            lfrom, lto, rfrom, rto := me.split(data, comparator, left, right)
    
            if lfrom < lto {
                stack.push(lfrom, lto)
            }
            if rfrom < rto  {
                stack.push(rfrom, rto)
            }
        }
    }
    
    func (me *tQuickSort) split(data []interface{}, comparator sorting.CompareFunction, left int, right int) (lfrom, lto, rfrom, rto int) {
        if left >= right {
            return
        }
    
        v := data[left]
        l := left
        r := right
        rside := true
    
        for {
            hit := false
            if rside {
                for ; r > l; r-- {
                    if comparator(data[r], v) == sorting.LESS {
                        hit = true
                        break
                    }
                }
    
                if hit {
                    data[l], data[r] = data[r], nil
                    l++
                    rside = false
                }
            } else {
                for ; l < r;l++ {
                    if comparator(data[l], v) == sorting.GREATER {
                        hit = true
                        break
                    }
                }
    
                if hit {
                    data[r], data[l] = data[l], nil
                    r--
                    rside = true
                }
            }
    
            if l == r {
                data[l] = v
                break
            }
        }
    
        return left, l - 1, r + 1, right
    }
    
    var QuickSorter = newQuickSort()
    

    (end)

    相关文章

      网友评论

        本文标题:手撸golang 基本数据结构与算法 快速排序

        本文链接:https://www.haomeiwen.com/subject/pcopfltx.html