小王职场记谈谈你的STL理解（1）

std:sort代码解析开始

首先看一段视频。
https://www.youtube.com/watch?v=67ta5WTjjUo

然后看一段代码会有什么问题。

std:sort

当数据元素相同时候
stl sort会概率造成core dump（如果你测试，不一定会重现，猜一下需要什么条件？）

image.png

一、问题
std::sort()在排序的时候，如果对排序中的仿函数对相等的值返回true，会导致程序core掉。

二、解决办法
让比较函数对相等的值返回false

三、原因分析std：sort 分析
完整版请看：
文档注释：https://github.com/wangcy6/weekly/blob/master/stl.md
代码注释：https://github.com/wangcy6/weekly/blob/master/KM/code/stl_algo.h

版本

gcc 使用 4.8.4 版本，STL源码在 Linux 系统的位置是：/usr/include/c++/4.8.4/bits (79个文件)

函数对象模块

定义：

重载了“operaotr()”操作符的普通类对象，

这个对象具备了具有函数行为

调用类(), 相当于调用类.成员函数()


 // 大于
template <class _Tp>
struct greater : public binary_function<_Tp,_Tp,bool> 
{
  bool operator()(const _Tp& __x, const _Tp& __y) const { return __x > __y; }
};
//这个函数对象没有数据成员、没有虚函数、没有显示声明的构造函数和析构函数，且对operator()的实现是内联的。用作STL比较器的函数对象一般都很小

greater<int> greaterobj;
greaterobj(3, 5)//等价于greaterobj.operator(3,5) 效果等价于函数调用function(3, 5);

算法部分

代码：

stl_algo.h

std:compare：

Effective STL： Item 21：永远让比较函数对相同元素返回false

std:sort

template <class _RandomAccessIter>
inline void sort(_RandomAccessIter __first, _RandomAccessIter __last) {
  __STL_REQUIRES(_RandomAccessIter, _Mutable_RandomAccessIterator);
  __STL_REQUIRES(typename iterator_traits<_RandomAccessIter>::value_type,
                 _LessThanComparable);
  if (__first != __last) { //只有一个记录 ，不需要排序
    __introsort_loop(__first, __last,
                     __VALUE_TYPE(__first),
                     __lg(__last - __first) * 2);//快速排序，整体有序
    __final_insertion_sort(__first, __last); //剩下未排序的数据，直接插入排序
    
  }
}
template <class _RandomAccessIter, class _Tp, class _Size>
void __introsort_loop(_RandomAccessIter __first,
                      _RandomAccessIter __last, _Tp*,
                      _Size __depth_limit)
{
  while (__last - __first > __stl_threshold) { ///长度大于16才进行一次快排分割
    if (__depth_limit == 0) 
    {
      partial_sort(__first, __last, __last); //堆排序
      return;
    }
    --__depth_limit;
    _RandomAccessIter __cut =
      __unguarded_partition(__first, __last,
                            _Tp(__median(*__first,
                                         *(__first + (__last - __first)/2),
                                         *(__last - 1))));////找三个位置的中位数作为快排依据
    __introsort_loop(__cut, __last, (_Tp*) 0, __depth_limit); //排后一部分
    __last = __cut; //排前一部分
  }
}

std:sort描述

维基百科内省排序

内省排序（英语：Introsort）是由David Musser在1997年设计的排序算法。

这个排序算法首先从快速排序开始，当递归深度超过一定深度（深度为排序元素数量的对数值）后转为堆排序。采用这个方法，

内省排序既能在常规数据集上实现快速排序的高性能，又能在最坏情况下仍保持{\displaystyle O(nlogn)}[图片上传失败...(image-b3b9a4-1548141339074)]的时间复杂度。由于这两种算法都属于比较排序算法，所以内省排序也是一个比较排序算法。

2000年6月，SGI的C++标准模板库的stl_algo.h中的不稳定排序算法采用了Musser的内省排序算法。在此实现中，切换到插入排序的数据量阈值为16个。

主要因素：
if 递归深度多大 then 改为堆排序有不稳定排序改成稳定排序
if 数据少于16个 then 改为插入排序，降低递归堆栈消耗

，因此Musser在1996年发表了一遍论文，提出了Introspective Sorting(内省式排序)，这里可以找到PDF版本。它是一种混合式的排序算法，集成了前面提到的三种算法各自的优点：

在数据量很大时采用正常的快速排序，此时效率为O(logN)。
一旦分段后的数据量小于某个阈值，就改用插入排序，因为此时这个分段是基本有序的，这时效率可达O(N)。
在递归过程中，如果递归层次过深，分割行为有恶化倾向时，它能够自动侦测出来，使用堆排序来处理，在此情况下，使其效率维持在堆排序的O(N logN)，但这又比一开始使用堆排序好。