前两天公司里测试发现了一个bug,“尸检”(postmortem)之后同事发了封邮件分享了这个有趣的GCC "bug":
#include <iostream>
#include <string>
#include <limits>
#include <inttypes.h>
void ABSTest(int64_t value1) {
uint64_t value = static_cast<uint64_t>(value1 < 0 ? 0 - value1 : value1);
std::cout << "value " << value;
if (value < 2ul)
std::cout << " less than " << 2 << "?!\n";
std::cout << " compare correct\n";
int main()
int64_t value1 = std::numeric_limits<int64_t>::min();
这段代码其实是把64位有符号整型转换成无符号整型,并与无符号整型数2相比较。在GCC4.9以及更高的版本(可在Ubuntu 16.04上测试,或使用gcc -v查看编译器版本)不加任何优化的时候,其输出结果是:
test@instance-3 /tmp/gcc $ g++ test.cpp -o test && ./testvalue 9223372036854775808 compare correct
test@instance-3 /tmp/gcc $ g++ test.cpp -o test -O2 && ./testvalue 9223372036854775808 less than 2?!
uint64_t value = static_cast<uint64_t>(value1 < 0 ? 0 - (uint64_t)value1 : value1);
我看到这个问题也觉得很有意思,因为除了Jeff Dean,好像没怎么见过有人成功鄙视编译器。
g++ test.cpp -O2 -S -o test.S
sarq $63, %rax
movq %rdi, %rbx
xorq %rax, %rbx
subq %rax, %rbx
movl $6, %edx
movl $.LC0, %esi
movl $_ZSt4cout, %edi
call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l
movq %rbx, %rsi
movl $_ZSt4cout, %edi
call _ZNSo9_M_insertImEERSoT_
cmpq $1, %rbx
ja .L2
sarq $63, %rax
movl $6, %edx
movl $.LC0, %esi
xorq %rax, %rbx
movl $_ZSt4cout, %edi
subq %rax, %rbx
call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l
movq %rbx, %rsi
movl $_ZSt4cout, %edi
call _ZNSo9_M_insertImEERSoT_
cmpq $1, %rbx
jle .L5
乍一看似乎的确是gcc的bug。稍微有一些C++编程经验的程序员应该了解-O1和-O2的区别只在于O2多了一些优化选项。为了搞清楚更深层次的原因,需要进一步确定是哪些优化选项导致的。在gcc的文档页面可以看到各种优化选项:Optimize Options
-ftree-vrp 和 -fstrict-overflow
Perform Value Range Propagation on trees. This is similar to the constant propagation pass, but instead of values, ranges of values are propagated. This allows the optimizers to remove unnecessary range checks like array bound checks and null pointer checks. This is enabled by default at -O2 and higher. Null pointer check elimination is only done if -fdelete-null-pointer-checks is enabled.
for (i = 1; i < 100; i++)
if (i)
g ();
在编译过程中就可以明确知道,i 的范围是[1, 100]的整型,所以if(i)这个条件永远为真,就可以优化掉,变成:
for (i = 1; i < 100; i++)
g ();
Allow the compiler to assume strict signed overflow rules, depending on the language being compiled. For C (and C++) this means that overflow when doing arithmetic with signed numbers is undefined, which means that the compiler may assume that it does not happen. This permits various optimizations. For example, the compiler assumes that an expression like i + 10 > i is always true for signed i. This assumption is only valid if signed overflow is undefined, as the expression is false if i + 10 overflows when using twos complement arithmetic. When this option is in effect any attempt to determine whether an operation on signed numbers overflows must be written carefully to not actually involve overflow.
>This option also allows the compiler to assume strict pointer semantics: given a pointer to an object, if adding an offset to that pointer does not produce a pointer to the same object, the addition is undefined. This permits the compiler to conclude that p + u > p is always true for a pointer p and unsigned integer u. This assumption is only valid because pointer wraparound is undefined, as the expression is false if p + uoverflows using twos complement arithmetic.
>See also the -fwrapv option. Using -fwrapv means that integer signed overflow is fully defined: it wraps. When -fwrapv is used, there is no difference between -fstrict-overflow and -fno-strict-overflowfor integers. With -fwrapv certain types of overflow are permitted. For example, if the compiler gets an overflow when doing arithmetic on constants, the overflowed value can still be used with -fwrapv, but not otherwise.
>The -fstrict-overflow option is enabled at levels -O2, -O3, -Os.
>来源:[Optimize Options](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html)
首先,value1是定义了64位整型的最小值,也就是[-2^{63}, 2^{63}-1]. 当对value1 进行0-value1计算的时候,发生了溢出。
>"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2^n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
>来源:[What happens if I assign a negative value to an unsigned variable?](http://stackoverflow.com/questions/2711522/what-happens-if-i-assign-a-negative-value-to-an-unsigned-variable)
同事的本意是取绝对值,但是由于正数的最大值比负数的最小值绝对值小1,这种情况下同样长度的有符号整型是没办法表示的。同事期望的表现是在这种情况下,直接变成无符号整型,这样就能够表示了。这样看似乎这样做是可以的,但是别忘了,大部分计算机表示负数是用补码,比如1字节有符号整型有8位,它的-1是1111 1111,如果直接这样转换成无符号整型,就变成了1+2+4+8+16+32+64+128 = 255,这显然不是取绝对值操作。
std::numeric_limits<int64_t>::min() + 1
打开编译器警告。在编译选项中加上 -Wstrict-overflow=2 即可出现报警:
test@instance-3 /tmp/gcc $ g++ test1.cpp -o test1.S -S -O2 --std=c++11 -Wstrict-overflow=2
test1.cpp: In function ‘size_t test(int64_t)’:test1.cpp:6:3: warning: assuming signed overflow does not occur when simplifying conditional [-Wstrict-overflow] if (value < (((uint64_t)1) << 6)) return 1; ^
>This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables others. This option is enabled by default for the Java front-end, as required by the Java language specification.
>来源:[Using the GNU Compiler Collection (GCC): Code Gen Options](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html)
- 这不是一个bug,而是一个feature
- C++程序员非常有必要了解编译器的原理,读编译器文档,知道编译器在做各种优化时候所做的假设
- GCC文档上比LLVM完善不少,所以如果考虑整体迁移到LLVM,虽然好处不少,但是文档不全这方面需要慎重考虑
- 虽然现代软件工程强调解耦、模块化,试图把不同的层级分隔开,让每一层的程序员只专注于自己的那一层。但是毕竟软件是人来写的,有人的地方不可避免的有人为失误(bug),很多时候会遇到像本文这样的bug,牵涉到下一层的东西。所以,要想成为一个优秀的软件工程师,需要对从上到下的体系结构都有一定了解。这一点,复旦计算机学院的课程设计就很好,赵一鸣院长就说,培养的学生要建立从基本电路,到制作CPU,到计算机体系结构,设计编译器、操作系统的全方面知识体系。虽然在教学上跟美国顶尖大学比仍有非常大的差距,但是总体教学设计思路仍然让我收益颇丰。
include <iostream>
include <string>
include <limits>
include <inttypes.h>
size_t test(int64_t value1) {
uint64_t value = static_cast<uint64_t>(value1 < 0 ? 0 - value1 : value1);
if (value < (((uint64_t)1) << 6)) return 1;
if (value < (((uint64_t)1) << 13)) return 2;
if (value < (((uint64_t)1) << 20)) return 3;
if (value < (((uint64_t)1) << 27)) return 4;
if (value < (((uint64_t)1) << 34)) return 5;
if (value < (((uint64_t)1) << 41)) return 6;
if (value < (((uint64_t)1) << 48)) return 7;
if (value < (((uint64_t)1) << 55)) return 8;
if (value < (((uint64_t)1) << 62)) return 9;
return 10;
int main(){
int64_t test2 = std::numeric_limits<int64_t>::max();
std::cout << "1: " << test(test2) << "\n";
uint64_t test1 = (((uint64_t)1) << ((10 - 1) * 7));
uint64_t test2 = std::numeric_limits<uint64_t>::max();
std::cout << "2: " << test(test2) << "\n";
return 0;
test@instance-3 /tmp/gcc $ g++ test1.cpp -O2;./a.out
1: 10
2: 1
test@instance-3 /tmp/gcc $ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
2、在实际的调试过程中,源代码往往比较复杂,包含大量函数调用,直接生成的汇编代码比较难以理解。请用GCC 4.9及以上的编译器比较下列代码在-O1和-O2下的运行结果,并解释原因。
include <iostream>
include <string>
include <limits>
include <inttypes.h>
void ABSTest(int64_t value1) {
uint64_t value = static_cast<uint64_t>(value1 < 0 ? 0 - value1 : value1);
std::cout << "true is " << true << ", " << (true ? "true" : "false") << "\n";
std::cout << "false is " << false << ", " << (false ? "true" : "false") << "\n";
std::cout << "(value < 2u) is " << (value < 2u) << ", " << ((value < 2u) ? "true" : "false") << "\n";
std::cout << "(value < 1u) is " << (value < 1u) << ", " << ((value < 1u) ? "true" : "false") << "\n";
std::cout << "(value > 1u) is " << (value > 1u) << ", " << ((value > 1u) ? "true" : "false") << "\n";
std::cout << "(value > 1u && value < 2u) is " << (value > 1u && value < 2u) << ", " << ((value > 1u && value < 2u) ? "true" : "false") << "\n";
int main()
int64_t value1 = std::numeric_limits<int64_t>::min();
除了Jeff Dean,Linux作者Linus Torvalds也是一个编译器级的存在。不同的是,他个性十分鲜明,不光是警告编译器,他还把gcc的作者骂成“不应该从幼儿园合格毕业”,“头朝下挂着导致智商低下的蠢货”,真是个英语十级的骂人高手:
>Lookie here, your compiler does some absolutely insane things with thespilling, including spilling a *constant*. For chrissake, thatcompiler shouldn't have been allowed to graduate from kindergarten.We're talking "sloth that was dropped on the head as a baby" levelretardation levels here:……
>来源:[Random panic in load_balance() with 3.16-rc](https://lkml.org/lkml/2014/7/24/584)
我感觉这的确是个真的bug,但有意思的是只会发生在amd64架构的处理器上,X86_64则没问题:[Stack frame layout on x86-64](http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/)