目前各大互联网公司都开始注重代码质量,在我司,单元测试已经在进行全面推广和覆盖中,这次,我们通过一起实际的例子,聊一聊另一种非常重要的测试,也就是微基准性能测试。
Java中数字转字符串相信大家都有做过,四种常用的转换方式,究竟用哪种最优呢?本次通过对
Integer.toString(a)
String.valueOf(a)
a + ""
"" + a
四种数字转字符串的方式进行性能探究和分析,使大家对性能测试有正确的认识,逐步了解和掌握JMH微基准测试。
常规的测试方式
提到性能测试,如果是没有经验的小伙伴,通常会写出以下代码,企图通过循环来放大每次的运行速度偏差,代码如下:
publicclassCommonTest{
publicstaticvoidmain(String[] args){
intnum =1000000;
inta =123456789;
longstart = System.currentTimeMillis();
for(inti=0; i
Stringm = a +"";
}
longend= System.currentTimeMillis();
System.out.println("a+\"\" = "+ (end- start));
start = System.currentTimeMillis();
for(inti=0; i
Stringm =""+ a;
}
end= System.currentTimeMillis();
System.out.println("\"\"+a = "+ (end- start));
start = System.currentTimeMillis();
for(inti=0; i
Stringn =String.valueOf(a);
}
end= System.currentTimeMillis();
System.out.println("String.valueOf(a) = "+(end-start));
start = System.currentTimeMillis();
for(inti=0; i
Stringn = Integer.toString(a);
}
end= System.currentTimeMillis();
System.out.println("Integer.toString(a) = "+(end-start));
}
}
本机环境为MacBook,处理器:2.6 GHz 六核Intel Core i7,内存:16 GB 2667 MHz DDR4
1000000数据测试结果:
a+""=56
""+a =30
String.valueOf(a) =31
Integer.toString(a) =29
耗时结果为 a+"" > “”+a , String.valueOf(a) > Integer.toString(a),
多次测试,结果显示Integer.toString(a)与String.valueOf(a)消耗时间相似,String.valueOf(a)大于Integer.toString(a),a+""始终大于""+a,且"" + a 消耗时长最小。
500000000数据测试结果:
a+""=9527
""+a =9358
String.valueOf(a) =13620
Integer.toString(a) =13501
耗时结果为String.valueOf(a) > Integer.toString(a) > a+"" > “”+a ,但是a+““和”“+a的差距明显缩小了。
如果以上结果你认为是偶然性偏差,那么下面我再来做一个操作,可能会再次改变你的认知。
publicclass CommonTest {
public static void main(String[] args) {
intnum =500000000;
inta =123456789;
// 本次代码的改变是,将“” + a 和 a + “”的测试顺序进行了改变,先测试“” + a
long start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = "" + a;
}
long end = System.currentTimeMillis();
System.out.println("\"\"+a = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String m = a + "";
}
end = System.currentTimeMillis();
System.out.println("a+\"\" = " + (end - start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = String.valueOf(a);
}
end = System.currentTimeMillis();
System.out.println("String.valueOf(a) = " +(end-start));
start = System.currentTimeMillis();
for (int i=0; i<num; i++){
String n = Integer.toString(a);
}
end = System.currentTimeMillis();
System.out.println("Integer.toString(a) = " +(end-start));
}
}
大家可以自行测试一下,这里我可以直接告诉大家测试结果,a+"" 和 “”+a的测试结果出现了反转,由之前的a+"" 始终大于 “”+a变为了a+"" 始终小于 “”+a。
在不同数据量,甚至测试顺序的变化,都导致测试结果发生了差异,究竟哪个结果更可信?我们该如何正确的进行性能测试?
结果探究
在介绍正确的性能测试方法之前,我们可以先对以上测试结果的偏差原因进行分析。
点开String.valueOf(int i)源码不难看出:
String.valueOf(int i)其实是调用的Integer.toString(i),所以String.valueOf(int i)调用时间大于Integer.toString(i)比较正常,两者时间应该非常相似。
为了验证字符串相加的编译结果,下面给出探究过程:
测试代码:
package com.bestqiang.commontest;
publicclassCommonTest{
publicstaticvoidmain(String[] args) {
inta =1;
Stringstr ="hello"+ a;
Stringstr2 ="hello2"+1;
Stringstr3 = a +"hello3";
Stringstr4 =1+"hello4";
}
}
首先看 str2, 我们把其他代码注释掉,只留下 String str2 = "hello2" + 1;
使用javap解析可得:
"C:\Program Files\Java\jdk1.8.0_221\bin\javap.exe"-v com.bestqiang.commontest.CommonTest
Classfile/D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 481 bytes
MD5 checksum 2d6ee54fb564dc0a7fc51ab0a617cfcc
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #4.#20 // java/lang/Object."<init>":()V
#2 = String #21 // hello21
#3 = Class #22 // com/bestqiang/commontest/CommonTest
#4=Class#23// java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Utf8 Code
#8 = Utf8 LineNumberTable
#9 = Utf8 LocalVariableTable
#10 = Utf8 this
#11 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#12 = Utf8 main
#13 = Utf8 ([Ljava/lang/String;)V
#14 = Utf8 args
#15 = Utf8 [Ljava/lang/String;
#16 = Utf8 str2
#17 = Utf8 Ljava/lang/String;
#18 = Utf8 SourceFile
#19 = Utf8 CommonTest.java
#20 = NameAndType #5:#6 // "<init>":()V
#21 = Utf8 hello21
#22 = Utf8 com/bestqiang/commontest/CommonTest
#23 = Utf8 java/lang/Object
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=2, args_size=1
0: ldc #2 // String hello21
2: astore_1
3: return
LineNumberTable:
line 14: 0
line 19: 3
LocalVariableTable:
Start Length Slot Name Signature
0 4 0 args [Ljava/lang/String;
3 1 1 str2 Ljava/lang/String;
}
SourceFile: "CommonTest.java"
Process finished with exit code 0
解析结果显示,编译器直接优化为了hello21,没有那种StringBuilder追加的情况发生。
接着,我们把其他代码注释掉,只留下 String str = "hello" + a;
使用javap解析可得:
Classfile/D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 705 bytes
MD5 checksum 69bc22cd50230bd882b407a17dcff463
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#27 // java/lang/Object."<init>":()V
#2 = Class #28 // java/lang/StringBuilder
#3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V
#4 = String #29 // hello
#5 = Methodref #2.#30 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#6 = Methodref #2.#31 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #33 // com/bestqiang/commontest/CommonTest
#9=Class#34// java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code
#13 = Utf8 LineNumberTable
#14 = Utf8 LocalVariableTable
#15 = Utf8 this
#16 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 args
#20 = Utf8 [Ljava/lang/String;
#21 = Utf8 a
#22 = Utf8 I
#23 = Utf8 str
#24 = Utf8 Ljava/lang/String;
#25 = Utf8 SourceFile
#26 = Utf8 CommonTest.java
#27 = NameAndType #10:#11 // "<init>":()V
#28 = Utf8 java/lang/StringBuilder
#29 = Utf8 hello
#30 = NameAndType #35:#36 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#31 = NameAndType #35:#37 // append:(I)Ljava/lang/StringBuilder;
#32 = NameAndType #38:#39 // toString:()Ljava/lang/String;
#33 = Utf8 com/bestqiang/commontest/CommonTest
#34 = Utf8 java/lang/Object
#35 = Utf8 append
#36 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#37 = Utf8 (I)Ljava/lang/StringBuilder;
#38 = Utf8 toString
#39 = Utf8 ()Ljava/lang/String;
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: ldc #4 // String hello
11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: iload_1
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_2
22: return
LineNumberTable:
line 11: 0
line 12: 2
line 19: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 args [Ljava/lang/String;
2 21 1 a I
22 1 2 str Ljava/lang/String;
}
SourceFile: "CommonTest.java"
Process finished with exit code 0
如上图所示,其实a + “hello”内部是用stringBuilder进行追加操作的。
a + “hello3”仅仅是顺序发生了变化,底层实现是否是相同的呢?我们把其他代码注释掉,只留下 String str3 = a + “hello3”;
使用javap解析可得:
Classfile/D:/idea-space2/learning-technology/target/test-classes/com/bestqiang/commontest/CommonTest.class
Last modified 2021-8-19; size 707 bytes
MD5 checksum b767896bc82bc01ac0153354ac6e5886
Compiled from "CommonTest.java"
public class com.bestqiang.commontest.CommonTest
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #9.#27 // java/lang/Object."<init>":()V
#2 = Class #28 // java/lang/StringBuilder
#3 = Methodref #2.#27 // java/lang/StringBuilder."<init>":()V
#4 = Methodref #2.#29 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#5 = String #30 // hello3
#6 = Methodref #2.#31 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#7 = Methodref #2.#32 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#8 = Class #33 // com/bestqiang/commontest/CommonTest
#9=Class#34// java/lang/Object
#10 = Utf8 <init>
#11 = Utf8 ()V
#12 = Utf8 Code
#13 = Utf8 LineNumberTable
#14 = Utf8 LocalVariableTable
#15 = Utf8 this
#16 = Utf8 Lcom/bestqiang/commontest/CommonTest;
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 args
#20 = Utf8 [Ljava/lang/String;
#21 = Utf8 a
#22 = Utf8 I
#23 = Utf8 str3
#24 = Utf8 Ljava/lang/String;
#25 = Utf8 SourceFile
#26 = Utf8 CommonTest.java
#27 = NameAndType #10:#11 // "<init>":()V
#28 = Utf8 java/lang/StringBuilder
#29 = NameAndType #35:#36 // append:(I)Ljava/lang/StringBuilder;
#30 = Utf8 hello3
#31 = NameAndType #35:#37 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#32 = NameAndType #38:#39 // toString:()Ljava/lang/String;
#33 = Utf8 com/bestqiang/commontest/CommonTest
#34 = Utf8 java/lang/Object
#35 = Utf8 append
#36 = Utf8 (I)Ljava/lang/StringBuilder;
#37 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#38 = Utf8 toString
#39 = Utf8 ()Ljava/lang/String;
{
public com.bestqiang.commontest.CommonTest();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 9: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/bestqiang/commontest/CommonTest;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: iload_1
10: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
13: ldc #5 // String hello3
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
18: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
21: astore_2
22: return
LineNumberTable:
line 11: 0
line 16: 2
line 19: 22
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 args [Ljava/lang/String;
2 21 1 a I
22 1 2 str3 Ljava/lang/String;
}
SourceFile: "CommonTest.java"
很明显,同样是使用StringBuilder进行的追加操作。
二者底层实现上并无什么不同,
目前有两个疑问产生了,一个是测试循环数据量提升后为何导致“”+a和a+“”的性能反超Integer.toString(a)?另一个是为何“”+a和a+“”的底层实现相同,但是测试顺序不同,导致结果也不同,数据量大后这个差异就缩小了?
其实,结果之所以有这些差异,原因就是JVM未进行预热,且JIT编译器会对我们所写的热点代码进行优化,具体方式有循环展开(loop unrolling)、OSR(On-Stack Replacement)、方法内联等,由于篇幅问题,不再这里详细的一一介绍,后续会附上总的梳理脑图,感兴趣的小伙伴可以单独再去深入了解。也就是说,你所写的代码!=真正执行的代码,我们的JVM会对你写的代码进行隐形优化,导致你的测试结果发生偏差。
简单验证JIT编译优化:
先简单验证一下,验证方法很简单,我们先把把OSR给关闭,具体方法就是在运行JVM时使用JVM参数:-XX:-UseOnStackReplacement,接下来,我们再使用不同数据量级数据进行测试。
1000000数据测试结果:
""+a =116
a+""=74
String.valueOf(a) =43
Integer.toString(a) =68
可以看到,”“+a和a + ”“的耗时已经超过了String.valueOf(a) 和 Integer.toString(a)。
500000000数据测试结果:
""+a =37581
a+""=36621
String.valueOf(a) =20173
Integer.toString(a) =19934
可以看到,”“+a和a + ”“的耗时依然是超过了String.valueOf(a) 和 Integer.toString(a)的。
简单验证JVM预热影响:
验证这个很好办,我们在进行真正测试之前,关闭OSR,再加上一段测试代码先跑一段时间即可,如:
for(inti=0;i<100000000;i++){
result =""+ a;
}
添加代码后,使用1000000数据测试结果:
""+a =71
a+""=70
String.valueOf(a) =41
Integer.toString(a) =40
可以看到”“ + a 与 a + ”“耗时的差异已经消失了,且始终大于String.valueOf(a)与Integer.toString(a)。
开启OSR后,使用500000000数据测试结果:
""+a =11390
a+""=11220
String.valueOf(a) =14500
Integer.toString(a) =14423
可以看到,”“ + a 与 a + ”“性能再次反超Integer.toString(a)。
由此我们可以得出结论:String.valueOf(a)内部调用Integer.toString(a)在源码中可以看出。a+""与“”+a在javap解析后内部均使用StringBuilder进行相加,时间耗时测试也无特殊差异,另外对于非变量的相加如1+“hello”,使用javap分析可以看出直接优化为“1hello”放入常量池。关键在于Integer.toString(a)与“”+a在不同数据量循环的速度差异问题,考虑到低数据量与高数量结果迥然不同,所以怀疑是热点数据触发OSR编译导致的结果差异,考虑到JVM采用分层编译,主要为解释执行+JIT即时编译,JIT检测热点数据的方法为基于调用计数器(Invocation Counter)和回边计数器(Back Edge Counter)的热点探测,这里测试为直接用-XX:-UseOnStackReplacement直接关闭OSR,测试结果为无论数据量大还是小Integer.toString(a)耗时均小于""+a,初步得出结论为OSR对热点数据优化为机器指令后“”+a效率大于Integer.toString(a),否则Integer.toString(a)效率大于“”+a。
我们仅仅为了测试一个Java数字转字符串的性能,就需要关注这么多影响变量,探究这么多原理,那么,有没有一个工具,能够把这些都考虑进去,自动化统计得到真实环境运行时的相对准确的性能测试结果呢?微基准测试神器JMH正可以满足我们的需求。
使用JMH
官方介绍:JMH 是一个 Java 工具,用于构建、运行和分析用 Java 和其他面向 JVM 的语言编写的 nano/micro/milli/macro 基准测试。
JMH全称为Java Microbenchmark Harness,是一个专为Java做基准测试的工具,因为JVM层面在编译期、加载、运行时对代码做很多优化,同样的一段代码,在整个程序运行时不一定会真正生效,所以对Java做微基准测试是非常困难的,需要对JVM的原理非常了解,所以,JMH由JVM团队亲自进行开发,而且开发者给我们提供了38个例子,供我们进行学习,避坑。
我们如果想要测试一段代码片段的性能,就比如以上的例子:int类型转字符串这种代码粒度,往往不是接口维度的,需要较高精度才能准确测试,我们的一般的性能测试工具,如JMeter,往往测试的为接口维度,且其框架本身的运行成本也会使得测试精度下降,测试精度是无法达到微基准测试的要求的。JMH提供了不同的测试模式,并且帮我们自动进行测试预热、JVM多环境隔离、控制方法内联,并且提供多种参数对象,如黑洞对象、实验状态对象帮我们更好的进行微基准测试和避坑。
Talk is cheap, show me the code!让我们看看如何使用JMH进行上面我们的int类型转字符串的性能测试吧!
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
publicclass JHMTest {
privateinta =123456;
@Benchmark
publicStringwork1(){
return""+ a;
}
@Benchmark
publicStringwork2(){
returna +"";
}
@Benchmark
publicStringwork3(){
returnInteger.toString(a);
}
@Benchmark
publicStringwork4(){
returnString.valueOf(a);
}
publicstaticvoidmain(String[] args)throwsRunnerException{
Options opt =newOptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(1)
.build();
newRunner(opt).run();
}
}
测试结果为:
# JMH version:1.21
# VM version: JDK1.8.0_291, Java HotSpot(TM)64-Bit Server VM,25.291-b10
# VM invoker:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup:5iterations,1seach
# Measurement:5iterations,1seach
# Timeout:10min per iteration
# Threads:1thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work1
# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration 1: 29.389 ns/op
# Warmup Iteration2:20.454ns/op
# Warmup Iteration 3: 16.060 ns/op
# Warmup Iteration4:16.014ns/op
# Warmup Iteration 5: 15.806 ns/op
Iteration1:15.982ns/op
Iteration 2: 15.901 ns/op
Iteration3:16.003ns/op
Iteration 4: 15.865 ns/op
Iteration5:15.929ns/op
Result "bestqiang.test.jmh.JHMTest.work1":
15.936 ±(99.9%) 0.219 ns/op [Average]
(min, avg, max) =(15.865,15.936,16.003), stdev =0.057
CI(99.9%): [15.717,16.155] (assumes normal distribution)
# JMH version:1.21
# VM version: JDK1.8.0_291, Java HotSpot(TM)64-Bit Server VM,25.291-b10
# VM invoker:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup:5iterations,1seach
# Measurement:5iterations,1seach
# Timeout:10min per iteration
# Threads:1thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work2
# Run progress: 25.00% complete, ETA 00:00:32
# Fork: 1 of 1
# Warmup Iteration 1: 24.678 ns/op
# Warmup Iteration2:19.042ns/op
# Warmup Iteration 3: 15.775 ns/op
# Warmup Iteration4:15.769ns/op
# Warmup Iteration 5: 15.847 ns/op
Iteration1:15.915ns/op
Iteration 2: 15.923 ns/op
Iteration3:15.836ns/op
Iteration 4: 15.898 ns/op
Iteration5:15.853ns/op
Result "bestqiang.test.jmh.JHMTest.work2"
15.885 ±(99.9%) 0.149 ns/op [Average]
(min, avg, max) =(15.836,15.885,15.923), stdev =0.039
CI(99.9%): [15.736,16.034] (assumes normal distribution)
# JMH version:1.21
# VM version: JDK1.8.0_291, Java HotSpot(TM)64-Bit Server VM,25.291-b10
# VM invoker:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup:5iterations,1seach
# Measurement:5iterations,1seach
# Timeout:10min per iteration
# Threads:1thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work3
# Run progress: 50.00% complete, ETA 00:00:21
# Fork: 1 of 1
# Warmup Iteration 1: 34.360 ns/op
# Warmup Iteration2:26.414ns/op
# Warmup Iteration 3: 25.597 ns/op
# Warmup Iteration4:25.619ns/op
# Warmup Iteration 5: 25.743 ns/op
Iteration1:25.723ns/op
Iteration 2: 25.671 ns/op
Iteration3:25.862ns/op
Iteration 4: 25.863 ns/op
Iteration5:25.774ns/op
Result "bestqiang.test.jmh.JHMTest.work3":
25.779 ±(99.9%) 0.327 ns/op [Average]
(min, avg, max) =(25.671,25.779,25.863), stdev =0.085
CI(99.9%): [25.452,26.106] (assumes normal distribution)
# JMH version:1.21
# VM version: JDK1.8.0_291, Java HotSpot(TM)64-Bit Server VM,25.291-b10
# VM invoker:/Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=64049:/Users/chenyaqiang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7142.45/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup:5iterations,1seach
# Measurement:5iterations,1seach
# Timeout:10min per iteration
# Threads:1thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: bestqiang.test.jmh.JHMTest.work4
# Run progress: 75.00% complete, ETA 00:00:10
# Fork: 1 of 1
# Warmup Iteration 1: 34.378 ns/op
# Warmup Iteration2:26.560ns/op
# Warmup Iteration 3: 25.823 ns/op
# Warmup Iteration4:25.633ns/op
# Warmup Iteration 5: 26.084 ns/op
Iteration1:25.934ns/op
Iteration 2: 26.429 ns/op
Iteration3:30.559ns/op
Iteration 4: 28.399 ns/op
Iteration5:26.415ns/op
Result "bestqiang.test.jmh.JHMTest.work4":
27.547 ±(99.9%) 7.439 ns/op [Average]
(min, avg, max) =(25.934,27.547,30.559), stdev =1.932
CI(99.9%): [20.108,34.987] (assumes normal distribution)
# Run complete. Total time:00:00:42
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments,perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, askforreviewsfromthe domain experts.
Donot assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
JHMTest.work1 avgt515.936±0.219ns/op
JHMTest.work2 avgt 5 15.885 ± 0.149 ns/op
JHMTest.work3 avgt525.779±0.327ns/op
JHMTest.work4 avgt 5 27.547 ± 7.439 ns/op
@Benchmark代表了开启基准测试的方法,avgt代表平均耗时,@Fork代表了我们开启的JVM环境,比如1就代表我们为每个进行基准测试的方法都单独开启一个JVM环境,我们几个进行基准测试的方法可以进行环境隔离避免互相影响,@Warmup代表我们的预热参数,@Measurement代表我们的测试参数,可以看到,基准测试的运行结果和我们使用常规测试的最终结果是一致的。
编写一个合格的基准测试,远不止这么简单,上面的用例看似简单,但是蕴含了一些坑点,比如变量a,如果定义为final,你会发现,最终结果会截然不同,这里我直接贴出最终测试结果:
Benchmark Mode Cnt ScoreErrorUnits
JHMTest.work1 avgt53.166±0.184ns/op
JHMTest.work2 avgt 5 3.167 ± 0.103 ns/op
JHMTest.work3 avgt526.091±1.276ns/op
JHMTest.work4 avgt 5 25.424 ± 1.306 ns/op
可以明显的看到,“”+a和a+“”的平均耗时降低为了3 ns/op。这其实就是常量折叠(Constant folding)起了作用,它是一个在编译时期简化常数的一个过程,这个优化导致我们拿到了错误的结果。但是我们也可以看到work3和work4并未受到太大影响,这种优化我们是难以准确预测的,所以我们需要按照标准格式来写我们的测试用例,否则一旦JVM进行优化,我们就会拿到异常测试结果。又比如,上面的例子中,我们在每次计算完毕后就将结果返回,这就是为了避免”死码消除“,如果一块代码我们一直没有用到,那么可能它就会被编译器优化掉,我们将计算结果返回可以避免这种情况,又或者,我们可以使用Blackholes来将我们的计算值进行消耗,避免”死码消除“的情况发生,Blackholes因为本身也有一定的计算逻辑,为避免对测试结果造成影响,这里使用了直接将计算结果返回的方式。
我们在进行微基准测试时,并没有再使用循环,那么使用循环后会有什么后果呢?JMH的官方案例 JMHSample_11_Loops 中特别提到:don't overuse loops, rely on JMH to get the measurement right. 也就是说,不要使用循环,依赖JMH去获取正确的结果,因为循环是在是坑点太多,优化点太多,如果想要写出一个安全的循环也是非常的繁琐的,所以我们日常使用中应该尽量避免。如果实在想要使用循环的话,官方提供了一个例子 JMHSample_34_SafeLooping 用来写正确的循环,应用在我们本次的测试用例上面,代码可以这么写,供大家参考:
importorg.openjdk.jmh.annotations.*;
importorg.openjdk.jmh.runner.Runner;
importorg.openjdk.jmh.runner.RunnerException;
importorg.openjdk.jmh.runner.options.Options;
importorg.openjdk.jmh.runner.options.OptionsBuilder;
importjava.util.concurrent.TimeUnit;
@State(Scope.Thread)
@Warmup(iterations =5, time =1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations =5, time =1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
publicclass JHMTest {
@Param({"1000000"})
intsize;
static String work1(int x) {
return""+ x;
}
static String work2(int x) {
returnx +"";
}
static String work3(int x) {
returnInteger.toString(x);
}
static String work4(int x) {
returnString.valueOf(x);
}
@Benchmark
public void test1() {
for(inti =0; i < size; i ++) {
work1(i);
}
}
@Benchmark
public void test2() {
for(inti =0; i < size; i ++) {
work2(i);
}
}
@Benchmark
public void test3() {
for(inti =0; i < size; i ++) {
work3(i);
}
}
@Benchmark
public void test4() {
for(inti =0; i < size; i ++) {
work4(i);
}
}
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public static void sink(String v) {
// IT IS VERY IMPORTANT TO MATCH THE SIGNATURE TO AVOID AUTOBOXING.
// The method intentionally dBenchmarkProcessoroes nothing.
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JHMTest.class.getSimpleName())
.forks(3)
.build();
new Runner(opt).run();
}
}
CompilerControl.Mode.INLINE 这个注解的意思就是强制跳过内联的意思,避免内联优化。
关于JMH的使用例子,一共有38个,这里我们不再一一介绍,大家可以去查看 https://github.com/openjdk/jmh 官方代码,这些例子可以帮助大家更好的使用JMH。关于这38个例子和涉及的知识点,我进行了脑图梳理总结,可以帮助大家更好的学习和理解:
以上,本次的分享就结束了,关于微基准性能测试,你学到了吗?如果看完的小伙伴有兴趣了解更多的话,欢迎添加vx小助手:ON7luck 详细了解~
网友评论