代码
https://github.com/gykimo/c_plusplus_optimize/tree/master/spinlock_vs_mutex
耗时
[case 1] mutex:
real 0m1.009s
user 0m0.744s
sys 0m0.865s
==========================
rm: spinlock_loop_nothing: No such file or directory
[case 2] spinlock_loop_nothing:
real 0m2.009s
user 0m3.959s
sys 0m0.019s
==========================
rm: spinlock_loop_sleep: No such file or directory
[case 3] spinlock_loop_sleep:
real 0m0.397s
user 0m0.521s
sys 0m0.261s
==========================
rm: spinlock_loop_nanosleep: No such file or directory
[case 4] spinlock_loop_nanosleep:
real 0m0.500s
user 0m0.766s
sys 0m0.207s
==========================
rm: spinlock_webrtc: No such file or directory
[case 5] spinlock_webrtc:
real 0m0.472s
user 0m0.853s
sys 0m0.064s
==========================
rm: mutex_heavy_task: No such file or directory
[case 6] mutex_heavy_task:
fail, count: 20000
real 0m3.135s
user 0m3.095s
sys 0m0.081s
==========================
rm: spinlock_webrtc_heavy_task: No such file or directory
[case 7] spinlock_webrtc_heavy_task:
fail, count: 20000
real 0m3.158s
user 0m5.706s
sys 0m0.445s
说明
当临界区耗时小时,如case1-case5,其中spinlock_webrtc user+sys耗时比mutex少很多,所以这种情况下自旋旋更合适;
当临界区耗时大时,如case6-case7,mutex的user+sys耗时比spinlock_webrtc少很多,所以这种情况下mutex更合适;
自旋锁实现不好,性能比mutex还低
如case1,Lock一直死循环判断lock_acquired是为0(空闲),反而性能很差,因为,lock_acquired是1期间,当线程一直满负载占用当前CPU,导致user耗时非常高。
代码如下:
void Lock()
{
while (__sync_val_compare_and_swap(&lock_acquired, 0, 1))
{
}
}
sleep版本性能好一些
void Lock()
{
while (__sync_val_compare_and_swap(&lock_acquired, 0, 1))
{
sleep(0);
}
}
webrtc的性能最好
每次循环,调用sched_yield将当前线程挂载一会,这样可以不用一直占着CPU资源。
void Lock()
{
while (__sync_val_compare_and_swap(&lock_acquired, 0, 1))
{
sched_yield();
}
}
网友评论