安装
Centos stream
yum install systemtap systemtap-runtime
stap-prep # 安装 kernel debuginfo
如果安装 kernel debuginfo 失败,可以去下载相对应内核版本的 rpm 包,然后手动安装
- kernel-debuginfo
- kernel-debuginfo-common
- kernel-devel
~/go/src/ceph (test ✗) uname -r
4.18.0-490.el8.x86_64
~/go/src/ceph (test ✗) rpm -ivh kernel-debuginfo-common-x86_64-4.18.0-490.el8.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:kernel-debuginfo-common-x86_64-4.################################# [100%]
~/go/src/ceph (test ✗) rpm -ivh kernel-debuginfo-4.18.0-490.el8.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
Updating / installing...
1:kernel-debuginfo-4.18.0-490.el8 ################################# [100%]
~/go/src/ceph (test ✗) rpm -ivh https://kojihub.stream.centos.org/kojifiles/packages/kernel/4.18.0/490.el8/x86_64/kernel-devel-4.18.0-490.el8.x86_64.rpm
Retrieving https://kojihub.stream.centos.org/kojifiles/packages/kernel/4.18.0/490.el8/x86_64/kernel-devel-4.18.0-490.el8.x86_64.rpm
Verifying... ################################# [100%]
Preparing... ################################# [100%]
package kernel-devel-4.18.0-490.el8.x86_64 is already installed
测试
如果以下脚本能正常返回说明依赖包安装完成
~/go/src/ceph (test ✗) stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Pass 1: parsed user script and 490 library scripts using 301984virt/97936res/17212shr/82532data kb, in 210usr/40sys/246real ms.
Pass 2: analyzed script: 2 probes, 1 function, 5 embeds, 0 globals using 548244virt/345876res/19004shr/328792data kb, in 3270usr/450sys/3749real ms.
Pass 3: using cached /root/.systemtap/cache/81/stap_8173f38ff10dbf47e3b530681b423838_2760.c
Pass 4: using cached /root/.systemtap/cache/81/stap_8173f38ff10dbf47e3b530681b423838_2760.ko
Pass 5: starting run.
read performed
Pass 5: run completed in 10usr/110sys/433real ms.
使用
目前我们主要考虑用户态
SystemTap requires the uprobes module to perform user-space probing. If your Linux kernel is version 3.5 or higher, it already includes uprobes. To verify that the current kernel supports uprobes natively, run the following command:
~/go/src/ceph (test ✗) grep CONFIG_UPROBES /boot/config-`uname -r`
CONFIG_UPROBES=y
如果我们希望查询动态链接库的所有可用的探测点,可以用 stap -L
打印探测点和参数信息
- 查询 Python 动态链接库的探测点
~/go/src/ceph (test ✗) stap -L 'process("/lib64/libpython*.so.*").mark("*")'
process("/usr/lib64/libpython3.6dm.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libpython3.6dm.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libpython3.6dm.so.1.0").mark("gc__done") $arg1:long
process("/usr/lib64/libpython3.6dm.so.1.0").mark("gc__start") $arg1:long
process("/usr/lib64/libpython3.6dm.so.1.0").mark("line") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libpython3.6m.so.1.0").mark("function__entry") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libpython3.6m.so.1.0").mark("function__return") $arg1:long $arg2:long $arg3:long
process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__done") $arg1:long
process("/usr/lib64/libpython3.6m.so.1.0").mark("gc__start") $arg1:long
process("/usr/lib64/libpython3.6m.so.1.0").mark("line") $arg1:long $arg2:long $arg3:long
- 以 ceph-osd 为例,查询其可用的探测点
~/go/src/ceph (test ✗) stap -L 'process("ceph-osd").function("OSD::*")'
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::OSD@/root/go/src/ceph/src/osd/OSD.cc:2285") $this:class OSD* const $cct_:class CephContext* $store_:class ObjectStore* $id:int $internal_messenger:class Messenger* $external_messenger:class Messenger* $hb_client_front:class Messenger* $hb_client_back:class Messenger* $hb_front_serverm:class Messenger* $hb_back_serverm:class Messenger* $osdc_messenger:class Messenger* $mc:class MonClient* $dev:string const& $jdev:string const& $poolctx:class io_context_pool&
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_add_heartbeat_peer@/root/go/src/ceph/src/osd/OSD.cc:5226") $this:class OSD* const $p:int $hi:struct HeartbeatInfo* $i:iterator $__PRETTY_FUNCTION__:char const[] const
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_collect_metadata@/root/go/src/ceph/src/osd/OSD.cc:6707") $this:class OSD* const $pm:class map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >* $osdspec_affinity:string $r:int $ceph_version_when_created:string $created_at:string $devnames:class set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > $errs:class map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > $__func__:char const[] const
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_committed_osd_maps@/root/go/src/ceph/src/osd/OSD.cc:8375") $this:class OSD* const $first:epoch_t $last:epoch_t $m:class MOSDMap* $__func__:char const[] const $l:class lock_guard<ceph::mutex_debug_detail::mutex_debug_impl<false> > $__PRETTY_FUNCTION__:char const[] const $do_shutdown:bool $do_restart:bool $network_error:bool $osdmap:OSDMapRef $_bind_epoch:epoch_t
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_dispatch@/root/go/src/ceph/src/osd/OSD.cc:7405") $this:class OSD* const $m:class Message* $__PRETTY_FUNCTION__:char const[] const
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_finish_splits@/root/go/src/ceph/src/osd/OSD.cc:8718") $this:class OSD* const $pgs:class set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >& $__func__:char const[] const
process("/root/go/src/ceph/build/bin/ceph-osd").function("OSD::_get_pgids@/root/go/src/ceph/src/osd/OSD.cc:4747") $this:class OSD* const $v:class vector<spg_t, std::allocator<spg_t> >*
...
- 对于客户端调用 librados 写数据,可以调用
ldd
查看其调用的动态链接库
~/go/src/ceph/build (test ✗) ldd rados_write
linux-vdso.so.1 (0x00007ffced1e5000)
librados.so.2 => /root/go/src/ceph/build/lib/librados.so.2 (0x00007f0cc7900000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0cc753b000)
libceph-common.so.2 => /root/go/src/ceph/build/lib/libceph-common.so.2 (0x00007f0cbca5d000)
libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f0cbc80a000)
libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f0cbc320000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f0cbc11c000)
librt.so.1 => /lib64/librt.so.1 (0x00007f0cbbf14000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f0cbbcfc000)
libfmt.so.6 => /lib64/libfmt.so.6 (0x00007f0cbbab7000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0cbb897000)
libudev.so.1 => /lib64/libudev.so.1 (0x00007f0cbb5fb000)
libz.so.1 => /lib64/libz.so.1 (0x00007f0cbb3e3000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f0cbb04e000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0cbaccc000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0cbaab4000)
/lib64/ld-linux-x86-64.so.2 (0x00007f0cc7e79000)
libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f0cba8ac000)
libmount.so.1 => /lib64/libmount.so.1 (0x00007f0cba652000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f0cba427000)
libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f0cba1a3000)
然后可以查询你用到的动态链接库的函数探测点
stap -L 'process("/root/go/src/ceph/build/lib/libceph-common.so*").function("*")'
stap -L 'process("/root/go/src/ceph/build/lib/librados.so*").function("*")'
同理,也可以查询 osd 的所有函数探测点
stap -L 'process("/root/go/src/ceph/build/bin/ceph-osd").function("*")'
查看 rados write 数据时函数的调用过程
如下所示 client-callgraph.stp
脚本,参考官方的 Call Graph Tracing,例子比较简单,只输出了收发消息的打印,可以参考官方 examples 来输出更复杂的脚本。
#! /usr/bin/env stap
function trace(entry_p, extra) {
printf("%s%s%s %s\n",
thread_indent (entry_p),
(entry_p>0?"->":"<-"),
ppfunc (),
extra)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("connect*").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("connect*").return {
trace(-2, $$return)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("set_messenger").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("set_messenger*").return{
trace(-2, $$return)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("add_dispatch*").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/librados.so*").function("add_dispatch*").return{
trace(-2, $$return)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("AsyncConnection::*").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("AsyncConnection::*").return{
trace(-2, $$return)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("ProtocolV2::read*").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("ProtocolV2::read*").return{
trace(-2, $$return)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("ProtocolV2::send_message").call {
trace(2, $$parms)
}
probe process("/root/go/src/ceph/build/lib/libceph-common.so*").function("ProtocolV2::send_message").return{
trace(-2, $$return)
}
执行后部分结果如下所示:
~/go/src/ceph (test ✗) stap client-callgraph.stp -v > client.calltrace 2>&1
如下输出只过滤出了 rados_write
线程(tid: 3091414)相关的,其中第一列为执行时间,单位毫秒,第二列输出 execname(tid),紧接着是执行函数。
~/go/src/ceph (test ✗) cat client.calltrace
0 rados_write(3091414): ->connect this=0x16ff9b0
4111 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x1704110 cct=0x169e980 m=0x17021e0 q=0x1702570 w=0x1703560 m2=0x1 local=0x1
4259 rados_write(3091414): <-AsyncConnection::AsyncConnection
8137 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x17aca40 cct=0x169e980 m=0x17021e0 q=0x1702570 w=0x1707e00 m2=0x1 local=0x0
8196 rados_write(3091414): <-AsyncConnection::AsyncConnection
8205 rados_write(3091414): ->AsyncConnection::connect this=0x17aca40 addrs=0x7ffd575e4210 type=0x1 target=0x7ffd575e4150
8235 rados_write(3091414): ->AsyncConnection::_connect this=0x17aca40
8264 rados_write(3091414): <-AsyncConnection::_connect
8267 rados_write(3091414): <-AsyncConnection::connect
8322 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x1706f90 cct=0x169e980 m=0x17021e0 q=0x1702570 w=0x1708f20 m2=0x1 local=0x0
8345 rados_write(3091414): <-AsyncConnection::AsyncConnection
8350 rados_write(3091414): ->AsyncConnection::connect this=0x1706f90 addrs=0x7ffd575e4210 type=0x1 target=0x7ffd575e4150
8362 rados_write(3091414): ->AsyncConnection::_connect this=0x1706f90
8383 rados_write(3091414): <-AsyncConnection::_connect
8385 rados_write(3091414): <-AsyncConnection::connect
8418 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x17b9450 cct=0x169e980 m=0x17021e0 q=0x1702570 w=0x1703560 m2=0x1 local=0x0
8437 rados_write(3091414): <-AsyncConnection::AsyncConnection
8441 rados_write(3091414): ->AsyncConnection::connect this=0x17b9450 addrs=0x7ffd575e4210 type=0x1 target=0x7ffd575e4150
8464 rados_write(3091414): ->AsyncConnection::_connect this=0x17b9450
8481 rados_write(3091414): <-AsyncConnection::_connect
8493 rados_write(3091414): <-AsyncConnection::connect
8539 rados_write(3091414): ->AsyncConnection::send_message this=0x17b9450 m=0x16cb5c0
8562 rados_write(3091414): ->ProtocolV2::send_message this=0x17bb870 m=0x16cb5c0
8576 rados_write(3091414): <-ProtocolV2::send_message
8579 rados_write(3091414): <-AsyncConnection::send_message return=0x0
8595 rados_write(3091414): ->AsyncConnection::send_message this=0x1706f90 m=0x1690ba0
8625 rados_write(3091414): ->ProtocolV2::send_message this=0x17073a0 m=0x1690ba0
8845 rados_write(3091414): <-ProtocolV2::send_message
8848 rados_write(3091414): <-AsyncConnection::send_message return=0x0
8868 rados_write(3091414): ->AsyncConnection::send_message this=0x17aca40 m=0x16959f0
8878 rados_write(3091414): ->ProtocolV2::send_message this=0x1706aa0 m=0x16959f0
8897 rados_write(3091414): <-ProtocolV2::send_message
8900 rados_write(3091414): <-AsyncConnection::send_message return=0x0
20265 rados_write(3091414): ->AsyncConnection::mark_down this=0x17b9450
20317 rados_write(3091414): ->AsyncConnection::_stop this=0x17b9450
20340 rados_write(3091414): ->AsyncConnection::unregister this=0x7ffd575e6b60
20345 rados_write(3091414): <-AsyncConnection::unregister
20379 rados_write(3091414): <-AsyncConnection::_stop
20390 rados_write(3091414): <-AsyncConnection::mark_down
20915 rados_write(3091414): ->AsyncConnection::stop this=0x17b9450 queue_reset=0x1
20929 rados_write(3091414): <-AsyncConnection::stop
20935 rados_write(3091414): ->AsyncConnection::stop this=0x17aca40 queue_reset=0x1
20943 rados_write(3091414): <-AsyncConnection::stop
20948 rados_write(3091414): ->AsyncConnection::stop this=0x1706f90 queue_reset=0x1
20956 rados_write(3091414): <-AsyncConnection::stop
20970 rados_write(3091414): ->AsyncConnection::get_perf_counter this=0x1706f90
20989 rados_write(3091414): <-AsyncConnection::get_perf_counter return=0x16ab700
20996 rados_write(3091414): ->AsyncConnection::get_perf_counter this=0x17aca40
21003 rados_write(3091414): <-AsyncConnection::get_perf_counter return=0x16ac350
21008 rados_write(3091414): ->AsyncConnection::get_perf_counter this=0x17b9450
21014 rados_write(3091414): <-AsyncConnection::get_perf_counter return=0x16b96d0
21022 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x17b9450
21031 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x17b9450
21078 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21082 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21088 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x17aca40
21097 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x17aca40
21123 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21126 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21131 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x1706f90
21139 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x1706f90
21166 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21169 rados_write(3091414): <-AsyncConnection::~AsyncConnection
21183 rados_write(3091414): ->AsyncConnection::mark_down this=0x1704110
21224 rados_write(3091414): ->AsyncConnection::_stop this=0x1704110
21240 rados_write(3091414): ->AsyncConnection::unregister this=0x7ffd575e6b50
21244 rados_write(3091414): <-AsyncConnection::unregister
21275 rados_write(3091414): <-AsyncConnection::_stop
21278 rados_write(3091414): <-AsyncConnection::mark_down
21919 rados_write(3091414): ->AsyncConnection::get_perf_counter this=0x1704110
21931 rados_write(3091414): <-AsyncConnection::get_perf_counter return=0x16b96d0
22057 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x1704110
22065 rados_write(3091414): ->AsyncConnection::~AsyncConnection this=0x1704110
22085 rados_write(3091414): <-AsyncConnection::~AsyncConnection
22088 rados_write(3091414): <-AsyncConnection::~AsyncConnection
27678 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x1704110 cct=0x169e980 m=0x170a5b0 q=0x170a940 w=0x1703560 m2=0x1 local=0x1
27718 rados_write(3091414): <-AsyncConnection::AsyncConnection
28434 rados_write(3091414): ->set_messenger this=0x16ffaa0 m=0x170a5b0
28447 rados_write(3091414): <-set_messenger
28454 rados_write(3091414): ->set_messenger this=0x1700210 msgr_=0x170a5b0
28461 rados_write(3091414): <-set_messenger
28791 rados_write(3091414): ->add_dispatcher_head this=0x93cc8258fa01ca00 d=0x16ffa10
29074 rados_write(3091414): <-add_dispatcher_head
29081 rados_write(3091414): ->add_dispatcher_tail this=0x170a5b0 d=0x1700210
29087 rados_write(3091414): <-add_dispatcher_tail
29091 rados_write(3091414): ->add_dispatcher_tail this=0x170a5b0 d=0x170d6c8
29094 rados_write(3091414): <-add_dispatcher_tail
33512 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x1706f90 cct=0x169e980 m=0x170a5b0 q=0x170a940 w=0x1707e00 m2=0x1 local=0x0
33569 rados_write(3091414): <-AsyncConnection::AsyncConnection
33579 rados_write(3091414): ->AsyncConnection::connect this=0x1706f90 addrs=0x7ffd575e4350 type=0x1 target=0x7ffd575e4290
33601 rados_write(3091414): ->AsyncConnection::_connect this=0x1706f90
33629 rados_write(3091414): <-AsyncConnection::_connect
33633 rados_write(3091414): <-AsyncConnection::connect
33691 rados_write(3091414): ->AsyncConnection::AsyncConnection this=0x17aca40 cct=0x169e980 m=0x170a5b0 q=0x170a940 w=0x1708f20 m2=0x1 local=0x0
参考链接
Chapter 2. Using SystemTap
kojihub.stream.centos.org
Chapter 4. User-space Probing
网友评论