Kdump

作者: 偷油考拉 | 来源:发表于2021-08-19 14:12 被阅读0次

    参考资料

    红帽RHEL6 - The kdump Crash Recovery Service
    红帽RHEL7 - Kernel crash dump guide
    Ubuntu - Kernel Crash Dump
    SUSE - Kexec and Kdump
    NXP - kdump/kexec User Manual
    内核文档 - Kernel document - kdump
    Wiki文档 - kdump (Linux) - Wikipedia

    基于CentOS7,DUMP内存

    1. 安装kdump
    yum install kexec-tools
    
    1. 设置kernel启动配置
      Setting kernel command-line parameters

    编辑/etc/default/grub,增加crashkernel=auto

    GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap crashkernel=auto rd.lvm.lv=rhel/root rhgb quiet"
    GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet"
    

    生成新的grub.cfg文件

    [root@prod-proxy grub2]# grub2-mkconfig -o /boot/grub2/grub.cfg
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-3.10.0-693.el7.x86_64
    Found initrd image: /boot/initramfs-3.10.0-693.el7.x86_64.img
    Found linux image: /boot/vmlinuz-0-rescue-f45cdfd78e8d45b4b2ae3e0154762931
    Found initrd image: /boot/initramfs-0-rescue-f45cdfd78e8d45b4b2ae3e0154762931.img
    done
    
    
    [root@prod-proxy grub2]# diff grub.cfg grub.cfg.bak   --suppress-common-lines
    100c100
    <       linux16 /vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet
    ---
    >       linux16 /vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=zh_CN.UTF-8
    114c114
    <       linux16 /vmlinuz-0-rescue-f45cdfd78e8d45b4b2ae3e0154762931 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet
    ---
    >       linux16 /vmlinuz-0-rescue-f45cdfd78e8d45b4b2ae3e0154762931 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet
    117c117
    < if [ "x$default" = 'CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)' ]; then default='Advanced options for CentOS Linux>CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)'; fi;
    ---
    >
    

    重启,加载新Kernel
    检查kernel是否配置crashkernel

    [sysadmin@prod-proxy ~]$ cat /proc/cmdline
    BOOT_IMAGE=/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet
    

    启动服务

    systemctl enable kdump
    systemctl start kdump
    

    检查是否加载

    [sysadmin@prod-proxy ~]$ cat /proc/iomem |grep Crash
      2b000000-350fffff : Crash kernel
    
    1. Dump内存

    Warning
    测试Crash Dump机制会导致系统重启。如果系统的负载高,可能会导致丢失数据。如果确定要测试,请确保系统空闲或负载低。

    echo 1 > /proc/sys/kernel/sysrq
    echo c > /proc/sysrq-trigger

    正常情况下,在kernel crash的时候会激活该机制。

    使用Crash工具分析内存

    1. 安装工具
      安装Crash

      yum install crash
      

      除了Crash,还需要安装kernel-debuginfo。在root下,使用debuginfo-install安装。

      #安装 debuginfo-install
      yum install yum-utils -y
      #安装kernel-debuginfo。安装了kernel-debuginfo,yum-plugin-auto-update-debug-info,kernel-debuginfo-common-x86_64三个包
      debuginfo-install kernel
      

      安装完毕后,才会出现/usr/lib/debug/lib/modules/目录,后面会用到

    2. 确认dump文件

      [root@localhost 127.0.0.1-2021-08-18-05:30:37]# ls
      vmcore  vmcore-dmesg.txt
      
    3. 确认内核

      [root@localhost 127.0.0.1-2021-08-18-05:30:37]# uname -r
      3.10.0-862.el7.x86_64
      [root@localhost 127.0.0.1-2021-08-18-05:30:37]# ls /usr/lib/debug/lib/modules/
      3.10.0-862.el7.x86_64
      
    4. 运行crash工具

      
      

    附件1

    CentOS7安装的时候,默认开启KDUMP。如图:


    图片.png 图片.png

    附件2

    开启KDUMP vs 未开启KDUMP

    1. 查看/proc/cmdline,有没有增加crashkernel字段
    #Disable
    [root@localhost ~]# cat /proc/cmdline
    BOOT_IMAGE=/vmlinuz-3.10.0-862.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8
    
    #Enable
    [root@localhost ~]# cat /proc/cmdline
    BOOT_IMAGE=/vmlinuz-3.10.0-862.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8
    
    
    1. 查看/proc/iomem,是否成功load了crash kernel
    #Disable
    [root@localhost ~]# cat /proc/iomem |grep Crash
    
    #Enable
    [root@localhost ~]# cat /proc/iomem |grep Crash
      2b000000-350fffff : Crash kernel
    
    1. 安装kdump组件包,包括crash kernel和kexec组件
    yum install kexec-tools crash -y
    

    /usr/lib/systemd/system/kdump.servicekexec-tools 安装

    安装图形化界面

    yum install system-config-kdump
    
    1. 启动kdump服务,通过service命令或者/etc目录中的启动脚本启动
    #Disable
    [root@localhost ~]# systemctl list-unit-files |grep kdump
    kdump.service                                 disabled
    [root@localhost ~]# 
    [root@localhost ~]# systemctl is-active kdump
    unknown
    
    #Enable
    [root@localhost ~]# systemctl list-unit-files |grep kdump
    kdump.service                                 enabled 
    [root@localhost ~]# 
    [root@localhost ~]# systemctl is-active kdump
    active
    
    

    附件3

    测试kdump配置
    开启kdump,重启系统,确认服务在运行状态。然后,在交互窗输入如下命令:
    echo 1 > /proc/sys/kernel/sysrq
    echo c > /proc/sysrq-trigger

    [root@localhost ~]# free 
                  total        used        free      shared  buff/cache   available
    Mem:        1883092      125272     1372480        9012      385340     1574260
    Swap:       1679356           0     1679356
    [root@localhost ~]# 
    [root@localhost ~]# echo 1 > /proc/sys/kernel/sysrq
    [root@localhost ~]# echo c > /proc/sysrq-trigger
    Socket error Event: 32 Error: 10053.
    Connection closing...Socket close.
    
    Connection closed by foreign host.
    
    Disconnected from remote host(t2) at 17:30:53.
    
    Type `help' to learn how to use Xshell prompt.
    
    

    这将强制内核crash,创建address-YYYY-MM-DD-HH:MM:SS/vmcore文件。默认/var/crash/目录。

    NOTE
    除了可以校验配置的有效性外,该操作还可以用于记录在具有代表性测试负载下完成crash dump的时间。

    稍后,重新连接服务器,访问/var/crash目录,查看dump文件

    [root@localhost ~]# date
    Wed Aug 18 05:32:15 EDT 2021
    
    [root@localhost ~]# ls /var/crash
    127.0.0.1-2021-08-18-05:30:37
    
    [root@localhost 127.0.0.1-2021-08-18-05:30:37]# ll -h
    total 39M
    -rw-------. 1 root root  39M Aug 18 05:30 vmcore
    -rw-r--r--. 1 root root 107K Aug 18 05:30 vmcore-dmesg.txt
    
    

    相关文章

      网友评论

          本文标题:Kdump

          本文链接:https://www.haomeiwen.com/subject/urygbltx.html