美文网首页程序员
Mono源码阅读-崩溃机制

Mono源码阅读-崩溃机制

作者: 骆驼骑士 | 来源:发表于2020-05-23 15:17 被阅读0次

    Mono源码阅读-崩溃机制

    # 简介

    本文主要针对mono源码中关于崩溃信号量处理的相关源码进行阅读和研究,源码涉及的代码文件如下:

    • mini.c

    • mini-posix.c

    • mini-exceptions.c

    • exceptions-arm.c

    Install Signal Handler

    add_signal_handler

    所有信号handler的注册函数都是调用 add_signal_handler的.

    mono代码里一共调用这个函数来注册信号量的函数有:

    interp.c

    • mono_runtime_install_handlers

    mini-posix.c

    • mono_runtime_posix_install_handlers
    • mono_runtime_setup_stat_profiler (SIGPROF)

    mono_runtime_posix_install_handlers

    这里主要关注mini目录下的信号量注册.

    Mono捕捉的信号:

    • SIGINT (if handle sigint)
    • SIGFPE
    • SIGQUIT
    • SIGILL
    • SIGBUS
    • SIGUSR2(if mono_jit_trace_calls != null)
    • SIGUSR1 -> mono_thread_get_abort_signal(0
    • SIGABRT
    • SIGSEGV
    常量 解释
    SIGSEGV 非法内存访问(段错误),试图访问未分配给自己的内存, 或试图往没有写权限的内存地址写数据.
    SIGINT 外部中断,通常为用户所发动, 程序终止(interrupt)信号, 在用户键入INTR字符(通常是Ctrl-C)时发出,用于通知前台进程组终止进程。
    SIGILL 非法程序映像,例如非法指令, 执行了非法指令. 通常是因为可执行文件本身出现错误, 或者试图执行数据段. 堆栈溢出时也有可能产生这个信号。
    SIGABRT 异常终止条件,例如 abort() 所起始的
    SIGFPE 在发生致命的算术运算错误时发出. 不仅包括浮点运算错误, 还包括溢出及除数为0等其它所有的算术的错误。
    SIGQUIT 和SIGINT类似, 但由QUIT字符(通常是Ctrl-\)来控制. 进程在因收到SIGQUIT退出时会产生core文件, 在这个意义上类似于一个程序错误信号。
    SIGBUS 非法地址, 包括内存地址对齐(alignment)出错。比如访问一个四个字长的整数, 但其地址不是4的倍数。它与SIGSEGV的区别在于后者是由于对合法存储地址的非法访问触发的(如访问不属于自己存储空间或只读存储空间)。
    SIGUSR1 留给用户使用
    SIGUSR1 留给用户使用

    注册信号量的代码:

    void
    mono_runtime_posix_install_handlers (void)
    {
    
    
        sigset_t signal_set;
    
    
        if (mini_get_debug_options ()->handle_sigint)
            add_signal_handler (SIGINT, mono_sigint_signal_handler);
    
    
        add_signal_handler (SIGFPE, mono_sigfpe_signal_handler);
        add_signal_handler (SIGQUIT, sigquit_signal_handler);
        add_signal_handler (SIGILL, mono_sigill_signal_handler);
        add_signal_handler (SIGBUS, mono_sigsegv_signal_handler);
        if (mono_jit_trace_calls != NULL)
            add_signal_handler (SIGUSR2, sigusr2_signal_handler);
    
    
        add_signal_handler (mono_thread_get_abort_signal (), sigusr1_signal_handler);
        /* it seems to have become a common bug for some programs that run as parents
         * of many processes to block signal delivery for real time signals.
         * We try to detect and work around their breakage here.
         */
        sigemptyset (&signal_set);
        sigaddset (&signal_set, mono_thread_get_abort_signal ());
        sigprocmask (SIG_UNBLOCK, &signal_set, NULL);
    
    
        signal (SIGPIPE, SIG_IGN);
    
    
    #ifndef MONO_CROSS_COMPILE
        add_signal_handler (SIGABRT, sigabrt_signal_handler);
    
    
        /* catch SIGSEGV */
        add_signal_handler (SIGSEGV, mono_sigsegv_signal_handler);
    #endif
    }
    

    Signal Handler

    所有的信号handler都是使用 SIG_HANDLER_SIGNATURE 宏来定义的:

    mini-posix.c

    • sigabrt_signal_handler
    • sigprof_signal_handler
    • sigquit_signal_handler
    • siguser1_signal_handler
    • sigusr2_signal_handler

    mini.c

    • mono_sigfpe_signal_handler
    • mono_sigill_signal_handler
    • mono_sigsegv_signal_handler
    • mono_sigint_signal_handler

    SIGINT

    void
    SIG_HANDLER_SIGNATURE (mono_sigint_signal_handler)
    {
        MonoException *exc;
        GET_CONTEXT;
    
    
        exc = mono_get_exception_execution_engine ("Interrupted (SIGINT).");
    
        mono_arch_handle_exception (ctx, exc, FALSE);
    }
    

    mono_arch_handle_exception

    /*
    * This is the function called from the signal handler
    */
    gboolean
    mono_arch_handle_exception (void *ctx, gpointer obj, gboolean test_only)
    {
        MonoContext mctx;
        gboolean result;
    
    
        mono_arch_sigctx_to_monoctx (ctx, &mctx);
    
    
        result = mono_handle_exception (&mctx, obj, (gpointer)mctx.eip, test_only);
        /* restore the context so that returning from the signal handler will invoke
         * the catch clause
         */
        mono_arch_monoctx_to_sigctx (&mctx, ctx);
        return result;
    }
    

    SEGV

    如果没有 mono_domain_get() 或者没有 jit_tls 则可以认为该线程非管理线程, 则调用
    mono_chain_signal 来调用注册的chian signal handler 去处理, 如果该handler返回true, 则mono直接return不做任何处理 , 否则mono会调用
    如果是管理线程, 那么和在C#里面Throw Exception的逻辑一样, 调用mono_handle_exception去处理C#的异常.
    mono_handle_native_sigsegv来打印堆栈并最后调用 abort()

    这里的chain_signal_handler就是mono在注册信号量的时候预先保存了之前的signal_handler
    saved_handler

    void
    SIG_HANDLER_SIGNATURE (mono_sigsegv_signal_handler)
    {
        MonoJitInfo *ji;
        MonoJitTlsData *jit_tls = TlsGetValue (mono_jit_tls_id);
        gpointer ip;
    
    
        GET_CONTEXT;
    
    
    #if defined(MONO_ARCH_SOFT_DEBUG_SUPPORTED) && defined(HAVE_SIG_INFO)
        if (mono_arch_is_single_step_event (info, ctx)) {
            mono_debugger_agent_single_step_event (ctx);
            return;
        } else if (mono_arch_is_breakpoint_event (info, ctx)) {
            mono_debugger_agent_breakpoint_hit (ctx);
            return;
        }
    #endif
    
    
    #if !defined(PLATFORM_WIN32) && defined(HAVE_SIG_INFO)
        if (mono_aot_is_pagefault (info->si_addr)) {
            mono_aot_handle_pagefault (info->si_addr);
            return;
        }
    #endif
    
    
        /* The thread might no be registered with the runtime */
        if (!mono_domain_get () || !jit_tls) {
            if (mono_chain_signal (SIG_HANDLER_PARAMS))
                return;
            mono_handle_native_sigsegv (SIGSEGV, ctx);
        }
    
    
        ip = mono_arch_ip_from_context (ctx);
    #ifdef _WIN64
        /* Sometimes on win64 we get null IP, but the previous frame is a valid managed frame */
        /* So pop and try again */
        if (!ip && ctx)
        {
            MonoContext *context = (MonoContext*)ctx;
            gpointer *sp = context->rsp;
            if (sp)
            {
                ip = context->rip = *sp;
                context->rsp += sizeof(gpointer);
            }
        }
    #endif
        ji = mono_jit_info_table_find (mono_domain_get (), ip);
    
    
    #ifdef MONO_ARCH_SIGSEGV_ON_ALTSTACK
        if (mono_handle_soft_stack_ovf (jit_tls, ji, ctx, (guint8*)info->si_addr))
            return;
    
    
        /* The hard-guard page has been hit: there is not much we can do anymore
         * Print a hopefully clear message and abort.
         */
        if (jit_tls->stack_size &&
                ABS ((guint8*)info->si_addr - ((guint8*)jit_tls->end_of_stack - jit_tls->stack_size)) < 32768) {
            const char *method;
            /* we don't do much now, but we can warn the user with a useful message */
            fprintf (stderr, "Stack overflow: IP: %p, fault addr: %p\n", mono_arch_ip_from_context (ctx), (gpointer)info->si_addr);
            if (ji && ji->method)
                method = mono_method_full_name (ji->method, TRUE);
            else
                method = "Unmanaged";
            fprintf (stderr, "At %s\n", method);
            _exit (1);
        } else {
            /* The original handler might not like that it is executed on an altstack... */
            if (!ji && mono_chain_signal (SIG_HANDLER_PARAMS))
                return;
    
    
            mono_arch_handle_altstack_exception (ctx, info->si_addr, FALSE);
        }
    #else
    
    
        if (!ji) {
            if (mono_chain_signal (SIG_HANDLER_PARAMS))
                return;
    
    
            mono_handle_native_sigsegv (SIGSEGV, ctx);
        }
    
        mono_arch_handle_exception (ctx, NULL, FALSE);
    #endif
    }
    

    mono_handle_native_sigsegv

    几个关键点:

    • mono_backtrace_from_context (OS X) 从sig context里转成MonoContext, 并且回溯堆栈的每一个PC
    • backtrace (非OS X) 也是回溯堆栈的每一个PC值
    • backtrace_symbols 将每个PC值转换成函数名(符号名称)

    然后将堆栈打印到stderr

    然后通过GDB获取更详细的调试信息, 并打印到stderr.

    最后去掉监听ABRT信号量, 然后调用 abort() 函数来退出程序.

    Throw Exception

    mono_arm_throw_exception

    exceptions-arm.c

    抛出异常的代码:

    void
    mono_arm_throw_exception (MonoObject *exc, unsigned long eip, unsigned long esp, gulong *int_regs, gdouble *fp_regs)
    {
        static void (*restore_context) (MonoContext *);
        MonoContext ctx;
        gboolean rethrow = eip & 1;
    
    
        if (!restore_context)
            restore_context = mono_get_restore_context ();
    
    
        eip &= ~1; /* clear the optional rethrow bit */
        /* adjust eip so that it point into the call instruction */
        eip -= 4;
    
    
        /*printf ("stack in throw: %p\n", esp);*/
        MONO_CONTEXT_SET_BP (&ctx, int_regs [ARMREG_FP - 4]);
        MONO_CONTEXT_SET_SP (&ctx, esp);
        MONO_CONTEXT_SET_IP (&ctx, eip);
        memcpy (((guint8*)&ctx.regs) + (4 * 4), int_regs, sizeof (gulong) * 8);
        /* memcpy (&ctx.fregs, fp_regs, sizeof (double) * MONO_SAVED_FREGS); */
    
    
        if (mono_object_isinst (exc, mono_defaults.exception_class)) {
            MonoException *mono_ex = (MonoException*)exc;
            if (!rethrow)
                mono_ex->stack_trace = NULL;
        }
        mono_handle_exception (&ctx, exc, (gpointer)(eip + 4), FALSE);
        restore_context (&ctx);
        g_assert_not_reached ();
    }
    

    保存context
    mono_handle_exception
    还原context

    mono_handle_exception

    mini-exceptions.c

    • mono_handle_exception
      • mono_handle_exception_internal

    MonoContext

    Mono为了做平台兼容性, 将sig_context全部统一成 MonoContext 结构体, 主要包括寄存器的各类值, 例如ARM下保存了PC, FP, SP和R0-R15

    typedef struct {
        gulong eip;          // pc
        gulong ebp;          // fp
        gulong esp;          // sp
        gulong regs [16];
        double fregs [MONO_SAVED_FREGS];
    } MonoContext;
    

    eip -> sigctx.arm_pc (R15)
    esp -> sigctx.arm_sp (RR13)
    ebp -> sigctx.arm_fp (R11)
    regs -> sigctx.arm_r0, sizeof(gulong) * 16 (R0 ~ R15)


    http://www.mono-project.com/docs/debug+profile/debug/
    http://www.mono-project.com/docs/advanced/embedding/

    define mono_backtrace select-frame 0 set $i = 0 while ($i < $arg0) set $foo = (char*) mono_pmip ($pc) if ($foo) printf "#%d %p in %s\n", $i, $pc, $foo else frame end up-silently set $i = $i + 1 end end
    
    define mono_stack set $mono_thread = mono_thread_current () if ($mono_thread == 0x00) printf "No mono thread associated with this thread\n" else set $ucp = malloc (sizeof (ucontext_t)) call (void) getcontext ($ucp) call (void) mono_print_thread_dump ($ucp) call (void) free ($ucp) end end
    

    mono_chain_signal 调用原handler
    mono_handle_native_sigsegv 打印堆栈 and abort()
    mono_arch_handle_exception // exceptions-arm.c
    mono_handle_exception_internal // mini-exceptions.c

        if (!ji) {
            if (mono_chain_signal (SIG_HANDLER_PARAMS))
                return;
    
    
            mono_handle_native_sigsegv (SIGSEGV, ctx);
        }
    
        mono_arch_handle_exception (ctx, exc, FALSE);
    

    mini.c
    SIG_HANDLER_SIGNATURE (mono_sigfpe_signal_handler)
    SIG_HANDLER_SIGNATURE (mono_sigill_signal_handler)
    SIG_HANDLER_SIGNATURE (mono_sigsegv_signal_handler)
    SIG_HANDLER_SIGNATURE (mono_sigint_signal_handler)

    mini-posix.c
    SIG_HANDLER_SIGNATURE (sigabrt_signal_handler)
    SIG_HANDLER_SIGNATURE (sigusr1_signal_handler)
    SIG_HANDLER_SIGNATURE (sigprof_signal_handler)
    SIG_HANDLER_SIGNATURE (sigquit_signal_handler)
    SIG_HANDLER_SIGNATURE (sigusr2_signal_handler)


    在非管理线程, 无法获取 tls, 主要是两个:

    • mono_domain_get()
    • jit_tls

    就算可以通过ptrace获取tls, 但因为必须调用如下几个函数来walk stack,

    • mono_jit_walk_stack_from_ctx
    • mono_walk_stack

    这里函数里面都去访问了tls, 因此无法传值进去, 如果自己去实现这两个函数, 又因为很多结构体无法访问到, 因此不能自己去实现stack walker

    就算只希望拿到最后一个pc, 去获取c#函数名, 因为所有的C#的函数信息都存放在 domain 里的jitInfoTable 里, 如果可以获取到 current domain对象, 那么也可以通过
    mono_jit_info_table_find(domain, addr) 函数来获取到 MonoJitInfo, 然后用 mono_jit_info_get_method 获取 MonoMethod, 最后通过 mono_method_full_name 来得到函数名.


    NOTE ATTRIBUTES

    Created Date: 2018-05-25 00:55:50
    Last Evernote Update Date: 2020-05-23 07:15:03

    相关文章

      网友评论

        本文标题:Mono源码阅读-崩溃机制

        本文链接:https://www.haomeiwen.com/subject/cfxlahtx.html