原文
http://kakaroto.homelinux.net/2017/11/introduction-to-reverse-engineering-and-assembly/
翻译部分段落
1. Stack
Stack 栈内存储数据上图展示了栈的LIFO(后进先出)性质,它在RAM中是向下增长(和图中向上增加是相反的)。栈用来存储局部变量和函数的返回地址(来自上一层函数调用该函数后的指令)。
一个栈里有多个栈帧,可以看到当前的栈帧,包括了它所有的变量,和调用它的函数的返回地址,在它之上,是上一个函数的帧,也是包含了它自己的变量和调用它的函数的返回地址,依此类推,main函数在栈的顶端。
2. Registers
寄存器介绍可以把一个寄存器当作一个变量,x86处理器上一共只有9个寄存器,其中只有7个是可用的。
x86上的变量寄存器是这些:EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, EIP.
其中两个比较特殊:
EIP:指令指针,存储当前正在执行的指令的地址;
ESP:栈指针,存储当前栈的地址。
3. Instructions
常用指令
常用指令Call指令
调用一个函数。
相当于两步:
- PUSH %EIP+4
- JMP
Sample
int main() {
return add_a_and_b(2, 3);
}
int add_a_and_b(int a, int b) {
return a + b;
}
编译后
image.pngmain的指令解释:
_main | |||
---|---|---|---|
push 3 | Push the second argument '3' into the stack | 把参数3写入栈; ESP | |
push 2 | Push the first argument '2' into the stack | 把参数2写入栈 | |
call _add_a_and_b | Call the _add_a_and_b function. This will put the address of the next instruction (add) into the stack, then it will jump into the _add_a_and_b function by putting the address of the first instruction in the _add_a_and_b label (push %ebx) into the EIP register | 调用_add_a_and_b函数,会把下一条指令(add)的地址写进栈里,然后通过把_add_a_and_b标签里的第一条指令地址(push %ebx)写入EIP寄存器,来跳转到_add_a_and_b这个函数。 | |
add %esp, 8 | Add 8 to the esp, which effectively pops out the two values we just pushed into it | esp寄存器加8个字节,即把push两个value值的空间pop释放掉 | |
ret | Return to the parent function.... | 返回上一级父层函数。 |
For the purposes of this exercise, we’re going to assume that the _main function is located in memory at the address 0xFFFF0000, and that each instructoin is 4 bytes long (the size of each instruction can vary depending on the instruction and on its operands).
So you can see, we first pushed 3 into the stack, %esp was lowered, then we pushed 2 into the stack, %esp was lowered, then we did a ‘call _add_a_and_b’, which stored the address of the next instruction (4 instructions into the main, so ‘_main+16’) into the stack and esp was lowered, then we pushed %ebx, which I assumed here contained a value of 0, and the %esp was lowered again. If we now wanted to access the first argument to the function (2), we need to access %esp+8, which will let us skip the saved %ebx and the ‘Return address’ that are in the stack (since we’re working with 32 bits, each value is 4 bytes). And in order to access the second argument (3), we need to access %esp+12.
翻译如下:
假定main函数在内存里起始地址是0xFFFF0000,每个指令占4个字节(指令大小可以根据指令和操作数变化)
push 3入栈后,ESP指针位置降低(减4);
然后push 2入栈,ESP又降低(减4)。
接着call _add_a_and_b函数。
所做的操作是:存储main函数里call的下一条指令地址到栈里。
因为call的下一条指令是“add %esp, 8 ”,所以是保存"add %esp, 8"这条指令的地址到stack中,而它在main里是第四条指令,所以是main函数地址+16字节(按一条指令占4字节算)。
- 此处存在争议点,函数的地址和函数里的第一条指令地址是不是一致?如果是一致的话,"add %esp, 8"的地址应该是main+12才对。
跳转到add函数后,先把ebx寄存器的原始值push入栈(这里假设ebx的原始值为0),这时esp指针位置会再次降低(减4)。
看下面这个图吧:
stack存储情况.png再来看add函数里这两句指令:
获取value.pngmov %eax, [%esp+8]
mov %ebx, [%esp+12]
ESP从保存完ebp初始值的位置,+8来获取value 2,+12来获取value 3.
网友评论