源码版本:php-7.1.0
电脑环境:Deepin15.11
GDB版本:8.3.1
GCC版本:6.3.0
我们使用一段PHP代码来进行整个过程的分析。
<?php
$a=[];
$a[]='124';
$a[]='234';
$a[]='345';
我们通过gdb可以看到当前脚本的执行过程:
(gdb) b zend_compile
Breakpoint 1 at 0x4becb2: file Zend/zend_language_scanner.l, line 578.
(gdb) r arr.php
Breakpoint 1, zend_compile (type=2) at Zend/zend_language_scanner.l:578
578 zend_op_array *op_array = NULL;
(gdb) b _zend_hash_init
Breakpoint 2 at 0x555555a7cfbd: file /program/php-7.1.0/Zend/zend_hash.c, line 173.
(gdb) c //使用continue直接进入 当前的_zend_hash_init的断点位置
Continuing.
Breakpoint 2, _zend_hash_init (ht=0x7ffff3858720, nSize=0,
pDestructor=0x555555a68a40 <_zval_ptr_dtor_wrapper>, persistent=0 '\000',
__zend_filename=0x555555fa6128 "/program/php-7.1.0/Zend/zend_compile.c", __zend_lineno=6597)
at /program/php-7.1.0/Zend/zend_hash.c:173
173 GC_REFCOUNT(ht) = 1;
(gdb) n
174 GC_TYPE_INFO(ht) = IS_ARRAY;
(gdb)
175 ht->u.flags = (persistent ? HASH_FLAG_PERSISTENT : 0) | HASH_FLAG_APPLY_PROTECTION | HASH_FLAG_STATIC_KEYS;
(gdb)
176 ht->nTableSize = zend_hash_check_size(nSize);
(gdb)
177 ht->nTableMask = HT_MIN_MASK;
(gdb)
178 HT_SET_DATA_ADDR(ht, &uninitialized_bucket);
(gdb)
179 ht->nNumUsed = 0;
(gdb)
180 ht->nNumOfElements = 0;
(gdb)
181 ht->nInternalPointer = HT_INVALID_IDX;
(gdb)
182 ht->nNextFreeElement = 0;
(gdb)
183 ht->pDestructor = pDestructor;
(gdb)
184 }
(gdb) p ht
$1 = (HashTable *) 0x7ffff3858720
(gdb) p *ht
$2 = {gc = {refcount = 1, u = {v = {type = 7 '\a', flags = 0 '\000', gc_info = 0}, type_info = 7}},
u = {v = {flags = 18 '\022', nApplyCount = 0 '\000', nIteratorsCount = 0 '\000',
consistency = 0 '\000'}, flags = 18}, nTableMask = 4294967294, arData = 0x555555fadb10,
nNumUsed = 0, nNumOfElements = 0, nTableSize = 8, nInternalPointer = 4294967295,
nNextFreeElement = 0, pDestructor = 0x555555a68a40 <_zval_ptr_dtor_wrapper>}
(gdb)
通过_zend_hash_init方法,最终初始化了一个tableSize为8的空数组。
我们在zend_execute处打断点,这里是PHP真正执行代码的操作。
(gdb) b zend_execute
Breakpoint 3 at 0x555555aca6f3: file /program/php-7.1.0/Zend/zend_vm_execute.h, line 461.
(gdb) c
Continuing.
Breakpoint 3, zend_execute (op_array=0x7ffff3882000, return_value=0x0)
at /program/php-7.1.0/Zend/zend_vm_execute.h:461
这里传入的值是一个op_array。我们都知道PHP在执行的时候,最终转化为opcode才会被虚拟机执行。zend_execute方法也就是执行opcode的具体函数。
我们继续往下走,执行到execte_ex函数中,继续执行到
((opcode_handler_t)OPLINE->handler)(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);
可以看到当前函数执行的就是
ZEND_ASSIGN_SPEC_CV_CONST_RETVAL_UNUSED_HANDLER ()
at /program/php-7.1.0/Zend/zend_vm_execute.h:39440
39440 SAVE_OPLINE();
(gdb) n
39441 value = EX_CONSTANT(opline->op2);
(gdb) n
39442 variable_ptr = _get_zval_ptr_cv_undef_BP_VAR_W(execute_data, opline->op1.var);
(gdb) p value
$4 = (zval *) 0x555555ac29c9 <zend_vm_stack_push_call_frame+80>
(gdb) p *value
$6 = {value = {lval = 140737279002400, dval = 6.9533454644260462e-310, counted = 0x7ffff3858720,
str = 0x7ffff3858720, arr = 0x7ffff3858720, obj = 0x7ffff3858720, res = 0x7ffff3858720,
ref = 0x7ffff3858720, ast = 0x7ffff3858720, zv = 0x7ffff3858720, ptr = 0x7ffff3858720,
ce = 0x7ffff3858720, func = 0x7ffff3858720, ww = {w1 = 4085614368, w2 = 32767}}, u1 = {v = {
type = 7 '\a', type_flags = 28 '\034', const_flags = 0 '\000', reserved = 0 '\000'},
type_info = 7175}, u2 = {next = 4294967295, cache_slot = 4294967295, lineno = 4294967295,
num_args = 4294967295, fe_pos = 4294967295, fe_iter_idx = 4294967295,
access_flags = 4294967295, property_guard = 4294967295}}
(gdb)
可以看到6.value.arr=0x7ffff3858720,与我们ht初始化的hashTable的值相等。
退回到zend_execute_ex函数,继续找所有的handler方法,我们可以看到,其一共执行了如下的几个op指令
ZEND_ASSIGN_SPEC_CV_CONST_RETVAL_UNUSED_HANDLER () at /program/php-7.1.0/Zend/zend_vm_execute.h:39440
ZEND_ASSIGN_DIM_SPEC_CV_UNUSED_OP_DATA_CONST_HANDLER () at /program/php-7.1.0/Zend/zend_vm_execute.h:42221
ZEND_ASSIGN_DIM_SPEC_CV_UNUSED_OP_DATA_CONST_HANDLER () at /program/php-7.1.0/Zend/zend_vm_execute.h:42221
ZEND_ASSIGN_DIM_SPEC_CV_UNUSED_OP_DATA_CONST_HANDLER () at /program/php-7.1.0/Zend/zend_vm_execute.h:42221
ZEND_RETURN_SPEC_CONST_HANDLER () at /program/php-7.1.0/Zend/zend_vm_execute.h:2858
执行完成之后,我们重新查看之前的value 所对应的地址 (zval *) 0x555555ac29c9
(gdb) p *(zval*)0x555555ac29c9
$18 = {value = {lval = 140737279002496, dval = 6.9533454644307892e-310, counted = 0x7ffff3858780, str = 0x7ffff3858780, arr = 0x7ffff3858780, obj = 0x7ffff3858780, res = 0x7ffff3858780, ref = 0x7ffff3858780,
ast = 0x7ffff3858780, zv = 0x7ffff3858780, ptr = 0x7ffff3858780, ce = 0x7ffff3858780, func = 0x7ffff3858780, ww = {w1 = 4085614464, w2 = 32767}}, u1 = {v = {type = 0 '\000', type_flags = 0 '\000',
const_flags = 0 '\000', reserved = 0 '\000'}, type_info = 0}, u2 = {next = 0, cache_slot = 0, lineno = 0, num_args = 0, fe_pos = 0, fe_iter_idx = 0, access_flags = 0, property_guard = 0}}
(gdb)
可以看到,当前arr的地址发生了变化,其原因是在赋值的时候进行了写时分离的操作,因为初始化一个空数组和把空数组赋值给$a是执行了两步操作,这个过程中,我们的hashTable的结构体的引用计数变成了2。而对于PHP的引用机制来说,如果一个变量的引用计数不为1,就说明它是正在被多个结构利用的。这时候就会进行写时分离的操作,先复制出一块内存,然后对新复制的内存进行赋值操作。
(gdb) p *$20.arData[0].val.value.str.val@3
$27 = "124"
(gdb) p *$20.arData[1].val.value.str.val@3
$28 = "234"
(gdb) p *$20.arData[2].val.value.str.val@3
$29 = "345"
(gdb) p *$20.arData[3].val.value.str.val@3
Cannot access memory at address 0x18
(gdb)
通过分析可以发现,我们的数据最终存储到了0x555555ac29c9 这个指针对应的结构体所指向的zend_array结构体下的arData中。
网友评论