- 大师兄的Python源码学习笔记(二十八): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十五): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十三): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十六): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十七): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十四): 虚拟机中的类机制(
- 大师兄的Python源码学习笔记(二十九): 运行环境初始化(一
- 大师兄的Python源码学习笔记(二十一): 虚拟机中的函数机制
- 大师兄的Python源码学习笔记(二十): 虚拟机中的函数机制(
- 大师兄的Python源码学习笔记(二十二): 虚拟机中的类机制(
大师兄的Python源码学习笔记(二十六): 虚拟机中的类机制(五)
大师兄的Python源码学习笔记(二十八): 虚拟机中的类机制(七)
三. 用户自定义Class
3. 访问instance对象中的属性
- 在Python中,形如
x.y
或x.y()
形式的表达式成为属性引用。 - 其中x为对象,y则是对象的属性。
- 属性可以是简单的数据,也可以是复杂的成员函数。
3.1 访问属性值
demo.py
class A():
v = 1
a = A()
a.v
4 20 LOAD_NAME 1 (a)
22 LOAD_ATTR 2 (v)
24 POP_TOP
26 LOAD_CONST 2 (None)
28 RETURN_VALUE
- 当访问属性值时,虚拟机首先通过LOAD_NAME将a对应的instance对象压入运行时栈中。
- 随后,进入LOAD_ATTR访问属性:
ceval.c
TARGET(LOAD_ATTR) {
PyObject *name = GETITEM(names, oparg);
PyObject *owner = TOP();
PyObject *res = PyObject_GetAttr(owner, name);
Py_DECREF(owner);
SET_TOP(res);
if (res == NULL)
goto error;
DISPATCH();
}
- 这里的核心在PyObject_GetAttr(owner, name)中:
Objects/object.c
PyObject *
PyObject_GetAttr(PyObject *v, PyObject *name)
{
PyTypeObject *tp = Py_TYPE(v);
if (!PyUnicode_Check(name)) {
PyErr_Format(PyExc_TypeError,
"attribute name must be string, not '%.200s'",
name->ob_type->tp_name);
return NULL;
}
if (tp->tp_getattro != NULL)
return (*tp->tp_getattro)(v, name);
if (tp->tp_getattr != NULL) {
const char *name_str = PyUnicode_AsUTF8(name);
if (name_str == NULL)
return NULL;
return (*tp->tp_getattr)(v, (char *)name_str);
}
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
return NULL;
}
- 在这段代码中,通过PyTypeObject对象中的tp_getattro访问属性(tp_getattr已不再推荐使用)。
Include\object.h
... ...
typedef PyObject *(*getattrofunc)(PyObject *, PyObject *);
... ...
typedef struct _typeobject {
PyObject_VAR_HEAD
... ...
getattrfunc tp_getattr;
... ...
getattrofunc tp_getattro;
... ...
} PyTypeObject;
- 在Python虚拟机创建class时,会从PyBaseObject_Type中继承其tp_getattro--PyObject_GenericGetAttr。
Objects\object.c
PyObject *
PyObject_GenericGetAttr(PyObject *obj, PyObject *name)
{
return _PyObject_GenericGetAttrWithDict(obj, name, NULL, 0);
}
- _PyObject_GenericGetAttrWithDict通过一套复杂的算法访问属性:
Objects\object.c
PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name,
PyObject *dict, int suppress)
{
/* Make sure the logic of _PyObject_GetMethod is in sync with
this method.
When suppress=1, this function suppress AttributeError.
*/
PyTypeObject *tp = Py_TYPE(obj);
PyObject *descr = NULL;
PyObject *res = NULL;
descrgetfunc f;
Py_ssize_t dictoffset;
PyObject **dictptr;
if (!PyUnicode_Check(name)){
PyErr_Format(PyExc_TypeError,
"attribute name must be string, not '%.200s'",
name->ob_type->tp_name);
return NULL;
}
Py_INCREF(name);
if (tp->tp_dict == NULL) {
if (PyType_Ready(tp) < 0)
goto done;
}
descr = _PyType_Lookup(tp, name);
f = NULL;
if (descr != NULL) {
Py_INCREF(descr);
f = descr->ob_type->tp_descr_get;
if (f != NULL && PyDescr_IsData(descr)) {
res = f(descr, obj, (PyObject *)obj->ob_type);
if (res == NULL && suppress &&
PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear();
}
goto done;
}
}
if (dict == NULL) {
/* Inline _PyObject_GetDictPtr */
dictoffset = tp->tp_dictoffset;
if (dictoffset != 0) {
if (dictoffset < 0) {
Py_ssize_t tsize;
size_t size;
tsize = ((PyVarObject *)obj)->ob_size;
if (tsize < 0)
tsize = -tsize;
size = _PyObject_VAR_SIZE(tp, tsize);
assert(size <= PY_SSIZE_T_MAX);
dictoffset += (Py_ssize_t)size;
assert(dictoffset > 0);
assert(dictoffset % SIZEOF_VOID_P == 0);
}
dictptr = (PyObject **) ((char *)obj + dictoffset);
dict = *dictptr;
}
}
if (dict != NULL) {
Py_INCREF(dict);
res = PyDict_GetItem(dict, name);
if (res != NULL) {
Py_INCREF(res);
Py_DECREF(dict);
goto done;
}
Py_DECREF(dict);
}
if (f != NULL) {
res = f(descr, obj, (PyObject *)Py_TYPE(obj));
if (res == NULL && suppress &&
PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear();
}
goto done;
}
if (descr != NULL) {
res = descr;
descr = NULL;
goto done;
}
if (!suppress) {
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
}
done:
Py_XDECREF(descr);
Py_DECREF(name);
return res;
}
- 这里首先寻找f对应的descriptor。
- 如果descriptor访问失败,则在instance对象自身的
__dict__
中寻找属性。
3.2 访问成员函数
demo.py
class A():
def f(self):
print("in function")
a = A()
a.f()
5 20 LOAD_NAME 1 (a)
22 LOAD_METHOD 2 (f)
24 CALL_METHOD 0
26 POP_TOP
28 LOAD_CONST 2 (None)
30 RETURN_VALUE
- 访问成员函数与属性值类似,首先通过LOAD_NAME将a对应的instance对象压入运行时栈中。
- 随后,进入LOAD_METHOD访问成员函数:
ceval.c
TARGET(LOAD_METHOD) {
/* Designed to work in tamdem with CALL_METHOD. */
PyObject *name = GETITEM(names, oparg);
PyObject *obj = TOP();
PyObject *meth = NULL;
int meth_found = _PyObject_GetMethod(obj, name, &meth);
if (meth == NULL) {
/* Most likely attribute wasn't found. */
goto error;
}
if (meth_found) {
/* We can bypass temporary bound method object.
meth is unbound method and obj is self.
meth | self | arg1 | ... | argN
*/
SET_TOP(meth);
PUSH(obj); // self
}
else {
/* meth is not an unbound method (but a regular attr, or
something was returned by a descriptor protocol). Set
the second element of the stack to NULL, to signal
CALL_METHOD that it's not a method call.
NULL | meth | arg1 | ... | argN
*/
SET_TOP(NULL);
Py_DECREF(obj);
PUSH(meth);
}
DISPATCH();
}
- 核心代码是通过_PyObject_GetMethod判断有没有找到成员函数:
Objects\object.c
/* Specialized version of _PyObject_GenericGetAttrWithDict
specifically for the LOAD_METHOD opcode.
Return 1 if a method is found, 0 if it's a regular attribute
from __dict__ or something returned by using a descriptor
protocol.
`method` will point to the resolved attribute or NULL. In the
latter case, an error will be set.
*/
int
_PyObject_GetMethod(PyObject *obj, PyObject *name, PyObject **method)
{
PyTypeObject *tp = Py_TYPE(obj);
PyObject *descr;
descrgetfunc f = NULL;
PyObject **dictptr, *dict;
PyObject *attr;
int meth_found = 0;
assert(*method == NULL);
if (Py_TYPE(obj)->tp_getattro != PyObject_GenericGetAttr
|| !PyUnicode_Check(name)) {
*method = PyObject_GetAttr(obj, name);
return 0;
}
if (tp->tp_dict == NULL && PyType_Ready(tp) < 0)
return 0;
descr = _PyType_Lookup(tp, name);
if (descr != NULL) {
Py_INCREF(descr);
if (PyFunction_Check(descr) ||
(Py_TYPE(descr) == &PyMethodDescr_Type)) {
meth_found = 1;
} else {
f = descr->ob_type->tp_descr_get;
if (f != NULL && PyDescr_IsData(descr)) {
*method = f(descr, obj, (PyObject *)obj->ob_type);
Py_DECREF(descr);
return 0;
}
}
}
dictptr = _PyObject_GetDictPtr(obj);
if (dictptr != NULL && (dict = *dictptr) != NULL) {
Py_INCREF(dict);
attr = PyDict_GetItem(dict, name);
if (attr != NULL) {
Py_INCREF(attr);
*method = attr;
Py_DECREF(dict);
Py_XDECREF(descr);
return 0;
}
Py_DECREF(dict);
}
if (meth_found) {
*method = descr;
return 1;
}
if (f != NULL) {
*method = f(descr, obj, (PyObject *)Py_TYPE(obj));
Py_DECREF(descr);
return 0;
}
if (descr != NULL) {
*method = descr;
return 0;
}
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%U'",
tp->tp_name, name);
return 0;
}
- 这里一样是通过descriptor和
__dict__
寻找成员函数,如果找到,则返回1,将成员函数对应的PyObject对象塞入栈中,并通过紧接着的字节码指令CALL_METHOD执行函数。
ceval.c
TARGET(CALL_METHOD) {
/* Designed to work in tamdem with LOAD_METHOD. */
PyObject **sp, *res, *meth;
sp = stack_pointer;
meth = PEEK(oparg + 2);
if (meth == NULL) {
/* `meth` is NULL when LOAD_METHOD thinks that it's not
a method call.
Stack layout:
... | NULL | callable | arg1 | ... | argN
^- TOP()
^- (-oparg)
^- (-oparg-1)
^- (-oparg-2)
`callable` will be POPed by call_function.
NULL will will be POPed manually later.
*/
res = call_function(&sp, oparg, NULL);
stack_pointer = sp;
(void)POP(); /* POP the NULL. */
}
else {
/* This is a method call. Stack layout:
... | method | self | arg1 | ... | argN
^- TOP()
^- (-oparg)
^- (-oparg-1)
^- (-oparg-2)
`self` and `method` will be POPed by call_function.
We'll be passing `oparg + 1` to call_function, to
make it accept the `self` as a first argument.
*/
res = call_function(&sp, oparg + 1, NULL);
stack_pointer = sp;
}
PUSH(res);
if (res == NULL)
goto error;
DISPATCH();
}
3.3 instance对象中的__dict__
- 在创建class时,虚拟机设置了一个名为tp_dictoffset的域。
-
tp_dictoffset是instance对象中
__dict__
的偏移位置。
Objects\object.c
PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name,
PyObject *dict, int suppress)
{
... ...
Py_ssize_t dictoffset;
... ...
if (dict == NULL) {
/* Inline _PyObject_GetDictPtr */
dictoffset = tp->tp_dictoffset;
if (dictoffset != 0) {
if (dictoffset < 0) {
Py_ssize_t tsize;
size_t size;
tsize = ((PyVarObject *)obj)->ob_size;
if (tsize < 0)
tsize = -tsize;
size = _PyObject_VAR_SIZE(tp, tsize);
assert(size <= PY_SSIZE_T_MAX);
dictoffset += (Py_ssize_t)size;
assert(dictoffset > 0);
assert(dictoffset % SIZEOF_VOID_P == 0);
}
dictptr = (PyObject **) ((char *)obj + dictoffset);
dict = *dictptr;
}
}
... ...
}
- 如果dictoffset小于0,意味着class继承自变长对象,虚拟机会对dictoffset进行一些处理,最终会指向class在内存中额外申请的位置。
- 而_PyObject_GenericGetAttrWithDict正是根据dictoffset的偏移量获得了一个dict对象。
3.4 descriptor的作用
- 在PyType_Ready中,虚拟机会填充tp_dict,其中与操作名对应的是一个个descriptor。
- descriptor可以分为以下两种:
data descriptor:type中定义了__get__和__set__的descriptor。
non data descriptor:type中只定义了__get__的descriptor。
- 在虚拟机访问instance对象的属性时,descriptor的一个作用是影响虚拟机对属性的选择:
- 虚拟机在instance对象的__dict__中寻找的属性,叫做instance属性。
- 虚拟机在instance对象对应的class对象的mro列表中寻找属性,叫做class属性。
- Python虚拟机按照instance属性、class属性的顺序选择属性,即instance属性优先于class属性。
- 如果在class属性中发现同名的data descriptor,那么该descriptor会优先于instance属性被虚拟机选择。
demo.py
>>>class A(list):
>>> def __get__(self, instance, owner):
>>> return A.__get__
>>> def __set__(self, instance, value):
>>> print("A.__set__")
>>>class B(object):
>>> value = A()
>>>b = B()
>>>b.value = 1
>>>print(b.value)
A.__set__
<function A.__get__ at 0x000002C33C8D7F70>
- descriptor的第二个作用是影响访问结果:
- 从上面的案例可以看出,当最终获得的属性是一个descriptor时,虚拟机不是简单地返回descriptor,而是调用descriptor.__get__的结果。
- 对于上面的结论,还有一个例外:在PyObject_GenericGetAttr中,如果查询到descriptor是instance属性,则不会调用其__get__方法。
- 换句话说,如果待访问的属性是一个descriptor,若它存在于class对象的tp_dict中,会调用其__get__方法;若它存在于instance对象的tp_dict中,则不会调用其__get__方法。
demo.py
>>>class A(object):
>>> def __get__(self, instance, owner):
>>> return "python"
>>>class B(object):
>>> desc_in_class = A()
>>>b = B()
>>>b.desc_in_instance = A()
>>>print("desc_in_class in B:",B.desc_in_class)
>>>print("desc_in_class in b:",b.desc_in_class)
>>>print("desc_in_instance in b:",b.desc_in_instance)
desc_in_class in B: python
desc_in_class in b: python
desc_in_instance in b: <__main__.A object at 0x000001A493879220>
网友评论