美文网首页
大师兄的Python源码学习笔记(二十七): 虚拟机中的类机制(

大师兄的Python源码学习笔记(二十七): 虚拟机中的类机制(

作者: superkmi | 来源:发表于2021-08-06 13:41 被阅读0次

大师兄的Python源码学习笔记(二十六): 虚拟机中的类机制(五)
大师兄的Python源码学习笔记(二十八): 虚拟机中的类机制(七)

三. 用户自定义Class

3. 访问instance对象中的属性
  • 在Python中,形如x.yx.y()形式的表达式成为属性引用
  • 其中x为对象,y则是对象的属性
  • 属性可以是简单的数据,也可以是复杂的成员函数。
3.1 访问属性值
demo.py

class A():
    v = 1
a = A()
a.v
  4          20 LOAD_NAME                1 (a)
             22 LOAD_ATTR                2 (v)
             24 POP_TOP
             26 LOAD_CONST               2 (None)
             28 RETURN_VALUE
  • 当访问属性值时,虚拟机首先通过LOAD_NAME将a对应的instance对象压入运行时栈中。
  • 随后,进入LOAD_ATTR访问属性:
ceval.c

TARGET(LOAD_ATTR) {
            PyObject *name = GETITEM(names, oparg);
            PyObject *owner = TOP();
            PyObject *res = PyObject_GetAttr(owner, name);
            Py_DECREF(owner);
            SET_TOP(res);
            if (res == NULL)
                goto error;
            DISPATCH();
        }
  • 这里的核心在PyObject_GetAttr(owner, name)中:
Objects/object.c

PyObject *
PyObject_GetAttr(PyObject *v, PyObject *name)
{
    PyTypeObject *tp = Py_TYPE(v);

    if (!PyUnicode_Check(name)) {
        PyErr_Format(PyExc_TypeError,
                     "attribute name must be string, not '%.200s'",
                     name->ob_type->tp_name);
        return NULL;
    }
    if (tp->tp_getattro != NULL)
        return (*tp->tp_getattro)(v, name);
    if (tp->tp_getattr != NULL) {
        const char *name_str = PyUnicode_AsUTF8(name);
        if (name_str == NULL)
            return NULL;
        return (*tp->tp_getattr)(v, (char *)name_str);
    }
    PyErr_Format(PyExc_AttributeError,
                 "'%.50s' object has no attribute '%U'",
                 tp->tp_name, name);
    return NULL;
}
  • 在这段代码中,通过PyTypeObject对象中的tp_getattro访问属性(tp_getattr已不再推荐使用)。
Include\object.h

... ...
typedef PyObject *(*getattrofunc)(PyObject *, PyObject *);
... ...
typedef struct _typeobject {
    PyObject_VAR_HEAD
    ... ...
    getattrfunc tp_getattr;
    ... ...
    getattrofunc tp_getattro;
    ... ...
} PyTypeObject;
  • 在Python虚拟机创建class时,会从PyBaseObject_Type中继承其tp_getattro--PyObject_GenericGetAttr
Objects\object.c

PyObject *
PyObject_GenericGetAttr(PyObject *obj, PyObject *name)
{
    return _PyObject_GenericGetAttrWithDict(obj, name, NULL, 0);
}
  • _PyObject_GenericGetAttrWithDict通过一套复杂的算法访问属性:
Objects\object.c

PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name,
                                 PyObject *dict, int suppress)
{
    /* Make sure the logic of _PyObject_GetMethod is in sync with
       this method.

       When suppress=1, this function suppress AttributeError.
    */

    PyTypeObject *tp = Py_TYPE(obj);
    PyObject *descr = NULL;
    PyObject *res = NULL;
    descrgetfunc f;
    Py_ssize_t dictoffset;
    PyObject **dictptr;

    if (!PyUnicode_Check(name)){
        PyErr_Format(PyExc_TypeError,
                     "attribute name must be string, not '%.200s'",
                     name->ob_type->tp_name);
        return NULL;
    }
    Py_INCREF(name);

    if (tp->tp_dict == NULL) {
        if (PyType_Ready(tp) < 0)
            goto done;
    }

    descr = _PyType_Lookup(tp, name);

    f = NULL;
    if (descr != NULL) {
        Py_INCREF(descr);
        f = descr->ob_type->tp_descr_get;
        if (f != NULL && PyDescr_IsData(descr)) {
            res = f(descr, obj, (PyObject *)obj->ob_type);
            if (res == NULL && suppress &&
                    PyErr_ExceptionMatches(PyExc_AttributeError)) {
                PyErr_Clear();
            }
            goto done;
        }
    }

    if (dict == NULL) {
        /* Inline _PyObject_GetDictPtr */
        dictoffset = tp->tp_dictoffset;
        if (dictoffset != 0) {
            if (dictoffset < 0) {
                Py_ssize_t tsize;
                size_t size;

                tsize = ((PyVarObject *)obj)->ob_size;
                if (tsize < 0)
                    tsize = -tsize;
                size = _PyObject_VAR_SIZE(tp, tsize);
                assert(size <= PY_SSIZE_T_MAX);

                dictoffset += (Py_ssize_t)size;
                assert(dictoffset > 0);
                assert(dictoffset % SIZEOF_VOID_P == 0);
            }
            dictptr = (PyObject **) ((char *)obj + dictoffset);
            dict = *dictptr;
        }
    }
    if (dict != NULL) {
        Py_INCREF(dict);
        res = PyDict_GetItem(dict, name);
        if (res != NULL) {
            Py_INCREF(res);
            Py_DECREF(dict);
            goto done;
        }
        Py_DECREF(dict);
    }

    if (f != NULL) {
        res = f(descr, obj, (PyObject *)Py_TYPE(obj));
        if (res == NULL && suppress &&
                PyErr_ExceptionMatches(PyExc_AttributeError)) {
            PyErr_Clear();
        }
        goto done;
    }

    if (descr != NULL) {
        res = descr;
        descr = NULL;
        goto done;
    }

    if (!suppress) {
        PyErr_Format(PyExc_AttributeError,
                     "'%.50s' object has no attribute '%U'",
                     tp->tp_name, name);
    }
  done:
    Py_XDECREF(descr);
    Py_DECREF(name);
    return res;
}
  • 这里首先寻找f对应的descriptor
  • 如果descriptor访问失败,则在instance对象自身的__dict__中寻找属性。
3.2 访问成员函数
demo.py 

class A():
    def f(self):
        print("in function")
a = A()
a.f()
  5          20 LOAD_NAME                1 (a)
             22 LOAD_METHOD              2 (f)
             24 CALL_METHOD              0
             26 POP_TOP
             28 LOAD_CONST               2 (None)
             30 RETURN_VALUE
  • 访问成员函数与属性值类似,首先通过LOAD_NAME将a对应的instance对象压入运行时栈中。
  • 随后,进入LOAD_METHOD访问成员函数:
ceval.c

        TARGET(LOAD_METHOD) {
            /* Designed to work in tamdem with CALL_METHOD. */
            PyObject *name = GETITEM(names, oparg);
            PyObject *obj = TOP();
            PyObject *meth = NULL;

            int meth_found = _PyObject_GetMethod(obj, name, &meth);

            if (meth == NULL) {
                /* Most likely attribute wasn't found. */
                goto error;
            }

            if (meth_found) {
                /* We can bypass temporary bound method object.
                   meth is unbound method and obj is self.

                   meth | self | arg1 | ... | argN
                 */
                SET_TOP(meth);
                PUSH(obj);  // self
            }
            else {
                /* meth is not an unbound method (but a regular attr, or
                   something was returned by a descriptor protocol).  Set
                   the second element of the stack to NULL, to signal
                   CALL_METHOD that it's not a method call.

                   NULL | meth | arg1 | ... | argN
                */
                SET_TOP(NULL);
                Py_DECREF(obj);
                PUSH(meth);
            }
            DISPATCH();
        }
  • 核心代码是通过_PyObject_GetMethod判断有没有找到成员函数:
Objects\object.c

/* Specialized version of _PyObject_GenericGetAttrWithDict
   specifically for the LOAD_METHOD opcode.

   Return 1 if a method is found, 0 if it's a regular attribute
   from __dict__ or something returned by using a descriptor
   protocol.

   `method` will point to the resolved attribute or NULL.  In the
   latter case, an error will be set.
*/
int
_PyObject_GetMethod(PyObject *obj, PyObject *name, PyObject **method)
{
    PyTypeObject *tp = Py_TYPE(obj);
    PyObject *descr;
    descrgetfunc f = NULL;
    PyObject **dictptr, *dict;
    PyObject *attr;
    int meth_found = 0;

    assert(*method == NULL);

    if (Py_TYPE(obj)->tp_getattro != PyObject_GenericGetAttr
            || !PyUnicode_Check(name)) {
        *method = PyObject_GetAttr(obj, name);
        return 0;
    }

    if (tp->tp_dict == NULL && PyType_Ready(tp) < 0)
        return 0;

    descr = _PyType_Lookup(tp, name);
    if (descr != NULL) {
        Py_INCREF(descr);
        if (PyFunction_Check(descr) ||
                (Py_TYPE(descr) == &PyMethodDescr_Type)) {
            meth_found = 1;
        } else {
            f = descr->ob_type->tp_descr_get;
            if (f != NULL && PyDescr_IsData(descr)) {
                *method = f(descr, obj, (PyObject *)obj->ob_type);
                Py_DECREF(descr);
                return 0;
            }
        }
    }

    dictptr = _PyObject_GetDictPtr(obj);
    if (dictptr != NULL && (dict = *dictptr) != NULL) {
        Py_INCREF(dict);
        attr = PyDict_GetItem(dict, name);
        if (attr != NULL) {
            Py_INCREF(attr);
            *method = attr;
            Py_DECREF(dict);
            Py_XDECREF(descr);
            return 0;
        }
        Py_DECREF(dict);
    }

    if (meth_found) {
        *method = descr;
        return 1;
    }

    if (f != NULL) {
        *method = f(descr, obj, (PyObject *)Py_TYPE(obj));
        Py_DECREF(descr);
        return 0;
    }

    if (descr != NULL) {
        *method = descr;
        return 0;
    }

    PyErr_Format(PyExc_AttributeError,
                 "'%.50s' object has no attribute '%U'",
                 tp->tp_name, name);
    return 0;
}
  • 这里一样是通过descriptor__dict__寻找成员函数,如果找到,则返回1,将成员函数对应的PyObject对象塞入栈中,并通过紧接着的字节码指令CALL_METHOD执行函数。
ceval.c

TARGET(CALL_METHOD) {
            /* Designed to work in tamdem with LOAD_METHOD. */
            PyObject **sp, *res, *meth;

            sp = stack_pointer;

            meth = PEEK(oparg + 2);
            if (meth == NULL) {
                /* `meth` is NULL when LOAD_METHOD thinks that it's not
                   a method call.

                   Stack layout:

                       ... | NULL | callable | arg1 | ... | argN
                                                            ^- TOP()
                                               ^- (-oparg)
                                    ^- (-oparg-1)
                             ^- (-oparg-2)

                   `callable` will be POPed by call_function.
                   NULL will will be POPed manually later.
                */
                res = call_function(&sp, oparg, NULL);
                stack_pointer = sp;
                (void)POP(); /* POP the NULL. */
            }
            else {
                /* This is a method call.  Stack layout:

                     ... | method | self | arg1 | ... | argN
                                                        ^- TOP()
                                           ^- (-oparg)
                                    ^- (-oparg-1)
                           ^- (-oparg-2)

                  `self` and `method` will be POPed by call_function.
                  We'll be passing `oparg + 1` to call_function, to
                  make it accept the `self` as a first argument.
                */
                res = call_function(&sp, oparg + 1, NULL);
                stack_pointer = sp;
            }

            PUSH(res);
            if (res == NULL)
                goto error;
            DISPATCH();
        }
3.3 instance对象中的__dict__
  • 在创建class时,虚拟机设置了一个名为tp_dictoffset的域。
  • tp_dictoffsetinstance对象__dict__的偏移位置。
Objects\object.c

PyObject *
_PyObject_GenericGetAttrWithDict(PyObject *obj, PyObject *name,
                                 PyObject *dict, int suppress)
{
    ... ...
    Py_ssize_t dictoffset;
    ... ...
    if (dict == NULL) {
        /* Inline _PyObject_GetDictPtr */
        dictoffset = tp->tp_dictoffset;
        if (dictoffset != 0) {
            if (dictoffset < 0) {
                Py_ssize_t tsize;
                size_t size;

                tsize = ((PyVarObject *)obj)->ob_size;
                if (tsize < 0)
                    tsize = -tsize;
                size = _PyObject_VAR_SIZE(tp, tsize);
                assert(size <= PY_SSIZE_T_MAX);

                dictoffset += (Py_ssize_t)size;
                assert(dictoffset > 0);
                assert(dictoffset % SIZEOF_VOID_P == 0);
            }
            dictptr = (PyObject **) ((char *)obj + dictoffset);
            dict = *dictptr;
        }
    }
    ... ...
}
  • 如果dictoffset小于0,意味着class继承自变长对象,虚拟机会对dictoffset进行一些处理,最终会指向class在内存中额外申请的位置。
  • _PyObject_GenericGetAttrWithDict正是根据dictoffset的偏移量获得了一个dict对象
3.4 descriptor的作用
  • PyType_Ready中,虚拟机会填充tp_dict,其中与操作名对应的是一个个descriptor
  • descriptor可以分为以下两种:

data descriptor:type中定义了__get__和__set__的descriptor。
non data descriptor:type中只定义了__get__的descriptor。

  • 在虚拟机访问instance对象的属性时,descriptor的一个作用是影响虚拟机对属性的选择:
  • 虚拟机在instance对象的__dict__中寻找的属性,叫做instance属性
  • 虚拟机在instance对象对应的class对象mro列表中寻找属性,叫做class属性
  • Python虚拟机按照instance属性class属性的顺序选择属性,即instance属性优先于class属性
  • 如果在class属性中发现同名的data descriptor,那么该descriptor会优先于instance属性被虚拟机选择。
demo.py 

>>>class A(list):
>>>    def __get__(self, instance, owner):
>>>        return A.__get__
>>>    def __set__(self, instance, value):
>>>        print("A.__set__")
>>>class B(object):
>>>    value = A()
>>>b = B()
>>>b.value = 1
>>>print(b.value)
A.__set__
<function A.__get__ at 0x000002C33C8D7F70>
  • descriptor的第二个作用是影响访问结果:
  • 从上面的案例可以看出,当最终获得的属性是一个descriptor时,虚拟机不是简单地返回descriptor,而是调用descriptor.__get__的结果。
  • 对于上面的结论,还有一个例外:在PyObject_GenericGetAttr中,如果查询到descriptorinstance属性,则不会调用其__get__方法。
  • 换句话说,如果待访问的属性是一个descriptor,若它存在于class对象tp_dict中,会调用其__get__方法;若它存在于instance对象tp_dict中,则不会调用其__get__方法。
demo.py 

>>>class A(object):
>>>    def __get__(self, instance, owner):
>>>        return "python"

>>>class B(object):
>>>    desc_in_class = A()

>>>b = B()
>>>b.desc_in_instance = A()

>>>print("desc_in_class in B:",B.desc_in_class)
>>>print("desc_in_class in b:",b.desc_in_class)
>>>print("desc_in_instance in b:",b.desc_in_instance)
desc_in_class in B: python
desc_in_class in b: python
desc_in_instance in b: <__main__.A object at 0x000001A493879220>

相关文章

网友评论

      本文标题:大师兄的Python源码学习笔记(二十七): 虚拟机中的类机制(

      本文链接:https://www.haomeiwen.com/subject/hkkjvltx.html