美文网首页
大师兄的Python源码学习笔记(三十): 运行环境初始化(二)

大师兄的Python源码学习笔记(三十): 运行环境初始化(二)

作者: superkmi | 来源:发表于2021-08-27 09:16 被阅读0次

    大师兄的Python源码学习笔记(二十九): 运行环境初始化(一)
    大师兄的Python源码学习笔记(三十一): 运行环境初始化(三)

    二、系统module初始化

    • 在Python交互模式下,输入dir()会显示一个list内容:
    >>> dir()
    ['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']
    
    • 我们知道,Python要执行dir(),必定是在某个名字空间中寻找到了符号"dir"所对应的callable的对象。
    • 所以这意味着在Python启动之后,已经创建了某个名字空间,且在这个空间中存在符号"dir"。
    • 这个名字空间和值来自系统module,而这些系统module是在Py_InitializeEx中设置的,其中第一个被Python创建的是__builtin__module。
    1. 创建__builtin__ module
    • Py_InitializeEx —> _Py_InitializeCore_impl中,当创建了PyInterpreterStatePyThreadState对象后,就会通过_PyBuiltin_Init设置__builtin__ module:
    Python\pylifecycle.c
    
    _PyInitError
    _Py_InitializeCore_impl(PyInterpreterState **interp_p,
                            const _PyCoreConfig *core_config)
    {
        PyInterpreterState *interp;
        _PyInitError err;
    
        ... ...
        PyObject *modules = PyDict_New();
        if (modules == NULL)
            return _Py_INIT_ERR("can't make modules dictionary");
        interp->modules = modules;
        ... ...
        PyObject *bimod = _PyBuiltin_Init();
        ... ...
    }
    
    • 在调用 _PyBuiltin_Init()之前,Python会将interp->modules创建为一个PyDictObject对象,用于维护所有的module,这在_PyBuiltin_Init()之中也可以清晰地看到:
    Python\bltinmodule.c
    
    PyObject *
    _PyBuiltin_Init(void)
    {
        PyObject *mod, *dict, *debug;
    
        if (PyType_Ready(&PyFilter_Type) < 0 ||
            PyType_Ready(&PyMap_Type) < 0 ||
            PyType_Ready(&PyZip_Type) < 0)
            return NULL;
    
        mod = _PyModule_CreateInitialized(&builtinsmodule, PYTHON_API_VERSION);
        if (mod == NULL)
            return NULL;
        dict = PyModule_GetDict(mod);
    
    #ifdef Py_TRACE_REFS
        /* "builtins" exposes a number of statically allocated objects
         * that, before this code was added in 2.3, never showed up in
         * the list of "all objects" maintained by Py_TRACE_REFS.  As a
         * result, programs leaking references to None and False (etc)
         * couldn't be diagnosed by examining sys.getobjects(0).
         */
    #define ADD_TO_ALL(OBJECT) _Py_AddToAllObjects((PyObject *)(OBJECT), 0)
    #else
    #define ADD_TO_ALL(OBJECT) (void)0
    #endif
    
    #define SETBUILTIN(NAME, OBJECT) \
        if (PyDict_SetItemString(dict, NAME, (PyObject *)OBJECT) < 0)       \
            return NULL;                                                    \
        ADD_TO_ALL(OBJECT)
    
        SETBUILTIN("None",                  Py_None);
        SETBUILTIN("Ellipsis",              Py_Ellipsis);
        SETBUILTIN("NotImplemented",        Py_NotImplemented);
        SETBUILTIN("False",                 Py_False);
        SETBUILTIN("True",                  Py_True);
        SETBUILTIN("bool",                  &PyBool_Type);
        SETBUILTIN("memoryview",        &PyMemoryView_Type);
        SETBUILTIN("bytearray",             &PyByteArray_Type);
        SETBUILTIN("bytes",                 &PyBytes_Type);
        SETBUILTIN("classmethod",           &PyClassMethod_Type);
        SETBUILTIN("complex",               &PyComplex_Type);
        SETBUILTIN("dict",                  &PyDict_Type);
        SETBUILTIN("enumerate",             &PyEnum_Type);
        SETBUILTIN("filter",                &PyFilter_Type);
        SETBUILTIN("float",                 &PyFloat_Type);
        SETBUILTIN("frozenset",             &PyFrozenSet_Type);
        SETBUILTIN("property",              &PyProperty_Type);
        SETBUILTIN("int",                   &PyLong_Type);
        SETBUILTIN("list",                  &PyList_Type);
        SETBUILTIN("map",                   &PyMap_Type);
        SETBUILTIN("object",                &PyBaseObject_Type);
        SETBUILTIN("range",                 &PyRange_Type);
        SETBUILTIN("reversed",              &PyReversed_Type);
        SETBUILTIN("set",                   &PySet_Type);
        SETBUILTIN("slice",                 &PySlice_Type);
        SETBUILTIN("staticmethod",          &PyStaticMethod_Type);
        SETBUILTIN("str",                   &PyUnicode_Type);
        SETBUILTIN("super",                 &PySuper_Type);
        SETBUILTIN("tuple",                 &PyTuple_Type);
        SETBUILTIN("type",                  &PyType_Type);
        SETBUILTIN("zip",                   &PyZip_Type);
        debug = PyBool_FromLong(Py_OptimizeFlag == 0);
        if (PyDict_SetItemString(dict, "__debug__", debug) < 0) {
            Py_DECREF(debug);
            return NULL;
        }
        Py_DECREF(debug);
    
        return mod;
    #undef ADD_TO_ALL
    #undef SETBUILTIN
    }
    
    • _PyBuiltin_Init函数的功能就是设置好__builtin__ module,通过两个步骤完成:
    1. 创建PyModuleObject对象
    2. 设置module,将Python中所有类型对象塞到__builtin__ module中。
    • 其实第一步就已经完成大部分__builtin__ module的工作,通过_PyModule_CreateInitialized完成:
    Objects\moduleobject.c
    
    PyObject *
    _PyModule_CreateInitialized(struct PyModuleDef* module, int module_api_version)
    {
        const char* name;
        PyModuleObject *m;
    
        if (!PyModuleDef_Init(module))
            return NULL;
        name = module->m_name;
        if (!check_api_version(name, module_api_version)) {
            return NULL;
        }
        if (module->m_slots) {
            PyErr_Format(
                PyExc_SystemError,
                "module %s: PyModule_Create is incompatible with m_slots", name);
            return NULL;
        }
        /* Make sure name is fully qualified.
    
           This is a bit of a hack: when the shared library is loaded,
           the module name is "package.module", but the module calls
           PyModule_Create*() with just "module" for the name.  The shared
           library loader squirrels away the true name of the module in
           _Py_PackageContext, and PyModule_Create*() will substitute this
           (if the name actually matches).
        */
        if (_Py_PackageContext != NULL) {
            const char *p = strrchr(_Py_PackageContext, '.');
            if (p != NULL && strcmp(module->m_name, p+1) == 0) {
                name = _Py_PackageContext;
                _Py_PackageContext = NULL;
            }
        }
        if ((m = (PyModuleObject*)PyModule_New(name)) == NULL)
            return NULL;
    
        if (module->m_size > 0) {
            m->md_state = PyMem_MALLOC(module->m_size);
            if (!m->md_state) {
                PyErr_NoMemory();
                Py_DECREF(m);
                return NULL;
            }
            memset(m->md_state, 0, module->m_size);
        }
    
        if (module->m_methods != NULL) {
            if (PyModule_AddFunctions((PyObject *) m, module->m_methods) != 0) {
                Py_DECREF(m);
                return NULL;
            }
        }
        if (module->m_doc != NULL) {
            if (PyModule_SetDocString((PyObject *) m, module->m_doc) != 0) {
                Py_DECREF(m);
                return NULL;
            }
        }
        m->md_def = module;
        return (PyObject*)m;
    }
    
    • 方法参数中的modulemodule对象module_api_version为Python内部的version值,用于比较。
    1.1 创建module对象
    • 在函数_PyModule_CreateInitialized中,使用PyModule_New创建了module对象本身:
    Objects\moduleobject.c
    
    PyObject *
    PyModule_New(const char *name)
    {
        PyObject *nameobj, *module;
        nameobj = PyUnicode_FromString(name);
        if (nameobj == NULL)
            return NULL;
        module = PyModule_NewObject(nameobj);
        Py_DECREF(nameobj);
        return module;
    }
    
    • Python内部维护了一个存放所有加载到内存中的module的集合interp->modules,它是一个PyDictOjbect对象
    • interp->modules中存放着所有的(module名,module对象)这样的对应关系。
    • interp->modules对应到Python一级,是sys.modules
    • 实际上,PyModuleObject对象就是对PyDictObject对象的简单包装:
    Objects\moduleobject.c
    
    PyObject *
    PyModule_NewObject(PyObject *name)
    {
        PyModuleObject *m;
        m = PyObject_GC_New(PyModuleObject, &PyModule_Type);
        if (m == NULL)
            return NULL;
        m->md_def = NULL;
        m->md_state = NULL;
        m->md_weaklist = NULL;
        m->md_name = NULL;
        m->md_dict = PyDict_New();
        if (module_init_dict(m, m->md_dict, name, NULL) != 0)
            goto fail;
        PyObject_GC_Track(m);
        return (PyObject *)m;
    
     fail:
        Py_DECREF(m);
        return NULL;
    }
    
    • 最终,PyModule_New只是创建了一个空的module,并将空的PyModuleObject对象放入interp->modules中就返回了。
    1.2 设置module对象
    • PyModule_New结束后,流程返回到_PyModule_CreateInitialized中,并完成了对__builtin__module几乎全部属性的设置。
    • 这个动作依赖_PyModule_CreateInitialized中的参数module->method,在这里为builtin_methods_PyModule_CreateInitialized会遍历并处理其中的每一项元素:
    Include\methodobject.h
    
    typedef PyObject *(*PyCFunction)(PyObject *, PyObject *);
    ... ...
    struct PyMethodDef {
        const char  *ml_name;   /* The name of the built-in function/method */
        PyCFunction ml_meth;    /* The C function that implements it */
        int         ml_flags;   /* Combination of METH_xxx flags, which mostly
                                   describe the args expected by the C func */
        const char  *ml_doc;    /* The __doc__ attribute, or NULL */
    };
    typedef struct PyMethodDef PyMethodDef;
    
    Python\bltinmodule.c
    
    static PyMethodDef builtin_methods[] = {
        {"__build_class__", (PyCFunction)builtin___build_class__,
         METH_FASTCALL | METH_KEYWORDS, build_class_doc},
        {"__import__",      (PyCFunction)builtin___import__, METH_VARARGS | METH_KEYWORDS, import_doc},
        BUILTIN_ABS_METHODDEF
        BUILTIN_ALL_METHODDEF
        BUILTIN_ANY_METHODDEF
        BUILTIN_ASCII_METHODDEF
        BUILTIN_BIN_METHODDEF
        {"breakpoint",      (PyCFunction)builtin_breakpoint, METH_FASTCALL | METH_KEYWORDS, breakpoint_doc},
        BUILTIN_CALLABLE_METHODDEF
        BUILTIN_CHR_METHODDEF
        BUILTIN_COMPILE_METHODDEF
        BUILTIN_DELATTR_METHODDEF
        {"dir",             builtin_dir,        METH_VARARGS, dir_doc},
        BUILTIN_DIVMOD_METHODDEF
        BUILTIN_EVAL_METHODDEF
        BUILTIN_EXEC_METHODDEF
        BUILTIN_FORMAT_METHODDEF
        {"getattr",         (PyCFunction)builtin_getattr, METH_FASTCALL, getattr_doc},
        BUILTIN_GLOBALS_METHODDEF
        BUILTIN_HASATTR_METHODDEF
        BUILTIN_HASH_METHODDEF
        BUILTIN_HEX_METHODDEF
        BUILTIN_ID_METHODDEF
        BUILTIN_INPUT_METHODDEF
        BUILTIN_ISINSTANCE_METHODDEF
        BUILTIN_ISSUBCLASS_METHODDEF
        {"iter",            builtin_iter,       METH_VARARGS, iter_doc},
        BUILTIN_LEN_METHODDEF
        BUILTIN_LOCALS_METHODDEF
        {"max",             (PyCFunction)builtin_max,        METH_VARARGS | METH_KEYWORDS, max_doc},
        {"min",             (PyCFunction)builtin_min,        METH_VARARGS | METH_KEYWORDS, min_doc},
        {"next",            (PyCFunction)builtin_next,       METH_FASTCALL, next_doc},
        BUILTIN_OCT_METHODDEF
        BUILTIN_ORD_METHODDEF
        BUILTIN_POW_METHODDEF
        {"print",           (PyCFunction)builtin_print,      METH_FASTCALL | METH_KEYWORDS, print_doc},
        BUILTIN_REPR_METHODDEF
        BUILTIN_ROUND_METHODDEF
        BUILTIN_SETATTR_METHODDEF
        BUILTIN_SORTED_METHODDEF
        BUILTIN_SUM_METHODDEF
        {"vars",            builtin_vars,       METH_VARARGS, vars_doc},
        {NULL,              NULL},
    };
    
    • 可以看到,__import__、dir、getattr......PyMethodDef的结构如下:
    • 对于builtin_methods中的每一个PyMethodDef结构,_PyModule_CreateInitialized都会基于它创建爱你一个PyCFunctionObject对象
    • 这个对象是Python中对函数指针的包装,将这个函数指针和其他信息联系在了一起:
    Include\methodobject.h
    
    typedef struct {
        PyObject_HEAD
        PyMethodDef *m_ml; /* Description of the C function to call */
        PyObject    *m_self; /* Passed as 'self' arg to the C func, can be NULL */
        PyObject    *m_module; /* The __module__ attribute, can be anything */
        PyObject    *m_weakreflist; /* List of weak references */
    } PyCFunctionObject;
    
    Objects\methodobject.c
    
    static PyCFunctionObject *free_list = NULL;
    ... ...
    PyObject *
    PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module)
    {
        PyCFunctionObject *op;
        op = free_list;
        if (op != NULL) {
            free_list = (PyCFunctionObject *)(op->m_self);
            (void)PyObject_INIT(op, &PyCFunction_Type);
            numfree--;
        }
        else {
            op = PyObject_GC_New(PyCFunctionObject, &PyCFunction_Type);
            if (op == NULL)
                return NULL;
        }
        op->m_weakreflist = NULL;
        op->m_ml = ml;
        Py_XINCREF(self);
        op->m_self = self;
        Py_XINCREF(module);
        op->m_module = module;
        _PyObject_GC_TRACK(op);
        return (PyObject *)op;
    }
    
    • 看到freelist,说明PyCFunctionObject对象采用了缓冲池策略。
    • 这里还需要注意m_module维护的一个PyStringObject对象,它是PyModuleObject对象的名字。
    • 现在再来看__builtin_module__ module的结构:
    • _PyBuiltin_Init()之后,Python将把PyModuleObject对象中维护的那个PyDictObject对象抽取出来,并赋给interp->builtins:
    Python\pylifecycle.c
    
    _PyInitError
    _Py_InitializeCore_impl(PyInterpreterState **interp_p,
                            const _PyCoreConfig *core_config)
    {
        PyInterpreterState *interp;
        _PyInitError err;
    
        ... ...
    
        PyObject *bimod = _PyBuiltin_Init();
        if (bimod == NULL)
            return _Py_INIT_ERR("can't initialize builtins modules");
        _PyImport_FixupBuiltin(bimod, "builtins", modules);
        interp->builtins = PyModule_GetDict(bimod);
        ... ...
    }
    
    Objects\moduleobject.c
    
    PyObject *
    PyModule_GetDict(PyObject *m)
    {
        PyObject *d;
        if (!PyModule_Check(m)) {
            PyErr_BadInternalCall();
            return NULL;
        }
        d = ((PyModuleObject *)m) -> md_dict;
        assert(d != NULL);
        return d;
    }
    
    • 所以之后Python需要访问__builtini__ module时,直接访问interp->builtins就可以了,这是一种加速机制。

    相关文章

      网友评论

          本文标题:大师兄的Python源码学习笔记(三十): 运行环境初始化(二)

          本文链接:https://www.haomeiwen.com/subject/jsdsiltx.html