大师兄的Python源码学习笔记(二十九): 运行环境初始化(一)
大师兄的Python源码学习笔记(三十一): 运行环境初始化(三)
二、系统module初始化
- 在Python交互模式下,输入
dir()
会显示一个list内容:
>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']
- 我们知道,Python要执行
dir()
,必定是在某个名字空间中寻找到了符号"dir"所对应的callable的对象。 - 所以这意味着在Python启动之后,已经创建了某个名字空间,且在这个空间中存在符号"dir"。
- 这个名字空间和值来自系统module,而这些系统module是在Py_InitializeEx中设置的,其中第一个被Python创建的是
__builtin__
module。
1. 创建__builtin__ module
- 在Py_InitializeEx —> _Py_InitializeCore_impl中,当创建了PyInterpreterState和PyThreadState对象后,就会通过_PyBuiltin_Init设置__builtin__ module:
Python\pylifecycle.c
_PyInitError
_Py_InitializeCore_impl(PyInterpreterState **interp_p,
const _PyCoreConfig *core_config)
{
PyInterpreterState *interp;
_PyInitError err;
... ...
PyObject *modules = PyDict_New();
if (modules == NULL)
return _Py_INIT_ERR("can't make modules dictionary");
interp->modules = modules;
... ...
PyObject *bimod = _PyBuiltin_Init();
... ...
}
- 在调用
_PyBuiltin_Init()
之前,Python会将interp->modules
创建为一个PyDictObject对象,用于维护所有的module,这在_PyBuiltin_Init()
之中也可以清晰地看到:
Python\bltinmodule.c
PyObject *
_PyBuiltin_Init(void)
{
PyObject *mod, *dict, *debug;
if (PyType_Ready(&PyFilter_Type) < 0 ||
PyType_Ready(&PyMap_Type) < 0 ||
PyType_Ready(&PyZip_Type) < 0)
return NULL;
mod = _PyModule_CreateInitialized(&builtinsmodule, PYTHON_API_VERSION);
if (mod == NULL)
return NULL;
dict = PyModule_GetDict(mod);
#ifdef Py_TRACE_REFS
/* "builtins" exposes a number of statically allocated objects
* that, before this code was added in 2.3, never showed up in
* the list of "all objects" maintained by Py_TRACE_REFS. As a
* result, programs leaking references to None and False (etc)
* couldn't be diagnosed by examining sys.getobjects(0).
*/
#define ADD_TO_ALL(OBJECT) _Py_AddToAllObjects((PyObject *)(OBJECT), 0)
#else
#define ADD_TO_ALL(OBJECT) (void)0
#endif
#define SETBUILTIN(NAME, OBJECT) \
if (PyDict_SetItemString(dict, NAME, (PyObject *)OBJECT) < 0) \
return NULL; \
ADD_TO_ALL(OBJECT)
SETBUILTIN("None", Py_None);
SETBUILTIN("Ellipsis", Py_Ellipsis);
SETBUILTIN("NotImplemented", Py_NotImplemented);
SETBUILTIN("False", Py_False);
SETBUILTIN("True", Py_True);
SETBUILTIN("bool", &PyBool_Type);
SETBUILTIN("memoryview", &PyMemoryView_Type);
SETBUILTIN("bytearray", &PyByteArray_Type);
SETBUILTIN("bytes", &PyBytes_Type);
SETBUILTIN("classmethod", &PyClassMethod_Type);
SETBUILTIN("complex", &PyComplex_Type);
SETBUILTIN("dict", &PyDict_Type);
SETBUILTIN("enumerate", &PyEnum_Type);
SETBUILTIN("filter", &PyFilter_Type);
SETBUILTIN("float", &PyFloat_Type);
SETBUILTIN("frozenset", &PyFrozenSet_Type);
SETBUILTIN("property", &PyProperty_Type);
SETBUILTIN("int", &PyLong_Type);
SETBUILTIN("list", &PyList_Type);
SETBUILTIN("map", &PyMap_Type);
SETBUILTIN("object", &PyBaseObject_Type);
SETBUILTIN("range", &PyRange_Type);
SETBUILTIN("reversed", &PyReversed_Type);
SETBUILTIN("set", &PySet_Type);
SETBUILTIN("slice", &PySlice_Type);
SETBUILTIN("staticmethod", &PyStaticMethod_Type);
SETBUILTIN("str", &PyUnicode_Type);
SETBUILTIN("super", &PySuper_Type);
SETBUILTIN("tuple", &PyTuple_Type);
SETBUILTIN("type", &PyType_Type);
SETBUILTIN("zip", &PyZip_Type);
debug = PyBool_FromLong(Py_OptimizeFlag == 0);
if (PyDict_SetItemString(dict, "__debug__", debug) < 0) {
Py_DECREF(debug);
return NULL;
}
Py_DECREF(debug);
return mod;
#undef ADD_TO_ALL
#undef SETBUILTIN
}
-
_PyBuiltin_Init
函数的功能就是设置好__builtin__ module,通过两个步骤完成:
- 创建PyModuleObject对象。
- 设置module,将Python中所有类型对象塞到__builtin__ module中。
- 其实第一步就已经完成大部分__builtin__ module的工作,通过_PyModule_CreateInitialized完成:
Objects\moduleobject.c
PyObject *
_PyModule_CreateInitialized(struct PyModuleDef* module, int module_api_version)
{
const char* name;
PyModuleObject *m;
if (!PyModuleDef_Init(module))
return NULL;
name = module->m_name;
if (!check_api_version(name, module_api_version)) {
return NULL;
}
if (module->m_slots) {
PyErr_Format(
PyExc_SystemError,
"module %s: PyModule_Create is incompatible with m_slots", name);
return NULL;
}
/* Make sure name is fully qualified.
This is a bit of a hack: when the shared library is loaded,
the module name is "package.module", but the module calls
PyModule_Create*() with just "module" for the name. The shared
library loader squirrels away the true name of the module in
_Py_PackageContext, and PyModule_Create*() will substitute this
(if the name actually matches).
*/
if (_Py_PackageContext != NULL) {
const char *p = strrchr(_Py_PackageContext, '.');
if (p != NULL && strcmp(module->m_name, p+1) == 0) {
name = _Py_PackageContext;
_Py_PackageContext = NULL;
}
}
if ((m = (PyModuleObject*)PyModule_New(name)) == NULL)
return NULL;
if (module->m_size > 0) {
m->md_state = PyMem_MALLOC(module->m_size);
if (!m->md_state) {
PyErr_NoMemory();
Py_DECREF(m);
return NULL;
}
memset(m->md_state, 0, module->m_size);
}
if (module->m_methods != NULL) {
if (PyModule_AddFunctions((PyObject *) m, module->m_methods) != 0) {
Py_DECREF(m);
return NULL;
}
}
if (module->m_doc != NULL) {
if (PyModule_SetDocString((PyObject *) m, module->m_doc) != 0) {
Py_DECREF(m);
return NULL;
}
}
m->md_def = module;
return (PyObject*)m;
}
- 方法参数中的module为module对象,module_api_version为Python内部的version值,用于比较。
1.1 创建module对象
- 在函数_PyModule_CreateInitialized中,使用PyModule_New创建了module对象本身:
Objects\moduleobject.c
PyObject *
PyModule_New(const char *name)
{
PyObject *nameobj, *module;
nameobj = PyUnicode_FromString(name);
if (nameobj == NULL)
return NULL;
module = PyModule_NewObject(nameobj);
Py_DECREF(nameobj);
return module;
}
- Python内部维护了一个存放所有加载到内存中的module的集合interp->modules,它是一个PyDictOjbect对象。
- interp->modules中存放着所有的(module名,module对象)这样的对应关系。
- interp->modules对应到Python一级,是sys.modules。
- 实际上,PyModuleObject对象就是对PyDictObject对象的简单包装:
Objects\moduleobject.c
PyObject *
PyModule_NewObject(PyObject *name)
{
PyModuleObject *m;
m = PyObject_GC_New(PyModuleObject, &PyModule_Type);
if (m == NULL)
return NULL;
m->md_def = NULL;
m->md_state = NULL;
m->md_weaklist = NULL;
m->md_name = NULL;
m->md_dict = PyDict_New();
if (module_init_dict(m, m->md_dict, name, NULL) != 0)
goto fail;
PyObject_GC_Track(m);
return (PyObject *)m;
fail:
Py_DECREF(m);
return NULL;
}
- 最终,PyModule_New只是创建了一个空的module,并将空的PyModuleObject对象放入interp->modules中就返回了。
1.2 设置module对象
- PyModule_New结束后,流程返回到_PyModule_CreateInitialized中,并完成了对__builtin__module几乎全部属性的设置。
- 这个动作依赖_PyModule_CreateInitialized中的参数module->method,在这里为builtin_methods,_PyModule_CreateInitialized会遍历并处理其中的每一项元素:
Include\methodobject.h
typedef PyObject *(*PyCFunction)(PyObject *, PyObject *);
... ...
struct PyMethodDef {
const char *ml_name; /* The name of the built-in function/method */
PyCFunction ml_meth; /* The C function that implements it */
int ml_flags; /* Combination of METH_xxx flags, which mostly
describe the args expected by the C func */
const char *ml_doc; /* The __doc__ attribute, or NULL */
};
typedef struct PyMethodDef PyMethodDef;
Python\bltinmodule.c
static PyMethodDef builtin_methods[] = {
{"__build_class__", (PyCFunction)builtin___build_class__,
METH_FASTCALL | METH_KEYWORDS, build_class_doc},
{"__import__", (PyCFunction)builtin___import__, METH_VARARGS | METH_KEYWORDS, import_doc},
BUILTIN_ABS_METHODDEF
BUILTIN_ALL_METHODDEF
BUILTIN_ANY_METHODDEF
BUILTIN_ASCII_METHODDEF
BUILTIN_BIN_METHODDEF
{"breakpoint", (PyCFunction)builtin_breakpoint, METH_FASTCALL | METH_KEYWORDS, breakpoint_doc},
BUILTIN_CALLABLE_METHODDEF
BUILTIN_CHR_METHODDEF
BUILTIN_COMPILE_METHODDEF
BUILTIN_DELATTR_METHODDEF
{"dir", builtin_dir, METH_VARARGS, dir_doc},
BUILTIN_DIVMOD_METHODDEF
BUILTIN_EVAL_METHODDEF
BUILTIN_EXEC_METHODDEF
BUILTIN_FORMAT_METHODDEF
{"getattr", (PyCFunction)builtin_getattr, METH_FASTCALL, getattr_doc},
BUILTIN_GLOBALS_METHODDEF
BUILTIN_HASATTR_METHODDEF
BUILTIN_HASH_METHODDEF
BUILTIN_HEX_METHODDEF
BUILTIN_ID_METHODDEF
BUILTIN_INPUT_METHODDEF
BUILTIN_ISINSTANCE_METHODDEF
BUILTIN_ISSUBCLASS_METHODDEF
{"iter", builtin_iter, METH_VARARGS, iter_doc},
BUILTIN_LEN_METHODDEF
BUILTIN_LOCALS_METHODDEF
{"max", (PyCFunction)builtin_max, METH_VARARGS | METH_KEYWORDS, max_doc},
{"min", (PyCFunction)builtin_min, METH_VARARGS | METH_KEYWORDS, min_doc},
{"next", (PyCFunction)builtin_next, METH_FASTCALL, next_doc},
BUILTIN_OCT_METHODDEF
BUILTIN_ORD_METHODDEF
BUILTIN_POW_METHODDEF
{"print", (PyCFunction)builtin_print, METH_FASTCALL | METH_KEYWORDS, print_doc},
BUILTIN_REPR_METHODDEF
BUILTIN_ROUND_METHODDEF
BUILTIN_SETATTR_METHODDEF
BUILTIN_SORTED_METHODDEF
BUILTIN_SUM_METHODDEF
{"vars", builtin_vars, METH_VARARGS, vars_doc},
{NULL, NULL},
};
- 可以看到,__import__、dir、getattr......,PyMethodDef的结构如下:
- 对于builtin_methods中的每一个PyMethodDef结构,_PyModule_CreateInitialized都会基于它创建爱你一个PyCFunctionObject对象。
- 这个对象是Python中对函数指针的包装,将这个函数指针和其他信息联系在了一起:
Include\methodobject.h
typedef struct {
PyObject_HEAD
PyMethodDef *m_ml; /* Description of the C function to call */
PyObject *m_self; /* Passed as 'self' arg to the C func, can be NULL */
PyObject *m_module; /* The __module__ attribute, can be anything */
PyObject *m_weakreflist; /* List of weak references */
} PyCFunctionObject;
Objects\methodobject.c
static PyCFunctionObject *free_list = NULL;
... ...
PyObject *
PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module)
{
PyCFunctionObject *op;
op = free_list;
if (op != NULL) {
free_list = (PyCFunctionObject *)(op->m_self);
(void)PyObject_INIT(op, &PyCFunction_Type);
numfree--;
}
else {
op = PyObject_GC_New(PyCFunctionObject, &PyCFunction_Type);
if (op == NULL)
return NULL;
}
op->m_weakreflist = NULL;
op->m_ml = ml;
Py_XINCREF(self);
op->m_self = self;
Py_XINCREF(module);
op->m_module = module;
_PyObject_GC_TRACK(op);
return (PyObject *)op;
}
- 看到freelist,说明PyCFunctionObject对象采用了缓冲池策略。
- 这里还需要注意m_module维护的一个PyStringObject对象,它是PyModuleObject对象的名字。
- 现在再来看__builtin_module__ module的结构:
- 在_PyBuiltin_Init()之后,Python将把PyModuleObject对象中维护的那个PyDictObject对象抽取出来,并赋给interp->builtins:
Python\pylifecycle.c
_PyInitError
_Py_InitializeCore_impl(PyInterpreterState **interp_p,
const _PyCoreConfig *core_config)
{
PyInterpreterState *interp;
_PyInitError err;
... ...
PyObject *bimod = _PyBuiltin_Init();
if (bimod == NULL)
return _Py_INIT_ERR("can't initialize builtins modules");
_PyImport_FixupBuiltin(bimod, "builtins", modules);
interp->builtins = PyModule_GetDict(bimod);
... ...
}
Objects\moduleobject.c
PyObject *
PyModule_GetDict(PyObject *m)
{
PyObject *d;
if (!PyModule_Check(m)) {
PyErr_BadInternalCall();
return NULL;
}
d = ((PyModuleObject *)m) -> md_dict;
assert(d != NULL);
return d;
}
- 所以之后Python需要访问__builtini__ module时,直接访问interp->builtins就可以了,这是一种加速机制。
网友评论