Python模块动态加载机制

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python模块动态加载机制相关的知识,希望对你有一定的参考价值。

本文和大家分享的主要是python中模块动态加载机制相关内容,一起来看看吧,希望对大家学习python有所帮助。

 

import 指令

 

来看看 import sys 所产生的指令:

 

co_consts : (0, None)

co_names : (’sys’,)

0 LOAD_CONST               0 (0)

2 LOAD_CONST               1 (None)

4 IMPORT_NAME              0 (sys)

6 STORE_NAME               0 (sys)

可以看到import的结果是赋值给变量 sys 并存储在当前framelocal名字空间中使得后续使用 sys.path 就能很快找到这个符号了具体看看 IMPORT_NAME 指令做了什么动作:

 

TARGET(IMPORT_NAME) {

    PyObject *name = GETITEM(names, oparg); // 获得 ’sys’ PyUnicodeObject

    PyObject *fromlist = POP();         // None

    PyObject *level = TOP();            // 0

    PyObject *res;

    res = import_name(f, name, fromlist, level);

    Py_DECREF(level);

    Py_DECREF(fromlist);

    SET_TOP(res);

    if (res == NULL)

        goto error;

    DISPATCH();

}

这部分是收集将要import操作的所需信息然后调用 import_name :

 

[ceval.c]

static PyObject * import_name(PyFrameObject *f, PyObject *name, PyObject *fromlist, PyObject *level)

{

    _Py_IDENTIFIER(__import__);

    PyObject *import_func, *res;

    PyObject* stack[5];

 

    // 获得内建函数 __import__

    import_func = _PyDict_GetItemId(f->f_builtins, &PyId___import__);

 

    /* Fast path for not overloaded __import__. */

    if (import_func == PyThreadState_GET()->interp->import_func) {

        int ilevel = _PyLong_AsInt(level);

        if (ilevel == -1 && PyErr_Occurred()) {

            return NULL;

        }

        res = PyImport_ImportModuleLevelObject(

                        name,

                        f->f_globals,

                        f->f_locals == NULL ? Py_None : f->f_locals,

                        fromlist,

                        ilevel);

        return res;

    }

    ...

}

传进来的参数列表分别是, 当前framePyFrameObject对象表示’sys’PyUnicodeObject, Py_None对象表示0PyLongObject. 首先从内建 f->builtins 中获取 __import__ 函数此时它已经是一个包装过了的PyCFunctionObject对象了,在上一篇的builtins初始化时对每一个方法进行了包装.

 

if (import_func == PyThreadState_GET()->interp->import_func) 是用来判断 __import__ 是否被程序员重载了这里不考虑被重载的情况. PyImport_ImportModuleLevelObject 函数内比较复杂因为它还要处理如 import xml.sax 这样的结构好像调用时 PyImport_ImportModuleLevelObject 压根没有用到 import_func 这个内建 import 方法但其实他们是殊途同归的还记得内建方法的数组 builtin_methods :

 

[bltinmodule.c]

static PyMethodDef builtin_methods[] = {

    ...

    {"__import__",      (PyCFunction)builtin___import__, METH_VARARGS | METH_KEYWORDS, import_doc},

    ...

};

 

static PyObject * builtin___import__(PyObject *self, PyObject *args, PyObject *kwds)

{

    static char *kwlist[] = {"name", "globals", "locals", "fromlist",

                             "level", 0};

    PyObject *name, *globals = NULL, *locals = NULL, *fromlist = NULL;

    int level = 0;

 

    if (!PyArg_ParseTupleAndKeywords(args, kwds, "U|OOOi:__import__",

                    kwlist, &name, &globals, &locals, &fromlist, &level))

        return NULL;

    return PyImport_ImportModuleLevelObject(name, globals, locals,

                                            fromlist, level);

}

最终调用的其实是同一个 PyImport_ImportModuleLevelObject 函数.

 

很混乱有没有, 好重新整理下import的实现过程.

 

import 机制

 

对于任何import操作, python虚拟机都要做到:

 

对运行时sys.modules全局模块池的维护

解析和搜索module路径

对不同文件的module的动态加载机制

import的形式也有很多种最简单的形式如 import os , 复杂一点就是 import x.y.z , 注入 fromasimport结合的也会被分析后变成 import x.y.z 的形式所以我们分析import的实现代码就以 import x.y.z 作为指令动作.

 

当以 import x.y.z 形式时调用的参数分别是:

 

res = PyImport_ImportModuleLevelObject(

            name,                                       // 表示 ’x.y.z’ 的 PyUnicodeObject

            f->f_globals,                               // frame的global名字空间

            f->f_locals == NULL ? Py_None : f->f_locals,// frame的local名字空间

            fromlist,                                   // None值

            ilevel);                                    // 0

深入这个函数来看:

 

PyObject * PyImport_ImportModuleLevelObject(PyObject *name, PyObject *globals,

                                 PyObject *locals, PyObject *fromlist,

                                 int level)

{

    _Py_IDENTIFIER(_find_and_load);

    _Py_IDENTIFIER(_handle_fromlist);

    PyObject *abs_name = NULL;

    PyObject *final_mod = NULL;

    PyObject *mod = NULL;

    PyObject *package = NULL;

    PyInterpreterState *interp = PyThreadState_GET()->interp;

    int has_from;

 

    abs_name = name;

 

    mod = PyDict_GetItem(interp->modules, abs_name);

    if (mod != NULL && mod != Py_None) {    // 如果全局modules里已经有了说明重复引入模块

        ...

        }

    }

    else { // 该模块第一次引入

        mod = _PyObject_CallMethodIdObjArgs(interp->importlib,

                                            &PyId__find_and_load, abs_name,

                                            interp->import_func, NULL);

    }

 

    // 处理from xxx import xxx 语句

    has_from = 0;

    if (fromlist != NULL && fromlist != Py_None) {

        has_from = PyObject_IsTrue(fromlist);

        if (has_from < 0)

            goto error;

    }

    if (!has_from) {    // 不是from xxx形式的

        Py_ssize_t len = PyUnicode_GET_LENGTH(name);

        if (level == 0 || len > 0) {

            Py_ssize_t dot;

 

            // 查找是模块名是否含有不含返回-1, 含会返回其索引

            dot = PyUnicode_FindChar(name, ’.’, 0, len, 1);

 

            if (dot == -1) {

                /* No dot in module name, simple exit */

                final_mod = mod;

                Py_INCREF(mod);

                goto error;

            }

 

            if (level == 0) {

                PyObject *front = PyUnicode_Substring(name, 0, dot);

                if (front == NULL) {

                    goto error;

                }

 

                final_mod = PyImport_ImportModuleLevelObject(front, NULL, NULL, NULL, 0);

                Py_DECREF(front);

            }

            else {

                ...

            }

        }

        else {

            final_mod = mod;

            Py_INCREF(mod);

        }

    }

    else {

        final_mod = _PyObject_CallMethodIdObjArgs(interp->importlib,

                                                  &PyId__handle_fromlist, mod,

                                                  fromlist, interp->import_func,

                                                  NULL);

    }

 

  error:

    return final_mod;

}

这时动态加载就显示出来了, 首先会去全局的 interp->modules 中查看是否已经加载过了该模块加载过了就不会重新加载了而后处理import语句有含 "." 点的情况从代码中可以看到如何是 import x.y.z 的形式也是会将 模块整个引入并将它赋值给x (第一个模块).

 

interp->importlib 是什么呢python初始化中的最后做的一步就是初始化import:

 

[pylifecycle.c]

void _Py_InitializeCore(const _PyCoreConfig *config)

{

    ...

    _PyImport_Init();

    _PyImportHooks_Init();

    _PyWarnings_Init();

 

    /* This call sets up builtin and frozen import support */

    if (!interp->core_config._disable_importlib) {

        printf("interp->core_config._disable_importlib\\n");

        initimport(interp, sysmod);

    }

 

    _Py_CoreInitialized = 1;

}

这是初始化函数中关于 import 机制的初始化, interp->importlib 就是在 initimport 函数中被赋值的当然我们是要从 _PyImport_Init() 开始分析:

 

[import.c]

static PyObject *initstr = NULL;

void _PyImport_Init(void)

{

    PyInterpreterState *interp = PyThreadState_Get()->interp;

    initstr = PyUnicode_InternFromString("__init__");

    interp->builtins_copy = PyDict_Copy(interp->builtins);

}

这部分就简单创建了 "__init__" 的PyUnicodeObject对象和复制一份内建builtins.

 

[import.c]

void _PyImportHooks_Init(void)

{

    PyObject *v, *path_hooks = NULL;

    int err = 0;

 

    /* adding sys.path_hooks and sys.path_importer_cache */

    v = PyList_New(0);

    PySys_SetObject("meta_path", v);

 

    v = PyDict_New();

    PySys_SetObject("path_importer_cache", v);

 

    path_hooks = PyList_New(0);

    PySys_SetObject("path_hooks", path_hooks);

 

    Py_DECREF(path_hooks);

}

给sys模块设置锚点也就是 sys.path_hooks 和 sys.path_importer_cache .

 

[import.c]

static void initimport(PyInterpreterState *interp, PyObject *sysmod)

{

    PyObject *importlib;

    PyObject *impmod;

    PyObject *sys_modules;

    PyObject *value;

 

    ...

 

    importlib = PyImport_AddModule("_frozen_importlib");

 

    interp->importlib = importlib;

    interp->import_func = PyDict_GetItemString(interp->builtins, "__import__");

 

    impmod = PyInit_imp();

    PyDict_SetItemString(sys_modules, "_imp", impmod);

    /* Install importlib as the implementation of import */

    value = PyObject_CallMethod(importlib, "_install", "OO", sysmod, impmod);

    ...

}

原来 interp->importlib 是 _frozen_importlib 模块将 impmod 安装到改模块使 importlib 作为导入的实现而 impmod 的导入过程如果用python语言来表示就如下:

 

你也许会奇怪, 为什么有 importlib 还要有一个 imp ? 这其实是一个新事物取代旧事物的过程python3.4版本以来就不推荐 imp 方式正在被慢慢替换成 importlib. 很尴尬的是现在importlib的实现代码是个code对象的字节码:

 

[Python/importlib.h]

const unsigned char _Py_M__importlib[] = {

    99,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,

    0,64,0,0,0,115,210,1,0,0,100,0,90,0,100,1,

    97,1,100,2,100,3,132,0,90,2,100,4,100,5,132,0,

    ...

};

也就是说看不到最终的执行代码了. 啊郁闷啊目前居然是这样替换imp好吧好吧毕竟替换是有个过程就用一个python的伪代码来表示一下:

 

def __import__(name, globals=None, locals=None, fromlist=None):

    # Fast path: see if the module has already been imported.

    try:

        return sys.modules[name]

    except KeyError:

        pass

 

    # If any of the following calls raises an exception,

    # there’s a problem we can’t handle -- let the caller handle it.

 

    fp, pathname, description = imp.find_module(name)

 

    try:

        return imp.load_module(name, fp, pathname, description)

    finally:

        # Since we may exit via an exception, close fp explicitly.

        if fp:

            fp.close()

作为全局的 sys.modules 用来记录已经引入的模块当判断已经存在时就不需要重新import.

 

import 机制的影响

 

在pythoon的import机制中会影响当前的local名字空间. sys.modules 表示全局引入的模块有的模块会默认加载到内存(比如 os), 但是通过 dir() 可以看到这些并没有在当前的local名字空间中.

 

而如果是被import文件里面又import其他文件名字空间不会影响到上一层的空间.

可以看到当前local名字空间中的 "__builtins__" 是一个module对象test中的 __builtins__ 确实dict对象上一篇提过, python进程中仅有一个 builtins , 它被所有线程共享这也是其背后的module对象下维护的dict其实指向的是同一个. test modules中的 builtins 符号对应的正式当前名字空间中 builtins 维护的dict对象他们背后其实都是python环境初始化中的 builtins module中维护的dict. 这个module早就被python加载进内存维护在sys.modules.

 

实际上, 所有的import操作不管是什么时间什么地方都是会影响到全局module集合即sys.modules.

 

 

 

来源:栖迟於一丘


以上是关于Python模块动态加载机制的主要内容,如果未能解决你的问题,请参考以下文章

《python解释器源码剖析》第15章--python模块的动态加载机制

Python插件机制实现

python上传模块,别人搜索不到

Python内置函数 __import__ 动态加载模块

Python importlib 动态加载模块

python 动态加载