venv
.venv/bin
里面的python是软链,但是.venv/bin
下面的其他二进制文件(如pip)以及site-packages
里面安装的第三方库都是实实在在的源码文件
(.venv) [root@VM-165-116-centos /]# ls -l .venv/bin/python
lrwxrwxrwx 1 root root 15 Mar 6 10:35 .venv/bin/python -> /usr/bin/python
python不像go。python通过虚拟环境管理包的方式类似小仓模式,做的是物理隔离;而go通过go.mod进行包管理更像是大仓模式做逻辑隔离:
- 每个python的虚拟环境都存储了第三方pkg的源码,同一个pkg可能在不同的虚拟环境中被fork了多份;
- 而go的第三方pkg源码全局一份,不同的module通过go.mod对源码进行引用。当然,go vendor 也提供了依赖包物理隔离的管理方式
个人感觉,逻辑隔离相对优于物理隔离,因为不同工程可以复用同一份第三方pkg,包管理更加轻量和灵活
module and pkg in python
先说结论:
- python import的单位是从来都是module or items defined in module,而不是pkg。为什么?因为python是脚本语言,它是一个一个的.py文件。就像在shell脚本中引用其他shell脚本用的是
source /path/to/import.sh
- pkg是特殊的module,import pkg其实是import了一个
__init__.py
文件。module的核心作用是提供和隔离namespace
module
A module is a single file containing Python definitions and statements. The file name is the module name with the suffix .py
appended. Within a module, the module’s name (as a string) is available as the value of the global variable __name__
.
>>> import math
>>> math.__name__
'math'
Modules can import other modules. The imported module names, if placed at the top level of a module (outside any functions or classes), are added to the module’s global namespace.——
import其实是把被import的命名空间attach到当前module的命名空间上
(Note: For efficiency reasons, each module is only imported once per interpreter session.)
refers to https://docs.python.org/3/tutorial/modules.html
pkg
什么是package
Packages are a way of structuring Python’s module namespace with attribute __path__
. Packages have __path__
set to the directory or directories they were created from.
>>> import urllib
>>> urllib.__name__
'urllib'
>>> urllib.__path__
['/usr/lib64/python3.9/urllib']
>>>
__path__
= pkg的绝对路径
__path__
,顾名思义就是它所在包在当前系统的绝对路径,也就是说不同路径下即使同名的包它们的__path__
值也是不同的
包的 __path__
属性会在导入其子包期间被使用
如果一个module具有 __path__
属性(module.__path__
),它就被视为包,pkg也只是特殊的module
__path__
is initialized to be a list containing the name of the directory holding the package’s __init__.py
before the code in that file is executed(__path__
是预先初始化好的用于找__init__.py
的,在init.py被执行前__path__
就已经被赋值). This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.
如何成为package
regular package
A regular package is typically implemented as a directory containing an __init__.py
file.
When a regular package is imported, this __init__.py
file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. 当一个常规包被导入时,这个 __init__.py
文件会隐式地被执行;在 __init__.py
中定义的对象,会被绑定到import 指定的命名空间里面来。如果__init__.py
啥也没有,那就只是把这pkg一个空的namespace attach到当前module的namespace,这意味着导入包会创建命名空间但没有其他效果
namespace package
一些定义
"portion" refers to a set of files in a single directory (possibly stored in a zip file) that contribute to a namespace package.
"legacy portion" refers to a portion that uses __path__
manipulation in order to implement namespace packages.
A namespace package is a virtual package whose portions can be distributed in various places along Python's lookup path.
它将物理上的分布式包构成一个逻辑包,each portion contributes a subpackage to the parent namespace package。例如,这种使用场景经常出现在大的应用框架中,框架开发者希望鼓励用户发布插件或附加包
how to packaging a namespace pkg
native-namespace-package
Package directory without __init__.py
is "native namespace package". refer to https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages
resource_plan_backend/
├ tests/ # the name of this dir doesn't matter
│ └ ns1/ # namespace package name
│ └ test1.py # a = 1
│
.
.
.
└ ns1/ # namespace package name
│ └ test2.py # b = 2
>>> import sys
>>> sys.path.extend(['tests']) # must extend sys.path to find all subpackages
>>>
>>> import ns1 # ns1 is a namespace pkg
>>> ns1.__path__
_NamespacePath(['/root/workspace/resource_plan_backend/ns1', '/root/workspace/resource_plan_backend/tests/ns1'])
>>> ns1
<module 'ns1' (namespace)>
>>> from ns1 import test1
>>> test1.a
1
>>> from ns1 import test2
>>> test2.b
2
pkgutil-style-namespace-package
refer to https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#pkgutil-style-namespace-packages
- 所有subpkg保持相同namespace
- 所有subpkg的
__init__.py
仅写入__path__ = __import__('pkgutil').extend_path(__path__, __name__)
. the legacy behavior is to add any other regular packages on the searched path to its__path__
. But in Python 3.3 and later, it also adds namespace packages.
resource_plan_backend/
├ tests/ # the name of this dir doesn't matter
│ └ ns/ # namespace package name
│ └ ns1.py # a = 1
│ ├ __init__.py # Sub-package __init__.py
.
.
.
└ ns/ # namespace package name
│ └ ns2.py # b = 2
│ ├ __init__.py # Sub-package __init__.py
.
>>> import sys
>>> sys.path.extend(['tests']) # must extend sys.path to find all subpackages
>>>
>>> import ns
>>> ns.__path__
['/root/workspace/resource_plan_backend/ns', '/root/workspace/resource_plan_backend/./ns', '/root/workspace/resource_plan_backend/tests/ns'] # attention: ns is not a native namespace pkg here
>>> ns
<module 'ns' from '/root/workspace/resource_plan_backend/ns/__init__.py'>
>>> from ns import ns1, ns2
>>> ns1.a
1
>>> ns2.b
2
我们发现,pkgutil-style-namespace-package并不是纯正原生的namespace-package,因为__init__.py
仍然存在,它仍然是regular package。 只是这种方式通过extend_path
修改了顶层的公共namespace的search path,支持在多个路径下查找subpkg,从而实现了namespace-package的效果
importing module/pkg
With import <module_name>
, a module is imported as an object of the module
type. You can check which file is imported with print()
>>> import sys
>>> print(sys)
<module 'sys' (built-in)>
>>> print(type(sys))
<class 'module'>
>>>
>>> from math import * # 将math内的items attach到当前namespace,而math这个namespace本身并没有被attached
>>> print(math)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
>>> print(pi)
3.141592653589793
python如何查询module/pkg
分2步
- first, search
sys.modules
- second, search from
sys.meta_path
>>> import sys
>>> sys.meta_path
[<class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib_external.PathFinder'>]
sys.meta_path
contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module. Meta path finders must implement a method called find_spec() which takes three arguments: a name, an import path, and (optionally) a target module(所以你当然可以自己实现finder插入到meta_path中去).
The meta path may be traversed multiple times for a single import request.
For example, assuming none of the modules involved has already been cached.
importing
foo.bar.baz
will first perform a top level import(top level is in one of the paths on your sys.path), callingmpf.find_spec("foo", None, None)
on each meta path finder (mpf).After
foo
has been imported,foo.bar
will be imported by traversing the meta path a second time, callingmpf.find_spec("foo.bar", foo.__path__, None)
.Once
foo.bar
has been imported, the final traversal will callmpf.find_spec("foo.bar.baz", foo.bar.__path__, None)
.
总结来说,通过多个mpf进行广度搜索,找到顶层包;然后再把顶层包的__path__
作为参数,再进行一轮广度搜索。整个过程就是递归执行多次广度遍历
import pkg,就是import pkg的__init__.py
当python import一个pkg时,实际上只是import了pkg的__init__.py
这一个文件,而不像go那样import了整个包。例如:
Files (modules) are stored in the urllib directory as follows. __init__.py
is empty.
urllib/
├── __init__.py
├── error.py
├── parse.py
├── request.py
├── response.py
└── robotparser.py
If you write import urllib
, you cannot use the modules under it. For example, urllib.error
raises an error AttributeError.
import urllib
print(type(urllib))
# <class 'module'>
print(urllib)
# <module 'urllib' from '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/__init__.py'>
# print(urllib.error)
# AttributeError: module 'urllib' has no attribute 'error'
import urllib
引入的仅仅是xxx/urllib/__init__.py
绝对import和相对import
绝对导入的格式为 import A.B
或 from A import B
relative import (使用. ..的import,如from . import moduleA
)
相对导入不是相对于目录,而是相对于package
开放给其他开发者使用的pkg,包内module互相import显然需要使用相对引入,这是内聚
python import的单位是从来都是module or items defined in module,而不是pkg
Python的“包”其实不是包。它只是简单地把文件放在一起,提供了一个统一的namespace,仅此而已。进入“包”内我们发现,甚至包内引用都要import,这算哪门子的包?包内还是一个个独立的single-file module
所以,python的能力暴露单元本质上是module;而作为对比,go 则是将pkg作为一个整体能力对外暴露(go pkg 不存在包内互相import)
网友评论