本文写于2022年04月03日,阅读时请注意时效。
问题场景
使用conda
创建的python==3.9.11
、tensorflow-gpu==2.4.1
的环境中使用pip install pkuseg
失败,出错情况大概有三种,后来有一种无论如何也无法复现了,所以这里只记录源码安装pkuseg
的流程。
经测试pkuseg
可以在python==3.7.13
或python==3.8.13
的环境下使用pip install pkuseg
安装成功,环境中是否有tensorflow-gpu
不影响安装,遇到环境中缺少numpy
cython
的依赖装上就可以了。
-
先在GitHub页面下载代码的zip包,得到文件名为
pkuseg-python-master.zip
的压缩文件。 -
安装numpy、cython依赖:
pip install numpy cython
-
安装pkuseg:
pip install pkuseg-python-master.zip
终端输出如下:
raner@testnode:~$ conda activate test
(test) raner@testnode:~$ pip install numpy cython
Looking in indexes: http://pypi.tuna.tsinghua.edu.cn/simple
Collecting numpy
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/15/87/4d6bc4e2053a4b517b022746f8e2dae328155a4c723bcad4c7d536febf51/numpy-1.22.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
|████████████████████████████████| 16.8 MB 919 kB/s
Collecting cython
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/94/6a/0d66e2d9cf405c87c74d1d29439c4910d3d1895fb122667920a4012d0bda/Cython-0.29.28-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
|████████████████████████████████| 1.9 MB 1.3 MB/s
Installing collected packages: numpy, cython
Successfully installed cython-0.29.28 numpy-1.22.3
(test) raner@testnode:~$ pip install pkuseg-python-master.zip
Looking in indexes: http://pypi.tuna.tsinghua.edu.cn/simple
Processing ./pkuseg-python-master.zip
Requirement already satisfied: cython in ./anaconda3/envs/test/lib/python3.10/site-packages (from pkuseg==0.0.25) (0.29.28)
Requirement already satisfied: numpy>=1.16.0 in ./anaconda3/envs/test/lib/python3.10/site-packages (from pkuseg==0.0.25) (1.22.3)
Building wheels for collected packages: pkuseg
Building wheel for pkuseg (setup.py) ... done
Created wheel for pkuseg: filename=pkuseg-0.0.25-cp310-cp310-linux_x86_64.whl size=3321624 sha256=6be6d05f53319aac298f9535a6f4a0cd0ed41dd3c91548b07300bc51639d6889
Stored in directory: /public/home/raner/.cache/pip/wheels/73/db/1c/9a992085963288025e05fe3229efeb59db87a06903c5f4fa7f
Successfully built pkuseg
Installing collected packages: pkuseg
Successfully installed pkuseg-0.0.25
模型文件存放位置
安装结束后,在GitHub的Release中找到后缀为.zip
的模型文件,放到~/.pkuseg
文件夹中,然后将每个压缩文件解压到对应的文件夹中即可,目录结构如下:
raner@testnode:~$ tree .pkuseg/
.pkuseg/
├── medicine
│ ├── features.pkl
│ ├── medicine_dict.pkl
│ └── weights.npz
├── medicine.zip
├── mixed
│ ├── features.pkl
│ └── weights.npz
├── mixed.zip
├── news
│ ├── features.pkl
│ └── weights.npz
├── news.zip
├── postag
│ ├── features.pkl
│ └── weights.npz
├── postag.zip
├── tourism
│ ├── features.pkl
│ ├── tourism_dict.pkl
│ └── weights.npz
├── tourism.zip
├── web
│ ├── features.pkl
│ └── weights.npz
└── web.zip
6 directories, 20 files
使用方法
这里使用mixed模型,也就是默认模型的方法和文档中的有一些区别,因为使用源码安装是没有自带默认的模型文件的。其他模型如news等使用方法不变。(使用pip安装的情况模型文件放置的位置是相同的)
import pkuseg
# 默认模型的使用方法,需要写绝对路径
# 路径中使用~无效,使用default或者mixed作为参数值均无效
seg = pkuseg.pkuseg(model_name='/public/home/raner/.pkuseg/mixed', postag=True)
text = seg.cut('我昨天忘记签到了。')
print(text)
# 其他模型的使用方法与官方readme一致
seg = pkuseg.pkuseg(model_name='news', postag=True)
text = seg.cut('我昨天忘记签到了。')
print(text)
参考链接:
网友评论