1. CentOS8 安装 pdfminer
[root@VM-99-12-centos ~]# pip install pdfminer
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip install --user` instead.
Collecting pdfminer
Downloading http://mirrors.tencentyun.com/pypi/packages/71/a3/155c5cde5f9c0b1069043b2946a93f54a41fd72cc19c6c100f6f2f5bdc15/pdfminer-20191125.tar.gz (4.2MB)
100% |████████████████████████████████| 4.2MB 135.0MB/s
Collecting pycryptodome (from pdfminer)
Downloading http://mirrors.tencentyun.com/pypi/packages/af/ef/bedde9b7a1f237b743eb307e6c247369c2ae5ca6a79b61c064698cfd78cd/pycryptodome-3.10.1-cp35-abi3-manylinux1_x86_64.whl (1.9MB)
100% |████████████████████████████████| 1.9MB 3.3MB/s
Installing collected packages: pycryptodome, pdfminer
Running setup.py install for pdfminer ... done
Successfully installed pdfminer-20191125 pycryptodome-3.10.1
2. 测试pdf2txt.py
将PDF第三页转换为text格式,保存为文本pdf-page3.txt
pdf2txt.py -t text -p 3 -o pdf-page3.txt 62100400348.pdf
网友评论