[求问]使用python pdfminer3k读取pdf，并将p

作者: 知识收集社 | 来源:发表于2019-01-08 17:13 被阅读0次

[求问]使用python pdfminer3k读取pdf，并将p
python处理pdf文件
【python】使用python pdfminer3k读取pdf
python读取pdf txt 文件
读pdf及DBF文件
使用Python编辑PDF
Initializing from file failed
Python读写PDF
python读取pdf
Python使用Tabula提取PDF表格数据

如何将下载下来的文件名乱码的pdf按论文标题命名，在windows系统下，利用Python3.6，使用pdfminer库来实现。

按照https://www.jianshu.com/p/742a28decc58中的程序来实现，按照原作者的程序，能实现步骤2（读pdf获取标题），但执行步骤3（更改文件名）后，文件名未被更改，也没有报错，请问是什么原因？

程序如下:

#encoding:utf-8

from urllib.request import urlopen

from pdfminer.pdfinterp import PDFResourceManager,process_pdf

from pdfminer.converter import TextConverter

from pdfminer.layout import LAParams

from io import StringIO #StringIO就是在内存中读写str

from io import open

import os

from os import walk

#步骤2（读pdf获取标题）

def readPDF(pdffile):

rsrcmgr=PDFResourceManager()

retstr=StringIO()

laparams=LAParams()

device=TextConverter(rsrcmgr,retstr,laparams=laparams)

process_pdf(rsrcmgr,device,pdffile)

device.close()

content=retstr.getvalue()

retstr.close()

strs = str(content).split("\n")

#选择第 8,9,10行作为标题，但这种方法不一定适合别的pdf文档

title = strs[8]+strs[9]+strs[10]

return title

pdffile=open('D:\\pdfjiexi\\3.pdf',"rb")

title =readPDF(pdffile)

print(title)

pdffile.close()

#步骤3（更改文件名）

def rename():

walk = os.walk('/pdfjiexi/')

i = 0;

for root, dirs, files in walk:

#获取文件的全路径

for name in files:

pdffile=open(os.path.join(root, name),"rb")

title = readPDF(pdffile)

print(title)

os.rename(os.path.join(root, name), os.path.join(root, title+".pdf"))

i += 1

网友评论

本文标题：[求问]使用python pdfminer3k读取pdf，并将p

本文链接：https://www.haomeiwen.com/subject/gujbrqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

[求问]使用python pdfminer3k读取pdf，并将p

相关文章