美文网首页
python处理pdf文件

python处理pdf文件

作者: 微雨旧时歌丶 | 来源:发表于2018-10-30 15:03 被阅读0次
    pdftotext -layout '/home/pengfei/桌面/51.pdf'
    

    tabula库

    例子

    import tabula
    
    # Read pdf into DataFrame
    df = tabula.read_pdf("test.pdf", options)
    
    # Read remote pdf into DataFrame
    df2 = tabula.read_pdf("https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf")
    
    # convert PDF into CSV
    tabula.convert_into("test.pdf", "output.csv", output_format="csv")
    
    # convert all PDFs in a directory
    tabula.convert_into_by_batch("input_directory", output_format='csv')
    

    相关文章

      网友评论

          本文标题:python处理pdf文件

          本文链接:https://www.haomeiwen.com/subject/vwegtqtx.html