pdfplumber模块初始用

Posted 2020-11-10 日天达人

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了pdfplumber模块初始用相关的知识，希望对你有一定的参考价值。

import pdfplumber 
import re
def pdf_read():
    pdf=pdfplumber.open(‘文件路径‘")#文件路径,读取文件
    page0=pdf.pages[11] #指定页数
    tables=page0.extract_tables()#获得该页的表格
    texts=page0.extract_text()#获得text文本值

pdfplumber 缺省通过表格线来区分行和列，所以下列情况是无法提取出表格的：
* 你的表格是图片，通过选择可以确定是否图片
* 你的表格不是用线来分隔，或者分隔不全，例如列用线，行没线
这种情况下，你就需要尝试：
page0.extract_tables(table_settings={})

以上是关于pdfplumber模块初始用的主要内容，如果未能解决你的问题，请参考以下文章

100天精通Python——第42天：pdfplumber读取PDF写入Excel文末送书三本

python里pdfplumber怎么下

使用pdfplumber读取PDF

pdfplumber处理pdf文件

Conda 不会安装 pdfplumber

pdfplumber读取拆分pdf内容和表格