Problem in extracting words by Font Style (face) #957
Unanswered
ProxyAyush
asked this question in
Q&A
Replies: 1 comment
-
Hi @ProxyAyush, and thanks for your interest in this library. Without the PDF, this will be difficult to diagnose. Can you provide it? And are you sure that the Also, if you could, please edit your message to fix the formatting issues, which currently are making it a bit difficult to read the code: |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I cannot extract words in a pdf with using fontname in objects property, my code is below
`import pdfplumber
def get_filtered_text(file_to_parse: str) -> str:
with pdfplumber.open(file_to_parse) as pdf:
for i in range(0, 286):
text = pdf.pages[i]
clean_text = text.filter(lambda obj: obj["object_type"] == "char" and obj["size"] == 33)
#if clean_text.extract_text() != "":
#print(clean_text.extract_text())
get_filtered_text("/Users/User/Desktop/kundu modified.pdf")`
Beta Was this translation helpful? Give feedback.
All reactions