-
Description of the bugMonaleesa_full.pdf How to reproduce the bugimport pymupdf page_num = 0 PyMuPDF version1.24.11 Operating systemMacOS Python version3.12 |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 2 replies
-
Except for page 7 (0-based), none of the pages contains an image. |
Beta Was this translation helpful? Give feedback.
-
Vector graphics cannot be extracted. All you can do is making a "photo" of the respective page area ... |
Beta Was this translation helpful? Give feedback.
-
Acrobat API can extract the vector graphics and save as png or svg. How does it do this? Is it hard to support in Pymupdf? THanks! |
Beta Was this translation helpful? Give feedback.
-
You can try this script. Or do this: import pymupdf
doc = pymupdf.open("input.pdf")
for page in doc:
for i, bbox in enumerate(page.cluster_drawings()):
pix = page.get_pixmap(clip=bbox, dpi=150)
pix.save(f"{doc.name}-{page.number}-{i}.png") |
Beta Was this translation helpful? Give feedback.
-
Thanks @JorjMcKie very much. It works and can extract the image I want. But it also extracted tables from this pdf as drawing, is there any field can differentiate the tables with other drawing? Thanks! |
Beta Was this translation helpful? Give feedback.
You can try this script. Or do this: