-
Description of the bugI am trying to get the coordinate positions of each text in a pdf file. But the mupdf files have the coordinate origin as top left, but the pdf files have the coordinate origin as bottom left (or may be the random coordinate origin). I only want to get the coordinate origin as top left, is there a way to get the coordinate origin as top left only? How to reproduce the bugNone. PyMuPDF version1.24.10 Operating systemLinux Python version3.10 |
Beta Was this translation helpful? Give feedback.
Answered by
anhalu
Sep 26, 2024
Replies: 1 comment 7 replies
-
@JorjMcKie please help me. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
update problem
I think I'm misunderstanding something, the coordinates of the pdf are read by pymupdf (forming the coordinates of mupdf). The original PDF file is rotated so the coordinates of mupdf are also rotated. Now I set page.remove_rotation() which solved the problem, but at the same time adding remove_rotation() generates a lot of shape points (x, y coordinate pairs in the items of the shape) which increases the calculation a lot. I was going to use convex hull to handle this but it's not efficient enough.