Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make python based exportNotebook script (also add support for mac and potentially windows) #22

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

mseri
Copy link

@mseri mseri commented Jun 14, 2018

This allows to drop the dependency on pdfinfo and makes both pdftk and ghostscript optional, at the price of introducing a dependency on PyPDF2 (completely python based) and paramiko (almost purely python based).

I also believe that the python3-based script is easier to read and maintain. I think it could even allow to run this script on Windows, but I have not tested it.

If you are happy with this PR and such a drastic change, I can update the README.
We could also add a setup.py file with an entrypoint for exportNotebook and rM2svg, and package the whole thing as a standalone pip-installable python3 package. I am happy to add it if you believe this is worthwhile.

There is space for further cleanup and unit-testing. Currently I have tested this on multiple notebooks using the latest osx and ununtu, and I am very happy with the results.

Closes #21

mseri added 10 commits June 14, 2018 14:30
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
It implements a first version of core functions and drops the pdfinfo
dependency but introduces the python3 PyPDF2 and paramiko dependencies.

Pdftk is now optional but the merge function using PyPDF2 is not yet
implemented.

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
When used, this removes the dependency of pdftk.

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
@mseri mseri changed the title Make python based exportNotebook script Make python based exportNotebook script (also add support for mac and potentially windows) Jun 20, 2018
@mseri
Copy link
Author

mseri commented Jun 20, 2018

I have been testing this on multiple PDFs, I have some examples in which the offsets are coming out incorrect (happens also with the old pdftk-based implementation though). Unfortunately I cannot publish the PDFs where this is happening. Interestingly enough the issue does not show up with remt, so there may be some incorrect computation here (I wonder if it has to do with the fact that we comput the x offset but we never touch the y offset).

It is not a regression but I will try to make some tests to see if it can be fixed.

I still believe this PR is useful as remt is problematic to use on windows due to its dependency on cairo and poppler, while this new version does not have any unusual dependency

@Lspencer
Copy link

I have tweaked the exportNotebook script to work with US letter paper size (a scale factor and vertical offset are needed for this, whereas a horizontal offset is needed for A4 paper size). I have some pdf files that it does funny things on, so it is not quite ready to share. I'll hopefully update soon.

@mseri
Copy link
Author

mseri commented Jul 26, 2018

@edupont is there any interest in this python conversion PR?

@mseri
Copy link
Author

mseri commented Jul 26, 2018

@Lspencer neat! We should probably make exportNotebook generic on the page format!

@florian-wagner
Copy link
Contributor

@mseri I like the idea of packaging it into a Python application. Thank you very much for your efforts and sorry for the late reply. @Ameb @edupont What do you think? Can we do this within here or should we work on a seperate repository?

@LDSpencer
Copy link

An update on my script to handle A4 and USletter paper with ExportNotebook: I was testing with a student thesis chapter and needed a vertical offset (plus a scaling factor) instead of a horizontal offset. I got this working and then tested on 2 other files. Both did not work out of the box. I found a work-around for one (us gs to scale by 1 and offset by 0, and then it would scale and offset properly), but the other one would not scale, so I had to scale the rM lines to the original page size. The latter is OK, but edge comments are clipped sometimes. I am not the author of any of these pdfs, so I do not know all of the differences, and haven't been able to duplicate these results on pdf files I create. With the pdf file that would not scale, the gs routine I was using modified the mediaBox, but not the artBox, clipBox, etc. I'll try to clean up my script and then post it, but I would like to better understand why this is failing on some pdfs. I can try to post a MWE but need these files to be my own (not student work), so I still am huntign around to recreate this problem.

@trou
Copy link
Contributor

trou commented Aug 1, 2018

It may be useful to create a repository with a set of pairs (lines, PDF) to test, maybe even be included as a test suite ?

@mseri
Copy link
Author

mseri commented Aug 7, 2018

@trou that would be great!

@mseri mseri mentioned this pull request Aug 27, 2018
2 tasks
@ericsfraga
Copy link

I've hacked the rM2svg script to work for most (all?) PDF annotations with one minor caveat: annotations near the right or bottom edges of the tablet may be lost depending on the aspect ratio of the original PDF document (due to how the reMarkable maps the PDF to the tablet's 4:3 aspect ratio). The script now takes --width and --height arguments which should correspond to the original PDF document's geometry and the annotations are mapped accordingly.
I've attached a test file I used for playing around with the updated script. Note loss of arrowhead on the right end of the top horizontal line.
screendump-20180928143204

@niklasb
Copy link

niklasb commented Oct 22, 2018

@ericsfraga where can I find your updated script and the test file? I'm willing to play around with it

@mseri
Copy link
Author

mseri commented Oct 22, 2018

@ericsfraga would you mind making a PR to my make-python-based with the fix?

@mseri mseri closed this Oct 22, 2018
@mseri mseri reopened this Oct 22, 2018
@ericsfraga
Copy link

ericsfraga commented Oct 22, 2018

The modified python script is here along with the test file here. The script now expects a width and a height which you can extract from the PDF using pdfinfo. I have a shell script that uses rM2svg to create annotated PDF files.

The shell script expects a .lines file as the argument. It then looks for a .pdf with the same base name. If it finds the file, it extracts width and height from that file and then creates an annotated file. If no PDF is found, it assumes that the .lines file refers to a notebook on the remarkable and so simply creates a PDF from the lines file directly.

I've been using my shell script with the updated rM2svg script in anger for a few weeks now and it's working very well.

Also, my shell script will also incorporate any template that was used for notebook documents. It does assume the templates have been copied over from the remarkable tablet itself.

@niklasb
Copy link

niklasb commented Oct 22, 2018

@ericsfraga I tried your rM2svg script and it worked well with my letter-format test file. I'm not sure what the purpose of your hardcoded 2.3 constant is. I would suggest a slightly simpler version that just wraps everything in a <g transform> tag: https://gist.github.com/niklasb/8d4da7933b4eace54cd2c836ef62117e I've never tried to use it with multi-page documents though. It's honestly a mystery to me how the upstream PDF export can still be this buggy. Thank you @edupont and others for making these scripts.

@ericsfraga
Copy link

ericsfraga commented Oct 22, 2018

Ah, the 2.3 was a holdover from when I was manually playing with A4 size paper and comes from the ratio of 1404 to the width of an A4 page. It should probably be calculated directly as the 1404 divided by actual target width. This does assume that the height ratio is similar... bit of a hack, I guess.

Regarding your suggestion about wrapping within an appropriate tag, I'm sure that's probably the right way to go about it but I sure do not have the expertise in SVG...

In any case, I'm glad it worked with your letter format page as well!

And why the default export mechanism is so bad... ummm, no idea really.

@niklasb
Copy link

niklasb commented Oct 22, 2018

The correct "algorithm" seems to be to scale down the page to the correct width keeping the aspect ratio, and then truncate/extend the SVG so that it also has the correct target height. I tried to do this in my code.

@niklasb
Copy link

niklasb commented Oct 22, 2018 via email

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
This should resolve the scaling issues for the PDFs that have all the pages of the same size

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
@niklasb
Copy link

niklasb commented Oct 22, 2018

@mseri Not sure if mixing this with the prepare_background functionality you have makes sense. In fact the '{translate -offset 0}' ghostscript command does not work at all for the documents I was trying to export, hence I switched over to generating the appropriate foreground image in the first place.

@mseri
Copy link
Author

mseri commented Oct 22, 2018

I like the idea of generating the "right" foreground so that there are as few rescaling as possible in the document merge. It does not seem that this will be merged anytime soon, I'll play around with it and see if it resolves some of the issues I've found

@mseri
Copy link
Author

mseri commented Apr 11, 2019

This has gotten stale. I am starting working on the version supporting the new format, integrating @ericsfraga changes as well. Would it make sense to just fork this. It seems to have become unmaintained

@ericsfraga
Copy link

I cannot speak for the original author but do let us know if you do fork this. I am waiting impatiently for the latest upgrade to the system to see if things break again; I don't expect they will but you never know!

@mseri
Copy link
Author

mseri commented Apr 11, 2019

I will update the code to use the new rm2svg for .rm files, generate the appropriate foreground, and move everything to a separate repo. My hopes are for all this functionality to be integrated directly in rmapi (as it seems to be happening), but I want to have working colorful export for myself in the meanwhile. I'll ping here in some weeks, as soon as I have done it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants