Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of unicode ANSI skipping #30

Open
ztravis opened this issue Apr 5, 2024 · 1 comment
Open

Incorrect handling of unicode ANSI skipping #30

ztravis opened this issue Apr 5, 2024 · 1 comment

Comments

@ztravis
Copy link

ztravis commented Apr 5, 2024

The current implementation of unicode ANSI replacement sequences is slightly wrong - according to https://www.biblioscape.com/rtf15_spec.htm

  • A scope delimiter (i.e. "{" or "}") should end the current skippable data
  • Control words or symbols should be considered a single skipped character (and in my testing with MS office, they're ignored)
  • Any binary data is also considered a single skipped character

I plan on opening a PR for these, just wanted to open an issue first in case that takes a while.

@Gurushesh-Metapercept
Copy link

@ztravis , could you please tell me how to run rtfparserkit? I have created a Maven project and added rtfparserkit as a dependency, but I'm still encountering errors. Have you successfully run it? If so, could you share any guidelines or solutions you have?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants