Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

Open
arctus-io opened this issue Aug 19, 2024 · 4 comments · May be fixed by #28
Open

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

arctus-io opened this issue Aug 19, 2024 · 4 comments · May be fixed by #28

Comments

@arctus-io
Copy link

arctus-io commented Aug 19, 2024

When using jira2markdown's convert() function on Jira lists with Carriage Return (CR) Line Feed (LF) (CRLF) style line-breaks the resulting markdown text adds erroneous whitespace to subsequent text after the last list item.

See below for a visual example of the conversion issue.

from jira2markdown import convert
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'
print(jira_text)

Input (jira_text printed):

Line Before List: Sample text words words words:
 * Bulleted Item 1: Sample text words words words
 * Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Input (jira_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
 * Bulleted Item 1: Sample text words words words\r\n
 * Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words
md_text = convert(jira_text)

Expected Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Expected Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\r\n
- Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words\r\n
print(md_text)

Actual Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words
  
  Line After List: Sample text words words words
  Line After List: Sample text words words words

Actual Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\n
- Bulleted Item 2: Sample text words words words\n
  \n
  Line After List: Sample text words words words\n
  Line After List: Sample text words words words

As shown the conversion ends up replacing:

  • \r\n in the list with \n
  • \r\n\r\n at the end of the list with \n \n
  • \r\n after the list with \n

Copy-and-Pasteable Snippet to replicate the issue:

from jira2markdown import convert

# Input with CRLF line-breaks 
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'

# Print input with line-breaks rendered
print("\njira_text:\n" + jira_text)

# Print input with line-breaks represented, not rendered
print("\nrepr(jira_text):\n" + repr(jira_text))

md_text = convert(jira_text)

# Print output with line-breaks rendered
print("\nmd_text:\n" + md_text)

# Print output with line-breaks represented, not rendered

print("\nrepr(md_text):\n" + repr(md_text))
@arctus-io
Copy link
Author

Happy to try and find/fix the issue if it is not a trivial fix on your end.

@catcombo catcombo linked a pull request Sep 7, 2024 that will close this issue
@catcombo
Copy link
Owner

catcombo commented Sep 7, 2024

Hi @arctus-io!

Thank you very much for the detailed issue with the reproducer. Sorry for the late response. I prepared a fix #28 Could you please test it?

@arctus-io
Copy link
Author

arctus-io commented Sep 18, 2024

@catcombo: Thanks for the fix! I can confirm #28 fixes this issue.

However... while testing this fix I noticed that the same type of issue regarding \r\n line endings is causing problems with table conversions as well.

I can submit another issue with more details if needed but I believe applying an across the board change that gives \r\n the same equivalency as \n would.

My current workaround is to just convert all \r\n line endings to \n line endings before running it through jira2markdown

@catcombo
Copy link
Owner

@arctus-io Thanks for the feedback! The easiest way to solve this problem is to replace \r\n with \n in the convert function before applying markup conversion. But I would like to find solution of how to fix it on the pyparsing level. It may take some time. Could you give me an example for the table conversion so I have more test cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants