Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

arctus-io · 2024-08-19T20:29:23Z

When using jira2markdown's convert() function on Jira lists with Carriage Return (CR) Line Feed (LF) (CRLF) style line-breaks the resulting markdown text adds erroneous whitespace to subsequent text after the last list item.

See below for a visual example of the conversion issue.

from jira2markdown import convert
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'

print(jira_text)

Input (jira_text printed):

Line Before List: Sample text words words words:
 * Bulleted Item 1: Sample text words words words
 * Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Input (jira_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
 * Bulleted Item 1: Sample text words words words\r\n
 * Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words

md_text = convert(jira_text)

Expected Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words

Line After List: Sample text words words words
Line After List: Sample text words words words

Expected Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\r\n
- Bulleted Item 2: Sample text words words words\r\n
\r\n
Line After List: Sample text words words words\r\n
Line After List: Sample text words words words\r\n

print(md_text)

Actual Output (md_text printed):

Line Before List: Sample text words words words:
- Bulleted Item 1: Sample text words words words
- Bulleted Item 2: Sample text words words words
  
  Line After List: Sample text words words words
  Line After List: Sample text words words words

Actual Output (md_text with line-breaks visualized):

Line Before List: Sample text words words words:\r\n
- Bulleted Item 1: Sample text words words words\n
- Bulleted Item 2: Sample text words words words\n
  \n
  Line After List: Sample text words words words\n
  Line After List: Sample text words words words

As shown the conversion ends up replacing:

\r\n in the list with \n
\r\n\r\n at the end of the list with \n \n
\r\n after the list with \n

Copy-and-Pasteable Snippet to replicate the issue:

from jira2markdown import convert

# Input with CRLF line-breaks 
jira_text = 'Line Before List: Sample text words words words:\r\n * Bulleted Item 1: Sample text words words words\r\n * Bulleted Item 2: Sample text words words words\r\n\r\nLine After List: Sample text words words words\r\nLine After List: Sample text words words words'

# Print input with line-breaks rendered
print("\njira_text:\n" + jira_text)

# Print input with line-breaks represented, not rendered
print("\nrepr(jira_text):\n" + repr(jira_text))

md_text = convert(jira_text)

# Print output with line-breaks rendered
print("\nmd_text:\n" + md_text)

# Print output with line-breaks represented, not rendered

print("\nrepr(md_text):\n" + repr(md_text))

The text was updated successfully, but these errors were encountered:

arctus-io · 2024-08-19T20:31:12Z

Happy to try and find/fix the issue if it is not a trivial fix on your end.

catcombo · 2024-09-07T12:34:37Z

Hi @arctus-io!

Thank you very much for the detailed issue with the reproducer. Sorry for the late response. I prepared a fix #28 Could you please test it?

arctus-io · 2024-09-18T00:26:03Z

@catcombo: Thanks for the fix! I can confirm #28 fixes this issue.

However... while testing this fix I noticed that the same type of issue regarding \r\n line endings is causing problems with table conversions as well.

I can submit another issue with more details if needed but I believe applying an across the board change that gives \r\n the same equivalency as \n would.

My current workaround is to just convert all \r\n line endings to \n line endings before running it through jira2markdown

catcombo · 2024-09-18T11:08:15Z

@arctus-io Thanks for the feedback! The easiest way to solve this problem is to replace \r\n with \n in the convert function before applying markup conversion. But I would like to find solution of how to fix it on the pyparsing level. It may take some time. Could you give me an example for the table conversion so I have more test cases?

catcombo linked a pull request Sep 7, 2024 that will close this issue

Feature/fix list line endings #28

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

arctus-io commented Aug 19, 2024 •

edited

Loading

arctus-io commented Aug 19, 2024

catcombo commented Sep 7, 2024

arctus-io commented Sep 18, 2024 •

edited

Loading

catcombo commented Sep 18, 2024

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

Converting Jira lists with CRLF line-breaks adds erroneous whitespace to subsequent text #25

Comments

arctus-io commented Aug 19, 2024 • edited Loading

arctus-io commented Aug 19, 2024

catcombo commented Sep 7, 2024

arctus-io commented Sep 18, 2024 • edited Loading

catcombo commented Sep 18, 2024

arctus-io commented Aug 19, 2024 •

edited

Loading

arctus-io commented Sep 18, 2024 •

edited

Loading