Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serializing mdast to markdown #64

Open
Enoumy opened this issue Apr 29, 2023 · 9 comments
Open

Serializing mdast to markdown #64

Enoumy opened this issue Apr 29, 2023 · 9 comments

Comments

@Enoumy
Copy link

Enoumy commented Apr 29, 2023

(Whoops accidentally hit enter before drafting a content for this question, my apologies for the noise!)

Hi! I have a perhaps newbie question! I can use markdown::to_mdast to go from &str -> Node. Is it possible/is there a function to go back to a string - Node -> &str` - in a way that roundtrips?

I came across Node::to_string, and it does seem to convert nodes into a string but it also deletes the links/titles/and most other ast nodes, which if re-parsed again, results in a different ast. Unsure if this question is reasonable/within the context of this crate, but is there an alternate function elsewhere that is round-trippable to/from &str <-> Node? I am also happy to take a stab at implementing this "rountrippable" unparser function myself, but was wondering if a function like it already existed.

For further clarification, by "roundtripping", I would be writing a property based test, like markdown::to_mdast(to_string(node)) == node be true for all node's.

Thanks!

@Enoumy Enoumy changed the title Is it possible to roundtrip parse? Is it possible to roundtrip parse between Node <-> mdast? Apr 29, 2023
@Enoumy Enoumy changed the title Is it possible to roundtrip parse between Node <-> mdast? Is it possible to roundtrip parse between mdast nodes <-> strings? Apr 29, 2023
@wooorm
Copy link
Owner

wooorm commented Apr 29, 2023

No, this is not yet possible, as mdast-util-to-markdown has not been implemented in Rust yet.

You can work on this. Though, it is involved work that takes a while. The good part is that everything has already been implemented in JavaScript.

Finally, “complete” roundtripping (toString(fromString(x)) == x) is impossible with ASTs. ASTs are abstract. They loose information. That is intentional. So the results will never be exact, but the results will be equivalent.

@wooorm wooorm changed the title Is it possible to roundtrip parse between mdast nodes <-> strings? Serializing mdast to markdown Apr 29, 2023
@h7kanna
Copy link

h7kanna commented May 5, 2023

Will this work? Passing on the 'serde_json' serialized format to mdast-util-to-markdown?

@wooorm
Copy link
Owner

wooorm commented May 9, 2023

perhaps

@a-viv-a
Copy link

a-viv-a commented Aug 18, 2023

I wrote a likely crummy implementation of this for a personal project here, would something like this make sense as a PR or a new crate?

It passes a (much) weaker version of the proptest @Enoumy proposes, where string -> mdast -> string2 -> mdast -> string3 produces an equivalent string2 and string3 (assuming I understand how proptest works 😁 )

I don't think it covers all the possible nodes mdasts can include, and it applies some opinionated formatting. I also suspect this recursive approach is bad for performance. (I'm learning rust through this project, so I wouldn't be surprised to learn something about this code is very far from best practices)

@wooorm
Copy link
Owner

wooorm commented Aug 18, 2023

Nice start and welcome to rust :)

  • this project is no_std, looks like you’re using a bunch of that?
  • some potential bugs are fine, but it should be good from the start I think, have you looked at mdast-util-to-markdown? it’s battle tested and supports everything. Being mostly compatible across JS and Rust is also important to me!

@a-viv-a
Copy link

a-viv-a commented Aug 18, 2023

I'll leave this code in my own project then. I found this issue when I was already mostly done with this implementation, so I couldn't until it was too late. I'll take a look now, but I don't plan to write something new when I have something that works for me.

Edit: if nothing else I need to copy the unsafe character support...

@moy2010
Copy link

moy2010 commented Nov 16, 2023

@wooorm, do you know why wouldn't leveraging the ToString implementation for this be a good idea? Or is the intention to have a separate method for this?

@wooorm
Copy link
Owner

wooorm commented Nov 17, 2023

“to string” is already a thing in the mdast world, getting just the text out.
Formatting markdown is complex. And not always needed.
Yes, separate methods. See the first comment. https://github.com/syntax-tree/mdast-util-to-markdown

@moy2010
Copy link

moy2010 commented Nov 17, 2023

I see. I will try to work on a PR then 🙂.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants