Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify how parse_in_place will mutate the buffer #346

Open
huangqinjin opened this issue Dec 18, 2022 · 2 comments
Open

Clarify how parse_in_place will mutate the buffer #346

huangqinjin opened this issue Dec 18, 2022 · 2 comments

Comments

@huangqinjin
Copy link

First of all, thank you for making such an excellent C++ YAML library!

From Quick start,

// Parse YAML code in place, potentially mutating the buffer.

Could you please explain more on this?

  • Is the parsing action will mutate the buffer?
  • Or it only means you can modify existing nodes through operator<< and operator=, and the modification will be written to the buffer?
  • Does it mean that using const ryml::Tree tree will never modify the buffer even it is parsed by parse_in_place?

The use case is that, I have a large YAML document readonly memory-mapped into memory. I want to avoid copy and parse it.

@biojppm
Copy link
Owner

biojppm commented Dec 18, 2022

First of all, thank you for making such an excellent C++ YAML library!

Thanks, this is really appreciated.

Those are very good questions, and the documentation should be explicit about these points.

Will the parsing action mutate the buffer?

Yes, generally.

parse_in_place() requires mutable memory. Currently, any scalar that requires filtering (plain, block, flow, double-quoted and single-quoted) will be filtered in place. BUT there is a case when no modification is done: when there are no characters to be filtered, then the original buffer is kept unchanged throughout the parse. However, since there is no prior knowledge of when it might be the case, the API will always start with a mutable buffer. That's why parse_in_place() only receives the mutable substr. Also, parse_in_arena() is just a wrapper over a parse_in_place(): it receives the immutable csubstr, copies it to the tree's arena, and then calls parse_in_place() on that copy.

Now, if all you have is immutable memory and cannot afford the copy to the arena, and you are 100% certain that no scalar filtering is required, you could hard-cast the pointer from immutable to mutable and just call parse_in_place(). But that would be very dangerous territory. You know, there be dragons and so on.

Or it only means you can modify existing nodes through operator<< and operator=, and the modification will be written to the buffer?

No Or; the prior point is unrelated to posterior use of operator<</operator=. operator= will point the node at the given memory, and operator<< will serialize first to the tree's arena (allocating as needed unless there was a prior reserve). The source buffer may be or may not be in the tree's arena, but if it is, operator<< will not write into the buffer, but instead append at the end of the arena (which may cause the buffer to be relocated if the arena capacity is expanded, but in this case the contents of the buffer will not change).

Does it mean that using const ryml::Tree tree will never modify the buffer even it is parsed by parse_in_place?

Yes, pretty much - but only after parsing (during parsing the Tree must not be const, for obvious reasons). If a const Tree can modify the buffer, then that's an API oversight or a bug, and I'd like to hear about it.


Feel free to ask further clarification if the above does not quite answer your questions. I will improve the quickstart to explain these points in more detail. I will keep this open to track that.

@huangqinjin
Copy link
Author

Thanks for your very clear explanation! I understand now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants