How to avoid Nokogiri::HTML.parse
behavior
#2998
-
but
How do I work around this bug? without |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@YusukeSuzuki Thanks for asking this question. What you're seeing is how libxml2 (the underlying HTML4 parser) constructs a document around this fragment. I'm curious why you don't want to use You may also want to use Nokogiri::HTML5.parse('<div>text</div>').to_html
# => "<html><head></head><body><div>text</div></body></html>"
Nokogiri::HTML5.parse('text').to_html
# => "<html><head></head><body>text</body></html>" But really, again, I suggest you consider using fragment parsing: Nokogiri::HTML5.fragment('<div>text</div>').to_html
# => "<div>text</div>"
Nokogiri::HTML5.fragment('text').to_html
# => "text" |
Beta Was this translation helpful? Give feedback.
@YusukeSuzuki Thanks for asking this question. What you're seeing is how libxml2 (the underlying HTML4 parser) constructs a document around this fragment.
I'm curious why you don't want to use
DocumentFragment
as this is exactly the use case it addresses. Neither<div>text</div>
nortext
is a Document.You may also want to use
Nokogiri::HTML5
which uses libgumbo instead of libxml2, and that library follows the precise rules in the HTML5 spec around document structure:But really, again, I su…