Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML parser error in PHP7 #69

Open
cburschka opened this issue May 24, 2016 · 2 comments
Open

XML parser error in PHP7 #69

cburschka opened this issue May 24, 2016 · 2 comments

Comments

@cburschka
Copy link

cburschka commented May 24, 2016

I traced a mysterious hang-up during authentication to the following problem:

  1. The client successfully negotiates TLS.
  2. On reading <proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>, the client resets the parser.
  3. The next string that is supposed to hit the newly initiated parser is <?xml version='1.0'?><stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='...' from='...' version='1.0' xml:lang='en'>
  4. Even though nothing else should have been parsed yet (and xml_get_current_byte_index returns 0), the parser will report Reserved XML Name. That error is usually a result of <?xml version='1.0'?> being preceded by any other input. That implies that something is "polluting" the parser in between being reset and receiving that string.

Unfortunately I didn't find out what happens, but I was able to use the following hacky workaround to make sure that a <? string automatically resets the parser again:

diff --git a/core/jaxl_xml_stream.php b/core/jaxl_xml_stream.php
index 1c2a70d..42f2f88 100644
--- a/core/jaxl_xml_stream.php
+++ b/core/jaxl_xml_stream.php
@@ -89,6 +89,7 @@ class JAXLXmlStream {
        }

        public function parse($str) {
+               if (strlen($str) > 2 && $str[0] == '<' && $str[1] == '?') $this->reset_parser();
                xml_parse($this->parser, $str, false);
        }
@semoriil
Copy link

semoriil commented Oct 30, 2016

May that "other input" be BOM mark? For UTF-8 it's unnessary but happens. And can give strange results given you can't see it in viewer but parser shure finds them...

@cburschka
Copy link
Author

I haven't examined it deeply enough to count that out completely. But I don't think any bytes could hide in front of the <?xml string itself, since as far as I know the $str[0] == '<' operates on a byte level, so the fix wouldn't work.

I suppose it'd have to be in a separate call to xml_parse(), which I haven't been able to find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants