-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #2212 Enhances validation of HTTP header names #2213
Conversation
@carryel Do you know if the validation should need to be applied to the heder value also? I tested to include the invalid characters on the value side and those are processed, example:
cases that currently are prevented:
in all cases adding the header with a value containing the invalid chars the request is processed |
@breakponchito As you know, the validation of header name and header value are different. The header value should be slightly less strict than the header name. It is true that the header name requires token validation, and the header value requires field-content-related validation. However, I only did the minimum patch because I was worried about backward compatibility and possible side effects. There are so many related RFCs that there are slight differences, but if I extract a few, it is roughly as follows.
|
…alidation against invalid characters RFC-9110
Adding properties
+ Added the validation of HTTP header values + Name and value's validation are provided as options. (Improved based on #1 @breakponchito)
some test fail, it seems something regarding network connection, @carryel Can you try to run again the tests? |
+ trivial) updated license and re-trigger status checks(eclipse-ee4j#2213).
@breakponchito It passed all checks except for the reviews. :( Thanks for notification. :) |
I tried to run a manual test with an application and with the new grizzly property enabled to validate header value but no success, the request resolve the header and continue to get the resource: response:
probably I'm missing something to reject the response |
@breakponchito The examples below may help you analyze your testcases.
For the first command, it appears that it was treated as a string of '\' char(0x5c) and 'r' char(0x72), not '\r' char(0x0d). |
To explain a little more, the header value can have the following specifications, which are slightly different from the header name.
I interpreted that '\' char(0x5c) and 'r' char(0x72) belong to the above VCHAR. |
yep I saw how to test correctly instead of just send each character represented alone. With the following curls each character is scaped and validated against the new grizzly validations:
|
+ trivial) updated license and re-trigger status checks(eclipse-ee4j#2213).
Hey @carryel Based on what I review with my team, What could be the reason to not protect the Header Value content against new line character added alone?,
Field values containing CR, LF, or NUL characters are invalid and dangerous, due to the varying ways that implementations might parse and interpret those characters; a recipient of CR, LF, or NUL within a field value MUST either reject the message or replace each of those characters with SP before further processing or forwarding of that message. Field values containing other CTL characters are also invalid; however, recipients MAY retain such characters for the sake of robustness when they appear within a safe context (e.g., an application-specific quoted string that will not be processed by any downstream HTTP parser). As we thought about it and following the RFC-9110 definition we should need to protect against that character. |
@breakponchito Hi, Actually, LF(x0A, '\n') is a bit special. However, LF itself is not included in the Header Value content, nor is this char used as a header value. The line break(the line terminator) of the HTTP header is CRLF(x0D x0A) according to the official specification. However, I think most HTTP servers usually treat a standalone LF as a line terminator. The link below may help explain a little.
When I simply tested LF with NGINX, it was like this.
Patched Grizzly may also appear to allow LF, but the Header Value content does not include LF, and is processed as empty and "h" respectively. I think there would be an issue if Grizzly returned or allowed "\nhello" or "h\nello" as the value of "X-MyHeader". In conclusion, it is not that LF is not prevented, but LF is processed as CRLF during the parsing process, and the final field values to be utilized do not include LF. |
@arjantijms @carryel Do you know who can review and approve? |
@breakponchito Unfortunately, I am not sure who among the current Grizzly project members can review. Looking at the latest commits and activities, @arjantijms seems the most appropriate, but I am curious if he can review or what the current grizzly project operation status is. Related issue: #2211 |
+ trivial) updated license and re-trigger status checks(eclipse-ee4j#2213).
+ trivial) updated license and re-trigger status checks(eclipse-ee4j#2213).
@carryel Hi again, to continue with the discussion for the RFC-9110 header validations for reference:
@carryel @dmatej @arjantijms let me know your thoughts |
// Fast for correct values, slower for incorrect ones | ||
try { | ||
return isText[c]; | ||
} catch (ArrayIndexOutOfBoundsException ex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to check if you are in the 0 - isText.length range than "exception driven code"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I agree that checking the range is better than throwing an exception. The reason for doing this is to maintain similarity with other existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great opportunity to stop this madness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- @dmatej Strict RFC6265 cookie handling and Servlet 6 attributes support #2164 😁 (I hoped it was since initial contribution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for the comment:
grizzly/modules/http/src/main/java/org/glassfish/grizzly/http/util/CookieHeaderParser.java
Lines 33 to 34 in 4316ca1
* <p>Implementation note:<br> | |
* This class has been carefully tuned. </p> |
need to dig at origin
grizzly/modules/http/src/main/java/org/glassfish/grizzly/http/util/CookieHeaderParser.java
Line 36 in 4316ca1
* @author The Tomcat team |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for doing this is to maintain similarity with other existing code.
👍
+ to investigate the change in both places later (perhaps never)
@breakponchito Thanks for reviewing so many RFCs. I'll try to reflect the correction about restricting single LF. |
I don't currently have the knowledge and time to learn all those specifications so I would believe you for now, just one idea/note - obsoleted specs are not always completely abandoned by the industry and if we simply forbid some features, we can have some compatibility issues. On the other hand if compatibility means compromising security, we should press in this direction at least. |
+ Field values cannot have a single LF as per RFC-9110(https://www.rfc-editor.org/rfc/rfc9110.html#section-5.5). + Additionally, this patch automatically removes support for multiple lines of http headers as mentioned in RFC-2616(https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2). Note) The features related to this issue eclipse-ee4j#2212 only work when the STRICT_HEADER_NAME_VALIDATION_RFC_9110 and STRICT_HEADER_VALUE_VALIDATION_RFC_9110 options are enabled, and the existing code base behavior is maintained when options are not enabled. + Instead of throwing ArrayIndexOutOfBoundsException when judging Token and Text char, it ensures checking within the array range. + Since several existing http testcases do not respect CRLF in header values, I modified them to comply with the spec where it was not intentional. This patch passed all grizzly existing testcases locally, depending on the presence of options. > mvn clean test All passed. > mvn clean test -Dorg.glassfish.grizzly.http.STRICT_HEADER_NAME_VALIDATION_RFC_9110=true -Dorg.glassfish.grizzly.http.STRICT_HEADER_VALUE_VALIDATION_RFC_9110=true All passed.
@breakponchito The content of header value has been modified to no longer allow LF. More details are left in the commit log of d258c8b. |
Unclear target branch - should it be |
@pzygielo When I look at the source tree, it looks like main is correct. I'm wondering if I should clean up the PR based on main again, or if I should merge it to master first and then move it to main. |
Moved to #2219 |
This is a PR for Issue #2212.
The patch contains two contents.
\r\n
and it is an incomplete packet, ignore it and proceed with parsing. At this time, I was wondering whether to allow not only\r\n
but also single\n
, so for now, I decided to allow only\r\n
.Overall, I patched it so that there would be no major changes while maintaining the existing logic, and added related test cases.