-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addresses Content-Type mismatch on HTTP 301 Moved Permanently with FileGetContents #709
Conversation
Based on our extensive test suite, I can confirm that this patch solves the reported issue 👍 Although it also exposes the performance bottleneck that comes with external references when they have to be fetched synchronously in a single-threaded fashion. 👎 Without patch with 3.0.0-without-%24id.json
With patch with 3.0.0.json.
So it is good that the problem gets fixed but for us, 💯 but it is better if we stick with https://github.com/asyncapi/spec-json-schemas/blob/master/schemas/3.0.0-without-%24id.json instead of https://github.com/asyncapi/spec-json-schemas/blob/master/schemas/3.0.0.json. |
So if I understand your comments correctly everything works but in your test suite you need to shift to another specification due to time constraints? I've tried to understand the phpunit outputs and I can see the patch adds a 47 seconds time increase, but that can also be becuase of the different schema you're loading. WHat would be the In addition I've played around with the schema that is being fetched in the added test but there seems to be no impact on the timing of the test itself no matter which schema I pick. |
I also believe the different schema causes/caused the delay, precisely that external references have to be resolved in the https://github.com/asyncapi/spec-json-schemas/blob/master/schemas/3.0.0.json spec synchrinously, in a blocking fashion. I have not ran a profiling on the code, but this is my strong feeling. From this PR point of view, even if this is true, this works as designed because fixing this problem would require some refactoring - I assume. Also, admittedly, my test suite is stress testing this code because it parses several API specs with multiple data providers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand the performance problem being flagged in the comments, but it seems to me it is a separate issue and should not block this. I would recommend re-thinking the tests for this though.
IMHO this is enough of an edge case to not block 6.0 release.
{ | ||
$res = new FileGetContents(); | ||
|
||
$res->retrieve('http://asyncapi.com/definitions/2.0.0/asyncapi.json'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this is, I think, to be avoided. We don't have control over that website or server configuration and it could easily break. Can we perform the same test by mocking a response and not requiring a real external network call? Tests should ideally run without an internet connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree with your point to some extend. In order to test we actually need a schema which is available using an HTTP 301 redirect. My point for using this specific URL is because it is relative safe as it belongs to an organisation and we point to their definition which would not change quickly.
// See http://php.net/manual/en/reserved.variables.httpresponseheader.php for more info. | ||
$this->fetchContentType($http_response_header); // @codeCoverageIgnore | ||
} else { // @codeCoverageIgnore | ||
$headers = strpos($uri, 'http') === 0 ? get_headers($uri) : array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems bad to me to switch to get_headers as, as far as I understand, this does a second HTTP request just to fetch the headers. Whereas $http_response_header is populated by file_get_contents in a single HTTP request.
So I'm not exactly sure what this fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok reading the issue again, I see what needs solving is fetchContentType needs to stop doing the early return. You need to iterate through all the headers and get the last content-type header's value, as that will be the final response. See https://github.com/composer/composer/blob/main/src/Composer/Util/RemoteFilesystem.php#L173-L177 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your points and only realise now that indeed a secondary call is made. This triggered me to look further into the issue and found an override for file_get_contest
was made in the test causing the $http_response_headers
to not be set in the tests. Further looking the test that depended on the override were having different behaviour between the test and runtime as file_get_contents
can return false without invoking the defined error handler. Removing the test and the override seemed like to correct way forward.
Overriding file_get_contents introduces different behaviour from the native function, such as the $http_response_headers missing. Removing the override revealed two test (testing a specific return path of the file_get_contents method rather then testing the behaviour or result of the subject under test) which had different behaviour between the test and runtime, therefor these tests have been removed.
…g matches on HTTP 301 redirect headers which are listed first
This can also be the result of using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't look at tests but the fix itself lgtm
This PR attempts to resolve issue #694, in order to do so three commits where added
$http_response_header
[link] withget_headers()
[link]Content-Type
header.