Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure body is consumed only once (alternative to #847) #851

Merged
merged 1 commit into from
Aug 2, 2024

Conversation

sathieu
Copy link
Contributor

@sathieu sathieu commented Jul 7, 2024

Fixes: #846

@sathieu
Copy link
Contributor Author

sathieu commented Jul 7, 2024

@hartwork Here is another way to fix #846. Probably better, because at the same place as streams handling.

@codecov-commenter
Copy link

codecov-commenter commented Jul 13, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 78.94737% with 4 lines in your changes missing coverage. Please review.

Project coverage is 91.96%. Comparing base (298a693) to head (ffb8c07).
Report is 14 commits behind head on master.

Files Patch % Lines
vcr/util.py 63.63% 2 Missing and 2 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #851      +/-   ##
==========================================
- Coverage   92.32%   91.96%   -0.37%     
==========================================
  Files          27       27              
  Lines        1811     1829      +18     
  Branches      338      293      -45     
==========================================
+ Hits         1672     1682      +10     
- Misses         92       96       +4     
- Partials       47       51       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathieu thanks for this new approach. Please see the details I found during review, below:

tests/unit/test_util.py Outdated Show resolved Hide resolved
tests/unit/test_stubs.py Show resolved Hide resolved
data2,
data3,
):
testpath = str(tmpdir.join(testfile))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason why we want control over the precise filename here? If not, with NamedTemporaryFile(...) would work:

  • suffix=".yml" allows to control the file extension if needed
  • dir=tmpdir allows using the directory we have been given.

It would resolve one no-longer-necessary parameter. I think it could help simplicity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to use TemporaryDirectory instead, because vcrpy fails when cassette is an empty file.

Copy link
Collaborator

@hartwork hartwork Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathieu interesting! I got curious, played with it and can confirm that the file cannot exits already when we start. Here's what I did to make it work still:

# git --no-pager diff
diff --git a/tests/unit/test_stubs.py b/tests/unit/test_stubs.py
index c177061..8373923 100644
--- a/tests/unit/test_stubs.py
+++ b/tests/unit/test_stubs.py
@@ -1,7 +1,7 @@
 import contextlib
 import http.client as httplib
 from io import BytesIO
-from tempfile import TemporaryDirectory
+from tempfile import NamedTemporaryFile
 from unittest import mock
 
 from pytest import mark
@@ -52,8 +52,12 @@ class TestVCRConnection:
         data2,
         data3,
     ):
-        with TemporaryDirectory(dir=tmpdir) as cassete_dir:
-            testpath = str(cassete_dir.join("cass.yml"))
+        with NamedTemporaryFile(dir=tmpdir, suffix='.yml') as f:
+            testpath = f.name
+            # NOTE: ``use_cassette`` is not okay with the file existing
+            #       already.  So we are using ``.close()`` to not only
+            #       close but also delete the empty file, before we start.
+            f.close()
             host, port = httpbin.host, httpbin.port
             match_on = ["method", "uri", "body"]
             with use_cassette(testpath, match_on=match_on):

I'm not saying that's better, maybe it's not 🤷 How about adding a sentence of comment why TemporaryDirectory is prefered over NamedTemporaryFile in our case? Something like that would maybe be best?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changed to your suggestion. thanks!

tests/unit/test_stubs.py Outdated Show resolved Hide resolved
tests/unit/test_util.py Outdated Show resolved Hide resolved
vcr/util.py Outdated
@@ -92,6 +92,21 @@ def composed(incoming):
def read_body(request):
if hasattr(request.body, "read"):
return request.body.read()
if hasattr(request.body, "__iter__") and not isinstance(
request.body,
(bytearray, bytes, dict, list, str),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This very 5-types list exists in two places in the code base now, so there is risk of the two going out of sync. Could this be (a) extracted to a non-public constant-like symbol somewhere or (b) commented about keeping in sync on both sides? (a) would likely be preferable. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new is_iterator function. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathieu the function idea and the function body I like, but the name is a bit misleading and I'm worried that this de-facto internal helper function is becoming part of the public API this way. How about _is_nonsequence_iterator? The naming idea is based on collections.abc.Sequence and it makes me realize as a side effect that we could make replace isinstance(request.body, (bytearray, bytes, dict, list, str)) by isinstance(request.body, Sequence). What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to _is_nonsequence_iterator.

Unfortunately, isinstance(request.body, Sequence) is breaking a few tests...

vcr/util.py Show resolved Hide resolved
vcr/util.py Outdated Show resolved Hide resolved
vcr/util.py Outdated Show resolved Hide resolved
tests/unit/test_stubs.py Show resolved Hide resolved
@hartwork hartwork changed the title Ensure body is consumed only once Ensure body is consumed only once (alternative to #847) Jul 13, 2024
Copy link
Contributor Author

@sathieu sathieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hartwork for your review. I've implemented your suggestions. Back to you!

data2,
data3,
):
testpath = str(tmpdir.join(testfile))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to use TemporaryDirectory instead, because vcrpy fails when cassette is an empty file.

tests/unit/test_stubs.py Outdated Show resolved Hide resolved
tests/unit/test_stubs.py Show resolved Hide resolved
tests/unit/test_stubs.py Show resolved Hide resolved
tests/unit/test_util.py Outdated Show resolved Hide resolved
tests/unit/test_util.py Outdated Show resolved Hide resolved
vcr/util.py Outdated
@@ -92,6 +92,21 @@ def composed(incoming):
def read_body(request):
if hasattr(request.body, "read"):
return request.body.read()
if hasattr(request.body, "__iter__") and not isinstance(
request.body,
(bytearray, bytes, dict, list, str),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new is_iterator function. WDYT?

vcr/util.py Outdated Show resolved Hide resolved
vcr/util.py Show resolved Hide resolved
vcr/util.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathieu we're getting there, please see my new comments below above for the remaining small semi-lose ends 🍻

Fixes: kevin1024#846
Signed-off-by: Mathieu Parent <math.parent@gmail.com>
@sathieu
Copy link
Contributor Author

sathieu commented Jul 21, 2024

@hartwork Sorry for the delay, the mail was lost in my inbox 😉 ...

Back to you, see my above comments ☝️.

Copy link
Collaborator

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sathieu thanks for the adjustements! This looks ready-to-merge to me. Any concerns from @kevin1024 or @jairhenrique with regard to merging?

@kevin1024
Copy link
Owner

The only contributions I have are (1) Thank you for your work, this looks great! and (2) this PR title sounds like some kind of strange horror movie.

@hartwork hartwork merged commit 71eb624 into kevin1024:master Aug 2, 2024
13 checks passed
@sathieu sathieu deleted the consume-body-once2 branch August 5, 2024 10:12
@sathieu
Copy link
Contributor Author

sathieu commented Aug 5, 2024

@hartwork When will this change be released?

NB: I currently a package removed from Debian testing because of it (see bugs.debian.org#1073420)

@hartwork
Copy link
Collaborator

hartwork commented Aug 6, 2024

@hartwork When will this change be released?

@sathieu I cannot do releases but @kevin1024 can.

NB: I currently a package removed from Debian testing because of it (see bugs.debian.org#1073420)

From a quick look, I'm not sure it's the same issue. Is it?

@sathieu
Copy link
Contributor Author

sathieu commented Aug 7, 2024

@hartwork

From a quick look, I'm not sure it's the same issue. Is it?

Yes. There are two bugs :

  • The consumed only once body🧟 (i.e. this merged PR)
  • A change in urllib3 wordings (Cannot connect to proxy. ➡️ Unable to connect to proxy, see aa73e725)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chunked body as an iterator doesn't work when body is a streamed iterator
4 participants