Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC3986 compatible URL.join honoring empty segments #1039

Merged
merged 36 commits into from
Aug 30, 2024

Conversation

commonism
Copy link
Contributor

@commonism commonism commented Jul 18, 2024

What do these changes do?

The changes re-implement URL.join within yarl.URL instead of using urllib.parse.urljoin to avoid known issues when joining paths with empty segments (python/cpython#84774)

Are there changes in behavior for the user?

Due to the unit tests for URL.join it was required to align the query parameter rendering to match the unit tests, empty parameters are rendered without value therefore instead of an empty value.

Related issue number

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes

urllib.parse.urljoin does not honor empty segments,
python/cpython#84774

implement using joinpath

change query string generation to not emit empty values as ""
Copy link

codecov bot commented Jul 18, 2024

Codecov Report

Attention: Patch coverage is 73.49398% with 22 lines in your changes missing coverage. Please review.

Project coverage is 62.60%. Comparing base (84e1c7d) to head (f94589c).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
tests/test_url.py 59.52% 17 Missing ⚠️
yarl/_url.py 87.80% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1039      +/-   ##
==========================================
+ Coverage   62.37%   62.60%   +0.23%     
==========================================
  Files          38       38              
  Lines        6626     6721      +95     
  Branches      356      368      +12     
==========================================
+ Hits         4133     4208      +75     
- Misses       2466     2486      +20     
  Partials       27       27              
Flag Coverage Δ
CI-GHA 62.57% <73.49%> (+0.23%) ⬆️
MyPy 25.91% <39.02%> (+0.30%) ⬆️
OS-Linux 99.26% <100.00%> (+0.01%) ⬆️
OS-Windows 99.59% <100.00%> (+<0.01%) ⬆️
OS-macOS 99.03% <100.00%> (+0.01%) ⬆️
Py-3.10.11 98.94% <100.00%> (+0.04%) ⬆️
Py-3.10.14 99.15% <100.00%> (+0.04%) ⬆️
Py-3.11.9 99.15% <100.00%> (+0.04%) ⬆️
Py-3.12.5 99.15% <100.00%> (+0.04%) ⬆️
Py-3.13.0-rc.1 99.15% <100.00%> (+0.04%) ⬆️
Py-3.8.10 98.88% <100.00%> (+0.01%) ⬆️
Py-3.8.18 99.09% <100.00%> (+0.01%) ⬆️
Py-3.9.13 98.88% <100.00%> (+0.01%) ⬆️
Py-3.9.19 99.09% <100.00%> (+0.01%) ⬆️
Py-pypy7.3.11 99.40% <100.00%> (+<0.01%) ⬆️
Py-pypy7.3.16 99.43% <100.00%> (+<0.01%) ⬆️
VM-macos-latest 99.03% <100.00%> (+0.01%) ⬆️
VM-ubuntu-latest 99.26% <100.00%> (+0.01%) ⬆️
VM-windows-latest 99.59% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@commonism
Copy link
Contributor Author

Towncrier ain't happy with my intersphinx link to python docs:

/home/runner/work/yarl/yarl/docs/[towncrier-fragments]:45: WARNING: py:meth reference target not found: python.urllib.parse.urljoin

@commonism
Copy link
Contributor Author

As

:py:function:`urllib.parse.urljoin`

does not work either

/home/runner/work/yarl/yarl/docs/changes.rst:45: ERROR: Unknown interpreted text role "py:function".

But according to

python3 -m sphinx.ext.intersphinx https://docs.python.org/3/objects.inv | grep -E "(^py:.*|urllib.parse.urljoin)"
…
py:function
	urllib.parse.urljoin                     library/urllib.parse.html#urllib.parse.urljoin
…

this should work and the intersphinx is set up in the job

loading intersphinx inventory 'python' from https://docs.python.org/3/objects.inv...
loading intersphinx inventory 'multidict' from https://multidict.aio-libs.org/en/stable/objects.inv...

@Dreamsorcerer @webknjaz any ideas?

@webknjaz
Copy link
Member

@commonism use :func:. That table shows object types with differs from the role names sometimes.

In my helper, I tried replacing them with proper reference hints: https://webknjaz.github.io/intersphinx-untangled

@webknjaz
Copy link
Member

Towncrier ain't happy with my intersphinx link to python docs:

Technically, it's not Towncrier that's unhappy. Towncrier is just a fancy template renderer. But when that's injected into Sphinx (through my sphinxcontrib-towncrier extension), it can emit warnings and we turn them into errors.

@commonism
Copy link
Contributor Author

The python domain maps function to func, therefore it is :py:func:
https://github.com/sphinx-doc/sphinx/blob/685e3fdb49c42b464e09ec955e1033e2a8729fff/sphinx/domains/python.py#L845-L881

To many docs for intersphinx show :py:function: as example.

That said, mastered this as well.

yarl/_url.py Outdated Show resolved Hide resolved
yarl/_url.py Outdated Show resolved Hide resolved
always encode parameters with an assignment
adjust rfc3986 tests to include an assignment in the expectation
Copy link
Member

@Dreamsorcerer Dreamsorcerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks right to me. @webknjaz Can you check it over too?

CHANGES/1039.bugfix.rst Outdated Show resolved Hide resolved
CHANGES/1039.bugfix.rst Outdated Show resolved Hide resolved
CHANGES/1039.bugfix.rst Outdated Show resolved Hide resolved
CHANGES/1039.bugfix.rst Outdated Show resolved Hide resolved
tests/test_url.py Outdated Show resolved Hide resolved
tests/test_url.py Outdated Show resolved Hide resolved
tests/test_url.py Outdated Show resolved Hide resolved
tests/test_url.py Outdated Show resolved Hide resolved
@bdraco
Copy link
Member

bdraco commented Aug 30, 2024

Ran a profile. Only thing I noticed is schema comes up a lot in yarl. Probably would make sense to put that one in the dict to and avoid the property call.. but thats another issue

bdraco added a commit that referenced this pull request Aug 30, 2024
Noticed this one comes up a lot while profiling #1039
@bdraco
Copy link
Member

bdraco commented Aug 30, 2024

Tested one production. No adverse effect observed

@bdraco bdraco merged commit e3dd736 into aio-libs:master Aug 30, 2024
45 of 48 checks passed
@bdraco
Copy link
Member

bdraco commented Aug 30, 2024

Thanks @commonism

bdraco added a commit that referenced this pull request Aug 31, 2024
#1039 introduced a regression in decoding query
string params
@bdraco
Copy link
Member

bdraco commented Aug 31, 2024

This PR introduced a regression in decoding query string params in joined urls.

I attempted to fix it in #1066 but the fix looks more involved so I'm going to revert this PR and do another release

I think this needs to be reworked to use URL(self._val._replace(.....), encoded=True) instead like some of the other methods but more analysis is needed as there is some complex re-quoting logic for query strings already.

I would like to encourage another attempt at solving this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:skip This PR does not need to include a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants