Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unecessary upper bounds #468

Closed
wants to merge 2 commits into from
Closed

Remove unecessary upper bounds #468

wants to merge 2 commits into from

Conversation

jhkennedy
Copy link
Collaborator

@jhkennedy jhkennedy commented Feb 21, 2024

Upperbounds are considered harmful for python, which has a flat dependency tree, even for SemVer projects. Long form discussion here:
https://iscinumpy.dev/post/bound-version-constraints/


📚 Documentation preview 📚: https://earthaccess--468.org.readthedocs.build/en/468/

@mfisher87
Copy link
Collaborator

That's a lot to read! I feel we should cut a release now to resolve the known breakage from #465 while we give the maintenance team time to learn more. What do you think?

@jhkennedy
Copy link
Collaborator Author

That's a lot to read! I feel we should cut a release now to resolve the known breakage from #465 while we give the maintenance team time to learn more. What do you think?

Yes, that's fine. Python 4.0 isn't going to come anytime soon and Requests 3.0 is also a long ways out. We don't want to punt it forever though

@mfisher87
Copy link
Collaborator

Agreed! I don't have time to read the whole post right now, but did a little bit of skimming from the ToC and have some doubts about the premise. Of course my mind could be changed in light of new information, but this just doesn't agree with my understanding of semver. Would love to learn more, and eventually I'll find time to dig in to the article more too :) I don't agree with this, for example:

Basically, a “perfect” SemVer library would pretty much always bump the major version, since you almost always could possibly break a user with any change (and if you have enough users, by Hyrum’s law, you will do this). This makes “true” SemVer pointless. Minor releases are impossible, and patch releases are nearly impossible.

I think where most folks (including me) don't walk the walk with semver is specification item 1:

Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it SHOULD be precise and comprehensive.

This sets the stage for the rest of the semver rules applying only to declared API changes, not behavior changes. It can be argued that "the behavior is the API", and there is merit to that way of thinking, but the semver specification rules that approach out with specification item 1. I feel that approach does lead to the onerous situation described in the article, and I don't think that's sustainable. I feel calling that "perfect" semver is not a good description; it's more like "radicalized" semver to me :)

If we reason about semver only in the context of the declared API, most bugfixes are changes to undocumented behavior, i.e. don't change the declared API, so are patch changes. Users should be making semver judgements based on the declared API as well, so it should be expected that a patch version can break their code if they rely on undocumented behavior, and they should pin accordingly.

That said, how accessible is that way of thinking to our userbase? The less users have to think about that, the better, but also we have to be able to make breaking changes to make progress. I think we could improve accessibility by adding documentation that explicitly confirms how we use and interpret semver and what we consider to be the public API.

I don't have any judgements yet on the goodness/badness of upper bounds yet, just thinking out loud :)

@jhkennedy
Copy link
Collaborator Author

@mfisher87 Yes, I agree that's a bit of hyperbole, and hides the real problem.

In SemVer, you're supposed to declare a public API -- in Python we do that with _ prefixes and __all__s, and if the library was developed perfectly, all methods would be appropriately declared. The reality is that almost never happens; or at least I've never worked on a code base that as done that perfectly. So, there's often a miss-match between developer and user expectations around public vs private methods.

To top it off, a major version bump happens for any change to the public API. But, we might only use a subset of that API wich remains entirely uneffected.

By providing an upper bound, we guarantee that our package will break on every major release and require a re-release. By not capping, you only have to address actual issues as they arise. This is demonstrated exactly in our s3fs upper pin causing 2 PRs and a release to fix something that is not broken:

@rsignell
Copy link
Contributor

rsignell commented Feb 21, 2024

While we are pondering aspects of SemVer, check this out this recent post:
https://jacobtomlinson.dev/effver/

And shortly thereafter matplotlib adopted effver!

@mfisher87
Copy link
Collaborator

@rsignell TIL EffVer! Thanks for sharing that. I really like that the approach acknowledges the uncertainty and humanness of this problem and I think I want to start trying it :) One thing I do like about semver though is how it explicitly encourages clearly defining and thinking more about the API. That's also important for effectively implementing EffVer, but it's more implicit. I admittedly mostly skimmed the article and loved the pretty pictures 😆

image

Also, TIL of the colloquialism "YOLO version" 😆

In SemVer the 0.x.x version has become known as the YOLO version because anything goes.


In SemVer, you're supposed to declare a public API -- in Python we do that with _ prefixes and __all__s, and if the library was developed perfectly, all methods would be appropriately declared. The reality is that almost never happens; or at least I've never worked on a code base that as done that perfectly. So, there's often a miss-match between developer and user expectations around public vs private methods.

I think of those as separate things. semver or not, that's the Pythonic convention for indicating publicness, but separately, semver allows for us to, for example, declare this is our public API for the purposes of versioning and for ensuring developers and users are on the same page about what "the public API" actually means.

Whatever we decide, I'd really like for us to have a page on our versioning policy that's aimed at users. Right now, I think the farthest we go is saying "The versioning scheme we use is SemVer" in CONTRIBUTING.md.

I need to think and read more about upper bounds in regards to semver. I definitely agree with you using an upper bound with a calver project is a poor practice, but I'm not yet sure what my opinion is on "never use upper bounds (for a library)" :)

@jhkennedy
Copy link
Collaborator Author

jhkennedy commented Feb 21, 2024

To be clear, "never use upper bounds (for a library)" is not my position; my position is: "upper-bounds are guaranteed to break this package in the future, so they should be avoided."

There are times when upper bounds are appropriate, but we should know they are, not guess. That is, if we rely on nearly all of a public API, or a dependency has a history of breaking us, or we see a change coming that we know is going to break us, then sure, upper bound.

@jhkennedy
Copy link
Collaborator Author

Notably, Poetry-relax is a plugin built to handle this for poetry projects. It's readme discusses the issue a bit and has a good set of references about this issue at the bottom of the README as well:
https://github.com/zanieb/poetry-relax

pyproject.toml Outdated
@@ -33,10 +33,10 @@ classifiers = [
# `ci/environment-mindeps.yaml` conda environment. When updating minimum dependencies
# here, make sure to also update `ci/environment-mindeps.yaml`.
[tool.poetry.dependencies]
python = ">=3.8,<4.0"
python = ">=3.8"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python = ">=3.8"
python = ">=3.8,<4.0"

Looks like this is actually problematic with poetry and cannot be relaxed 🙄

@jhkennedy
Copy link
Collaborator Author

looks like this was superseded by da71210; closeing as already done.

@jhkennedy jhkennedy closed this Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants