Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] NotFoundException is internal #17344

Open
snickl opened this issue Nov 19, 2024 · 6 comments
Open

[bug] NotFoundException is internal #17344

snickl opened this issue Nov 19, 2024 · 6 comments
Assignees

Comments

@snickl
Copy link

snickl commented Nov 19, 2024

Describe the bug

I have a use case where I've been using the conan.tools.files.download() function on an optional url/file, i.e. the file may not be there and that's not an error.
When I wrote the recipe I noticed that download() throws NotFoundException if the source file is not found, and so I started catching that in my code to handle the absence case.
Now #17126 has moved this exception to an internal namespace (and broke my code), so I take it I'm not supposed to reference it from recipes.
However as this is an exception that a public tool function can throw, I posit that this exception (and maybe others) should also be available to recipes.
I could probably catch it as a more generic type of exception, but I find being able to distinguish expecially 404 from other types of error useful.
Thanks!

How to reproduce it

No response

@memsharded memsharded self-assigned this Nov 19, 2024
@memsharded
Copy link
Member

Hi @snickl

Thanks for your report.

have a use case where I've been using the conan.tools.files.download() function on an optional url/file, i.e. the file may not be there and that's not an error.
When I wrote the recipe I noticed that download() throws NotFoundException if the source file is not found, and so I started catching that in my code to handle the absence case.
Now #17126 has moved this exception to an internal namespace (and broke my code), so I take it I'm not supposed to reference it from recipes.

According to the stability commitment (check https://docs.conan.io/2/introduction.html#stable)

Only documented features in https://docs.conan.io/ are considered part of the public interface of Conan. Private implementation details, and everything not included in the documentation is subject to change.

Only documented things can be used in recipes. This wouldn't be a bug, but expected. As a hint, is it possible that you were important that exception from from conans? Recall that namespace doesn't exist in the public interface of Conan 2, it was the first thing defined in the migration guide (https://docs.conan.io/en/1.65/migrating_to_2.0/recipes.html#python-import-statements).

I am not sure what would be the use case to catch a NotFoundException in a recipe. If some sources fail to download, then it is necessary to fail, as the package build will not be able to succeed, isn't it?

@snickl
Copy link
Author

snickl commented Nov 19, 2024

Hi @memsharded, thanks for the quick reply!
I think I understand your position well and I'm not aware of any general OOP guidelines about this, but I feel exceptions are somehow part of a function's contract, in the sense that if a function may throw a particular exception, the caller should be able to catch it.

But I'll admit my use case is a bit off-the-wall and I'm curious about your insight anyway.
I have several big-ish binary tar.gz and I'm more or less doing what you've described here #5059 (comment) among other similar things.

There's CI pipeline A, building that big binary "semifinish" .tar.gz package and uploading it into an internal Artifactory generic type repository, and there's CI pipeline B that works on a repo a bit like Conan-Center. B turns A's semifinishes into Conan binary packages by having recipes fetch those archives back from Artifactory, packaging and uploading the results into a Conan type repo hosted on the same Artifactory instance.

Now since these files are big and up- and downloads are somewhat of a slog, I want to leverage the Conan source cache. However that only works if I can supply sha256sums for those files. To (semi-)automate this, I first have pipeline A generate a checksums.txt file with sha256 of the semifinishes, which it uploads alongside the semifinishes themselves.

Then when it comes time to update the "Conan-Center" repo, I start by invoking conan create locally, and it will first try to download() that checksum file. If the checksum file is present on the server (i.e. no NotFoundException) and there is no entry for the semifinish in conandata.yml yet, it will create that entry with the sha256 from the checksum file and re-write the local conandata.yml (don't worry, I'm not using Conan internal functions for that, just the yaml module ;-)
Then the recipe will proceed to download() the actual semifinish, and assuming the above went well, it will be placed in the source cache.
Now I just git add the updated conandata.yml and commit, allowing me to tweak the recipes and rebuild the package as much as I like without having to reach out to the server to fetch the big file each time.

Conversely, if the checksum file is absent (not all A pipelines create it and some semifinishes are provided by third parties), the caching will simply not take, or I have to update conandata.yml manually, depending on the type of semifinish.

I also saw that Artifactory supplies a http header with the sha256 sum, but asking for http HEAD request support in Conan may be a bit much, unless auto-updating conandata.yml can be turned into a proper feature.

@memsharded
Copy link
Member

I think I understand your position well and I'm not aware of any general OOP guidelines about this, but I feel exceptions are somehow part of a function's contract, in the sense that if a function may throw a particular exception, the caller should be able to catch it.

The thing is that Conan recipes are not really a pure Python program or library, it is actually a DSL on top of Python with its own rules and logic. A ConanFile isn't a normal class or object, and it doesn't keep state in the way a regular object does through a Conan invocation. Its variables and methods even have rules regarding which names can be used and not: https://docs.conan.io/2/reference/conanfile.html. The Python ConanAPI at this moment returns some objects that are mostly opaque, they can be passed to other APIs, but it is not intended to be used as regular Python object.

As many of Conan users are not that experts in Python, Conan recipes syntax has tried as much as possible to avoid exceptions in user recipe code. In other words, I think there is not a single try-except in the whole Conan docs (and there are many hundreds of pages of docs). If there is an exception it really means something is quite bad and should stop. Otherwise Conan has provided safe alternatives like self.settings.rm_safe() to avoid having to try-except.

Then when it comes time to update the "Conan-Center" repo, I start by invoking conan create locally, and it will first try to download() that checksum file. If the checksum file is present on the server (i.e. no NotFoundException) and there is no entry for the semifinish in conandata.yml yet, it will create that entry with the sha256 from the checksum file and re-write the local conandata.yml (don't worry, I'm not using Conan internal functions for that, just the yaml module ;-)

Regarding what you are trying to do, I think I'd approach this the other way round, making it more imperative and decoupled.
For example, I'd build a Conan custom command that assuming the file is there, will download the file, get the checksums, define the respective checksums in the conandata.yml of the respective packages and does the commit. So when the packages are conan create they will use the checksum and everything will work transparently. This approach sounds to me a bit more "divide and conquer", with more cohesive and decoupled responsibilities, which is typically easier to maintain and understand, that is, the mapping to checksums.txt to the conandata.yml files is pretty straightforward, and done only once for all recipes. It is easier to track those changes in source control if they leave explicit source changes in the commit. And recipes are completely straightforward.

Just a very rough idea, I probably don't know enough about all the constraints, etc., so this might not be possible. But we tackle a lot of quite complex problems in our ConanCenter CI and this is kind of the approach we usually follow.

@snickl
Copy link
Author

snickl commented Nov 20, 2024

As many of Conan users are not that experts in Python, Conan recipes syntax has tried as much as possible to avoid exceptions in user recipe code. In other words, I think there is not a single try-except in the whole Conan docs (and there are many hundreds of pages of docs). If there is an exception it really means something is quite bad and should stop. Otherwise Conan has provided safe alternatives like self.settings.rm_safe() to avoid having to try-except.

The obvious alternative to exceptions would be to return a status code, but it seems that would also be a first.

Regarding what you are trying to do, I think I'd approach this the other way round, making it more imperative and decoupled. For example, I'd build a Conan custom command that assuming the file is there, will download the file, get the checksums, define the respective checksums in the conandata.yml of the respective packages and does the commit. So when the packages are conan create they will use the checksum and everything will work transparently. This approach sounds to me a bit more "divide and conquer", with more cohesive and decoupled responsibilities, which is typically easier to maintain and understand, that is, the mapping to checksums.txt to the conandata.yml files is pretty straightforward, and done only once for all recipes. It is easier to track those changes in source control if they leave explicit source changes in the commit. And recipes are completely straightforward.

Yes I thought about making it into a separate tool too, but the advantage of doing it from recipe context is that in my case, the source URL paths and filenames are also derived in the recipe methods, mostly from the version string passed via conan create --version=.
So the quickest way then it seems would be to do a conan create dummy run, snatch the source URL from its console output and pass it on to the hash update command manually.

Question: In a Conan custom command implementation, would it be ok to call SourcesCachingDownloader.download() and catch NotFoundException, or would I have to spin up my own http requester?

@memsharded
Copy link
Member

The obvious alternative to exceptions would be to return a status code, but it seems that would also be a first.

Yes, so far, things like download() or cmake.build() are always "must work or fail", Conan does not consider this kind of "try this, and if it doesn't work, then fallback to this other behavior". It is not really a matter of try-except with exceptions or returning error codes, it is intended to be as "declarative" and "imperative" as possible. Implementing business logic as the one you are proposing via a fallback is not recommended, except in a very few exceptions.

This it the same rationale of why when a remote is not available or fail to connect, Conan will immediately stop. It will not continue to the next remote. Some users have requested this behavior in the past, but our experience is that it is better to fail, it is safer. If you have a remote defined, that remote should be there and work, and otherwise, stop everything and check. If you want to skip that remote, then you can conan remote disable it.

There is an important thing that I might be missing here.
All this download of this semi-finished thing, is being done at build() time?
The modification of conandata.yml can only happen at export time. Any other modification of conandata.yml is changing the checksums and will be corrupting the packages. A conan cache check-integrity or a conan upload .... --check will fail with an error reporting the recipe is corrupted.

@memsharded
Copy link
Member

Question: In a Conan custom command implementation, would it be ok to call SourcesCachingDownloader.download() and catch NotFoundException, or would I have to spin up my own http requester?

No, I am afraid that that isn't public API either, so it cannot be used. The idea wouldn't be that the command download anything, the command would only download that checksums.txt file, then analyze what is necessary and update the checksums in the conandata.yml of the recipes (even before the recipes are created/exported), so that becomes part of the "source inputs" information before creating the package, leaving conandata.yml immutable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants