Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print version number #236

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

guicho271828
Copy link

@guicho271828 guicho271828 commented Jan 12, 2025

This PR adds a version number (the Git hash at the build time) to the output log
This prevent several accidents like forgetting to recompile it before the experiments & not noticing it, ending up getting a wrong result

The first line of the log output look like this

[t=0.000191s, 24400 KB] Fast Downward build 4ff052903dee4939138383c0c780cc7305ee057d
[t=0.000315s, 24400 KB] reading input...
[t=0.046780s, 27600 KB] done reading input!
...

Copy link
Member

@FlorianPommerening FlorianPommerening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think including the git revision in the log is a good idea but I have some practical concerns about the implementation.

  1. We already have a command ./fast-downward.py --version which prints the release (e.g., 23.06) on a release version and something like 23.06+ for other versions. We want to keep the release number and not introduce two different ways of referring to one build. One way to report this would be 23.06 for a release and 23.06+ (rev f39b35b2) for others. The trouble is that --version is an option of the driver, not the search component.
  2. Baking the git revision into the binary with CMake only works if we build from a repository but there are cases where we build from a tarball, or a git export of the code. In that case, there is no surrounding git repo, and accessing the revision number would fail.

.gitignore Outdated
@@ -5,3 +5,5 @@ __pycache__/
/builds/
/misc/.tox/
/misc/autodoc/downward-xmlrpc.secret
*~
#*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#*

Copy link
Author

@guicho271828 guicho271828 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#* is also a pattern for emacs backup file.
For example, if I am editing a.c, it creates #a.c while I am still editing the file, and a.c~ when the file is saved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both of these patterns should rather be in a user's global git configuration, not each project-specific repository. That way, users can configure these once for all projects, and the projects don't need to deal with all possible editor setups.

For what it's worth, as a fellow emacs user: I've configured emacs to put the *~ backups in a global place rather than alongside the original files because I think it's much less messy. And for the #* files, I think it's a feature not to have git ignore them because their presence indicates that there are unsaved edited files currently open, and having this shown clearly when doing a git status or similar prior to a commit is good. It's rarely a good idea to commit when there are unsaved changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be better to leave this up to the user configuration then. But isn't # a comment in gitignore? Does this work as a pattern without escaping #?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the *~ backups in a global place

This indeed works. I am fine with dropping this commit, it is just my habit from experience that plenty of students don't know these and commit absolute everything, and I got annoyed often enough removing .DS_store etc commited in a shared repo...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the gitignore commit.

@guicho271828
Copy link
Author

there are cases where we build from a tarball

Thanks, this is a legit concern! Though the release version and my PR have certainly different purposes (the latter is to confirm that the binary version actually matches the version of the source code it was built from).
I added additional code that reads from driver.__version__ .

Copy link
Member

@FlorianPommerening FlorianPommerening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the two numbers have slightly different purposes. But what I would like to avoid is that there are two different answers to the question "Which version did you use?". So I suggest to report both numbers together. If we are on a release, the revision number is not necessary (the release itself is a tag that works like a revision number). For other revisions, it would still be good to see which release this code is based on. Reporting 23.06+ (rev f39b35b2) would be better than just f39b35b2.

We could do this in CMake: ask the driver for a version, then try to get the revision from git and use something like unknown revision in case we build from a tarball (doesn't have to be a critical failure, since we can still build the code). The problem I have with this approach is that if we now ask the driver about the version, it will report 23.06+, not 23.06+ (rev f39b35b2). The problem here is what revision should the driver specify here: The revision of its own code (current git revision), or the revision of the binary it is about to run (the search component, which might be compiled with a different version of the code). Also, we might want to cover the case that the repository has uncommitted changes. In that case, the revision number is also not accurately representing the version of the code.

We could also shift this functionality into the driver and make it responsible for getting the git version and building the string 23.06+ (rev f39b35b2). Since the new patch calls the driver from CMake anyway, we might as well implement this in Python, rather than in CMake. One suggestion would be to have the following functionality in the driver

  • get_release() -> 23.06 or 23.06+
  • get_code_revision() -> f39b35b2 (current repo revision), or f39b35b2+ (if there are uncommitted changes), orunknown revision (if run from a tarball)
  • get_code_version() -> Something like f"{get_release()} ({get_code_revision()})" called by CMake and built into the binary
  • get_binary_version(binary) -> retrieves the value from the binary and reports it (called by ./fast-downward.py --version so it reports what version is actually run). That would mean the binary itself needs an option --version to report its internally stored value.

Am I overthinking this? I definitely see the problem of the driver and binary going out of sync (if you build, then switch git branches, then forget to rebuild), and that the source code and repository going out of sync (if you build with uncommitted changes). Maybe instead of passing this information back and forth, we could just store in independently in both the driver and the binary and have each one report its version. But then I would still store both 23.06+ and f39b35b2+ as separate fields and combine them in the output. The driver could actually report both. Something like: 23.06+ (driver code: f39b35b2+, binary: 35b26510).

.gitignore Outdated
@@ -5,3 +5,5 @@ __pycache__/
/builds/
/misc/.tox/
/misc/autodoc/downward-xmlrpc.secret
*~
#*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be better to leave this up to the user configuration then. But isn't # a comment in gitignore? Does this work as a pattern without escaping #?

@guicho271828
Copy link
Author

I agree that it would be better to leave this up to the user configuration then. But isn't # a comment in gitignore? Does this work as a pattern without escaping #?

perhaps I was wrong all along...

@guicho271828
Copy link
Author

guicho271828 commented Jan 13, 2025

Am I overthinking this?

Perhaps one way to address this is to make it less obvious to the user so that it does not mess up the bug report and increase the effort by the maintainers.

  • driver version : nothing changed. 23.06, or 23.06+ or whatever.
  • downward binary : change this line to utils::g_log << "build hash " << GIT_REVISION << endl; . It is not obvious that the hash value refers to the version number, therefore there is less chance of users reporting confused messages to you.

@guicho271828
Copy link
Author

guicho271828 commented Jan 13, 2025

To be clear, I just think it is a good idea to be able to check the output log retroactively for forensic purpose and be absolutely certain about the version of the C++ source code of the binary that produced the log.

  • You may forget to run ./build.py before running the experiment after chainging the branch.
  • You did not forget ./build.py , but forgot to run git pull on the login node of a cluster before running it, thus the experiment was ran with an older version.
  • You did git pull and build.py on the login node, but you pushed the latest to a wrong remote on your laptop.
  • One may have ran so many experiments with incremental low-level optimization that you forget which output directory corresponds to which version (if you were managing them manually).
  • One may stop working on a repository for a while because of other duties, and months later, coming back to the cluster, having no idea which result corresponds to which version, throwing away all the compute time.
  • Many other examples that I myself was bitten before.

Just emitting the build sha1 hash provides a large QoL improvement.

Having --version option is also a good idea, and is easily doable, but the fact that it must be explicitly invoked will defeat the purpose (because one will forget). I also often run downward without driver to cache SAS+ files.

@guicho271828
Copy link
Author

Besides, another idea is that if there are uncommitted changes, the build hash may contain "-dirty" suffix to indicate something weird is happening.

@FlorianPommerening
Copy link
Member

I agree with all of those points as potential problems where you want a version number in the output. The idea with having this as a CLI option was to change the experimental setup so it always prints the version to the log but I don't have a strong opinion on this point. I also think a suffix for dirty repos makes sense but I would call it + instead of -dirty to be consistent with the release versions: 23.06 is the release exactly, 23.06+ is the release version with some additional unreleased changes, so analogously f39b35b2 should be the revision exactly and f39b35b2+ the revision with some additional uncommitted changes.

What I dislike about the current patch is that it accesses the python version as a fallback in cases were we build from a tarball. I think the reason for this is that if we don't have exact information about the revision, we at least want to know which release it is based on. But I think this information is either always interesting (also if we have the revision) or never (even if we don't have the revision). So I'd either run it unconditional and make it part of the print statement, or remove it altogether and print "unknown revision" or something like it in such cases.

@FlorianPommerening
Copy link
Member

We usually treat discussions in Github pull requests as ephemeral, so I created an issue in our tracker summarizing the discussion so far: https://issues.fast-downward.org/issue1163

@guicho271828
Copy link
Author

guicho271828 commented Jan 14, 2025

I added the feature that was requested.

  1. When HEAD is a release commit, the revision string instead uses the tag name (this makes sense; SHA1 and tag names are both interpreted by git.)
[masataro downward]$ git tag release-test HEAD

[masataro downward]$ git log -n 1 --oneline
84b723a02 (HEAD -> print-version-number, tag: release-test, public/print-version-number) detect non-release commits and modify the string

[masataro downward]$ ./build.py
Building configuration release.
Executing command "cmake -S /home/masataro/repos/oss/downward/./src -B /home/masataro/repos/oss/downward/./builds/release -G Unix Makefiles -DCMAKE_BUILD_TYPE=Release"
-- Building for 64-bit.
Building from a release commit.
revision string: "release-test"
...
Built configuration release successfully.

[masataro downward]$ builds/release/bin/downward --version
release-test
Peak memory: 10000 KB

[masataro downward]$ builds/release/bin/downward | head -n 1
[t=0.000212s, 10136 KB] Fast Downward build release-test
  1. If the current commit does not have a release tag, it prints the SHA1 and the nearest tag.
[masataro downward]$ git tag -d release-test # remove the tag
Deleted tag 'release-test' (was 84b723a02)

[masataro downward]$ git log -n 1 --oneline
84b723a02 (HEAD -> print-version-number, public/print-version-number) detect non-release commits and modify the string

[masataro downward]$ ./build.py
Building configuration release.
Executing command "cmake -S /home/masataro/repos/oss/downward/./src -B /home/masataro/repos/oss/downward/./builds/release -G Unix Makefiles -DCMAKE_BUILD_TYPE=Release"
-- Building for 64-bit.
Building from a non-release commit.
revision string: "84b723a02 (based on release-24.06.0)"
...
Built configuration release successfully.

[masataro downward]$ builds/release/bin/downward --version
84b723a02 (based on release-24.06.0)
Peak memory: 10000 KB

[masataro downward]$ builds/release/bin/downward | head -n 1
[t=0.000198s, 10136 KB] Fast Downward build 84b723a02 (based on release-24.06.0)
  1. If there are uncommited changes, it appends "-dirty" to the revision. The additional message here is printed as a WARNING.
[masataro downward]$ echo "" >> README.md # make an uncommited change

[masataro downward]$ git log -n 1 --oneline
84b723a02 (HEAD -> print-version-number, public/print-version-number) detect non-release commits and modify the string

[masataro downward]$ git status
On branch print-version-number
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   README.md

no changes added to commit (use "git add" and/or "git commit -a")

[masataro downward]$ ./build.py
Building configuration release.
Executing command "cmake -S /home/masataro/repos/oss/downward/./src -B /home/masataro/repos/oss/downward/./builds/release -G Unix Makefiles -DCMAKE_BUILD_TYPE=Release"
-- Building for 64-bit.
Building from a non-release commit.
CMake Warning at search/CMakeLists.txt:90 (message):
  Building from a repository with uncommited changes.


revision string: "84b723a02-dirty (based on release-24.06.0)"
...
Built configuration release successfully.

[masataro downward]$ builds/release/bin/downward --version
84b723a02-dirty (based on release-24.06.0)
Peak memory: 10000 KB

[masataro downward]$ builds/release/bin/downward | head -n 1
[t=0.000196s, 10136 KB] Fast Downward build 84b723a02-dirty (based on release-24.06.0)
  1. Same thing happens to the release commits.
[masataro downward]$ git tag release-test HEAD

[masataro downward]$ git status
On branch print-version-number
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   README.md

no changes added to commit (use "git add" and/or "git commit -a")

[masataro downward]$ git log -n 1 --oneline
84b723a02 (HEAD -> print-version-number, tag: release-test, public/print-version-number) detect non-release commits and modify the string

[masataro downward]$ ./build.py
Building configuration release.
Executing command "cmake -S /home/masataro/repos/oss/downward/./src -B /home/masataro/repos/oss/downward/./builds/release -G Unix Makefiles -DCMAKE_BUILD_TYPE=Release"
-- Building for 64-bit.
Building from a release commit.
CMake Warning at search/CMakeLists.txt:82 (message):
  Building from a repository with uncommited changes.


revision string: "release-test-dirty"
...
Built configuration release successfully.

[masataro downward]$ builds/release/bin/downward --version
release-test-dirty
Peak memory: 10000 KB

[masataro downward]$ builds/release/bin/downward | head -n 1
[t=0.000224s, 10136 KB] Fast Downward build release-test-dirty

@guicho271828
Copy link
Author

guicho271828 commented Jan 14, 2025

For tarballs, it still reads from the python file. But this is not my commits' fault! At least I fixed it so that my change (which I am responsible) does not break the build, and I have no responsibility to the prior design (that version.py was updated manually after each release). Someone can improve it, but I see no point of holding it back.

If I do it, I would move my CMAKE code to the src/CMakefile, and autogenerate builds/release/translate/version.py from src/translate/version.py.in OOps, it was the driver code and it is not managed by CMake, so this approach does not work. Regardless, I am not responsible for it.

@FlorianPommerening
Copy link
Member

No, the previous design is certainly not your commits' fault, I take responsibility for that. But I think I didn't express my concern in the right way, My issue was not that the python files are accessed. it was that this is done only as a fallback.

Given what Malte said in the issue, I would suggest this workflow:

  1. Get a value for the git revision.
    a. Try to access the git revision from git log.
    b. If there are uncommitted changes, add -dirty.
    c. If accessing the revision failes, use unknown revision instead.
  2. Get a value for the release this code is based on
    a. read the version from the driver
  3. Store both values in the header and continue the build.
  4. When printing the version, print it as {release} ({revision}).
  5. Somehow make the driver report this as when using ./fast-downward.py --version

The last step is still unclear to me and we should discuss this in the issue tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants