-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up the Git for Windows SDK initialization again #1841
Speed up the Git for Windows SDK initialization again #1841
Conversation
…K again It used to be the case that initializing the minimal SDK (i.e. a radically slimmed-down subset of Git for Windows' development environment intended to perform the CI builds and little else) took a bit over one minute, would then be cached, and subsequent jobs would take at most half a dozen seconds to initialize said minimal SDK. It is important that this step is fast because we have to run the test suite in parallel, in a set of matrix jobs, to offset the slowness of the shell-based test suite, and each and every job has to initialize the very same minimal SDK. While it may sound as if parallelizing the jobs might only waste the generously-provided build minutes but at least the _wallclock_ time would pass quick, in reality it matters a lot: Frequently Git for Windows' or GitGitGadget PRs get stuck waiting for quite a while before CI builds start because other PRs' builds still spend substantial amounts of time to run, blocking due to the concurrency limit being reached. Since 91839a8 (ci: create script to set up Git for Windows SDK, 2024-10-09), the situation has worsened: every job that requires the minimal Git for Windows SDK spends roughly two-and-a-half minutes doing so. With the switch away from the GitHub Action `setup-git-for-windows-sdk`, we incurred more downsides: - It is no longer possible for said Action to fix problems independently from the Git repository, e.g. when new rules about GitHub Actions require changes in the way the minimal SDK is initialized. - The minimal SDK was installed specifically outside of the worktree so as not to clutter it nor incur an additional cost to verify that the worktree is clean. Therefore, even if it would be nice to have a shared process between GitHub and GitLab based CI builds, let's switch the GitHub-based CI back to the tried-and-tested `setup-git-for-windows-sdk` Action. This commit partially reverts 91839a8 (ci: create script to set up Git for Windows SDK, 2024-10-09). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
/submit |
Submitted as pull.1841.git.1734447458896.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> This commit partially reverts 91839a88277 (ci: create script to set up
> Git for Windows SDK, 2024-10-09).
Thanks, will queue.
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 9301a1edd6d..916a64b6736 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -113,15 +113,13 @@ jobs:
> cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }}
> steps:
> - uses: actions/checkout@v4
> - - name: setup SDK
> - shell: powershell
> - run: ci/install-sdk.ps1
> + - uses: git-for-windows/setup-git-for-windows-sdk@v1
> - name: build
> - shell: powershell
> + shell: bash
> env:
> HOME: ${{runner.workspace}}
> NO_PERL: 1
> - run: git-sdk/usr/bin/bash.exe -l -c 'ci/make-test-artifacts.sh artifacts'
> + run: . /etc/profile && ci/make-test-artifacts.sh artifacts
> - name: zip up tracked files
> run: git archive -o artifacts/tracked.tar.gz HEAD
> - name: upload tracked files and build artifacts
> @@ -149,12 +147,10 @@ jobs:
> - name: extract tracked files and build artifacts
> shell: bash
> run: tar xf artifacts.tar.gz && tar xf tracked.tar.gz
> - - name: setup SDK
> - shell: powershell
> - run: ci/install-sdk.ps1
> + - uses: git-for-windows/setup-git-for-windows-sdk@v1
> - name: test
> - shell: powershell
> - run: git-sdk/usr/bin/bash.exe -l -c 'ci/run-test-slice.sh ${{matrix.nr}} 10'
> + shell: bash
> + run: . /etc/profile && ci/run-test-slice.sh ${{matrix.nr}} 10
> - name: print test failures
> if: failure() && env.FAILED_TEST_ARTIFACTS != ''
> shell: bash
>
> base-commit: 631ddbbcbd912530e1b78e5d782e72879f7f1fb2 |
This patch series was integrated into seen via git@6a94bd7. |
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Tue, Dec 17, 2024 at 12:33:10PM -0800, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > This commit partially reverts 91839a88277 (ci: create script to set up
> > Git for Windows SDK, 2024-10-09).
>
> Thanks, will queue.
Okay. Too bad that things regressed this badly with the change. I agree
that reverting is the right thing to do for now. I may revisit this
again in the next release cycle to investigate whether we can get it up
to par with the GitHub Actions. It would be great if the build infra was
shared between our CIs, so I think there's some value in it. But if the
answer is "no" then I guess that's ultimately fine, as well.
Thanks!
Patrick |
This branch is now known as |
This patch series was integrated into seen via git@972174e. |
This patch series was integrated into next via git@6d59340. |
On the Git mailing list, Johannes Schindelin wrote (reply to this): Hi Patrick,
On Wed, 18 Dec 2024, Patrick Steinhardt wrote:
> On Tue, Dec 17, 2024 at 12:33:10PM -0800, Junio C Hamano wrote:
> > "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> > writes:
> >
> > > This commit partially reverts 91839a88277 (ci: create script to set up
> > > Git for Windows SDK, 2024-10-09).
> >
> > Thanks, will queue.
>
> Okay. Too bad that things regressed this badly with the change. I agree
> that reverting is the right thing to do for now. I may revisit this
> again in the next release cycle to investigate whether we can get it up
> to par with the GitHub Actions.
The way I implemented it should be eminently possible in PowerShell, too.
Something like `Start-Process`, launching not an `Invoke-WebRequest` but
instead `C:\Windows\system32\curl.exe` [*1*] to download the `.tar.gz`
file (not the `.zip` file, more about that below). Another `Start-Process`
should then be opened that executes `tar.exe` [*2*], and then the `stdout`
of the former should be piped to the latter [*3*].
I do think that it makes a lot of sense to start extracting as soon as the
first byte arrives instead of storing the archive as a temporary file and
extracting it only once it has arrived completely.
Now, why not use the `.zip` file instead of the `.tar.gz` file? In my
analysis [*4*], I pointed out that the `.zip` file is about 10MB smaller,
after all, and BSD tar (at least the version in `C:\Windows\system32`) is
able to handle those, too, right? Not so fast. In my experiments, when
streaming the `.zip` file to the `tar.exe -xf -` process, the `etc/`
and `.sparse/` directories were consistently dropped. A bug, I guess. I
ran out of time to investigate this in more depth.
Since the artifacts are now hosted in a GitHub Release and updated
regularly, and since those updates cannot be atomic (you can only upload a
release asset if no asset of the same name exists already, read: the
automation has to _delete_ that asset before uploading a new version), it
would also be good to kind of expect that the file may be intermittently
absent and add a back-off strategy [*5*].
> It would be great if the build infra was shared between our CIs, so I
> think there's some value in it. But if the answer is "no" then I guess
> that's ultimately fine, as well.
It _could_ be done. But the advantages of having it versioned outside of
the Git repository outweigh the benefits of that shared infrastructure, I
believe.
Ciao,
Johannes
Footnote *1*:
https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/965/commits/6db65223de699c4f75ab083f82f43947a53ad6ff#diff-6855ef61b94227f9264adab3ff9f2de95c2d7b4e451019cc0105896d32550eb0R58-R73
Footnote *2*:
https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/965/commits/6db65223de699c4f75ab083f82f43947a53ad6ff#diff-6855ef61b94227f9264adab3ff9f2de95c2d7b4e451019cc0105896d32550eb0R77-R86
Footnote *3*:
https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/965/commits/6db65223de699c4f75ab083f82f43947a53ad6ff#diff-6855ef61b94227f9264adab3ff9f2de95c2d7b4e451019cc0105896d32550eb0R88
Footnote *4*:
https://github.com/git-for-windows/git-sdk-64/pull/87/commits/fdb0cea373893ce7d40bcfcfbeb7fd091a3c4020
Footnote *5*:
https://github.com/git-for-windows/setup-git-for-windows-sdk/pull/965/commits/3d4ea07041d0740b21160a9d9a4181f569e706d8 |
This patch series was integrated into seen via git@819a28a. |
There was a status update in the "New Topics" section about the branch Revert recent changes to the way windows environment is set up for GitHub CI. Will merge to 'master'. source: <pull.1841.git.1734447458896.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@fc27f9a. |
This patch series was integrated into seen via git@5f6765d. |
This patch series was integrated into seen via git@6c7681b. |
This patch series was integrated into seen via git@bad5d1a. |
This patch series was integrated into master via git@bad5d1a. |
This patch series was integrated into next via git@bad5d1a. |
Closed via bad5d1a. |
While waiting for way too many builds in https://github.com/gitgitgadget/git/actions to finish, I noticed that the minimal Git for Windows SDK initialization now takes a whopping 2.5 minutes. That's way too much. It used to take a little over a minute when uncached, and 2-5 seconds when cached.
Let's fix this regression by reverting to using the
setup-git-for-windows-sdk
GitHub Action (also because that Action will soon see another dramatic speed-up, see git-for-windows/setup-git-for-windows-sdk#965).Cc: Patrick Steinhardt ps@pks.im