-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gc: add --expire-to
option
#1843
base: master
Are you sure you want to change the base?
Conversation
/submit |
Submitted as pull.1843.git.1735041177817.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
There are issues in commit 4254269: |
4254269
to
5797579
Compare
Submitted as pull.1843.v2.git.1735611513.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -69,6 +69,12 @@ be performed as well. | |||
the `--max-cruft-size` option of linkgit:git-repack[1] for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> Documentation/git-gc.txt | 6 ++++++
> builtin/gc.c | 6 +++++-
> t/t6500-gc.sh | 6 ++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
> +
> --prune=<date>::
> Prune loose objects older than date (default is 2 weeks ago,
> overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..77904694c9f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
> } else {
> strvec_push(&repack, "-A");
> if (cfg->prune_expire)
> @@ -675,7 +678,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +696,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
> +
> run_and_wait_for_gc () {
> # We read stdout from gc for the side effect of waiting until the
> # background gc process exits, closing its fd 9. Furthermore, the
> --
> gitgitgadget
>
Hi, Jeff King, could you come and help take a look at this patch?
I would be very grateful if you have time!
ZheNing Hu
@@ -69,6 +69,12 @@ be performed as well. | |||
the `--max-cruft-size` option of linkgit:git-repack[1] for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
This patch has been sitting for weeks with no review. Does anyone want
to help take a look?
ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> Documentation/git-gc.txt | 6 ++++++
> builtin/gc.c | 6 +++++-
> t/t6500-gc.sh | 6 ++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
> +
> --prune=<date>::
> Prune loose objects older than date (default is 2 weeks ago,
> overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..77904694c9f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
> } else {
> strvec_push(&repack, "-A");
> if (cfg->prune_expire)
> @@ -675,7 +678,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +696,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
> +
> run_and_wait_for_gc () {
> # We read stdout from gc for the side effect of waiting until the
> # background gc process exits, closing its fd 9. Furthermore, the
> --
> gitgitgadget
>
@@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) | |||
static void add_repack_all_option(struct gc_config *cfg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 77904694c9f..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> static void add_repack_all_option(struct gc_config *cfg,
> struct string_list *keep_pack)
> {
> - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> + && !(cfg->cruft_packs && cfg->repack_expire_to))
> strvec_push(&repack, "-a");
I expected to see a mention of repack_expire_to here, but not
cfg->cruft_packs. These two are AND-ed together so we are only disabling
"repack -a" when both options ("--expire-to" and "--cruft") are passed.
Can we --expire-to without cruft? I.e., what should happen with:
git gc --expire-to=some-path --prune=now --no-cruft
Looking at the underlying git-repack, it seems that we only respect
--expire-to at all when used with "--cruft", and don't otherwise
consider it. Which is what the manpage says ("Only useful with --cruft
-d").
But if we look at this proposed patch for example:
https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/
then it is expanding how --expire-to is used during the pruning step.
OTOH, I think the way your patch 1 is structured means that we'd always
pass --expire-to to git-repack anyway, and I _think_ even with the patch
linked above that "repack -a -d --expire-to=whatever" would do the right
thing.
In which case the problem really is the combination of cruft packs and
expire-to. Just cruft packs by themselves do not need to override using
"-a" for "--prune=now" because we know that any such cruft pack would be
empty.
So I think this logic is correct. Taylor might have more thoughts,
though (and ideas on whether he intends to revisit that earlier patch).
I do think this change should probably be done as part of patch 1,
rather than introducing a buggy state and then fixing it in patch 2.
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道:
>
> On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 77904694c9f..8656e1caff0 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> > static void add_repack_all_option(struct gc_config *cfg,
> > struct string_list *keep_pack)
> > {
> > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> > + && !(cfg->cruft_packs && cfg->repack_expire_to))
> > strvec_push(&repack, "-a");
>
> I expected to see a mention of repack_expire_to here, but not
> cfg->cruft_packs. These two are AND-ed together so we are only disabling
> "repack -a" when both options ("--expire-to" and "--cruft") are passed.
> Can we --expire-to without cruft? I.e., what should happen with:
>
> git gc --expire-to=some-path --prune=now --no-cruft
>
> Looking at the underlying git-repack, it seems that we only respect
> --expire-to at all when used with "--cruft", and don't otherwise
> consider it. Which is what the manpage says ("Only useful with --cruft
> -d").
>
Yes, this is the current state of git-repack. The --expire-to option can
only be used with --cruft, which is why I use cruft_packs && repack_expire_to
as a double safeguard.
When using --no-cruft, the option --expire-to becomes irrelevant.
So leaving `git gc --prune=now` as is at this point: passing -a as a
parameter to repack seems reasonable.
> But if we look at this proposed patch for example:
>
> https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/
>
> then it is expanding how --expire-to is used during the pruning step.
> OTOH, I think the way your patch 1 is structured means that we'd always
> pass --expire-to to git-repack anyway, and I _think_ even with the patch
> linked above that "repack -a -d --expire-to=whatever" would do the right
> thing.
>
I've taken a look at the patch, and I believe Taylor's changes are primarily
aimed at extending the --expire-to functionality within the --cruft feature,
rather than expecting --expire-to to be used on its own.
> In which case the problem really is the combination of cruft packs and
> expire-to. Just cruft packs by themselves do not need to override using
> "-a" for "--prune=now" because we know that any such cruft pack would be
> empty.
>
> So I think this logic is correct. Taylor might have more thoughts,
> though (and ideas on whether he intends to revisit that earlier patch).
>
> I do think this change should probably be done as part of patch 1,
> rather than introducing a buggy state and then fixing it in patch 2.
>
Yes, I agree with that, and perhaps a single patch will suffice.
> -Peff
- ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trong danh sách gửi thư Git , ZheNing Hu đã viết ( trả lời bài này ):
Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道: > > On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote: > > > diff --git a/builtin/gc.c b/builtin/gc.c > > index 77904694c9f..8656e1caff0 100644 > > --- a/builtin/gc.c > > +++ b/builtin/gc.c > > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > > static void add_repack_all_option(struct gc_config *cfg, > > struct string_list *keep_pack) > > { > > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > > strvec_push(&repack, "-a"); > > I expected to see a mention of repack_expire_to here, but not > cfg->cruft_packs. These two are AND-ed together so we are only disabling > "repack -a" when both options ("--expire-to" and "--cruft") are passed. > Can we --expire-to without cruft? I.e., what should happen with: > > git gc --expire-to=some-path --prune=now --no-cruft > > Looking at the underlying git-repack, it seems that we only respect > --expire-to at all when used with "--cruft", and don't otherwise > consider it. Which is what the manpage says ("Only useful with --cruft > -d"). > Yes, this is the current state of git-repack. The --expire-to option can only be used with --cruft, which is why I use cruft_packs && repack_expire_to as a double safeguard. When using --no-cruft, the option --expire-to becomes irrelevant. So leaving `git gc --prune=now` as is at this point: passing -a as a parameter to repack seems reasonable. > But if we look at this proposed patch for example: > > https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/ > > then it is expanding how --expire-to is used during the pruning step. > OTOH, I think the way your patch 1 is structured means that we'd always > pass --expire-to to git-repack anyway, and I _think_ even with the patch > linked above that "repack -a -d --expire-to=whatever" would do the right > thing. > I've taken a look at the patch, and I believe Taylor's changes are primarily aimed at extending the --expire-to functionality within the --cruft feature, rather than expecting --expire-to to be used on its own. > In which case the problem really is the combination of cruft packs and > expire-to. Just cruft packs by themselves do not need to override using > "-a" for "--prune=now" because we know that any such cruft pack would be > empty. > > So I think this logic is correct. Taylor might have more thoughts, > though (and ideas on whether he intends to revisit that earlier patch). > > I do think this change should probably be done as part of patch 1, > rather than introducing a buggy state and then fixing it in patch 2. > Yes, I agree with that, and perhaps a single patch will suffice. > -Peff - ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5797579
to
0842ec3
Compare
/submit |
Submitted as pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
Today I do not have enough time to do my usual commit log message
critique. Please use "git show -s --format=reference" when
referring to an earlier commit.
> Note: When git-gc is used with both `--cruft` and `--expire-to`,
> it does not pass `-a` to git-repack to delete all unreachable
> objects as `git gc --prune=now` originally did. Instead, it
> generates a cruft pack in the directory specified by expire-to.
Is this less important than "we added --expire-to to gc that is
passed down to underlying repack" in the previous paragraph?
Not removing the unreachables too early with "repack -a" is an
essential part of the design of this new feature to allow us not to
lose the cruft objects, so I was a bit surprised that this was
described as a "Note:".
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
Does "When packing unreachable objects into a cruft pack" mean that
this option is only meaningful with "--cruft"? As "--cruft" is on
by default, is it an error to pass "--no-cruft" when you use this
option?
"for more" -> "for more information" or something?
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> static void add_repack_all_option(struct gc_config *cfg,
> struct string_list *keep_pack)
> {
> - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> + && !(cfg->cruft_packs && cfg->repack_expire_to))
> strvec_push(&repack, "-a");
Hmph. When "--expire-to=<there>" is given, we are dropping these
unreachable objects right away, but we said "--no-cruft", then we
say "repack -a". If we have both "--cruft" and "--expire-to=<there>",
then ...
> else if (cfg->cruft_packs) {
> strvec_push(&repack, "--cruft");
> @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
... we do the usual "repack --cruft --expire-to=<there>" in the next
block.
> @@ -675,7 +679,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
OK.
> @@ -694,6 +697,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
OK.
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
As "--cruft" is on by default, the command line does not have to
have it, but being explicit is good.
Should we also see what happens when "--no-cruft" is given?
Thanks. |
This patch series was integrated into seen via git@aa1682c. |
This branch is now known as |
This patch series was integrated into seen via git@9984f53. |
This patch series was integrated into seen via git@4ae0c92. |
There was a status update in the "New Topics" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Needs review. source: <pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@d36a896. |
This patch series was integrated into seen via git@c716f47. |
This patch series was integrated into seen via git@7f3e7d1. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Needs review. source: <pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com> |
On the Git mailing list, ZheNing Hu wrote (reply to this): Junio C Hamano <gitster@pobox.com> 于2025年1月17日周五 02:23写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > This commit extends the functionality of `git gc`
> > by adding a new option, `--expire-to=<dir>`. Previously,
> > this feature was implemented in `git repack` (see 91badeb),
> > allowing users to specify a directory where unreachable and
> > expired cruft packs are stored during garbage collection.
> > However, users had to run `git repack --cruft --expire-to=<dir>`
> > followed by `git prune` to achieve similar results within `git gc`.
> >
> > By introducing `--expire-to=<dir>` directly into `git gc`,
> > we simplify the process for users who wish to manage their
> > repository's cleanup more efficiently. This change involves
> > passing the `--expire-to=<dir>` parameter through to `git repack`,
> > making it easier for users to set up a backup location for cruft
> > packs that will be pruned.
>
> Today I do not have enough time to do my usual commit log message
> critique. Please use "git show -s --format=reference" when
> referring to an earlier commit.
>
Okay, I will change to using this format.
> > Note: When git-gc is used with both `--cruft` and `--expire-to`,
> > it does not pass `-a` to git-repack to delete all unreachable
> > objects as `git gc --prune=now` originally did. Instead, it
> > generates a cruft pack in the directory specified by expire-to.
>
> Is this less important than "we added --expire-to to gc that is
> passed down to underlying repack" in the previous paragraph?
>
I had thought that adding --expire-to to gc was key in this patch,
but the change to the implementation of --prune=now should
indeed be mentioned more.
> Not removing the unreachables too early with "repack -a" is an
> essential part of the design of this new feature to allow us not to
> lose the cruft objects, so I was a bit surprised that this was
> described as a "Note:".
>
You're right. This section shouldn't use a note; it should provide
a more detailed explanation instead.
> > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> > index 370e22faaeb..b4c0cf02972 100644
> > --- a/Documentation/git-gc.txt
> > +++ b/Documentation/git-gc.txt
> > @@ -69,6 +69,12 @@ be performed as well.
> > the `--max-cruft-size` option of linkgit:git-repack[1] for
> > more.
> >
> > +--expire-to=<dir>::
> > + When packing unreachable objects into a cruft pack, write a cruft
> > + pack containing pruned objects (if any) to the directory `<dir>`.
> > + See the `--expire-to` option of linkgit:git-repack[1] for
> > + more.
>
> Does "When packing unreachable objects into a cruft pack" mean that
> this option is only meaningful with "--cruft"? As "--cruft" is on
> by default, is it an error to pass "--no-cruft" when you use this
> option?
>
It (--expired-to) can currently only be used together with --cruft.
Using --no-cruft together with --expire-to will not result in an error,
but --expired-to will not take effect either.
I should mention in the document that --expire-to and --cruft
need to be used together, otherwise --expire-to will not
have any effect.
> "for more" -> "for more information" or something?
>
OK, "for more information".
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index d52735354c9..8656e1caff0 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -136,6 +136,7 @@ struct gc_config {
> > char *prune_worktrees_expire;
> > char *repack_filter;
> > char *repack_filter_to;
> > + char *repack_expire_to;
> > unsigned long big_pack_threshold;
> > unsigned long max_delta_cache_size;
> > };
> > @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> > static void add_repack_all_option(struct gc_config *cfg,
> > struct string_list *keep_pack)
> > {
> > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> > + && !(cfg->cruft_packs && cfg->repack_expire_to))
> > strvec_push(&repack, "-a");
>
> Hmph. When "--expire-to=<there>" is given, we are dropping these
> unreachable objects right away, but we said "--no-cruft", then we
> say "repack -a". If we have both "--cruft" and "--expire-to=<there>",
> then ...
>
> > else if (cfg->cruft_packs) {
> > strvec_push(&repack, "--cruft");
> > @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> > if (cfg->max_cruft_size)
> > strvec_pushf(&repack, "--max-cruft-size=%lu",
> > cfg->max_cruft_size);
> > + if (cfg->repack_expire_to)
> > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
>
> ... we do the usual "repack --cruft --expire-to=<there>" in the next
> block.
>
> > @@ -675,7 +679,6 @@ struct repository *repo UNUSED)
> > const char *prune_expire_sentinel = "sentinel";
> > const char *prune_expire_arg = prune_expire_sentinel;
> > int ret;
> > -
> > struct option builtin_gc_options[] = {
> > OPT__QUIET(&quiet, N_("suppress progress reporting")),
> > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
>
> OK.
>
> > @@ -694,6 +697,8 @@ struct repository *repo UNUSED)
> > PARSE_OPT_NOCOMPLETE),
> > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> > N_("repack all other packs except the largest pack")),
> > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> > + N_("pack prefix to store a pack containing pruned objects")),
> > OPT_END()
> > };
>
> OK.
>
> > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> > index ee074b99b70..d4b0653a9b7 100755
> > --- a/t/t6500-gc.sh
> > +++ b/t/t6500-gc.sh
> > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> > '
> >
> > +test_expect_success '--expire-to sets appropriate repack options' '
> > + mkdir expired &&
> > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> > +'
>
> As "--cruft" is on by default, the command line does not have to
> have it, but being explicit is good.
>
> Should we also see what happens when "--no-cruft" is given?
>
--expire-to with --no-cruft will still run repack -a, I will add
corresponding tests.
> Thanks.
Thanks. |
This patch series was integrated into seen via git@77d4d83. |
This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in 91badeb (builtin/repack.c: implement `--expire-to` for storing pruned objects, 2022-10-24), which allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Due to the original `git gc --prune=now` deleting all unreachable objects by passing the `-a` parameter to git repack. With the addition of the `--cruft` and `--expire-to` options, it is necessary to modify this default behavior: instead of deleting these unreachable objects, they should be merged into a cruft pack and collected in a specified directory. Therefore, we do not pass `-a` to the repack command but instead pass `--cruft`, `--expire-to`, and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com>
0842ec3
to
6946ccd
Compare
/submit |
Submitted as pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@099b60c. |
I want to perform a "safe" garbage collection for the Git repository
on the server, which avoids data corruption issues caused by
concurrent pushes during git GC. To achieve this, I currently need to
use
git repack --cruft --expire-to=<dir>
andgit prune
in combination. However, it would be simpler if we could directly use
--expire-to=<dir>
with the git-gc command.v1: add --expire-to option to gc
v1 -> v2: fix git gc --prune=now with --expire-to
v2 -> v3: squash two patch into one patch
v3 -> v4: modify docs, commit message, and give more tests
cc: gitster@pobox.com
cc: me@ttaylorr.com
cc: peff@peff.net