-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diff-tree: fix crash when used with --remerge-diff #1771
diff-tree: fix crash when used with --remerge-diff #1771
Conversation
/preview |
Preview email sent as pull.1771.git.1723122774848.gitgitgadget@gmail.com |
/submit |
Submitted as pull.1771.git.1723123250958.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Elijah Newren wrote (reply to this): On Thu, Aug 8, 2024 at 6:20 AM blanet via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Xing Xin <xingxin.xx@bytedance.com>
>
> When using "git-diff-tree" to get the tree diff for merge commits with
> the diff format set to `remerge`, a bug is triggered as shown below:
>
> $ git diff-tree -r --remerge-diff 363337e6eb
> 363337e6eb812d0c0d785ed4261544f35559ff8b
> BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?
Wow, this bug is around for 2.5 years, and then we both independently
notice and fix it within 3 weeks of each other:
https://github.com/git/git/commit/e5890667c7598e813edee0ac4e76d6e3cdd525ec
My patch is incomplete as it's missing a testcase, and you submitted
first, so let's stick with your fix, though.
> This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
> added in commit 7b90ab467a (log: clean unneeded objects during log
> --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
> attempting to clean up temporary objects generated during the remerge
> process.
>
> After some further digging, I find that the remerge-related diff options
> were introduced in db757e8b8d (show, log: provide a --remerge-diff
> capability, 2022-02-02), which also affect the setup of `rev_info` for
> "git-diff-tree", but were not accounted for in the original
> implementation (inferred from the commit message).
>
> This commit fixes the bug by adding initialization logic for
> `remerge_objdir` in `builtin/diff-tree.c`, mirroring the logic in
> `builtin/log.c:cmd_log_walk_no_free`. A final cleanup for
> `remerge_objdir` is also included.
The commit message from my patch also included an explanation for why
diff-tree was the only caller that was missing the necessary logic
(see the last paragraph, which kind of references the one before it as
well).
> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
> ---
> diff-tree: fix crash when used with --remerge-diff
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1771%2Fblanet%2Fxx%2Ffix-diff-tree-crash-on-remerge-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1771/blanet/xx/fix-diff-tree-crash-on-remerge-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1771
>
> builtin/diff-tree.c | 13 +++++++++++++
> t/t4069-remerge-diff.sh | 35 +++++++++++++++++++++++++++++++++++
> 2 files changed, 48 insertions(+)
>
> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
> index 0d3c611aac0..813be486dad 100644
> --- a/builtin/diff-tree.c
> +++ b/builtin/diff-tree.c
> @@ -9,6 +9,7 @@
> #include "repository.h"
> #include "revision.h"
> #include "tree.h"
> +#include "tmp-objdir.h"
The includes other than this one are in alphabetical order; can you
move this a line before?
Also, as an aside, folks in this project often just put includes at
the end, but I think it's a bad practice. Whenever someone needs to
backport fixes or merge separate patch topics into seen/next/etc. or
even merge not-yet-upstream topics with newer upstream versions, this
practice increases the odds of unnecessary conflicts. And it makes it
harder for the next person who comes along to spot whether a header is
already included (and sometimes leaves us including headers twice).
While each case is a small amount of toil so we tend to overlook it,
it's totally unnecessary toil in many cases. Putting includes in
alphabetical order (other than the one include required to be first,
git-compat-util.h or its documented stand-ins) can often remove this
unnecessary toil. Anyway, thanks for letting me vent. :-)
> static struct rev_info log_tree_opt;
>
> @@ -166,6 +167,13 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>
> opt->diffopt.rotate_to_strict = 1;
>
> + if (opt->remerge_diff) {
> + opt->remerge_objdir = tmp_objdir_create("remerge-diff");
> + if (!opt->remerge_objdir)
> + die(_("unable to create temporary object directory"));
> + tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1);
> + }
> +
> /*
> * NOTE! We expect "a..b" to expand to "^a b" but it is
> * perfectly valid for revision range parser to yield "b ^a",
> @@ -230,5 +238,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
> diff_free(&opt->diffopt);
> }
>
> + if (opt->remerge_diff) {
> + tmp_objdir_destroy(opt->remerge_objdir);
> + opt->remerge_objdir = NULL;
> + }
> +
> return diff_result_code(&opt->diffopt);
> }
Your fix exactly matches mine, other than the header include location.
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 07323ebafe0..ca8f999caba 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -110,6 +110,41 @@ test_expect_success 'can filter out additional headers with pickaxe' '
> test_must_be_empty actual
> '
>
> +test_expect_success 'remerge-diff also works for git-diff-tree' '
> + # With a clean merge
> + git diff-tree -r -p --remerge-diff --no-commit-id bc_resolution >actual &&
> + test_must_be_empty actual &&
> +
> + # With both a resolved conflict and an unrelated change
> + cat <<-EOF >tmp &&
> + diff --git a/numbers b/numbers
> + remerge CONFLICT (content): Merge conflict in numbers
> + index a1fb731..6875544 100644
> + --- a/numbers
> + +++ b/numbers
> + @@ -1,13 +1,9 @@
> + 1
> + 2
> + -<<<<<<< b0ed5cb (change_a)
> + -three
> + -=======
> + -tres
> + ->>>>>>> 6cd3f82 (change_b)
> + +drei
> + 4
> + 5
> + 6
> + 7
> + -eight
> + +acht
> + 9
> + EOF
> + sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> + git diff-tree -r -p --remerge-diff --no-commit-id ab_resolution >tmp &&
> + sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> + test_cmp expect actual
> +'
> +
> test_expect_success 'setup non-content conflicts' '
> git switch --orphan base &&
Test looks good too.
I'll be happy to add my Reviewed-by if you fix the header include order. |
On the Git mailing list, Junio C Hamano wrote (reply to this): Elijah Newren <newren@gmail.com> writes:
> The commit message from my patch also included an explanation for why
> diff-tree was the only caller that was missing the necessary logic
> (see the last paragraph, which kind of references the one before it as
> well).
... which we may want to resurrect.
> Test looks good too.
>
> I'll be happy to add my Reviewed-by if you fix the header include order.
Thanks for a review. |
f0b86fa
to
02e63b0
Compare
When using "git-diff-tree" to get the tree diff for merge commits with the diff format set to `remerge`, a bug is triggered as shown below: $ git diff-tree -r --remerge-diff 363337e 363337e BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!? This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check added in commit 7b90ab4 (log: clean unneeded objects during log --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when attempting to clean up temporary objects generated during the remerge process. After some further digging, I find that the remerge-related diff options were introduced in db757e8 (show, log: provide a --remerge-diff capability, 2022-02-02), which also affect the setup of `rev_info` for "git-diff-tree", but were not accounted for in the original implementation (inferred from the commit message). Elijah Newren, the author of the remerge diff feature, notes that other callers of `log-tree.c:log_tree_commit` (the only caller of `log-tree.c:do_remerge_diff`) also exist, but: `builtin/am.c`: manually sets all flags; remerge_diff is not among them `sequencer.c`: manually sets all flags; remerge_diff is not among them so `builtin/diff-tree.c` really is the only caller that was overlooked when remerge-diff functionality was added. This commit resolves the crash by adding `remerge_objdir` setup logic to `builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`. It also includes the necessary cleanup for `remerge_objdir`. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
02e63b0
to
57f0b12
Compare
/submit |
Submitted as pull.1771.v2.git.1723188292498.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, "Xing Xin" wrote (reply to this): At 2024-08-09 00:03:53, "Elijah Newren" <newren@gmail.com> wrote:
>On Thu, Aug 8, 2024 at 6:20 AM blanet via GitGitGadget
><gitgitgadget@gmail.com> wrote:
>>
>> From: Xing Xin <xingxin.xx@bytedance.com>
>>
>> When using "git-diff-tree" to get the tree diff for merge commits with
>> the diff format set to `remerge`, a bug is triggered as shown below:
>>
>> $ git diff-tree -r --remerge-diff 363337e6eb
>> 363337e6eb812d0c0d785ed4261544f35559ff8b
>> BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?
>
>Wow, this bug is around for 2.5 years, and then we both independently
>notice and fix it within 3 weeks of each other:
>https://github.com/git/git/commit/e5890667c7598e813edee0ac4e76d6e3cdd525ec
>
>My patch is incomplete as it's missing a testcase, and you submitted
>first, so let's stick with your fix, though.
Wow, such an interesting coincidence! And thanks for your quick reply!
>> This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
>> added in commit 7b90ab467a (log: clean unneeded objects during log
>> --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
>> attempting to clean up temporary objects generated during the remerge
>> process.
>>
>> After some further digging, I find that the remerge-related diff options
>> were introduced in db757e8b8d (show, log: provide a --remerge-diff
>> capability, 2022-02-02), which also affect the setup of `rev_info` for
>> "git-diff-tree", but were not accounted for in the original
>> implementation (inferred from the commit message).
>>
>> This commit fixes the bug by adding initialization logic for
>> `remerge_objdir` in `builtin/diff-tree.c`, mirroring the logic in
>> `builtin/log.c:cmd_log_walk_no_free`. A final cleanup for
>> `remerge_objdir` is also included.
>
>The commit message from my patch also included an explanation for why
>diff-tree was the only caller that was missing the necessary logic
>(see the last paragraph, which kind of references the one before it as
>well).
Your explanations better illustrate the impact of this bug, I've quoted them
in the new patch commit message.
[snip]
>> diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
>> index 0d3c611aac0..813be486dad 100644
>> --- a/builtin/diff-tree.c
>> +++ b/builtin/diff-tree.c
>> @@ -9,6 +9,7 @@
>> #include "repository.h"
>> #include "revision.h"
>> #include "tree.h"
>> +#include "tmp-objdir.h"
>
>The includes other than this one are in alphabetical order; can you
>move this a line before?
>
>Also, as an aside, folks in this project often just put includes at
>the end, but I think it's a bad practice. Whenever someone needs to
>backport fixes or merge separate patch topics into seen/next/etc. or
>even merge not-yet-upstream topics with newer upstream versions, this
>practice increases the odds of unnecessary conflicts. And it makes it
>harder for the next person who comes along to spot whether a header is
>already included (and sometimes leaves us including headers twice).
>While each case is a small amount of toil so we tend to overlook it,
>it's totally unnecessary toil in many cases. Putting includes in
>alphabetical order (other than the one include required to be first,
>git-compat-util.h or its documented stand-ins) can often remove this
>unnecessary toil. Anyway, thanks for letting me vent. :-)
Noted. I'll move the new include to the correct position. Thanks for the
guidance!
>> static struct rev_info log_tree_opt;
>>
>> @@ -166,6 +167,13 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>>
>> opt->diffopt.rotate_to_strict = 1;
>>
>> + if (opt->remerge_diff) {
>> + opt->remerge_objdir = tmp_objdir_create("remerge-diff");
>> + if (!opt->remerge_objdir)
>> + die(_("unable to create temporary object directory"));
>> + tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1);
>> + }
>> +
>> /*
>> * NOTE! We expect "a..b" to expand to "^a b" but it is
>> * perfectly valid for revision range parser to yield "b ^a",
>> @@ -230,5 +238,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
>> diff_free(&opt->diffopt);
>> }
>>
>> + if (opt->remerge_diff) {
>> + tmp_objdir_destroy(opt->remerge_objdir);
>> + opt->remerge_objdir = NULL;
>> + }
>> +
>> return diff_result_code(&opt->diffopt);
>> }
>
>Your fix exactly matches mine, other than the header include location.
High five! :-)
[snip]
Xing Xin |
On the Git mailing list, Junio C Hamano wrote (reply to this): "blanet via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Elijah Newren, the author of the remerge diff feature, notes that other
> callers of `log-tree.c:log_tree_commit` (the only caller of
> `log-tree.c:do_remerge_diff`) also exist, but:
>
> `builtin/am.c`: manually sets all flags; remerge_diff is not among them
> `sequencer.c`: manually sets all flags; remerge_diff is not among them
>
> so `builtin/diff-tree.c` really is the only caller that was overlooked
> when remerge-diff functionality was added.
That is more than OK as a band-aid, and I'll take the patch as-is,
but I have to wonder if we do even better in a future follow-up
patch.
Any time do_remerge_diff() is entered, we know that either the end
user (from the command line) or the hard-coded caller (like
am/sequencer cited above) wants us to do the remerge-diff, which in
turn requires us to have the temporary object directory rotated into
the status of the primary object store. And there is nothing in
that object directory rotation code that requires caller-specific
customization---it is the same "create remerge-diff directory as
tmp-objdir, rotate it into the alt object store chain as the
primary" regardless of the actual caller).
So wouldn't it work well if we
(1) at the beginning of do_remerge_diff(), only once for a rev_info
structure:
(1-a) lazily do the "object directory rotation"
(1-b) set up an atexit handler to clear the temporary object
store
(2) remove all the "ah, we need to prepare and tear down the
temporary object store for _this_ operation" we have sprinkled
in different code paths (including the one added by the fix we
are looking at).
That way, we won't have to worry about adding future remerge_diff
users, including existing hard-coded callers.
ANyway, thanks for the fix. It is very pleasing to see contributors
working well together. |
On the Git mailing list, Junio C Hamano wrote (reply to this): Junio C Hamano <gitster@pobox.com> writes:
> but I have to wonder if we do even better in a future follow-up
> patch.
"if we do" -> "if we can do".
> So wouldn't it work well if we
>
> (1) at the beginning of do_remerge_diff(), only once for a rev_info
> structure:
> (1-a) lazily do the "object directory rotation"
> (1-b) set up an atexit handler to clear the temporary object
> store
An atexit handler may not be enough, when a program wants to start
creating a real object after we did a remerge-diff but before
exiting. So we'd probably need to allow an explicit "ok, we are
done" clean-up call for such a program, too.
And the atexit handler can call the same clean-up function if the
program hasn't called it explicitly. For logically read-only
operations like diff-tree, they do not have to worry about rotating
the real object store back to the primary status as soon as
possible.
|
This patch series was integrated into seen via git@b52560d. |
This patch series was integrated into next via git@cabe67c. |
This branch is now known as |
This patch series was integrated into seen via git@17311b7. |
There was a status update in the "New Topics" section about the branch "git rev-list ... | git diff-tree -p --remerge-diff --stdin" should behave more or less like "git log -p --remerge-diff" but instead it crashed, forgetting to prepare a temporary object store needed. Will merge to 'master'. source: <pull.1771.v2.git.1723188292498.gitgitgadget@gmail.com> |
There was a status update in the "Cooking" section about the branch "git rev-list ... | git diff-tree -p --remerge-diff --stdin" should behave more or less like "git log -p --remerge-diff" but instead it crashed, forgetting to prepare a temporary object store needed. Will merge to 'master'. source: <pull.1771.v2.git.1723188292498.gitgitgadget@gmail.com> |
There was a status update in the "Graduated to 'master'" section about the branch "git rev-list ... | git diff-tree -p --remerge-diff --stdin" should behave more or less like "git log -p --remerge-diff" but instead it crashed, forgetting to prepare a temporary object store needed. source: <pull.1771.v2.git.1723188292498.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@0da7673. |
This patch series was integrated into master via git@0da7673. |
This patch series was integrated into next via git@0da7673. |
Closed via 0da7673. |
cc: Elijah Newren newren@gmail.com