Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to detect build 'conflicts' while merging #64

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 121 additions & 4 deletions git-imerge
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,14 @@ class AutomaticMergeFailed(Exception):
self.commit1, self.commit2 = commit1, commit2


class AutomaticBuildFailed(Exception):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be appropriate for AutomaticMergeFailed and AutomaticBuildFailed to inherit from a common base class, as suggested here. After all, they both represent a failure of an automatic merge, albeit for different reasons. The base class could also manage the commit arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def __init__(self, commit1, commit2):
Exception.__init__(
self, 'Automatic build of %s and %s failed' % (commit1, commit2,)
)
self.commit1, self.commit2 = commit1, commit2


def automerge(commit1, commit2, msg=None):
"""Attempt an automatic merge of commit1 and commit2.

Expand All @@ -572,8 +580,18 @@ def automerge(commit1, commit2, msg=None):
# added in git version 1.7.4.
call_silently(['git', 'reset', '--merge'])
raise AutomaticMergeFailed(commit1, commit2)
else:
return get_commit_sha1('HEAD')

build_script = MergeState.get_default_test_command()
if build_script:
try:
check_call(['/bin/sh', build_script])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way you are invoking the test is unnecessarily restrictive. It requires the test to be embodied in a single shell script, which in many cases the user would have to write just for this purpose.

If instead you would invoke it using

check_call(['/bin/sh', '-c', build_script])

then build_script could be an arbitrary shell command (or even sequence of commands). Then the user could specifiy something like

git imerge --build-command='make' [...]

or even

git imerge --build-command='make test' [...]

Please note that git bisect --run does things differently. It treats all of the git bisect arguments following the --run option as individual words of the command that it will run. This is more flexible still, and maybe a bit less confusing with respect to shell quoting. But it makes it harder to parse the git bisect command itself, because the command to be invoked is "the rest of the line", so no other options or arguments can follow it on the command line. Moreover, we will need to serialize the command somewhere, and it is easier to serialize a string than a list of strings. So I think git bisect's approach is not necessarily a model that git imerge should follow.

By the way, deviation from git bisect's style of handling the command is a good reason that our option shouldn't be called --run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

except CalledProcessError as e:
if e.returncode == 125:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So "the source at this commit cannot be built or cannot be tested" is treated as success as far as git imerge is concerned? This is different than git bisect: there, return value 125 is handled as a third, "indeterminate" state and it is treated differently in subsequent logic.

We could also handle this return code specially during bisection. It could be treated as "something is fishy. Don't treat it as a failure, as long as the status of other tests are definitive". So if an earlier merge is found to FAIL definitively, then we would continue to the left of that merge. And if a later merge is found to SUCCEED definitively, then we would continue to the right of that merge. And if an indeterminate merge is found at the boundary between SUCCEED and FAIL, then we would treat the indeterminate status the same as a FAIL.

But without some kind of special handling, I don't see the value of treating return code 125 specially. The test script could just as well return 0 because it would have the same end result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's treated as a success... I guess it was a placeholder for doing the right thing.

I agree with the middle paragraph. I'll see if acting on an indeterminate state during the bisect stage is possible.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely something that doesn't have to be implemented in the first version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for another version

return get_commit_sha1('HEAD')
else:
raise AutomaticBuildFailed(commit1, commit2)

return get_commit_sha1('HEAD')


class MergeRecord(object):
Expand Down Expand Up @@ -1474,11 +1492,15 @@ class Block(object):
'Attempting automerge of %d-%d...' % self.get_original_indexes(i1, i2)
)
try:
print("Automerging from is_mergeable")
automerge(self[i1, 0].sha1, self[0, i2].sha1)
sys.stderr.write('success.\n')
return True
except AutomaticMergeFailed:
sys.stderr.write('failure.\n')
sys.stderr.write('Merge failure.\n')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old message was lower case because it was a continuation of the line started above, like:

Attempting automerge of 3-12...failure.

I think lower-case looks better here because it is punctuated like a single sentence. The same comment applies a few lines down, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return False
except AutomaticBuildFailed:
sys.stderr.write('Build failure.\n')
return False

def auto_outline(self):
Expand All @@ -1497,6 +1519,7 @@ class Block(object):
sys.stderr.write(msg % (i1orig, i2orig))
logmsg = 'imerge \'%s\': automatic merge %d-%d' % (self.name, i1orig, i2orig)
try:
print("Automerging from auto_outline")
merge = automerge(commit1, commit2, msg=logmsg)
sys.stderr.write('success.\n')
except AutomaticMergeFailed as e:
Expand Down Expand Up @@ -1571,6 +1594,7 @@ class Block(object):
sys.stderr.write('Attempting to merge %d-%d...' % (i1orig, i2orig))
logmsg = 'imerge \'%s\': automatic merge %d-%d' % (self.name, i1orig, i2orig)
try:
print("Automerging from auto_fill_micromerge")
merge = automerge(
self[i1, i2 - 1].sha1,
self[i1 - 1, i2].sha1,
Expand Down Expand Up @@ -1778,6 +1802,8 @@ class MergeState(Block):
re.VERBOSE,
)

DEFAULT_BUILD_COMMAND = None
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually ALL_CAPS indicates a constant whose value won't change over time. But this one is initialized later, so it should probably be written in lower-case. I'm also not sure why you call it default_build_command as opposed to build_command (or test_command). In what sense is it a "default"?


@staticmethod
def iter_existing_names():
"""Iterate over the names of existing MergeStates in this repo."""
Expand Down Expand Up @@ -1860,6 +1886,38 @@ class MergeState(Block):
except CalledProcessError:
return None

@staticmethod
def set_default_test_command(name):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this method also set/unset DEFAULT_BUILD_COMMAND?

"""Set the default test command to the specified one.

name can be None to cause the default to be cleared."""

if name is None:
try:
check_call(['git', 'config', '--unset', 'imerge.command'])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be --unset-all, though I see I've made the same mistake elsewhere in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and fixed the other place where this was used. I actually think this helps with some confusion that I was having with set_default_name.

except CalledProcessError as e:
if e.returncode == 5:
# Value was not set
pass
else:
raise
else:
check_call(['git', 'config', 'imerge.command', name])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name of this config variable should be more reminiscent of the name of the option; maybe buildcommand (or testcommand).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@staticmethod
def get_default_test_command():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why is "default" in the name? But I like that the name includes "test" rather than "build" :trollface:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I was following get_default_name() for getting the imerge name. There doesn't seem to be any calls to 'git config' that are not in functions with "default" in the name. How do you record different imerge names for multiple imerges?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that imerge.default is used for is to record the name of the incremental merge that is "active"; that is, if the user runs a git imerge without the --name parameter, then this is the name of the incremental merge that is affected by the command. So there is only one imerge.default setting at any time, not one per imerge.

Otherwise, nothing has been stored in the configuration for imerges; instead, the rest of the state was recorded in the object database as commits and blobs and references. The main information for an imerge is stored in a blob referred to by refs/imerge/$NAME/state.

If you want to store a command in the configuration as a string for the imerge named name, you need to do something like

check_call(['git', 'config', 'imerge.%s.command' % (name,), command])

and to retrieve it again,

command = check_output(['git', 'config', 'imerge.%s.command' % (name,)]).rstrip()

Hope that helps!

"""Get the name of the test command, or None if none is currently set."""

if MergeState.DEFAULT_BUILD_COMMAND:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please compare explicitly against None instead of testing its boolean value:

if MergeState.DEFAULT_BUILD_COMMAND is not None:

Also, you can be more idiomatic (and save a line or two of code) by inverting the if statement:

if MergeState.DEFAULT_BUILD_COMMAND is None:
    try:
        MergeState.DEFAULT_BUILD_COMMAND = \
            check_output(['git', 'config', 'imerge.command']).rstrip()
        return MergeState.DEFAULT_BUILD_COMMAND
    except CalledProcessError:
        return None

return MergeState.DEFAULT_BUILD_COMMAND

Finally, please note that in the (common) case that there is no such config setting, you will call the same git config command every time this method is called. You can avoid that by distinguishing somehow between "not yet initialized" and "no command configured".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid the case of calling git config every time by using the static variable DEFAULT_BUILD_COMMAND. The variable is initialized to None and assigned a non None value by check_output.

My confusing is coming from 'set_default_name'. I was attempting to set the test command in git config the same way the imerge name is set. I couldn't figure out the right way to set the test command to be anything other than a default value. How are other values of the imerge name set in the git config file?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above where I show how to store the command in the git config.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid the case of calling git config every time by using the static variable DEFAULT_BUILD_COMMAND. The variable is initialized to None and assigned a non None value by check_output.

The way I read the code is that if the configuration value is unset, then git config fails, and the exception handler in get_default_test_command() causes it to return None. So DEFAULT_BUILD_COMMAND is set to None again, and every time it is queried the whole sequence is repeated.

I think you need a way to distinguish "command hasn't been checked yet; we don't know what it is" from "command has been checked but it was not set"; either an additional boolean variable that keeps track of whether you have tried to read the command already, or maybe store another value (False?) to the variable to distinguish it from None.

return MergeState.DEFAULT_BUILD_COMMAND
else:
try:
MergeState.DEFAULT_BUILD_COMMAND = \
check_output(['git', 'config', 'imerge.command']).rstrip()
return MergeState.DEFAULT_BUILD_COMMAND
except CalledProcessError:
return None

@staticmethod
def _check_no_merges(commits):
multiparent_commits = [
Expand Down Expand Up @@ -1891,7 +1949,7 @@ class MergeState(Block):
name, merge_base,
tip1, commits1,
tip2, commits2,
goal=DEFAULT_GOAL, manual=False, branch=None,
goal=DEFAULT_GOAL, manual=False, branch=None, build_command=None,
):
"""Create and return a new MergeState object."""

Expand All @@ -1915,6 +1973,7 @@ class MergeState(Block):
goal=goal,
manual=manual,
branch=branch,
build_command=build_command,
)

@staticmethod
Expand Down Expand Up @@ -2109,13 +2168,15 @@ class MergeState(Block):
goal=DEFAULT_GOAL,
manual=False,
branch=None,
build_command=None,
):
Block.__init__(self, name, len(commits1) + 1, len(commits2) + 1)
self.tip1 = tip1
self.tip2 = tip2
self.goal = goal
self.manual = bool(manual)
self.branch = branch or name
self.build_command = build_command
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where this attribute is ever used.

...which probably goes some way towards answering why the class field has "default" in its name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this (although I feel like it might have to be re-added and the DEFAULT_TEST_COMMAND variable removed)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my feeling, too.


# A simulated 2D array. Values are None or MergeRecord instances.
self._data = [[None] * self.len2 for i1 in range(self.len1)]
Expand Down Expand Up @@ -2675,6 +2736,18 @@ def main(args):
action='store', default=None,
help='the name of the branch to which the result will be stored',
)
subparser.add_argument(
'--build-command',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of duplication here among subparser options. It would be great to have a function to add this option. The same function could also handle other options that are common to the subcommands that initiate incremental merges, like --manual.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's what I thought too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fallen in love with the really awesome 'docopt' package for help messages and options. I understand that we don't want to add any dependencies however.

Added a function for this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docopt looks interesting, but you're right that I'd prefer not to add another dependency.

I do like about argparse that it validates the arguments and converts them into the types that you want. It appears that docopt returns all arguments as strings, which I consider a disadvantage.

action='store', default=None,
help=(
'in addition to identifying for textual conflicts, run the build '
'or test script specified by TEST to identify where logical '
'conflicts are introduced. The test script is expected to return 0 '
'if the source is good, exit with code 1-127 if the source is bad, '
'except for exit code 125 which indicates the source code can not '
'be built or tested.'
),
)
subparser.add_argument(
'--manual',
action='store_true', default=False,
Expand Down Expand Up @@ -2716,6 +2789,18 @@ def main(args):
action='store', default=None,
help='the name of the branch to which the result will be stored',
)
subparser.add_argument(
'--build-command',
action='store', default=None,
help=(
'in addition to identifying for textual conflicts, run the build '
'or test script specified by TEST to identify where logical '
'conflicts are introduced. The test script is expected to return 0 '
'if the source is good, exit with code 1-127 if the source is bad, '
'except for exit code 125 which indicates the source code can not '
'be built or tested.'
),
)
subparser.add_argument(
'--manual',
action='store_true', default=False,
Expand Down Expand Up @@ -2754,6 +2839,18 @@ def main(args):
action='store', default=None,
help='the name of the branch to which the result will be stored',
)
subparser.add_argument(
'--build-command',
action='store', default=None,
help=(
'in addition to identifying for textual conflicts, run the build '
'or test script specified by TEST to identify where logical '
'conflicts are introduced. The test script is expected to return 0 '
'if the source is good, exit with code 1-127 if the source is bad, '
'except for exit code 125 which indicates the source code can not '
'be built or tested.'
),
)
subparser.add_argument(
'--manual',
action='store_true', default=False,
Expand Down Expand Up @@ -2894,6 +2991,18 @@ def main(args):
action='store', default=None,
help='the name of the branch to which the result will be stored',
)
subparser.add_argument(
'--build-command',
action='store', default=None,
help=(
'in addition to identifying for textual conflicts, run the build '
'or test script specified by TEST to identify where logical '
'conflicts are introduced. The test script is expected to return 0 '
'if the source is good, exit with code 1-127 if the source is bad, '
'except for exit code 125 which indicates the source code can not '
'be built or tested.'
),
)
subparser.add_argument(
'--manual',
action='store_true', default=False,
Expand Down Expand Up @@ -3040,9 +3149,11 @@ def main(args):
tip2, commits2,
goal=options.goal, manual=options.manual,
branch=(options.branch or options.name),
build_command=options.build_command,
)
merge_state.save()
MergeState.set_default_name(options.name)
MergeState.set_default_test_command(options.build_command)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is only one build command shared by all git imerges (yes, believe it or not, I have had multiple imerges going on in parallel in a single repo!). And yet by default the value is not carried over from one completed imerge to a newly-started one, but nor is the old value deleted when a new imerge is finished. It is a bit inconsistent.

It seem to me that there are two possible self-consistent policies:

  • Each imerge has its own test command, maybe stored in imerge.<name>.testcommand. Possibly there is also a default value for when the user doesn't specify a command explicitly, maybe stored under imerge.defaulttestcommand. When an imerge is finished, its imerge.<name>.testcommand setting should be erased.
  • All imerges use the same command, all the time. Then it could be stored in imerge.testcommand, but then it should be carried over from one run to the next (i.e., if the user starts git imerge without setting a new command, then the old command should continue to be used).

I prefer the former.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the former makes sense. I'll see what I can do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your above comment true for the imerge name? Where is the git config value set for different imerge branches? This will help me understand how to accomplish this for different test commands.

elif options.subcommand == 'start':
require_clean_work_tree('proceed')

Expand All @@ -3068,9 +3179,11 @@ def main(args):
tip2, commits2,
goal=options.goal, manual=options.manual,
branch=(options.branch or options.name),
build_command=options.build_command,
)
merge_state.save()
MergeState.set_default_name(options.name)
MergeState.set_default_test_command(options.build_command)

try:
merge_state.auto_complete_frontier()
Expand Down Expand Up @@ -3126,9 +3239,11 @@ def main(args):
tip2, commits2,
goal=options.goal, manual=options.manual,
branch=options.branch,
build_command=options.build_command,
)
merge_state.save()
MergeState.set_default_name(name)
MergeState.set_default_test_command(options.build_command)

try:
merge_state.auto_complete_frontier()
Expand Down Expand Up @@ -3188,9 +3303,11 @@ def main(args):
tip2, commits2,
goal=options.goal, manual=options.manual,
branch=options.branch,
build_command=options.build_command,
)
merge_state.save()
MergeState.set_default_name(options.name)
MergeState.set_default_test_command(options.build_command)

try:
merge_state.auto_complete_frontier()
Expand Down
51 changes: 51 additions & 0 deletions t/create-test-repo
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#! /bin/sh

set -e
set -x

DESCRIPTION="git-imerge test repository"

Expand All @@ -11,6 +12,20 @@ modify() {
git add "$filename"
}

modify_buildable() {
filename="$1"
text="$2"
cat > $filename << EOF
#include<stdio.h>

int main()
{
$text
}
EOF
git add "$filename"
}

BASE="$(dirname "$(cd $(dirname "$0") && pwd)")"
TMP="$BASE/t/tmp"

Expand Down Expand Up @@ -79,6 +94,42 @@ do
git commit -m "d$i"
done

###############
# Build setup #
###############
git checkout -b build master --
modify_buildable hello.c printf\(\"0\\n\"\)\;
cat > make_script << EOF
#/bin/bash -ex
gcc -o hello hello.c
EOF
chmod +x make_script
git add make_script
git commit -m "Hello World"

git checkout -b build-left build --
for i in $(seq 10)
do
modify_buildable hello.c 'printf("'$i' top\n");
printf("'$i' bot\n");'
git commit -m "build $i"
done

git checkout -b build-right build --
modify_buildable hello.c printf\(\"2\\n\"\)\;
git commit -m "Conflicts"
modify_buildable hello.c printf\(\"2\\n\"\;
git commit -m "Conflicts, breaks build"
modify_buildable hello.c 'printf("'2' top\n");
printf("'2' bot\n");
typo'
git commit -m "No conflict, breaks build"
for i in $(seq 4 6)
do
modify_buildable hello.c printf\(\"$i\\n\"\)\;
git commit -m "build $i"
done

git checkout master --

ln -s ../../imerge.css
5 changes: 5 additions & 0 deletions t/reset-test-repo
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,8 @@ do
git update-ref -d refs/heads/$b
done

"$GIT_IMERGE" remove --name=build-left-right
for b in build-left-right
do
git update-ref -d refs/heads/$b
done
Loading