Version Control with Git, 2nd Edition

Bibliographic Information

Version Control with Git, 2nd Edition

By: Jon Loeliger; Matthew McCullough

Publisher: O'Reilly Media, Inc.

Pub. Date: August 17, 2012

Print ISBN-13: 978-1-4493-1638-9

Pages in Print Edition: 456

The .gitignore File

Earlier in this chapter, you saw how to use the .gitignore file to pass over main.o, an irrelevant file. As in that example, you can skip any file by adding its name to .gitignore in the same directory. Additionally, you can ignore the file everywhere by adding it to the .gitignore file in the topmost directory of your repository.

But Git also supports a much richer mechanism. A .gitignore file can contain a list of filename patterns that specify what files to ignore. The format of .gitignore is as follows:

Blank lines are ignored, and lines starting with a pound sign (#) can be used for comments. However, the # does not represent a comment if it follows other text on the line.
A simple, literal filename matches a file in any directory with that name.
A directory name is marked by a trailing slash character (/). This matches the named directory and any subdirectory but does not match a file or a symbolic link.
A pattern containing shell globbing characters, such as an asterisk (*), is expanded as a shell glob pattern. Just as in standard shell globbing, the match cannot extend across directories and so an asterisk can match only a single file or directory name. But an asterisk can still be part of a pattern that includes slashes to specify directory names (e.g., debug/32bit/*.o).
An initial exclamation point (!) inverts the sense of the pattern on the rest of the line. Additionally, any file excluded by an earlier pattern but matching an inversion rule is included. An inverted pattern overrides lower precedence rules.

Furthermore, Git allows you to have a .gitignore file in any directory within your repository. Each file affects its directory and all subdirectories. The .gitignore rules also cascade: you can override the rules in a higher directory by including an inverted pattern (using the initial !) in one of the subdirectories.

To resolve a hierarchy with multiple .gitignore directories, and to allow command-line addenda to the list of ignored files, Git honours the following precedence, from highest to lowest:

Patterns are specified on the command line.
Patterns read from .gitignore in the same directory.
Patterns in parent directories, proceeding upward. Hence, the current directory’s patterns overrule the parents’ patterns, and the parents close to the current directory take precedence over higher parents.
Patterns from the .git/info/exclude file.
Patterns from the file specified by the configuration variable core.excludefile.

Because a .gitignore is treated as a regular file within your repository, it is copied during clone operations and applies to all copies of your repository. In general, you should place entries into your version-controlled .gitignore files only if the patterns apply to all derived repositories universally.

If the exclusion pattern is somehow specific to your one repository and should not (or might not) be applicable to anyone else’s clone of your repository, then the patterns should instead go into the .git/info/exclude file, because it is not propagated during clone operations. Its pattern format and treatment are the same as .gitignore files.

Here’s another scenario. It’s typical to exclude .o files, which are generated from the source by the compiler. To ignore .o files, place *.o in your top-level .gitignore. But what if you also had a particular *.o file that was, say, supplied by someone else and for which you couldn’t generate a replacement yourself? You’d likely want to explicitly track that particular file. You might then have a configuration like this:

$ cd my_package
$ cat .gitignore
*.o
$ cd my_package/vendor_files
$ cat .gitignore
!driver.o

The combination of rules means that Git will ignore all .o files within the repository but will track one exception, the file driver.o within the vendor_files subdirectory.

The Stash

Do you ever feel overwhelmed in your daily development cycle when the constant interruptions, demands for bug fixes, and requests from coworkers or managers all pile up and clutter the real work you are trying to do? If so, the stash was designed to help you!

The stash is a mechanism for capturing your work in progress, allowing you to save it and return to it later when convenient. Sure, you can already do that using the existing branch and commit mechanisms within Git, but the stash is a quick convenience mechanism that allows a complete and thorough capturing of your index and working directory in one simple command. It leaves your repository clean, uncluttered, and ready for an alternate development direction. Another single command restores that index and working directory state completely, allowing you to resume where you left off.

Let’s see how the stash works with the canonical use case: the so-called “interrupted work flow.”

In this scenario, you are happily working in your Git repository and have changed several files and maybe even staged a few in the index. Then, some interruption happens. Perhaps a critical bug is discovered and lands on your plate and must be fixed immediately. Perhaps your team lead has suddenly prioritized a new feature over everything else and insists you drop everything to work on it. Whatever the circumstance, you realize you must stash everything, clean your slate and work tree, and start afresh. This is a perfect opportunity for git stash!

$ cd the-git-project
# edit a lot, in the middle of something
# High-Priority Work-flow Interrupt!
# Must drop everything and do Something Else now!
$ git stash save
# edit high-priority change
$ git commit -a -m "Fixed High-Priority issue"
$ git stash pop

And resume where you were!

The default and optional operation to git stash is save. Git also supplies a default log message when saving a stash, but you can supply your own to better remind you what you were doing. Just supply it in the command after the then-required save argument:

$ git stash save "WIP: Doing real work on my stuff"

The acronym WIP is a common abbreviation used in these situations meaning “work in progress.”

To achieve the same effect with other, more basic Git commands requires manual creation of a new branch on which you commit all of your modifications, re-establishing your previous branch to continue your work, and then later recovering your saved branch state on top of your new working directory. For the curious, that process is roughly this sequence:

# ... normal development process interrupted ...
# Create new branch on which current state is stored.
$ git checkout -b saved_state
$ git commit -a -m "Saved state"
# Back to previous branch for immediate update.
$ git checkout master
# edit emergency fix
$ git commit -a -m "Fix something."
# Recover saved state on top of working directory.
$ git checkout saved_state
$ git reset --soft HEAD^
# ... resume working where we left off above ...

That process is sensitive to completeness and attention to detail. All of your changes have to be captured when you save your state, and the restoration process can be disrupted if you forget to move your HEAD back as well.

The git stash save command will save your current index and working directory state and clear them out so that they again match the head of your current branch. Although this operation gives the appearance that your modified files and any files updated into the index using, for example, git add or git rm, have been lost, they have not. Instead, the contents of your index and working directory are actually stored as independent, regular commits and are accessible through the ref refs/stash.

$ git show-branch stash
[stash] WIP on master: 3889def Some initial files.

As you might surmise by the use of pop to restore your state, the two basic stash commands, git stash save and git stash pop, implement a stack of stash states. That allows your interrupted work flow to be interrupted yet again! Each stashed context on the stack can be managed independently of your regular commit process.

The git stash pop command restores the context saved by a previous save operation on top of your current working directory and index. And by restore here, I mean that the pop operation takes the stash content and merges those changes into the current state rather than just overwriting or replacing files. Nice, huh?

You can only git stash pop into a clean working directory. Even then, the command may or may not fully succeed in recreating the full state you originally had at the time it was saved. Because the application of the saved context can be performed on top of a different commit, merging may be required, complete with possible user resolution of any conflicts.

After a successful pop operation, Git will automatically remove your saved state from the stack of saved states. That is, once applied, the stash state will be “dropped.” However, when conflict resolution is needed, Git will not automatically drop the state, just in case you want to try a different approach or want to restore it onto a different commit. Once you clear the merge conflicts and want to proceed, you should use the git stash drop to remove it from the stash stack. Otherwise, Git will maintain an ever growing^[23] stack of contexts.

If you just want to recreate the context you have saved in a stash state without dropping it from the stack, use git stash apply. Thus, a pop command is a successful apply followed by a drop.

Tip

In fact, you can use git stash apply to apply the same saved stashed context onto several different commits prior to dropping it from the stack.

However, you should consider carefully if you want to use git stash apply or git stash pop to regain the contents of a stash. Will you ever need it again? If not, pop it. Clean the stashed content and referents out of your object store.

The git stash list command lists the stack of saved contexts from most to least recent.

$ cd my-repo
$ ls
file1 file2
$ echo "some foo" >> file1
$ git status
# On branch master
# Changes not staged for commit:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: file1
#
no changes added to commit (use "git add" and/or "git commit -a")
$ git stash save "Tinkered file1"
Saved working directory and index state On master: Tinkered file1
HEAD is now at 3889def Add some files
$ git commit --dry-run
# On branch master
nothing to commit (working directory clean)
$ echo "some bar" >> file2
$ git stash save "Messed with file2"
Saved working directory and index state On master: Messed with file2
HEAD is now at 3889def Add some files
$ git stash list
stash@{0}: On master: Messed with file2
stash@{1}: On master: Tinkered file1

Git always numbers the stash entries with the most recent entry being zero. As entries get older, they increase in numerical order. And yes, the different stash entry names are stash@{0} and stash@{1}, as explained in The Reflog.

The git stash show command shows the index and file changes recorded for a given stash entry, relative to its parent commit.

$ git stash show
file2 | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

That summary may or may not be the extent of the information you sought. If not, adding -p to see the diffs might be more useful. Note that by default the git stash show command shows the most recent stash entry, stash@{0}.

Because the changes that contribute to making a stash state are relative to a particular commit, showing the state is a state-to-state comparison suitable for git diff, rather than a sequence of commit states suitable for git log. Thus, all the options for git diff may also be supplied to git stash show as well. As we saw previously, --stat is the default, but other options are valid, too. Here, -p is used to obtain the patch differences for a given stash state.

$ git stash show -p stash@{1}
diff --git a/file1 b/file1
index 257cc56..f9e62e5 100644
--- a/file1
+++ b/file1
@@ -1 +1,2 @@
foo
+some foo

Another classic use case for git stash is the so-called “pull into a dirty tree” scenario.

Until you are familiar with the use of remote repositories and pulling changes (see Getting Repository Updates), this might not make sense yet. But it goes like this. You’re developing in your local repository and have made several commits. You still have some modified files that haven’t been committed yet, but you realize there are upstream changes that you want. If you have conflicting modifications, a simple git pull will fail, refusing to overwrite your local changes. One quick way to work around this problem uses git stash.

$ git pull
# ... pull fails due to merge conflicts ...
$ git stash save
$ git pull
$ git stash pop

At this point you may or may not need to resolve conflicts created by the pop.

In case you have new, uncommitted (and hence “untracked”) files as part of your local development, it is possible that a git pull that would also introduce a file of the same name might fail, thus not wanting to overwrite your version of the new file. In this case, add the --include-untracked option on your git stash so that it also stashes your new, untracked files along with the rest of your modifications. That will ensure a completely clean working directory for the pull.

The --all option will gather up the untracked files as well as the explicitly ignored files from the .gitignore and exclude files.

Finally, for more complex stashing operations where you wish to selectively choose which hunks should be stashed, use the -p or --patch option.

In another similar scenario, git stash can be used when you want to move modified work out of the way, enabling a clean pull --rebase. This would happen typically just prior to pushing your local commits upstream.

# ... edit and commit ...
# ... more editing and working...
$ git commit --dry-run
# On branch master
# Your branch is ahead of 'origin/master' by 2 commits.
#
# Changed but not updated:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: file1.h
# modified: file1.c
#
no changes added to commit (use "git add" and/or "git commit -a")

At this point you may decide the commits you have already made should go upstream, but you also want to leave the modified files here in your work directory. However, git refuses to pull:

$ git pull --rebase
file1.h: needs update
file1.c: needs update
refusing to pull with rebase: your working tree is not up-to-date

This scenario isn’t as contrived as it might seem at first. For example, I frequently work in a repository where I want to have modifications to a Makefile, perhaps to enable debugging, or I need to modify some configuration options for a build. I don’t want to commit those changes, and I don’t want to lose them between updates from a remote repository. I just want them to linger here in my working directory.

Again, this is where git stash helps:

$ git stash save
Saved working directory and index state WIP on master: 5955d14 Some commit log.
HEAD is now at 5955d14 Some commit log.
$ git pull --rebase
remote: Counting objects: 63, done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 43 (delta 36), reused 0 (delta 0)
Unpacking objects: 100% (43/43), done.
From ssh://git/var/git/my_repo
871746b..6687d58 master -> origin/master
First, rewinding head to replay your work on top of it...
Applying: A fix for a bug.
Applying: The fix for something else.

After you pull in upstream commits and rebase your local commits on top of them, your repository is in good shape to send your work upstream. If desired, you can readily push them now:

# Push upstream now if desired!
$ git push

or after restoring your previous working directory state:

$ git stash pop
Auto-merging file1.h
# On branch master
# Your branch is ahead of 'origin/master' by 2 commits.
#
# Changed but not updated:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: file1.h
# modified: file1.c
#
no changes added to commit (use "git add" and/or "git commit -a")
Dropped refs/stash@{0} (7e2546f5808a95a2e6934fcffb5548651badf00d)
$ git push

If you decide to git push after popping your stash, remember that only completed, committed work will be pushed. There’s no need to worry about pushing your partial, uncommitted work. There is also no need to worry about pushing your stashed content: the stash is purely a local notion.

Sometimes stashing your changes leads to a whole sequence of development on your branch and, ultimately, restoring your stashed state on top of all those changes may not make direct sense. In addition, merge conflicts might make popping hard to do. Nonetheless, you may still want to recover the work you stashed. In situations like this, git offers the git stash branch command to help you. This command converts the contents of a saved stash into a new branch based on the commit that was current at the time the stash entry was made.

Let’s see how that works on a repository with a bit of history in it.

$ git log --pretty=one --abbrev-commit
d5ef6c9 Some commit.
efe990c Initial commit.

Now, some files are modified and subsequently stashed:

$ git stash
Saved working directory and index state WIP on master: d5ef6c9 Some commit.
HEAD is now at d5ef6c9 Some commit.

Note that the stash was made against commit d5ef6c9.

Due to other development reasons, more commits are made and the branch drifts away from the d5ef6c9 state.

$ git log --pretty=one --abbrev-commit
2c2af13 Another mod
1d1e905 Drifting file state.
d5ef6c9 Some commit.
efe990c Initial commit.
$ git show-branch -a
[master] Another mod

And although the stashed work is available, it doesn’t apply cleanly to the current master branch.

$ git stash list
stash@{0}: WIP on master: d5ef6c9 Some commit.
$ git stash pop
Auto-merging foo
CONFLICT (content): Merge conflict in foo
Auto-merging bar
CONFLICT (content): Merge conflict in bar

Say it with me: “Ugh.”

So reset some state and take a different approach, creating a new branch called mod that contains the stashed changes.

$ git reset --hard master
HEAD is now at 2c2af13 Another mod
$ git stash branch mod
Switched to a new branch 'mod'
# On branch mod
# Changes not staged for commit:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: bar
# modified: foo
#
no changes added to commit (use "git add" and/or "git commit -a")
Dropped refs/stash@{0} (96e53da61f7e5031ef04d68bf60a34bd4f13bd9f)

There are several important points to notice here. First, notice that the branch is based on the original commit d5ef6c9, and not the current head commit 2c2af13.

$ git show-branch -a
! [master] Another mod
* [mod] Some commit.
--
+ [master] Another mod
+ [master^] Drifting file state.
+* [mod] Some commit.

Second, because the stash is always reconstituted against the original commit, it will always succeed and hence will be dropped from the stash stack.

Finally, reconstituting the stash state doesn’t automatically commit any of your changes onto the new branch. All the stashed file modifications (and index changes, if desired) are still left in your working directory on the newly created and checked out branch.

$ git commit --dry-run
# On branch mod
# Changes not staged for commit:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: bar
# modified: foo
#
no changes added to commit (use "git add" and/or "git commit -a")

At this point you are of course welcome to commit the changes onto the new branch, presumably as a precursor to further development or merging as you deem necessary. No, this isn’t a magic bullet to avoid resolving merge conflicts. If there were merge conflicts when you tried to pop the stash directly onto the master branch earlier, trying to merge the new branch with the master will yield the same effects and the same merge conflicts.

$ git commit -a -m "Stuff from the stash"
[mod 42c104f] Stuff from the stash
2 files changed, 2 insertions(+), 0 deletions(-)
$ git show-branch
! [master] Another mod
* [mod] Stuff from the stash
--
* [mod] Stuff from the stash
+ [master] Another mod
+ [master^] Drifting file state.
+* [mod^] Some commit.
$ git checkout master
Switched to branch 'master'
$ git merge mod
Auto-merging foo
CONFLICT (content): Merge conflict in foo
Auto-merging bar
CONFLICT (content): Merge conflict in bar
Automatic merge failed; fix conflicts and then commit the result.

As some parting advice on the git stash command, let me leave you with this analogy: you name your pets and you number your livestock. So branches are named and stashes are numbered. The ability to create stashes might be appealing, but be careful not to overuse it and create too many stashes. And don’t just convert them to named branches to make them linger!

Forms of the git diff Command

If you pick two different root-level tree objects for comparison, git diff yields all deviations between the two project states. That’s powerful. You could use such a diff to convert wholesale from one project state to another. For example, if you and a co-worker are developing code for the same project, a root-level diff could effectively sync the repositories at any time.

There are three basic sources for tree or treelike objects to use with git diff:

Any tree object anywhere within the entire commit graph
Your working directory
The index

Typically, the trees compared in a git diff command are named via commits, branch names, or tags, but any commit name discussed in Identifying Commits of Chapter 6 suffices. Also, both the file and directory hierarchy of your working directory, as well as the complete hierarchy of files staged in the index, can be treated as trees.

The git diff command can perform four fundamental comparisons using various combinations of those three sources.

git diff

git diff shows the difference between your working directory and the index. It exposes what is dirty in your working directory and is thus a candidate to stage for your next commit. This command does not reveal differences between what’s in your index and what’s permanently stored in the repository (not to mention remote repositories you might be working with).

git diff commit

This form summarizes the differences between your working directory and the given commit. Common variants of this command name HEAD or a particular branch name as the commit.

git diff --cached commit

This command shows the differences between the staged changes in the index and the given commit. A common commit for the comparison—and the default if no commit is specified—is HEAD. With HEAD, this command shows you how your next commit will alter the current branch.

If the option --cached doesn’t make sense to you, perhaps the synonym --staged will. It is available in Git version 1.6.1 and later.

git diff commit1 commit2

If you specify two arbitrary commits, the command displays the differences between the two. This command ignores the index and working directory, and it is the workhorse for arbitrary comparisons between two trees that are already in your object store.

The number of parameters on the command line determines what fundamental form is used and what is compared. You can compare any two commits or trees. What’s being compared need not have a direct or even an indirect parent–child relationship. If you don’t supply a tree object or two, then git diff compares implied sources, such as your index or working directory.

Let’s examine how these different forms apply to Git’s object model. The example in Figure 8-1 shows a project directory with two files. The file file1 has been modified in the working directory, changing its content from “foo” to “quux.” That change has been staged in the index using git add file1, but it is not yet committed.

Figure 8-1. Various file versions that can be compared

A version of the file file1 from each of your working directory, the index, and the HEAD have been identified. Even though the version of file1 that is in the index, bd71363, is actually stored as a blob object in the object store, it is indirectly referenced through the virtual tree object that is the index. Similarly, the HEAD version of the file, a23bf, is also indirectly referenced through several steps.

This example nominally demonstrates the changes within file1. The bold arrows in the figure point to the tree or virtual tree objects to remind you that the comparison is actually based on complete trees and not just on individual files.

From Figure 8-1, you can see how using git diff without arguments is a good technique for verifying the readiness of your next commit. As long as that command emits output, you have edits or changes in your working directory that are not yet staged. Check the edits on each file. If you are satisfied with your work, use git add to stage the file. Once you stage a changed file, the next git diff no longer yields diff output for that file. In this way, you can step progressively through each dirty file in your working directory until the differences disappear, meaning that all files are staged in your index. Don’t forget to check for new or deleted files, too. At any time during the staging process, the command git diff --cached shows the complementary changes, or those changes already staged in the index that will be present in your next commit. When you’re finished, git commit captures all changes in your index into a new commit.

You are not required to stage all the changes from your working directory for a single commit. In fact, if you find you have conceptually different changes in your working directory that should be made in different commits, you can stage one set at a time, leaving the other edits in your working directory. A commit captures only your staged changes. Repeat the process, staging the next set of files appropriate for a subsequent commit.

The astute reader might have noticed that, although there are four fundamental forms of the git diff command, only three are highlighted with bold arrows in Figure 8-1. So, what is the fourth? There is only one tree object represented by your working directory, and there is only one tree object represented by the index. In the example, there is one commit in the object store along with its tree. However, the object store is likely to have many commits named by different branches and tags, all of which have trees that can be compared with git diff. Thus, the fourth form of git diff simply compares any two arbitrary commits (trees) already stored within the object store.

In addition to the four basic forms of git diff, there are myriad options as well. Here are a few of the more useful ones.

-M

The -M option detects renames and generates a simplified output that simply records the file rename rather than the complete removal and subsequent addition of the source file. If the rename is not a pure rename but also has some additional content changes, Git calls those out.

-w or --ignore-all-space

Both -w and --ignore-all-space compare lines without considering changes in whitespace as significant.

--stat

The --stat option adds statistics about the difference between any two tree states. It reports in a compact syntax how many lines changed, how many were added, and how many were elided.

--color

The --color option colorizes the output; a unique color represents each of the different types of changes present in the diff.

Finally, the git diff may be limited to show diffs for a specific set of files or directories.

Warning

The -a option for git diff does nothing even remotely like the -a option for git commit. To get both staged and unstaged changes, use git diff HEAD. The lack of symmetry is unfortunate and counterintuitive.

Working with Merge Conflicts

As demonstrated by the previous example, there are instances when conflicting changes can’t be merged automatically.

Let’s create another scenario with a merge conflict to explore the tools Git provides to help resolve disparities. Starting with a common hello with just the contents “hello,” let’s create two different branches with two different variants of the file.

$ git init
Initialized empty Git repository in /tmp/conflict/.git/
$ echo hello > hello
$ git add hello
$ git commit -m "Initial hello file"
Created initial commit b8725ac: Initial hello file
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 hello
$ git checkout -b alt
Switched to a new branch "alt"
$ echo world >> hello
$ echo 'Yay!' >> hello
$ git commit -a -m "One world"
Created commit d03e77f: One world
1 files changed, 2 insertions(+), 0 deletions(-)
$ git checkout master
$ echo worlds >> hello
$ echo 'Yay!' >> hello
$ git commit -a -m "All worlds"
Created commit eddcb7d: All worlds
1 files changed, 2 insertions(+), 0 deletions(-)

One branch says world, whereas the other says worlds—a deliberate difference.

As in the earlier example, if you check out master and try to merge the alt branch into it, a conflict arises.

$ git merge alt
Auto-merged hello
CONFLICT (content): Merge conflict in hello
Automatic merge failed; fix conflicts and then commit the result.

As expected, Git warns you about the conflict found in the hello file.

Locating Conflicted Files

But what if Git’s helpful directions scrolled off the screen or if there were many files with conflicts? Luckily, Git keeps track of problematic files by marking each one in the index as conflicted, or unmerged.

You can also use either the git status command or the git ls-files -u command to show the set of files that remain unmerged in your working tree.

$ git status
hello: needs merge
# On branch master
# Changed but not updated:
# (use "git add ..." to update what will be committed)
#
# unmerged: hello
#
no changes added to commit (use "git add" and/or "git commit -a")
$ git ls-files -u
100644 ce013625030ba8dba906f756967f9e9ca394464a 1 hello
100644 e63164d9518b1e6caf28f455ac86c8246f78ab70 2 hello
100644 562080a4c6518e1bf67a9f58a32a67bff72d4f00 3 hello

You can use git diff to show what’s not yet merged, but it will show all of the gory details, too!

Inspecting Conflicts

When a conflict appears, the working directory copy of each conflicted file is enhanced with three-way diff or merge markers. Continuing from where the example left off, the resulting conflicted file now looks like this:

$ cat hello
hello
<<<<<<< HEAD:hello
worlds
=======
world
>>>>>>> 6ab5ed10d942878015e38e4bab333daff614b46e:hello
Yay!

The merge markers delineate the two possible versions of the conflicting chunk of the file. In the first version, the chunk says “worlds”; in the other version, it says “world.” You could simply choose one phrase or the other, remove the conflict markers, and then run git add and git commit, but let’s explore some of the other features Git offers to help resolve conflicts.

Tip

The three-way merge marker lines (<<<<<<<<, ========, and >>>>>>>>) are automatically generated, but they’re just meant to be read by you, not (necessarily) a program. You should delete them with your text editor once you resolve the conflict.

git diff with conflicts

Git has a special, merge-specific variant of git diff to display the changes made against both parents simultaneously. In the example, it looks like this:

$ git diff
diff --cc hello
index e63164d,562080a..0000000
--- a/hello
+++ b/hello
@@@ -1,3 -1,3 +1,7 @@@
hello
++<<<<<<< HEAD:hello
+worlds
++=======
+ world
++>>>>>>> alt:hello
Yay!

What does it all mean? It’s the simple combination of two diffs: one versus the first parent, called HEAD, and one against the second parent, or alt. (Don’t be surprised if the second parent is an absolute SHA1 name representing some unnamed commit from some other repository!) To make things easier, Git also gives the second parent the special name MERGE_HEAD.

You can compare both the HEAD and MERGE_HEAD versions against the working directory (“merged”) version:

$ git diff HEAD
diff --git a/hello b/hello
index e63164d..4e4bc4e 100644
--- a/hello
+++ b/hello
@@ -1,3 +1,7 @@
hello
+<<<<<<< HEAD:hello
worlds
+=======
+world
+>>>>>>> alt:hello
Yay!

And then this:

$ git diff MERGE_HEAD
diff --git a/hello b/hello
index 562080a..4e4bc4e 100644
--- a/hello
+++ b/hello
@@ -1,3 +1,7 @@
hello
+<<<<<<< HEAD:hello
+worlds
+=======
world
+>>>>>>> alt:hello
Yay!

Tip

In newer versions of Git, git diff --ours is a synonym for git diff HEAD, because it shows the differences between “our” version and the merged version. Similarly, git diff MERGE_HEAD can be written as git diff --theirs. You can use git diff --base to see the combined set of changes since the merge base, which would otherwise be rather awkwardly written as:

$ git diff $(git merge-base HEAD MERGE_HEAD)

If you line up the two diffs side by side, all the text except the + columns are the same, so Git prints the main text only once and prints the + columns next to each other.

The conflict found by git diff has two columns of information prepended to each line of output. A plus sign in a column indicates a line addition, a minus sign indicates a line removal, and a blank indicates a line with no change. The first column shows what’s changing versus your version, and the second column shows what’s changing versus the other version. The conflict marker lines are new in both versions, so they get a ++. The world and worlds lines are new only in one version or the other, so they have just a single + in the corresponding column.

Suppose you edit the file to pick a third option, like this:

$ cat hello
hello
worldly ones
Yay!

Then the new git diff output is

$ git diff
diff --cc hello
index e63164d,562080a..0000000
--- a/hello
+++ b/hello
@@@ -1,3 -1,3 +1,3 @@@
hello
- worlds
-world
++worldly ones
Yay!

Alternatively, you could choose one or the other original version, like this:

$ cat hello
hello
world
Yay!

The git diff output would then be:

$ git diff
diff --cc hello
index e63164d,562080a..0000000
--- a/hello
+++ b/hello

Wait! Something strange happened there. Where does it show where the world line was added to the base version? Where does it show that the worlds line was removed from the HEAD version? As you have resolved the conflict in favor of the MERGE_HEAD version, Git deliberately omits the diff because it thinks you probably don’t care about that section anymore.

Running git diff on a conflicted file only shows you the sections that really have a conflict. In a large file with numerous changes scattered throughout, most of those changes don’t have a conflict; either one side of the merge changed a particular section or the other side did. When you’re trying to resolve a conflict, you rarely care about those sections, so git diff trims out uninteresting sections using a simple heuristic: if a section has changes versus only one side, that section isn’t shown.

This optimization has a slightly confusing side effect: once you resolve something that used to be a conflict by simply picking one side or the other, it stops showing up. That’s because you modified the section so that it only changes one side or the other (i.e., the side that you didn’t choose), so to Git it looks just like a section that was never conflicted at all.

This is really more a side effect of the implementation than an intentional feature, but you might consider it useful anyway: git diff shows you only those sections of the file that are still conflicted, so you can use it to keep track of the conflicts you haven’t fixed yet.

git log with conflicts

While you’re in the process of resolving a conflict, you can use some special git log options to help you figure out exactly where the changes came from and why. Try this:

$ git log --merge --left-right -p
commit <eddcb7dfe63258ae4695eb38d2bc22e726791227
Author: Jon Loeliger <jdl@example.com>
Date: Wed Oct 22 21:29:08 2008 -0500
All worlds
diff --git a/hello b/hello
index ce01362..e63164d 100644
--- a/hello
+++ b/hello
@@ -1 +1,3 @@
hello
+worlds
+Yay!
commit >d03e77f7183cde5659bbaeef4cb51281a9ecfc79
Author: Jon Loeliger <example@example.com>
Date: Wed Oct 22 21:27:38 2008 -0500
One world
diff --git a/hello b/hello
index ce01362..562080a 100644
--- a/hello
+++ b/hello
@@ -1 +1,3 @@
hello
+world
+Yay!

This command shows all the commits in both parts of the history that affect conflicted files in your merge, along with the actual changes each commit introduced. If you wondered when, why, how, and by whom the line worlds came to be added to the file, you can see exactly which set of changes introduced it.

The options provided to git log are as follows:

--merge shows only commits related to files that produced a conflict
--left-right displays < if the commit was from the “left” side of the merge (“our” version, the one you started with), or > if the commit was from the “right” side of the merge (“their” version, the one you’re merging in)
-p shows the commit message and the patch associated with each commit

If your repository were more complicated and several files had conflicts, you could also provide the exact filename(s) you’re interested in as a command line option, like this:

$ git log --merge --left-right -p hello

The examples here have been kept small for demonstration purposes. Of course, real-life situations are likely to be significantly larger and more complex. One technique to mitigate the pain of large merges with nasty, extended conflicts is to use several small commits with well-defined effects contained to individual concepts. Git handles small commits well, so there is no need to wait until the last minute to commit large, widespread changes. Smaller commits and more frequent merge cycles reduce the pain of conflict resolution.

How Git Keeps Track of Conflicts

How exactly does Git keep track of all the information about a conflicted merge? There are several parts:

.git/MERGE_HEAD contains the SHA1 of the commit you’re merging in. You don’t really have to use the SHA1 yourself; Git knows to look in that file whenever you talk about MERGE_HEAD.
.git/MERGE_MSG contains the default merge message used when you git commit after resolving the conflicts.
The Git index contains three copies of each conflicted file: the merge base, “our” version, and “their” version. These three copies are assigned respective stage numbers 1, 2, and 3.
The conflicted version (merge markers and all) is not stored in the index. Instead, it is stored in a file in your working directory. When you run git diff without any parameters, the comparison is always between what’s in the index with what’s in your working directory.

To see how the index entries are stored, you can use the git ls-files plumbing command as follows:

$ git ls-files -s
100644 ce013625030ba8dba906f756967f9e9ca394464a 1 hello
100644 e63164d9518b1e6caf28f455ac86c8246f78ab70 2 hello
100644 562080a4c6518e1bf67a9f58a32a67bff72d4f00 3 hello

The -s option to git ls-files shows all the files with all stages. If you want to see only the conflicted files, use the -u option instead.

In other words, the hello file is stored three times, and each has a different hash corresponding to the three different versions. You can look at a specific variant by using git cat-file:

$ git cat-file -p e63164d951
hello
worlds
Yay!

You can also use some special syntax with git diff to compare different versions of the file. For example, if you want to see what changed between the merge base and the version you’re merging in, you can do this:

$ git diff :1:hello :3:hello
diff --git a/:1:hello b/:3:hello
index ce01362..562080a 100644
--- a/:1:hello
+++ b/:3:hello
@@ -1 +1,3 @@
hello
+world
+Yay!

Tip

Starting with Git version 1.6.1, the git checkout command accepts the --ours or --theirs option as shorthand for simply checking out (a file from) one side or the other of a conflicted merge; your choice resolves the conflict. These two options can only be used during a conflict resolution.

Using the stage numbers to name a version is different from git diff --theirs, which shows the differences between their version and the resulting, merged (or still conflicted) version in your working directory. The merged version is not yet in the index, so it doesn’t even have a number.

Because you fully edited and resolved the working copy version in favor of their version, there should be no difference now:

$ cat hello
hello
world
Yay!
$ git diff --theirs
* Unmerged path hello

All that remains is an unmerged path reminder to add it to the index.

Finishing Up a Conflict Resolution

Let’s make one last change to the hello file before declaring it merged:

$ cat hello
hello
everyone
Yay!

Now that the file is fully merged and resolved, git add reduces the index to just a single copy of the hello file again:

$ git add hello
$ git ls-files -s
100644 ebc56522386c504db37db907882c9dbd0d05a0f0 0 hello

That lone 0 between the SHA1 and the path name tells you that the stage number for a nonconflicted file is zero.

You must work through all the conflicted files as recorded in the index. You cannot commit as long as there is an unresolved conflict. Therefore, as you fix the conflicts in a file, run git add (or git rm, git update-index, etc.) on the file to clear its conflict status.

Warning

Be careful not to git add files with lingering conflict markers. Although that will clear the conflict in the index and allow you to commit, your file won’t be correct.

Finally, you can git commit the end result and use git show to see the merge commit:

$ cat .git/MERGE_MSG
Merge branch 'alt'
Conflicts:
hello
$ git commit
$ git show
commit a274b3003fc705ad22445308bdfb172ff583f8ad
Merge: eddcb7d... d03e77f...
Author: Jon Loeliger <@example.com>
Date: Wed Oct 22 23:04:18 2008 -0500
Merge branch 'alt'
Conflicts:
hello
diff --cc hello
index e63164d,562080a..ebc5652
--- a/hello
+++ b/hello
@@@ -1,3 -1,3 +1,3 @@@
hello
- worlds
-world
++everyone
Yay!

You should notice three interesting things when you look at a merge commit:

There is a new, second line in the header that says Merge:. Normally there’s no need to show the parent of a commit in git log or git show, since there is only one parent and it’s typically the one that comes right after it in the log. But merge commits typically have two (and sometimes more) parents, and those parents are important to understanding the merge. Hence, git log and git show always print the SHA1 of each ancestor.
The automatically generated commit log message helpfully notes the list of files that conflicted. This can be useful later if it turns out a particular problem was caused by your merge. Usually, problems caused by a merge are caused by the files that had to be merged by hand.
The diff of a merge commit is not a normal diff. It is always in the combined diff or “conflicted merge” format. A successful merge in Git is considered to be no change at all; it is simply the combination of other changes that already appeared in the history. Thus, showing the contents of a merge commit shows only the parts that are different from one of the merged branches, not the entire set of changes.

Aborting or Restarting a Merge

If you start a merge operation but then decide for some reason that you don’t want to complete it, Git provides an easy way to abort the operation. Prior to executing the final git commit on the merge commit, use:

$ git reset --hard HEAD

This command restores both your working directory and the index to the state immediately prior to the git merge command.

If you want to abort or discard the merge after it has finished (that is, after it’s introduced a new merge commit), use the command:

$ git reset --hard ORIG_HEAD

Prior to beginning the merge operation, Git saves your original branch HEAD in the ORIG_HEAD ref for just this sort of purpose.

You should be very careful here, though. If you did not start the merge with a clean working directory and index, you could get in trouble and lose any uncommitted changes you have in your directory.

You can initiate a git merge request with a dirty working directory, but if you execute git reset --hard then your dirty state prior to the merge is not fully restored. Instead, the reset loses your dirty state in the working directory area. In other words, you requested a --hard reset to the HEAD state! (See Using git reset.)

Starting with Git version 1.6.1, you have another choice. If you have botched a conflict resolution and want to return to the original conflict state before trying to resolve it again, you can use the command git checkout -m.

Merge Examples

To merge other_branch into branch, you should check out the target branch and merge the other branches into it, like this:

$ git checkout branch
$ git merge other_branch

Let’s work through a pair of example merges, one without conflicts and one with substantial overlaps. To simplify the examples in this chapter, we’ll use multiple branches per the techniques presented in Chapter 7.

Preparing for a Merge

Before you begin a merge, it’s best to tidy up your working directory. During a normal merge, Git creates new versions of files and places them in your working directory when it is finished. Furthermore, Git also uses the index to store temporary and intermediate versions of files during the operation.

If you have modified files in your working directory or if you’ve modified the index via git add or git rm, then your repository has a dirty working directory or index. If you start a merge in a dirty state, Git may be unable to combine the changes from all the branches and from those in your working directory or index in one pass.

Tip

You don’t have to start with a clean directory. Git performs the merge, for example, if the files affected by the merge operation and the dirty files in your working directory are disjoint. However, as a general rule, your Git life will be much easier if you start each merge with a clean working directory and index.

Merging Two Branches

For the simplest scenario, let’s set up a repository with a single file, create two branches, and then merge the pair of branches together again.

$ git init
Initialized empty Git repository in /tmp/conflict/.git/
$ git config user.email "jdl@example.com"
$ git config user.name "Jon Loeliger"
$ cat > file
Line 1 stuff
Line 2 stuff
Line 3 stuff
^D
$ git add file
$ git commit -m "Initial 3 line file"
Created initial commit 8f4d2d5: Initial 3 line file
1 files changed, 3 insertions(+), 0 deletions(-)
create mode 100644 file

Let’s create another commit on the master branch:

$ cat > other_file
Here is stuff on another file!
^D
$ git add other_file
$ git commit -m "Another file"
Created commit 761d917: Another file
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 other_file

So far, the repository has one branch with two commits, where each commit introduced a new file. Next, let’s change to a different branch and modify the first file.

$ git checkout -b alternate master^
Switched to a new branch "alternate"
$ git show-branch
* [alternate] Initial 3 line file
! [master] Another file
--
+ [master] Another file
*+ [alternate] Initial 3 line file

Here, the alternate branch is initially forked from the master^ commit, one commit behind the current head.

Make a trivial change to the file so you have something to merge, and then commit it. Remember, it’s best to commit outstanding changes and start a merge with a clean working directory.

$ cat >> file
Line 4 alternate stuff
^D
$ git commit -a -m "Add alternate's line 4"
Created commit b384721: Add alternate's line 4
1 files changed, 1 insertions(+), 0 deletions(-)

Now there are two branches and each has different development work. A second file has been added to the master branch, and a modification has been made to alternate the branch. Because the two changes do not affect the same parts of a common file, a merge should proceed smoothly and without incident.

The git merge operation is context sensitive. Your current branch is always the target branch, and the other branch or branches are merged into the current branch. In this case, the alternate branch should be merged into the master branch, so the latter must be checked out before you continue:

$ git checkout master
Switched to branch "master"
$ git status
# On branch master
nothing to commit (working directory clean)
# Yep, ready for a merge!
$ git merge alternate
Merge made by recursive.
file | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

You can use another commit graph viewing tool, a part of git log, to see what what’s been done:

$ git log --graph --pretty=oneline --abbrev-commit
* 1d51b93... Merge branch 'alternate'
|\
| * b384721... Add alternate's line 4
* | 761d917... Another file
|/
* 8f4d2d5... Initial 3 line file

That is conceptually the commit graph described earlier in the section Commit Graphs (Chapter 6), except that this graph is turned sideways, with the most recent commits at the top rather than the right. The two branches have split at the initial commit, 8f4d2d5; each branch shows one commit each (761d917 and b384721); and the two branches merge again at commit 1d51b93.

Tip

Using git log --graph is an excellent alternative to graphical tools such as gitk. The visualization provided by git log --graph is well-suited to dumb terminals.

Technically, Git performs each merge symmetrically to produce one identical, combined commit that is added to your current branch. The other branch is not affected by the merge. Because the merge commit is added only to your current branch, you can say, “I merged some other branch into this one.”

A Merge with a Conflict

The merge operation is inherently problematic because it necessarily brings together potentially varying and conflicting changes from different lines of development. The changes on one branch may be similar to or radically different from the changes on a different branch. Modifications may alter the same files or a disjoint set of files. Git can handle all these varied possibilities, but often it requires guidance from you to resolve conflicts.

Let’s work through a scenario in which a merge leads to a conflict. We begin with the results of the merge from the previous section and introduce independent and conflicting changes on the master and alternate branches. We then merge the alternate branch into the master branch, face the conflict, resolve it, and commit the final result.

On the master branch, create a new version of file with a few additional lines in it and then commit the changes:

$ git checkout master
$ cat >> file
Line 5 stuff
Line 6 stuff
^D
$ git commit -a -m "Add line 5 and 6"
Created commit 4d8b599: Add line 5 and 6
1 files changed, 2 insertions(+), 0 deletions(-)

Now, on the alternate branch, modify the same file differently. Whereas you made new commits to the master branch, the alternate branch has not progressed yet.

$ git checkout alternate
Switched branch "alternate"
$ git show-branch
* [alternate] Add alternate's line 4
! [master] Add line 5 and 6
--
+ [master] Add line 5 and 6
*+ [alternate] Add alternate's line 4
# In this branch, "file" left off with "Line 4 alternate stuff"
$ cat >> file
Line 5 alternate stuff
Line 6 alternate stuff
^D
$ cat file
Line 1 stuff
Line 2 stuff
Line 3 stuff
Line 4 alternate stuff
Line 5 alternate stuff
Line 6 alternate stuff
$ git diff
diff --git a/file b/file
index a29c52b..802acf8 100644
--- a/file
+++ b/file
@@ -2,3 +2,5 @@ Line 1 stuff
Line 2 stuff
Line 3 stuff
Line 4 alternate stuff
+Line 5 alternate stuff
+Line 6 alternate stuff
$ git commit -a -m "Add alternate line 5 and 6"
Created commit e306e1d: Add alternate line 5 and 6
1 files changed, 2 insertions(+), 0 deletions(-)

Let’s review the scenario. The current branch history looks like this:

$ git show-branch
* [alternate] Add alternate line 5 and 6
! [master] Add line 5 and 6
--
* [alternate] Add alternate line 5 and 6
+ [master] Add line 5 and 6
*+ [alternate^] Add alternate's line 4

To continue, check out the master branch and try to perform the merge:

$ git checkout master
Switched to branch "master"
$ git merge alternate
Auto-merged file
CONFLICT (content): Merge conflict in file
Automatic merge failed; fix conflicts and then commit the result.

When a merge conflict like this occurs, you should almost invariably investigate the extent of the conflict using the git diff command. Here, the single file named file has a conflict in its content:

$ git diff
diff --cc file
index 4d77dd1,802acf8..0000000
--- a/file
+++ b/file
@@@ -2,5 -2,5 +2,10 @@@ Line 1 stuf
Line 2 stuff
Line 3 stuff
Line 4 alternate stuff
++<<<<<<< HEAD:file
+Line 5 stuff
+Line 6 stuff
++=======
+ Line 5 alternate stuff
+ Line 6 alternate stuff
++>>>>>>> alternate:file

The git diff command shows the differences between the file in your working directory and the index. In the traditional diff command output style, the changed content is presented between <<<<<<< and =======, with an alternate between ======= and >>>>>>>. However, additional plus and minus signs are used in the combined diff format to indicate changes from multiple sources relative to the final resulting version.

The previous output shows that the conflict covers lines 5 and 6, where deliberately different changes were made in the two branches. It’s then up to you to resolve the conflict. When resolving a merge conflict, you are free to choose any resolution you would like for the file. That includes picking lines from only one side or the other, or a mix from both sides, or even making up something completely new and different. Although that last option might be confusing, it is a valid choice.

In this case, I chose a line from each branch as the makeup of my resolved version. The edited file now has this content:

$ cat file
Line 1 stuff
Line 2 stuff
Line 3 stuff
Line 4 alternate stuff
Line 5 stuff
Line 6 alternate stuff

If you are happy with the conflict resolution, you should git add the file to the index and stage it for the merge commit:

$ git add file

After you have resolved conflicts and staged final versions of each file in the index using git add, it is finally time to commit the merge using git commit. Git places you in your favorite editor with a template message that looks like this:

Merge branch 'alternate'
Conflicts:
file
#
# It looks like you may be committing a MERGE.
# If this is not correct, please remove the file
# .git/MERGE_HEAD
# and try again.
#
# Please enter the commit message for your changes.
# (Comment lines starting with '#' will not be included)
# On branch master
# Changes to be committed:
# (use "git reset HEAD ..." to unstage)
#
# modified: file
#

As usual, the lines beginning with the octothorp (#) are comments and meant solely for your information while you write a message. All comment lines are ultimately elided from the final commit log message. Feel free to alter or augment the commit message as you see fit, perhaps adding a note about how the conflict was resolved.

When you exit the editor, Git should indicate the successful creation of a new merge commit:

$ git commit
# Edit merge commit message
Created commit 7015896: Merge branch 'alternate'
$ git show-branch
! [alternate] Add alternate line 5 and 6
* [master] Merge branch 'alternate'
--
- [master] Merge branch 'alternate'
+* [alternate] Add alternate line 5 and 6

You can see the resulting merge commit using:

$ git log

These are notes I made after reading this book. See more book notes

Just to let you know, this page was last updated Sunday, Apr 28 24

Version Control with Git, 2nd Edition

Book Notes by Julaine

Bibliographic Information