What are Git Merge and Git Branch

Introduction to Git (Part 2): Branching and Merging

Lately I already had a little too much Basics of versioning with Git tells. In this article I would like to continue the theme a little, focusing mainly on working with Branches focus.

  1. What is a commit anyway?
  2. And what is a branch now?
  3. How do I bring branches back together?
  4. What else is there?
  5. What to do in the event of a conflict
  6. Summary

An interesting fact on the side: In books on "classic" SCM systems such as CVS or Subversion, the section on "Branching and Merging" can usually be found - if at all - only at the very end. Most Git books, on the other hand, have a corresponding section on the same topic at the very beginning in the basics. As you can see, working with branches is much more important to Git than to other SCM systems.

What is a commit anyway?

Before you get down to the nitty-gritty, a few necessary basics. First, I'd like to take a closer look at what a Git commit is. As already explained, a Git commit is a snapshot of a specific version of your project.

The figure opposite shows the exact structure of a Git commit. First of all, every commit has one author (engl. Author) and one Committer (In contrast to SVN, Git differentiates between the author of a commit and the person who created the commit).

Git uses directories and files as so-called Trees and Blobs (short for Binary large objects) shown. Each blob is about the SHA1 checksum of its contents identified and stored in the repository under this checksum. This has a pleasant side effect: if the same file appears in unchanged form in many commits, Git still only saves this file as a single blob, since the checksum always remains the same.

For Git, a tree object is just a list of other tree objects and blobs. A tree is also identified using an SHA1 checksum. A commit in turn contains a pointer to a tree object, which is formed from the top directory level of the repository.

If you feel like it, you can even try it out with a few Git console commands. gives you the SHA1-ID of a file, and can then output this file again:

If the repository already has a commit and you then create more commits, each commit has an additional one a pointer to its predecessor commit. Basically, every commit has exactly one predecessor commit. Exceptions are the first commit (which logically cannot have a predecessor) and Merge Commits (which can have several predecessors; more on that later).

Ultimately, a Git history is nothing more than a series of commits referenced together (the computer scientist describes this structure as directed acyclic graph, DAG for short).

And what is a branch now?

In Git, a branch is nothing more than a “pointer” that points to a specific commit in this graph. On the first commit, Git automatically creates a new branch: the "master" branch. The "master" branch moves with you automaticallywhen new commits are created (in the examples below I use simple letters to describe the commit IDs; in practice these are of course longer).

(master) (master) | | A - B - C - [git commit] -> A - B - C - D

A new branch can now be created with the command. Git simply sets this internally just a new pointer pointing to the current commit:

(master) (master) | | A - B - C - [git branch testing] -> A - B - C | (testing)

How does Git know which branch to work with right now? Git manages another pointer for this: the so-called HEAD pointer. The HEAD pointer points to a specific branch and marks the branch that is currently being worked with. Initially, the HEAD will always point to the master branch (and therefore the branch created with points to the same commit as the master branch).

You can now switch to another branch with the command:

(HEAD) | (master) (master) | | A - B - C - [git checkout testing] -> A - B - C | | (testing) (testing) | (HEAD)

You can see very well what the whole thing is if you do one more now new commit created in the testing branch:

(master) (master) | | A - B - C - [git commit] -> A - B - C - D | | (testing) (testing) | | (HEAD) (HEAD)

As you can see, the pointer has now moved on with the new commit, while the pointer is still pointing to the same commit.

Let's make things a little more exciting: What happens if you switch back to the branch with and create another commit?

(HEAD) | (HEAD) (master) | | (master) E | / A - B - C - D - [git commit] -> A - B - C - D | | (testing) (testing)

At that point there is actually one real branching originated (in Git parlance it says here that the two branches diverges are).

How do I bring branches back together?

In order to merge both branches again, the command can now be used:

(HEAD) (HEAD) | | (master) (master) | | E E - F / / / A - B - C - D - [git merge testing] -> A - B - C - D | | (testing) (testing)

Oops, what happened now? Git has one when merging the branches new commit is created (in this example Commit F), which has both Commit E (the pointer previously pointed there) and Commit D (the pointer) as a predecessor commit.

Git proceeds as follows when merging: Git first looks at the commit pointed to by the current HEAD (here commit E) and the commit pointed to by the branch to be merged (here commit D). Then Git searches for the last common ancestor of these two commits (that is, when viewed from the branch tips, the first commit that can be reached from both branches; here the commit is C).

From these three commits, Git tries to determine the changes in each branch and combines these changes in a new commit - a so-called merge commit. The branch that was just checked out was then brought forward to the new merge commit. Because Git takes exactly three commits into account for this merge, this procedure is also known as a Three-way merge designated.

Suppose you want to merge two branches that have not yet diverged. Let's take one of the previous examples again:

(HEAD) (HEAD) | | (master) (master) | | A - B - C - D - [git merge testing] -> A - B - C - D | | (testing) (testing)

In this case, Git apparently moved the branch as well, however no new commit created. In this case, this is because the commit pointed to by the target branch (here the) already has a direct predecessor of the branch to be merged (in this example, commit C is a direct predecessor of commit D).

In this case, Git is content with simply “advancing” the pointer a few commits. This behavior is called Fastforward merge designated.

Now what if two branches contain changes that collide with each other? Usually this only happens if exactly the same lines or lines that are close together were edited in two commits. First of all, Git will point out these conflicts to you when you merge. It looks something like this:

The current conflict status can also be seen in the git status output:

In this case, the conflict must be resolved manually. The command git mergetool starts a graphical user interface that you can use to resolve the conflict (normally the vimdiff tool is started for this purpose, but this can be changed via the Git configuration). Alternatively, you can simply edit the file in a text editor. Within the file, the conflict is shown as follows:

After you have resolved the conflict, you can add the file to the index using. You then have to create the merge commit manually; the necessary changes are already in the staging area, so all you have to do is call it up.

"Ui, that sounds very complicated now," one or the other might think. From my own experience, however, I can say that working methods with frequent branching and merging - albeit unfamiliar - become flesh and blood faster than you think. In a later article I will go into detail again on how you can best integrate working with Git branches into your development workflow.

Have fun merging! ;)

Martin is a software architect and is enthusiastic about current topics in software and web development. He is also co-author of the books "Praxiswissen TYPO3" and "Zukunftssafere TYPO3-Extensions mit Extbase & Fluid" (both published by O'Reilly-Verlag.)