Gitflow branching model¶
Branches in git are super effective for developing without affecting the work of others. Yet, without any pre-defined organisation between your team, branches will soon become a nightmare : you have plenty of them, you don’t know where is what, you developed something really useful but can’t find how to merge it to the published version of your repository without breaking everything, etc.
Branching models are here to help you avoid that. They define a set of branch patterns with a certain function, so that, depending on what you want to do, you know how to do it in git. There are plenty of git workflows, some of them suited for really small teams with simple development patterns, other suited for big teams or open-source projects where a lot of people contribute, developement is long and having at least one version of the software that always work is vital.
I present here what is called the git-flow. It was first presented by Vincent Driessen in 2010 ∑`A successful git branching model`_. Git flow works really well in a vast majority of developement setups : when you’re developing a package on your own, when you’re working as a team of three people on a small data analysis project, when you’re developing a software with frequent updates with more than a dozen of people.
It is sure rather heavy compared to some other workflows, but thanks to that is able to handle most development problems. After reading this section, you might think that it is far too heavy for your needs. That’s perfectly fine ! Even if you don’t apply it, consider it as a theoretical reference, which you can cherry-pick ideas and concepts from to find your perfect development workflow. All the images you’ll find in this section are taken from this this excellent tutorial from Atlassian.
Dedicated branches for a functional development¶
The core idea behind git-flow is to maintain a single, tested and working version of your software, with frequent releases, while allowing the development of new features that will not hinder the fact that, well, it’s working. It is thus not really suited for softwares that maintain and distribute simultaneous versions. Conversely it works really well as part of an agile team organisation.
As you will encounter bugs, even on the main version that has already been released, it is also necessary to know how to handle those bugs without affecting development, and without having to fix them again and again because said fix was lost in the development history.
Let’s dive into the subject. Git flow defines a set of branch patterns that every branch you create will fall into. For each type of branch, there are rules as to when and where to create them from, when and where to merge them to, what you should do and not do with them, etc. I present them by family here, but you can find a shorter synthesis by branch pattern in the recap page Basic Git commands.
Main branches¶
The git-flow defines two core branches : master
and develop
. These are
branches that receive tested and verified code, and are later used to release
working versions of the software. By definition, they are quite passive branches :
you don’t commit directly on them (except on develop
on some rare cases), you
rather just merge other branches onto them, often after a whole process of code
reviewing. Because of that, they are also the most collective branches of the
repo, receiving new code from various people. Which makes them both fragile
and prone to conflicts.
master
holds the working and distributed version of your software, so it is by far the
most important and sensitive part of your repo’s history, as its names indicates.
One could say that the whole goal of git flow is to allow master
to be
(almost) always working, without bugs, so that it can be distributed or used at
any time.
You never work directly onto master. I mean it, never. Any modification, even the smallest one, could break whole parts of your code, and you really don’t want that to happen, especially if other people than your team are using the repository (or the software that is deployed from it).
There are ways we’ll see below to fix even urgent bugs on master
without
directly working on it, allowing proper review and verification before
changing it. Again, do not work directly on master
. This branch must be
composed of only merge commits, and a few if possible compared to other branches.
Having a lot of commits on master is a good sign that you don’t review and test
enough before merging.
develop
is the receiving branch for every new feature. It is the most active
branch of the repository, and as master
, it is also a collective branch by
definition. Because it is on develop
that you aggregate all the work that has
been done here and there, it must be reviewed thoroughly, to be sure that a new
feature does not break the existing or introduce regressions.
It is from develop
that you trigger a merge on master
(a release), which
actually makes it the bare bone of your repository, where value is added to the
project. Contrary to master
, it is not supposed to be accessible by users, so it
does not have to be working at all times, even if it should. That is why it is
allowed to commit stuff on it, when it is simple and safe. In any case, the
content of develop
will be reviewed before merging it onto master
.
It is a good idea not to allow rewriting history on develop
, because it would
mess up everyone’s git history, and first and foremost because it makes it harder
to monitor the development history of your code, and find the origin of a bug
Feature branches¶
A feature branch is a branch dedicated to the development of… new features ! It is where, as a developer, you will spend most of your time. The idea of a feature branch is to allow a safe space to develop a whole new feature. When you start working on it, there is nothing. Where you’re finished, the feature branch contains new code, that is supposed to work and can be merged onto the main code. Or it can modify existing code, but again is to be merged only when the modifications end up to something that is working.
Naturally, a feature branch roots from develop
and will be merged back onto
develop
. Several people can be developing on the same feature branch, or you
can work alone on it, even not synchronising it with the remote until it’s ready
to be merged (which is not advised for obvious back-up reasons).
You can virtually do anything on a feature branch : willingly break things to
see how it works, rewrite history again and again, make a thousand commits (we’ll
see how to fix that afterwards). The only thing is, since it is meant to be merged
on develop
to add value to the repository, it is a good (if not necessary)
idea to clean and review it before merging, so that what will be added to develop
is not only clean and working code, but also features a clean git history to ease
future bug tracking.
In general, a feature branch’s scope is a single feature. That means you don’t develop various features on the same branch : create another branch instead, don’t mix things. This allows an easy review, a better understanding of what has changed in the code, and thus makes the whole development process safer.
Conversely, for some really big features that needs a lot of development (both in terms of time and code lines), it is a good idea to sever the work into several feature branches, provided that each feature branch adds a coherent value on its own. It doesn’t have to yield an already working new code, but should at least encompass changes that have a meaning altogether.
Another possibility for big features is to create subfeature branches, that
root from and are merged back to the feature branch itself. In that case, there
will be only one final and heavy merge of the full feature branch onto develop
.
I don’t really advise that since it is prone to missing changes from develop
while developing, making the final merge tricky and possibly introducing unforeseen
bugs. Testing and merging every subfeature onto develop on the go is much safer to
me.
For teams, several feature branches are likely to coexist. That is not at all a problem but requires some precautions as to how and when to merge, and how to update an existing feature branch with work from already merged feature branches. We’ll explain thoroughly how to handle such situations in the Collaborating with git section.
Releases¶
There will be a moment when develop
has received enough new features from
feature branches, and you want those new features to be available on the main
version of your project (hello master
). One could simply merge develop
onto master
, but that’s a bad idea for two reasons :
First, even if every work that was added to develop
was reviewed, it is important
to be sure that the whole bunch of new features that were developed work well
together. It is hard – but possible – to test that for every feature merge, but
it is in any case safer to make a thorough test of what we will be released on
master
, and to have a dedicated environment (a branch and possibly other
testing infrastructures that we don’t talk about here) to do that.
Second, it is very likely that you’ll need to changes little things when testing
the release, at least version number, documentation and so on, that usually get
forgotten during development. These are not feature nor bug changes, but needs
to be done before a release. Since you’re about to commit some stuff,
you don’t want to mess up with the development process on develop
,
which might as well continue in the meantime.
Enter the release branch. It is a simple, short-lived branch, that is created from
develop
, on the last commit that is to be included in the release. This has the
benefit of fixing once and for all the version of the code that will be released, no matter
what continues to happen on develop
(that’s one of the reasons you don’t
rewrite history on develop
).
Here, you might review all the new code once more, but at that point every new line should already have been thoroughly reviewed during feature development. This is also when you test that your software is working as expected for users, or that you’re happy with the results you have if it is an analysis project, etc.
Once you’re certain that everything is fine (no bugs, no typos, expected behaviour),
you release the version, which is merging the release branch onto master
.
That way, master
has been incremented in a single commit with all the
new features that were on develop
, allowing easy tracking of version updates
on the main version of the project.
It is not compulsory but a good practice to tag each release on the master branch. A git tag is simply a label that is added to an existing commit, making it easier to track versions or to trigger some specific behaviour (for instance deployment to a webserver) from the git hosting service. As for commits, git tags must be pushed to the remote
git tag -a v0.1 -m "Version kitty cat" # tag the release commit on master
git push --tags # push to new tag to the remote
A release branch has to be merged twice : once onto master
, once
onto develop
, since you may have commited stuff onto it that you need
develop
to get too. Otherwise you’ll end up developing on a branch that is not
up-to-date with the main and distributed version of your project, which is bad.
I mean, really bad, since further releases will be hindered by the fact
that develop
and master
have diverged.
Fix branches¶
In an ideal development workflow, you don’t have to manage bugs in your code.
You develop features, merge them onto develop
, then release regularly to
master
and deploy your software or publish the updated version of your package.
However, since bugs will occur, one needs a way to handle them that can firstly fix the bug without preventing new features to be developed in the meantime, secondly to be sure that the fix gets applied for the current main version and for all the future ones.
Git flow defines hotfix
branches for that. Actually, most git hosting services
also define bugfix
branches on top of hotfix
. In both cases, these are
branches dedicated to bugs, which are not feature development. The difference
between them revolves around the bug being found on master
or on develop
.
hotfix
are for urgent bugs, that are found on master
and need to be
fixed as soon as possible, without being able to wait for the next release. A
hotfix branch roots from master and merge back directly to master,
bypassing the usual feature then release development workflow.
For that reason, they are quite delicate to handle and they really need to be limited to the bug fix only. Any other change can wait, it will be published with the next release. This allows precise review and testing of the bug fix, and makes it easier to merge onto master having the confidence that the changes will not break other things.
As for release branches, any hotfix branch that is merged onto master
must
also be merged onto develop
. Without this supplementary merge, the bug will
occur again with the next release, since the buggy part of the code wouldn’t
have been fixed on develop
, and the next release might erase the bug fix
that was done only on master
.
That’s another reason why it’s critical not to allow master
and develop
to diverge. Diverging master
and develop
makes it really hard to dispatch
changes from one onto the other, triggering merge conflicts everytime. We’ll talk
later about merge conflicts. They are common and healthy between features, but
dangerous when found between develop
and master
, because depending on
how you handle them, you can lose information on the choices made and reintroduce
lost bugs. So for now remember one thing :
merge back hotfix branches onto develop.
bugfix
are simpler to handle, since they are here to fix bugs that are found
on develop
during the development process, that is before a release.
That can occur when two new features, despite working well on their own, have
appeared to introduce an unexpected behaviour once merged onto develop
.
In that case, a bugfix
branch is opened from develop
then merged back
onto it, waiting to be published with the next release.
All in all, a bugfix
branch behaves exactly like a feature
branch.
It is just named differently to allow a more precise product management. That
process also shows the robustness of a well-done git flow : it is really not a problem
if unexpected bugs are introduced while developing, provided you detect them before
the next release.
If so, the main distributed version will not be affected and you don’t have to
risk modifying it on the fly. That also strengthen the importance of peer-reviewing
and functional testing for every new feature, and before every release.