git
has seen such a huge adoption over the past few years, so it’s
no wonder that there have emerged a number of suggested workflows for it.
What git
lacks in user-friendliness, it certainly makes up for in
flexibility, and it’s this flexibility that has resulted in this wide range
of possible use cases.
I’ve spent a while working with git
myself, and during that time I’ve seen
a few workflows. It’s obvious that different workflows apply for different
situations—the recommendations for working within the git
source
itself are based on the fact that multiple versions need to be maintained,
and thus there is a “graduation” system where changes are first merged into
the oldest branch requiring support, and then gradually merged upwards to
later branches, until a commit finally makes it onto the master branch.
This flow doesn’t really apply to most, and it’s not entirely unreasonable
to say that a lot of the features of git
are unnecessary for the typical
developer. That’s the wonder of git
, though: you can customize it to suit
your needs.
Workflow
At Causes, we use git
in a perhaps unconventional—but simple—way.
Whenever starting on a task, a new branch is created to encapsulate all the
changes as a part of that task (you might have heard of this referred to as a
“topic” or “feature” branch). As you work on it, the canonical source
(origin/master
) will be updated by other developers, and so we rebase that
feature branch on top of origin/master
often so any conflicts can be
addressed earlier rather than later (this is in contrast to the merge option,
where you use a merge commit to address any conflicts, dealing with them only
at that point in time). After commits are approved via our code review tool
Gerrit, the commits are cherry-picked serially onto origin/master
.
Conceptually, this means our commit history is perfectly linear, as any chunk
of work is simply a series of commits that is appended to the history when it
is merged onto master. There are no “bubbles” or branches of any kind in the
output of git log --graph
.
Experts might find this to be a great injustice, as you lose information about where branches started and when they were merged back in, but we’ve found this to not be a significant loss of information. We believe the true value of a repository history lies in well-written commit messages. We also believe that long-lived topic branches are to be avoided, as we want to aim to always push something to master at the end of every day.
Topic Branch Dependency
A situation that can sometimes arise is when you are working on a topic
branch that builds on top of another topic branch which has not yet been merged
onto origin/master
(due to pending code review, for example).
In order to continue being productive, you want to continue working on
feature-2
, but it’s likely that you’ll have to make some changes to
feature-1
after feedback from code review, or you’ll have to rebase
feature-1
onto a new master
after fetching origin/master
, or both.
This requires you to rebase each branch serially in order to preserve the
references of each branch. To see why, consider what would happen if you were
to just rebase feature-2
onto an updated master
using git rebase feature-2
--onto master
(notice in the figure below that the colours of the commits are
modified to emphasize that their hashes have changed from the rebase):
Notice that the commit feature-1
is pointing to is no longer the predecessor
of feature-2
, which during the rebase had its parents updated to the same
commits as feature-1
but with different hashes due to the rebase. To fix
this, you would need to manually set feature-1
to point to the new commit
(git checkout feature-1 && git reset --hard feature-2~2
).
Alternatively, you could avoid this problem by rebasing each branch in serial,
starting from master
:
Managing Large Dependency Chains
This becomes annoying in the case where you have very large dependency chains, as you have to traverse the chain up to master and then recursively rebase each branch onto its rebased parent in serial.
Although this is a relatively rare occurrence, I was intrigued by the idea of automating this task. I call it a recursive rebase, and I use it regularly as part of my workflow. It’s powered by the following recursive script:
In a nutshell, the script starts on the current branch, and recursively
traverses the current branch’s tracking (i.e. remote) branch until it finds a
tracking branch whose remote is actually a remote repository (e.g. origin
instead of .
, which is used to represent the local repository).
I’ve set the alias git rrb
to execute this script, completely automating this
process for me. There’s probably some edge cases I’ve missed, but after having
used it for a while it’s worked without any surprises. It may be
over-engineered, but it’s pretty darn cool.