How to git without hub

How did people use Git collaboratively before GitHub was a thing?
The two are often coupled, but some projects that don’t rely on GitHub (or any centralized service for that matter) for their git hosting needs.
Not just any old project too: the Linux kernel, Debian, Apache, GNU core utils and Golang are some examples of well known projects that handle their git repositories on their own infrastructure.

Ever wondered how they manage?
The simplest and most rudimentary way of using Git collaboratively is by sending changes to the maintainer in a file via email.

Let’s go over what that workflow might look like.

Where are my changes?

We need a way to get a set of changes (often called change-set) out of your local repository so that they can be sent elsewhere easily.

A commit is represented by a hash or an alias, which wouldn’t be very useful by itself in our case.
We can however get the underlying changes using diff.

Creating a diff

We can create a diff between two commits with something like git diff <hash-1> <hash-2> and send the results to a file:

git diff <hash-1> <hash-2> > mypatch.diff

You can also use a range of commits or something like HEAD~2 to get the diff for the last 2 commits.
If given only one hash, git diff will create a diff between it and your working directory (so git diff HEAD on a clean working directory prints nothing).

This gives us a neat file with the changes we want to send upstream.

Applying the diff

Just run git apply mypatch.diff!

This will apply the changes in the diff to the working directory, but they won’t be staged. The maintainer would have to stage them and create a commit to actually add the changes to the source tree.

Shortcomings

So this is great, but there are a couple of glaring issues here:

The original author and metadata of the changes got lost
The original commits all got squashed into one (or re-organized however the maintainer decides)
A person that didn’t write the changes (maintainer) got to commit them and appear as the author in the log

This is fine for a quick POC or draft to share during development, but doesn’t really scale well.

We need a way to maintain the original commits and their metadata, so the original contributor ends up in the log and their work is merged as provided (assuming no changes are needed from the maintainer).

Creating a patch

A patch is similar to a diff but keeps all the relevant metadata.
Since diffs are sometimes colloquially called patches, these can be called formatted patches.

You can create one using git format-patch:

git format-patch -1 <commit-hash> --stdout > my_patch.patch

Here, -1 represents the number of commits before <commit-hash> to be added to the patch.
So for a series of commits like:

A -- B -- C -- D

git format-patch -1 C and git format-patch B..C would achieve the same thing: a patch of the changes between B and C.
Conversely, git format-patch C would produce a patch of the changes between C and HEAD, which in this case is D.

The --stdout > my_patch.patch bit is just to send the data to a file.

It would be nice if git diff and git format-patch had consistent interfaces, but consistency is not really a strong point of git’s UI…

Anyway, if you inspect this file you’ll see that it not only contains the changes but also a bunch of metadata about them.

Let’s see how an actual formatted patch can be applied (because of course it’s not git apply like before…).

Applying the patch

Running git apply on a formatted patch “works” but leaves all that metadata out of the picture, just like before.

So instead, we use git am my_patch.patch (apply mailbox if you’re curious).

Now, all the original commits in the patch (with their timestamp and messages) will be added to the tree, with the original contributor/s as the author/s.
This of course might create conflicts (although it shouldn’t if the patch was created with care, more on that later). These can be fixed like any other merge conflicts, using git am --continue to resume the process.

In this case, the maintainer applying these changes will not appear anywhere in the commits.
This might be fine, but for later reference it might be useful to use the --signoff flag.
This way, the maintainer will be referenced at the end of the commit/s message/s with something like this:

Signed-off-by: Some One <someone@mail.yes>

Where is my fork button?

There is none, you just clone the repository, work on your local copy and send the patch to the maintainer/s.

Here’s the thing: forks are not really a git thing, they are a GitHub ~~complication~~ abstraction.

When you click that fork button, what essentially happens is that GitHub creates a copy of that repository under your user, with a reference to the original for ease of integration.
You would then clone your copy of the repo, work on that, push to your remote and then handle the merge request to upstream through GitHub’s UI.

Here, you just clone the original (upstream) repo, work however you want, and send the patch to the maintainer.

GitHub-less Workflow

So here’s what the full workflow might look like, from both perspectives:

As contributor

Clone the project and create a branch.

git clone git@server.com:Upstream/Repo.git
git checkout -b cool_branch

Do and commit the work.

git commit -a -m "no idea what i'm doing"

Pull any new changes and rebase master onto your branch.
This makes sure your changes don’t cause conflicts and are up-to-date with the main branch. The maintainer will thank you if you do this and yell at you if you don’t.

git checkout master
git pull
git checkout cool_branch
git rebase master

Create the formatted patch:

git format-patch master --stdout > the_patch.patch

Like we saw before, git will produce a patch of the diff between the head of the current branch and master.
This is the data you would see in a GitHub Pull Request.

Send the patch to the maintainer, and you’re done!

As maintainer

Get the patch somehow (email, curl, etc.) and apply it to a newly created branch.

git checkout -b dont_trust_that_guy
git am --signoff the_patch.patch

Hope the contributor did a rebase to avoid conflicts, yell at him if he didn’t.
Review the work and merge it if correct and up to standards.

git branch master
git merge dont_trust_that_guy
git push

Done! Now the contribution is in the main working tree, yay!

WTF do I care?

Why would anybody care to work like this when we have lovely, Microsoft-provided, green “Merge” buttons?

Well, for starters some projects started before GitHub was a thing. Some projects are so big and distributed in nature that the GitHub workflow isn’t really fit for purpose (such as the Linux kernel).
Some might argue that having the biggest repository of free and/or open source software hosted in Microsoft’s servers might not be the brightest idea…

In any case, as a contributor you might not really have a say in this. If you want/need to contribute changes to these kinds of projects you’ll have to adapt to how they work.

Apart from that, I just think it’s pretty cool to be able to send a quick diff or a patch to a co-worker or a maintainer/contributor without the usual rigmarole of creating a branch, pushing to remote, fighting with the pipeline, etc.
It’s just a file you can send via Matrix or Slack.

Simplicity has a charm all of its own.

How to git without hub

Other posts you might like