Happy little Trees and bumpy Roads
Your first contribution to an open source project can be a very rewarding
experience. Once your feature, fix or enhancement as part of a Github pull
request1 (PR) is merged, your brain is likely going to start looking to
solve the next problem in open source land with your newly acquired
If you’ve already went through this exercise though, you also know that
git Swiss Army knife, and related code collaboration platforms
(Github, etc.), might feel like black art and overwhelming. This is especially
true, when things go south, e.g. the reviewer requests changes to your code.
But even in the case where your code contribution is fine, the reviewer(s) might ask you to perform additional steps on your commits before accepting (“merging”) them into the upstream (“target”) repository.
Here’s a short list of typical reviewer (or bot) comments you might see during a review:
- “You must sign all of your commits!”
- “Please squash your commits before merging.”
- “Can you please rebase your commits onto the latest changes?”
- “Please follow our contribution guidelines and add
XYZto the message title/body”
Let me tell you that I’ve been in the same situation several times, both as a contributor and reviewer, e.g. on the VMware Event Broker Appliance (VEBA) project.
I have witnessed many times how quickly an enthusiastic (first-time)
contributor, often with little to no knowledge about
git and software
development, can desperately fail due to tooling, terminology or cryptic error
messages like the one below:
Over the time I established a set of patterns for my daily work with
help me to stay organized and protect me from common mistakes, e.g. pushing to
the wrong branch/repository. This is even more important when I work on multiple
PRs in parallel.
Don‘t ask how I know 😎
Of course, I adjust these patterns here and there, depending on the contribution guidelines/workflows of a project or team I‘m working with. I hope you find them useful, especially when you are blocked or frustrated 😄
This post aims to address some common challenges a
git novice might face
during a contribution, e.g. via Github pull request. The advice and best
practices given here are definitely opinionated based on my own observations and
the way I
git things done, so take them with the typical grain of salt.
Basics or internals of git and related tooling and platforms, such as Github,
won’t be covered. See the end of this post for some useful links
with details on
git(hub) concepts, workflows and internals.
Always make sure you read and understand the project‘s contribution guidelines and follow the provided issue/pull request templates before you start coding and open a pull request.
If the project does not provide any of these, first search for related issues. If your idea/fix has not already been discussed, open an issue to avoid lengthy discussions during a review on why you filed a PR. Upfront, clear and friendly communication is key in an open source project.
git Things done
The nice thing with
git when dealing with repositories, aka
remotes, is that
it does not differ between a remote URL or local folder. We can use this to our
advantage here to show some of concepts in action without creating a repository
gitCLI. Thus the latter lacks Github related primitives to work with issues, releases, pull requests, etc. I highly recommend installing the Github CLI
ghas a productivity booster.
Let’s create a local demo repository
awesome-project to understand the
patterns explained in the subsequent sections.
We can verify that a commit was created with the powerful
git log command.
To avoid having to remember all the different command line flags, I heavily rely
$SHELL aliases, e.g.
gloga which is a shortcut to the command above.
$SHELLand plugin managers, e.g. ohmyzsh provide useful pre-defined
gitaliases. For clarity, I will use the full commands in this post though.
Let’s create a couple more commits to make this post a bit more realistic.
Ignore the text that is added to each commit with the
-m flags for now. I’ll
come back to them in the Commits section.
Forks in the Road
By default, you can’t make direct changes to a repository (unless you are a owner or maintainer). That’s why Github established a fork/pull request workflow for contributions.
A fork creates a point-in-time clone of the original (
"upstream") repository in your own account. Your changes always are made in
that forked repository. Then you can create (open) a pull request on the
git CLI does not have a concept of forks. But we can mimic the workflow
locally with a plain ol' clone. Even though we won’t be able to cover the full
origin/fork/local_fork_clone lifecycle with the following examples, they help
to keep the complexity at a minimum, while focusing on useful patterns.
A nice thing is that
git will remember where the clone was created from and
automatically configures the clone’s
main branch to track
We can also quickly verify that the clone is identical to the source, i.e.
HEAD -> main tells you your current position (commit) in the
current repository (well, folder). But also note the additional
origin/HEAD references. They show you the position of the
main2 branch in the
And there’s the first catch: in a real-world scenario, more commits might have
been already added to the
origin repository. Thus, it’s important to keep up
with the remote’s changes to avoid issues at a later stage, e.g. merge conflicts
due to concurrent changes on the same code path.
HEAD, branches or tags like
v1.0.2are nothing but human-readable references, aka
refs, in a
gitrepository pointing to a specific commit SHA.
The topic of synchronizing with a remote will be covered in keeping your fork in sync. But since one can easily get lost with all the different repositories, I’ll cover some important naming patterns first.
The first step I perform after creating a clone is renaming the
remote to something more meaningful, e.g. “upstream”.
upstream/main reads more naturally. But feel free to pick whatever name
you prefer to not get lost.
git cloneprovides a
-o (--origin)flag to provide a custom remote name during a clone operation.
In addition, and to prevent us from accidentally pushing to the wrong
we can configure a dummy URL for
git push. With this little trick,
prevent us from pushing to the
You might be wondering why this is needed since typically you would not have
push permissions to the upstream anyways. Well, over time you might actually
become a collaborator or even maintainer of a project. That’s great, but if
you’re not careful your commit(s) end up in the wrong repository…especially
when you don’t follow an intuitive
remote naming strategy.
Here’s another tip: If we’d used a real remote repository, e.g. Github, as an
example here, you would have cloned a fork to your local machine. Depending on
the platform and tooling, the fork’s
remote might show up as
the “real” source repository (“upstream”) might not show up at all when you
git remote show. As usual, there’s an easy fix.
For branch names I settled on using Github issue numbers, which has a couple of benefits:
- I’m forced to create an issue upfront which is a good thing anyways
- I can directly tell which branch belongs to which issue(s)
- I can easily clean up merged branches by pattern matching
Once you work on multiple issues/branches in parallel, this approach has saved me several times from accidentally pushing the wrong code. Whatever naming pattern you pick, as long as it’s consistent (can be automated) you’re fine.
Here is an example using the
gh CLI to create an issue and respective branch.
git checkout -brings you back to your previous branch.
Once my branches are merged, I can easily clean them up too.
-d flag tells
git to only delete branches which have been fully merged.
Note that this command might not work for you if the repository does not create
merge commits for PRs. In that case (or if you really want to get rid of your
branch for whatever reason), use the brute-force way
git branch -D instead.
Keeping your Fork in Sync
Over time, especially in very active projects, the
upstream repository will be
ahead of your
fork in terms of commits. Thus, you need to keep up with these
upstream changes and synchronize them into your
Let’s create a couple more commits in our example
awesome-project to make this
Now we need to switch back to the fork and bring it in sync with
In the above commands, I only fetched the information about changes, but not the
changes (commits) themselves. You can see this because our current position,
HEAD, is still pointing to
main at commit SHA
My intention here is, that I want to train your muscle memory with an
alternative way instead of the usual
git pull to get the job done.
HEADthing, think of it as the “you are here” marker on a map to indicate the current position. As you traverse through the map (commits and
HEADwill move accordingly.
Rebasing is an amazing,
but often misunderstood, concept in
git. It allows you to move
refs and even
commits onto another commit, branch, or any other
Concerning our example above, we want to move the current position of our forked
main branch (
HEAD) to match its
upstream/main counterpart. Since we have a
linear history without any conflicting changes between the two repositories,
main in sync with
upstream/main is as simple as:
Because you will perform code changes in dedicated branches, i.e. not
the corresponding “primary” branch in a project), merge or rebasing conflicts
should never arise. Due to the linear commit history,
git rebase can simply
HEAD to the desired
My recommendation: rebasing instead of
git merge should become your default
way of synchronizing (integrating) changes from
remote into local branches.
You will see the real strength of
git rebase in the Oh my
What goes in a Message?
A commit represents an atomic change in the append-only history in
like in a log. Thus, it’s important that a well crafted commit includes a
meaningful but brief description about the included changes.
Even though the tooling might not enforce a standard, e.g. character limit, etc., there are certain commit best practices you should know and follow. I usually point newcomers to this great post: How to write a commit message.
Read it? Great, let’s move on…
You might have also wondered about the
-m flags and text like
in the earlier examples when executing
Let’s take this example:
git commit -s -m "feat: add feature X" -m "Closes: #34"
-m flag can be used multiple times and replaces the interactive editor
which otherwise comes up to create a commit title and message. Even multi-line
strings are supported (just keep typing after the first
Depending on the project you’re contributing to, it might use certain patterns
in a commit title or body to craft a
"feat:" is recognized
and the commit will be highlighted in a “Feature” section if the project uses
this. Take a look at this example from the
If you include a
"Closes: #34" in the commit message body and open a pull request, Github
will try to automatically link the specified issue (here
#34) to the PR so it
gets automatically closed after the PR is merged. Just like with the
aforementioned prefixes, these keywords might be used in a
CHANGELOG or release
-s flag will add a
Signed-off-by footer which is a good
to traceback the author of a patch and is often required in projects. Don’t
confuse this with digitally
When to commit?
Perhaps the biggest question is when to create a commit and how to break them up into individual chunks.
The latter really depends on the type of work and project guidelines (if any). In my opinion, everything that belongs together (“atomic”) goes into the same commit, including documentation updates. This makes it simpler to revert or cherrypick changes.
If you work on a larger pull request, likely this can be broken down into sub-tasks which nicely map to individual commits. Alternatively, the PR itself might have to be broken up into multiple PRs with smaller (single) commits.
Creating commits sounds easy, right? Well, often you don’t know upfront the scope of your work and how (when) to break them up. It might also be that the maintainers ask you to do so after the fact (see the following section).
If you’re paranoid like me, you might actually want to commit often and push to a remote (fork) as a cheap backup in case you badly messed up on your local machine…or your disk dies suddenly and the backup (you do backups, don’t you?) is from two days ago 😱
Long story short, in most cases you simply don’t know in advance what the final commit history will be. Your (temporary) commit history might actually look like mine, ehhh this one:
Let me tell you that this is the norm and not bad at all! As usual with
there’s a couple of ways out of this mess. My preferred one is using
once my work is ready for a PR.
Here’s how to fix it based on the commit history above in our fork and the
reset command is super convenient to revert committed changes as if they’d
never been committed - but without losing the actual changes. They get simply
unstaged in your working directory again. Think of
reset as the
reverse action of
reset command also comes in handy if you want to rearrange what goes into
a commit. Say you want
awesome_script.sh not in the first but second commit.
Simply (soft) reset your commits to a previous state (commit) as shown above and
create individual commits again with the desired contents.
git resethas a
--hardoption which will discard all changes. If this is not what you wanted
git reflogis there to help…
Finally, I am going to cover some of the typical challenges you might run into during your first pull request…or in case you keep forgetting this stuff 😄
The following sections are broken up into typical conversations you might experience during a PR review.
“You forgot to sign your commit(s)”
Often you will be greeted by a friendly 🤖 stating that you forgot to sign one or more commits. This could also happen after a rebase/squash (see below), where you did not sign the resulting commit.
Scenario with only commit
96c6889 in branch
issue-31 has not been signed-off and
is part of a PR in the
Solution: Amend the commit and
force-push to the fork (so it gets
reflected in the associated PR).
Remember: you cannot change existing commits in
git. Thus under the hood,
amending creates a new commit (
a2817f0) with the exact same content/message
and replaces the previous one. This rewrites the history and thus would be
rejected during a
git push to a remote. The flag
git to ignore such errors and forcefully overwrite the
--forcein the name can be dangerous as
gitwon’t get in your way to protect you. Here we use a preferred option
--force-with-leasewhich I highly recommend using instead for various reasons.
Scenario with multiple commits
Problem: Multiple (or all) commits in a branch are not signed off.
rebase against the parent branch (
main) to change all commits in
the current branch (
--exec to execute arbitrary commands,
such as signing off by amending.
-ioption in the
rebasecommand above and an interactive dialog will open where you can skip certain commits (and do other fancy rebase stuff).
“Please squash your Commits”
Problem: Some repositories might have been set up in a way that they can’t (or won’t) squash (“collapse”) multiple commits of a PR into a single one during merging. The contributor must then squash these commits and (force) push again before merging.
Continuing with our signed-off commits from the example above, there’re two ways you can achieve this.
git rebase you can interactively perform a squash operation:
s (short for squash) for all but the first commit and exit
the editor (e.g.
Upon exit, a new editor window will pop up allowing you to specify the title and
message for the resulting (single) commit. Squashing preserves commit details
which you can use to craft a proper message. If you don’t want to reuse the
previous commit details, use
f (short for
fixup) instead of
s in the step
Once you’re done don’t forget to push to your fork/PR:
vimsupports multi-line editing with
CTRL-Vso you don’t have to replace every single
This is my preferred option in this case, as it’s usually the faster approach but your mileage may vary 😉
The only drawback I can see here is that you have to retype your commit
title/message, unlike with
squash which preserves these details.
“Your PR needs a Rebase”
Problem: The repository owners configured the target (“base”) branch to only
allow merges from up-to-date branches. In most cases though, this message will
come up when conflicting changes were introduced to the base branch (e.g.
main) while you were working on your PR.
Solution: The actual fix to this problem is similar to what we did in the
squash scenario above. We continue our example with
issue-31 and execute the following steps within that branch.
The above history tells you that your
HEAD is still originating from
(which is stale, i.e. behind
upstream/main). We could first sync the recent
main, but this is not required for this
Instead, we move (
HEAD (commit) onto
upstream/main so our
branch is up to date again.
git rebase --abort.
After we verified that everything worked, we can push again. As we are not
changing history, i.e.
HEAD moves forward, a
force option is not required.
“Please adjust your commit message”
Problem: You might have forgotten to include some details, like a prefix or issue reference in one or more commit messages.
Scenario with one commit
Solution: Amend your commit.
Scenario with multiple commits
Solution: Perform an interactive rebase. Let’s continue with our previous
issue-31 example branch which for this exercise contains two commits that need
to be changed.
An editor shows up again where you replace
r (short for
on the commits you want to adjust the message.
After you exit this window, for each selected commit a new editor window opens
where you can perform your changes. Then perform a
force-push as in the other
I have seen so many enthusiastic newcomers getting stuck or frustrated in an open source project due to tooling or technical terms. Not too seldom ending up dropping the ball on a contribution. This has nothing to do with how smart you are or that these tools and platforms are only for “real developers”.
I hope the real-world examples, my very opinionated best practices and
references below help you to easily navigate
git and it’s huge ecosystem, e.g.
the Github platform.
I want YOU to shine as a leading example and inspire even more people to contribute their ideas, documentation fixes or other forms of improvements to the world of open source. No matter how “big” your contribution is, every useful commit counts!
git your feet wet 😉
- How to Write a Git Commit Message
- Interactive Git Cheatsheet
- Official Github Guides
- The Pro Git Book (free)
- Mastering Git Tutorials
- The legendary Oh Shit, Git!?!
- On undoing, fixing, or removing commits in git
- Step by step Tutorial Contributing to the VEBA Project