-
Notifications
You must be signed in to change notification settings - Fork 30
Define our git workflow best practices #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Suggested addition: I have found this diagram to be supremely helpful whenever I've gotten myself into a bind: http://justinhileman.info/article/git-pretty/ |
developing/vcs/README.md
Outdated
|
|
||
| ``` | ||
| git fetch origin | ||
| git rebase origin/master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A while back we had a big discussion on this flow vs merging and it never resolved. There were good points on both sides and the meeting we took a poll and it was split 50/50. I think for this document it probably makes sense to encourage merging to update your branch, and have rebase be in a separate more advanced doc, since that can lead you down some tricky paths, especially for people new to git.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! I much prefer git merge and am happy to talk about why!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do y'all squash commits when you use merge to pull in changes from master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not speaking for anyone else but I don't squash pushed commits, when merging or otherwise. In general I don't think rewriting shared history is necessary or helpful, and the risks are significant.
There may be distinct questions here though:
- Is it correct to rebase a branch after code review, prior to merging (and therefore force push after rebasing)?
- Regardless of No.1, is it correct to "squash and merge" when merging to master?
I tend to think the answer to both is 'No' but my conviction about the second one is much weaker than the first. It sounds like you're asking the second question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe squashing into meaningful commits, that pass tests and other checks results in a history that's easier to read, understand, and debug. Check out my latest commit, where I add the pros and cons for both, and a list of popular repos that enforce squash and rebase: 4da0ce1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My pattern is to only squash things before I've shared the PR for review. After that, I avoid it for commits that have been shared and only use merge to update my branch. Before my branch is in PR review I rebase with master.
developing/vcs/README.md
Outdated
| Force push the branch back up to GitHub: | ||
|
|
||
| ``` | ||
| git push --force-with-lease origin <branch-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the safer option, but you can still trick --force-with-lease into overwriting stuff especially if you are using git fetch vs git pull to update your branch. Here's a helpful post that we can link here for more details https://blog.developer.atlassian.com/force-with-lease/ though I still think this is probably better as part of an advanced usage section.
rpdelaney
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking on rebase / force push lightning rod for now.
developing/vcs/README.md
Outdated
|
|
||
| ### Cloning with SSH vs HTTPS | ||
|
|
||
| Do we have strong opinions on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK https authentication requires use of a password so ssh is (1) for sure most convenient and (2) I think also more secure? The strength of our opinion would swing on the second point actually being true though.
developing/vcs/README.md
Outdated
|
|
||
| Ways to mitigate cons: | ||
|
|
||
| * Protect the master branch from force pushes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider recommending this be enabled no matter what else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should recommend that the master branch be protected from any pushes. The only way something goes to master is through CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. See the section above under GitHub repo settings where I made this recommendation. GitHub labels it as protection from force pushes, but I'm pretty sure the setting Require pull request reviews before merging protects it from any kind of merge as well.
developing/vcs/README.md
Outdated
| * Protect the master branch from force pushes | ||
| * Agree on a process where once a PR is under active review, no force pushes | ||
| until the PR is approved and ready to be merged. | ||
| * Original history can be viewed with `git reflog` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only promise that the reflog will be available in the local repo. If you clone a new copy or run git gc you may lose refs.
developing/vcs/README.md
Outdated
|
|
||
| I'm trying to avoid confirmation bias, but I haven't found any yet. If you know | ||
| of any, please add them. I found some that don't enforce any specific workflow, | ||
| and most of them say they will squash your commits via the GitHub interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux uses merge: Linus on Merge vs Rebase, kernel.org contributing docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. The second link doesn't say that they specifically prefer merge. They talk about the benefits of both. On GitHub, they point people to this document, which specifically asks people to either split their work into separate logical patches, or to squash commits using interactive rebase: https://www.kernel.org/doc/html/latest/process/submitting-patches.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is that thing again where we are fighting the battle on the wrong field. The contention isn't really about merging and rebasing. The contention, I think, is force pushing in a way that destroys published history, including (potentially) others' work.
I don't think anybody objects to rebasing work that hasn't been published anywhere. But it seems that rebasing published work requires use of a force push to a repo that multiple people have access to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely don't object to rebasing unpublished work. I use rebase from time to time, but I do think that it's a pretty sharp edged tool to be using as part of our recommended default flow, recommended to junior engineers who have a lot more important thinks to be worrying about than the arcane intricacies of git.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember the context is finding projects that use a merge workflow and eschew rebasing. I think Linux is the ultimate case of that, and should be mentioned if we will have a list.
From the kernel.org docs:
History that has been exposed to the world beyond your private system should usually not be changed...If work is in need of rebasing, that is usually a sign that it is not yet ready to be committed to a public repository.
On my reading, that uses a much more contrite tone but basically paraphrases and condenses a big part of Linus' email. An exception is mentioned but it's strictly confined to when git is being used to distribute work in progress, not for others to base their own work on (since destroying published history is bad).
Here's the clear statement of intent, declaring allegiance to merge:
Many projects require that branches in pull requests be based on the current trunk so that no merge commits appear in the history. The kernel is not such a project; any rebasing of branches to avoid merges will, most likely, lead to trouble.
The submitting patches doc where rebase -i is encouraged seems to be about creating pull requests in a local repository that are likely to be accepted, but not rewriting published history in a repository with many clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm realizing now that we might be talking about different things. There are two different concepts: one is rebasing in order to tidy up a PR before it is merged, and the other is how the PR ends up being merged into master. They are not the same thing. The way I read the second quote above deals with the second concept: how a branch/PR gets incorporated into the trunk (such as the master branch). GitHub provides 3 different ways to do that: merge commit, squash, and rebase.
When the quote above refers to no merge commits appearing in the history, I understand that to be the equivalent of using the rebase button on GitHub, which does not add a merge commit (the squash button as well I believe does not add a merge commit).
The workflow I am proposing is that we always use the merge option on GitHub to merge a PR (which is what I had all along in the GitHub repo settings section), but if the PR has commits with poor messages or that are just fixing linting errors, or that cause tests to fail, those should be cleaned up first.
When I started that list of popular repos, I did not frame it correctly. I was only looking at their policy for preparing PRs, not how the PR is merged. I think that most people agree that it's a good idea to clean up PRs before they get merged, including Linux.
I will create a more specific poll to see what people think about cleaning up PRs and which buttons they prefer on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm realizing now that we might be talking about different things. There are two different concepts: one is rebasing in order to tidy up a PR before it is merged, and the other is how the PR ends up being merged into master.
Yes. As far as I'm concerned this is about push --force[-with-lease].
I think that most people agree that it's a good idea to clean up PRs before they get merged, including Linux.
They do. In fact, they believe in it so much, they require that the PR should be cleaned up before it is even sent! But once it's sent, the history is published and should not be changed:
History that has been exposed to the world beyond your private system should usually not be changed...If work is in need of rebasing, that is usually a sign that it is not yet ready to be committed to a public repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to follow Charlie Munger's advice about fully understanding the other side's argument, but I don't think I do yet. Here is how I understand what you said. Please let me know where I misunderstood.
In the Linux example:
- You are equating submitting a patch with it being published, regardless of whether or not it is accepted. In other words, if someone submits a patch and the reviewer asks for changes, and if the submitter then creates several new commits locally to address the feedback, they can no longer rebase those commits to tidy them up before resubmitting.
- You are equating submitting a patch to Linux to submitting a PR to a repo in GitHub.
Is that what you are saying?
If so, that's not how I understand it, and I'd like to know where I might be wrong. The way I read the Linux contribution instructions here is that a patch is sent via git send-email, and if it requires changes, another revision of the patch is sent using the version and annotate flags, such as git send-email -v2 --annotate, and the contributor describes what changed between versions 1 and 2.
In the GitHub PR above, the Linux maintainer tells the contributor that it's fine to amend their commit. The 2 patches are documented here:
The second patch doesn't include the original commit. It is an amended version of the first patch.
To me, that is the equivalent of using force push once a PR is approved, since that is the only way of updating the PR in order to do a final cleanup after addressing changes requested during PR review.
While a Linux patch is still being reviewed via email, I don't consider it to be published publicly. Only once it is merged into the "subsystem tree" would I consider it "exposed to the world".
In the world of GitHub, I do consider a PR to be published publicly, and I fully understand the risk of force pushing to a GitHub PR that other folks might be actively working on. What isn't yet clear to me is why folks think it's also risky to force push even if it's only done once at the end of a PR's life, when no one else will be touching that branch anymore.
To me, there isn't much difference between 1) pairing with someone locally and pushing the final PR and 2) pushing to GitHub first to have someone review the code remotely, making changes, and then force pushing to update the final commit message to document any important points that came up during the review.
The end goal should be to have a detailed commit message that also captures any relevant comments that might have been made during review. As I mentioned in Slack, I don't think it's a good idea to rely on GitHub as our canonical source of documentation. If there is relevant information in the comments of a PR, it should be summarized and captured in the final commit message. One shouldn't have to visit GitHub and try to read through all the comments, some or most of which might not be valuable, in order to get the big picture. I realize that the PR's main body can be updated on GitHub to summarize everything, but I don't think one should have to visit a website to understand a commit. If I want to look up a commit locally, I want to be able to stay in my Terminal or editor and not have to switch to a browser.
If we all agree that it's a good idea to document the PR in the commit message as opposed to on GitHub, and that it's OK to force push when it's necessary to update the final commit message, then I think we have a middle ground:
- those who prefer rebase can use it consistently, both to incorporate changes from the main branch, and to edit commits
- those who prefer merge can use it consistently, both to incorporate changes from the main branch, and to edit the final commit message using
git merge --squash
In both cases, a force push is required at the end, and in both cases, we will use the "Merge" button on GitHub to incorporate the PR into the main branch. How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monfresh I interpret the PATCH and PATCHv2 as... not like rebase and force push; conceptually, if you add a commit (PATCHv2), you’re not rebasing, you’re pushing a new commit into your branch.
I see what appears to be a consistent misunderstanding of the concerns around rebase: the problem isn’t “is this safe when everything goes right”, it’s “what happens when this doesn’t go right”. This is the fundamental concern with git workflows that require force push anywhere in the flow; at some point, someone’s going to typo a branch and have a bad day.
I’m actually ok with us accepting that risk if we think the trade off of a clean history (which, TBH, I think is over-valued, however, that’s a belief, not a fact). I am not ok with us having two ways to do the basic, every day, 100% needed operation. That seems ridiculous.
So: here’s what I want: do the risk analysis, with a focus on correctness and safety, in a way that (in theory) should convince this merry band of engineers that it’s better to have one way to do this that we train and socialize, even if it’s not the way we currently believe to be best.
|
Here's a sketch of my git workflow:
for whatever that's worth. I do try and make my commits make sense and pass tests, and I don't like to rewrite history once someone else's work is involved. Truly, I think the main difference between the two styles are
|
|
My flow tends to be
I avoid rebasing when others have started reviewing my code or if I'm working on a branch with someone else |
developing/vcs/README.md
Outdated
| ## Repo maintenance and hygiene | ||
|
|
||
| * Avoid including files in source control that are specific to your development machine or process. | ||
| * Ignore files that are specific to your machine or process in your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Any time we tell people they need to do something, we need to explain very clearly why this is important.
developing/vcs/README.md
Outdated
| * Avoid including files in source control that are specific to your development machine or process. | ||
| * Ignore files that are specific to your machine or process in your | ||
| `~/.gitignore_global` as opposed to your repo's `.gitignore`. | ||
| * Delete local and remote feature branches after merging. This can be automated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: why? Also, if this can be automated, then we should either fully automate it, explain why we should do this manual toil, or decide that it doesn’t matter.
|
|
||
| If it's a repo you have write and push access to, clone it. Otherwise, fork it. | ||
|
|
||
| ### Cloning with SSH vs HTTPS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have strong opinions. Use https for read-only repos (ie, where you’re only ever going to clone and pull) and use ssh for everything else. I believe the why is because ssh supports certificates on security keys, and https does not; therefor, we should not use https for anything authenticated. However, if you’re just pulling a public repo, https is way more convenient since you don’t need to authenticate anything (unlike ssh for read-only repos).
developing/vcs/README.md
Outdated
|
|
||
| #### TL;DR | ||
|
|
||
| If you care about keeping a clean commit history, where each commit has a great |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So: do we care? What are the trade offs of caring?
I see advantages; however, if we’re investing a ton of effort to maintain a history that nobody ever looks at, I’m not sure it’s worth the time to maintain. So: are we committed to an engineering practice that uses clean history, teaching everyone how to use git bisect correct, and demonstrating the value of this practice? Right now, I see a lot of belief and not a lot of facts.
developing/vcs/README.md
Outdated
| can cause confusion when reviewing the log | ||
| * Not all commits might pass tests, making debugging hard | ||
| * Cannot use interactive rebase to squash and/or reorder commits | ||
| * Increases risk of losing code when handling conflicts due to working with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure about this one? Aren’t conflicts about the current state of the source and destination branches and don’t involve the history at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was poorly worded by me and will be removed. See discussion above started by MacRae. Also, most of this will be rewritten. It was a first draft to start the discussion.
developing/vcs/README.md
Outdated
| Cons: | ||
|
|
||
| * Riskier since it requires a force push | ||
| * Rewrites history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History re-writing never seemed to be a thing that actually mattered in terms of value generated by this process; however, force push is a problem.
developing/vcs/README.md
Outdated
|
|
||
| I'm trying to avoid confirmation bias, but I haven't found any yet. If you know | ||
| of any, please add them. I found some that don't enforce any specific workflow, | ||
| and most of them say they will squash your commits via the GitHub interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monfresh I interpret the PATCH and PATCHv2 as... not like rebase and force push; conceptually, if you add a commit (PATCHv2), you’re not rebasing, you’re pushing a new commit into your branch.
I see what appears to be a consistent misunderstanding of the concerns around rebase: the problem isn’t “is this safe when everything goes right”, it’s “what happens when this doesn’t go right”. This is the fundamental concern with git workflows that require force push anywhere in the flow; at some point, someone’s going to typo a branch and have a bad day.
I’m actually ok with us accepting that risk if we think the trade off of a clean history (which, TBH, I think is over-valued, however, that’s a belief, not a fact). I am not ok with us having two ways to do the basic, every day, 100% needed operation. That seems ridiculous.
So: here’s what I want: do the risk analysis, with a focus on correctness and safety, in a way that (in theory) should convince this merry band of engineers that it’s better to have one way to do this that we train and socialize, even if it’s not the way we currently believe to be best.
developing/vcs/README.md
Outdated
|
|
||
| Create a local feature branch based off master. Start the branch name with your | ||
| initials. For example: `mb-git-worfklow`. If the branch is related to a Pivotal | ||
| story, add the story's ID to the end of the branch: `mb-git-workflow-166068559` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the ceremony around initials and story ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the story id in the branch name allows some convenient integration with the tracker; your branch (and any PRs sent from it) will show up in the story automagically.
Initials I think are meant to mark branch ownership... imho git itself is opinionated in that branches are not owned by anybody. But I do this usually since folks seem to like it and not much (anything?) is lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also helps to quickly identify who to ping if there are stale branches in GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
developing/vcs/README.md
Outdated
| `--autosquash` flag for interactive rebase, which you should enable globally: | ||
|
|
||
| ``` | ||
| git config --global rebase.autosquash true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: explain why, and explain (if possible, this is git after all) what this is doing.
|
@rpdelaney This is ready for another review please. We are removing the contentious bits for now until we complete the risk analysis. Latest commit is here: 3a107cd |
ferlatte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monfresh I appreciate the reasons added and it looks good to me. I'm curious about the "Alias" vs. "fetch --prune" question for deleting local unused branches, but I'm otherwise happy with this being merged.
rpdelaney
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you
**Why**: To establish conventions, maintain consistency, and to have a single place where folks can get answers. For now, this PR only covers basic repo maintenance and hygiene, GitHub-specific settings, and commit message format. Originally, we also intended to settle on a full git workflow, but we have not reached a consensus yet. We will continue discussing and researching the contentious bits, and propose a single solution based on risk analysis and what other folks in the industry do.
3a107cd to
1ac529c
Compare

Why: To establish conventions, maintain consistency, and to have a
single place where folks can get answers.
This PR covers basic repo maintenance and hygiene, GitHub-specific
settings, git workflow, branch naming, and commit message format.