Contributors to the curl project on GitHub tend to notice the above sequence quite quickly: pull requests submitted do not generally appear as “merged” with its accompanying purple blob, instead they are said to be “closed”. This has been happening since 2015 and is probably not going to change anytime soon.
Let me explain why this happens.
I blame GitHub
GitHub’s UI does not allow us to review or comment on commit messages for pull requests. Therefore, it is hard to insist on contributors to provide the correct message, using the proper language in the correct format.
If you make a pull request based on a single commit, the initial PR message is based on the commit message but when follow-up fixes are done and perhaps force-pushed, the PR message is not updated accordingly with the commit message’s updates.
Commit messages with style
I believe having good commit messages following a fixed style and syntax helps the project. It makes the git history better and easier to browse. It allows us to write tools and scripts around git and the git history. Like how we for example generate release notes and project stat graphs based on git log basically.
We also like and use a strictly linear history in curl, meaning that all commits are rebased on the master branch. Lots of the scripting mentioned above depends on this fact.
Manual merges
In order to make sure the commit message is correct, and in fact that the entire commit looks correct, we merge pull requests manually. That means that we pull down the pull request into a local git repository, clean up the commit message to adhere to project standards.
And then we push the commit to git. One or more of the commit messages in such a push then typically contains lines like:
Fixes #[number]
and Closes #[number]
. Those are instructions to GitHub and we use them like this:
Fixes means that this commit fixed an issue that was reported in the GitHub issue with that id. When we push a commit with that instruction, GitHub closes that issue.
Closes means that we merged a pull request with this id. (GitHub has no way for us to tell it that we merged the pull request.) This instruction makes GitHub closes the corresponding pull request: “[committer] closed this in [commit hash]”.
We do not let GitHub dictate how we do git. We use git and expect GitHub to reflect our git activity.
We COULD but we won’t
We could in theory fix and cleanup the commits locally and manually exactly the way we do now and then force-push them to the remote branch and then use the merge button on the GitHub site and then they would appear as “merged”.
That is however a clunky, annoying and time-consuming extra-step that not only requires that we (always) push code to other people’s branches, it also triggers a whole new round of CI jobs. This, only to get a purple blob instead of a red one. Not worth it.
If GitHub would allow it, I would disable the merge button in the GitHub PR UI for curl since it basically cannot be used correctly in the project.
Squashing all the commits in the PR is also not something we want since in many cases the changes should be kept as more than one commit and they need their own dedicated and correct commit message.
What GitHub could do
GitHub could offer a Merged keyword in the exact same style as Fixed and Closes, that just tells the service that we took care of this PR and merged it as this commit. It’s on me. My responsibility. I merged it. It would help users and contributors to better understand that their closed PR was in fact merged as that commit.
It would also have saved me from having to write this blog post.
Discussion
Addendum
In some post-publish discussions I have seen people ask about credits. This method to merge commits does not break or change how the authors are credited for their work. The commit authors remain the commit authors, and the one doing the commits (which is I when I do them) is stored separately. Like git always do. Doing the pushes manually this way does in no way change this model. GitHub will even count the commits correctly for the committer – assuming they use an email address their GitHub account does (I think).
Codeberg has an extra function for manual merging. You can then enter the commit ID of the merge there.
What I always find unfortunate when editing is that the GPG/SSH signature of the commit is removed (and particularly annoying if the maintainer who then merges it does not even sign it with his key).
On the other hand, manual merging has the advantage that you can even sign the merge commit. And I think it’s nice that you make the style yourself and thus help contributors.
And thanks for the explanation!
Forgejo and Gitea, which codeberg runs on 🙂
Small note: In the workflow to force push the merge result, I believe you can replace clicking the merge button with pushing the merged commit to the main branch – and github will automatically detect that and make the PR purple, if I recall correctly. This doesn’t help with the extra CI job, but at least you don’t need to click a button.
@tbodt: ah yes, I forget about that feature. We actually sometimes get that show up, mostly by accident.
> That means that we pull down the pull request into a local git repository, clean up the commit message to adhere to project standards.
>
> And then we push the commit to git.
Sounds like the “squash and merge” button/option, which allows you to edit the commit message, would accomplish the same thing (except without leaving the interface) with less steps/friction. I would love to learn why this option does not work for this project; perhaps I’m missing that from the article – do you mind expanding why?
@Chris: two reasons:
I dread editing a commit message and stay within 72 columns in a web form. Especially if it is a large one.
But perhaps more: a PR can certainly end up more than a single commit, so just assuming that all commits would always be squashed is not the right thing. If there are separate changes they should remain separate commits, each with its own correct commit message.
You may recall a long time ago when we were working on the curl push guidelines a big problem with those buttons was GitHub would take committer attribution rather than the actual person committing it. Then we couldn’t track who committed what. I’m not sure if they still do it that way. Github messing with committer/author attribution is a no go.
I don’t think github is going to please everyone here but if there was a way for us to extend or automate one of their buttons with a script we could come in and do a transformation or something. Like if we had a custom button that could run our script server side to automatically do the attribution, squash, show a preview etc
@Ray: I believe that attribution thing was a flaw that no longer occurs. At least I have not seen it recent in other repositories where I have used the merged button.
You made a typo.
«[…] typically contains likes like» -> «[…] typically contains lines like».
@Alan: thanks, fixed!
If you use GitHub’s API, you can specify a custom commit message on the merge commit https://docs.github.com/en/rest/pulls/pulls?apiVersion=2022-11-28#merge-a-pull-request. If you are already using a custom CI flow to merge the commit, it’s not the worst to replace it with this API call. But it’s still limited and won’t work if you do the merge manually
I’m reminded of a saying my grandmother taught me, over 50 years ago.
A foolish consistency is the hobgoblin of little minds.
@Smit: that’s just a rude way of saying you disagree with the way. That’s fine. Not everyone needs to agree.
For something that’s supposed to make collaboration faster and easier, time and time again I’ve seen a lot of developers and projects fighting to get GitHub, and git in general, to do what they want.
So much time and effort is spent on figuring out workarounds, then actively using those workarounds day-to-day, then trying to get contributors to understand and follow them, then sorting out problems when those workarounds end up failing in some way, then doing things manually, and so forth.
It all makes self-hosting a VCS server, CI infrastructure, bug tracker, mailing lists, and so on look easier, cheaper, and more enjoyable for all involved.
@Dominik: I think I answered most of those questions back in my what if GitHub is the devil post from 2021.
Why isn’t “squash and merge” a de facto standard in the industry, I don’t know.
Each PR should represent a single change. As such, is reversible via reverts. Furthermore, not doing more than one change at a time.
Sometimes though you want a stack of commits, each doing one thing and building on the last
To squash that loses important context and modularity
Daniel explained it in another comment: https://daniel.haxx.se/blog/2024/06/11/why-curl-closes-prs-on-github/#comment-27011
> a PR can certainly end up more than a single commit, so just assuming that all commits would always be squashed is not the right thing. If there are separate changes they should remain separate commits, each with its own correct commit message.
“We do not let GitHub dictate how we do git. We use git and expect GitHub to reflect our git activity” – 100% this! A tool is for people, not people for a tool.
I would really like a Merged keyword or simply an API to associate PRs with commits, simply because I prefer rebasing locally. I use git from the command-line. I don’t want the github website/servers to perform git actions on my repos, it just needs to be a frontend to my repos. It also lets me sign the commits, which has to be done locally. Fortunately the “Allow edits by maintainers” works decently for me but sometimes you have to ask the contributor to enable it.
Stumbled over this:
“the commit messageS are correct” or “the commit message IS correct”
@Andreas: thanks, fixed!
We’re not accepting PRs on haproxy for exactly the same reasons you explained. Going back-and-forth with the contributor to ask for changing this and that takes ages and is not respectful of anyone’s time, it’s totally inefficient and irritating for both. It’s much simpler to take the patch(es), edit them locally and apply them, particularly when the required changes are just cosmetic and literally take seconds vs explaining what to change.
For this we have a bot that automatically sends a mail to our mailing-list with instructions about that PR (how to get the patch etc) and closes the PR. It’s not very friendly but on the other hand we receive extremely few PRs like this, they mostly come from those who didn’t read our contributing guidelines which explicitly ask not to send PRs.
Github is first and foremost a social network before being a code exchange platform, so they want you to adapt your workflow to match the way they can collect various stats to present on each user’s profile. We try as much to respect that for contributors who care but PRs are definitely a no-go, as there are always things to adjust, otherwise you have to agree to degrade your standards just to satisfy these statistics.
Something I would really like to see would be Github counting mentions of their users in commit messages, because those reporting bugs various ways that lead to a quick fix provide as much value (and sometimes even more) than those sending patches. We try to credit all of our reporters in commit messages, but this is not taken into account by the Github machinery.
At least the vast majority of our contributors don’t care so we’re probably on a fine balance, but sometimes for newcomers it would be nicer if they could see their mentions accounted just like PRs etc in their activity.