long term curl versions

In the curl project we ship new releases based on the master branch of our git repository, in a clean and linear commit history. We have never maintained an old branch for long term or stability etc. Instead we promise to not break user behavior nor the ABI or API. All users should be able to always upgrade to the latest.

A never-ending stream of releases you can always upgrade to; a new one every 8th week.

We build infrastructure you can lean on.

But

Sometimes reality does not match our intentions and we ship regressions.

Sometimes users are too scared that there might be a regression so they refrain from upgrading. risk averse is probably how they view themselves.

Sometimes users, organizations and Linux distros have policies that say they do not upgrade versions. Usually based on how software in general works and there needs to be a single fixed policy for managing software versions and then curl gets treated the same way.

For those situations and other related scenarios, repeating the top paragraphs does not help.

long term branches by others

In practice, just about every major Linux distributor maintains one or more stable curl branches. They backport security fixes to those versions to keep them secure for their users. Some vendors also merge selected bugfixes into their branches.

Every Linux distributor picks the particular curl version they stick to by themselves without coordination with other distros. They all do it at different times and they all have their own specific criteria and work processes for doing this. This, in combination with curl’s frequent releases, tend to make them all pick different versions for their different branches. And keep them alive for different lengths.

Some vendors maintain their stable branches for extended periods of time. Upwards and beyond ten years happen.

These long-lived branches may eventually end up having literally hundreds of patches applied to them. The curl builds done from these branches still report as version x.y.z but in reality they are mutated versions that can be significantly different compared to the original x.y.z version that the curl project shipped in a tarball back in the day.

That’s the comfort you get for picking (and paying?) a Linux distribution. (Yes, you also easily get stuck with an ancient version because of this.)

Some users also simply get stuck on older versions for other reasons and do not security-patch them over time (by ignorance or incompetence), making them more and more insecure over time.

Reality

At the time I write this post, the curl release with the largest number of known security vulnerabilities has 85 published CVEs.

By asking users and by looking in logs in various servers, we know that just about every curl versions we have shipped the last dozen of years or so remain in use somewhere. We can only hope that most of them are security patched.

In reality, every release we do becomes a long term release for someone.

Long term support?

Every once in a while a discussion pops out in or close to the curl project whether we should consider starting to maintain one or more LTS branches.

We have never completely dismissed those ideas. We are however acutely aware of the extra effort and energy such an endeavor requires, so we have so far shrugged it off. But should there come users and sponsors willing to help make it happen, we would not be shy of implementing something.

After all, the ones most interested in LTS branches are usually people and companies with an economic gain to be had; with businesses using and relying on a rock solid curl. They should then also be able to help pay for this.

If you or the company you work for would be interested in something like this, please reach out and we can get the conversation going. Maybe we can do something to improve the lives of people out there?

Until then, we stick to a single release branch.

Credits

Image by Julius Silver from Pixabay

Inside 22,734 Steam games

About a year ago I blogged about games that use curl. In that post I listed a bunch of well-known titles I knew use curl and there was a list of 136 additional games giving credit to curl.

Kind of amazing that over one hundred games decided to use curl!

At the time, lots of people told me that number was probably way low and while I kind of had that feeling as well it was just a feeling and nothing else. We cannot be absolutely certain unless there is data or evidence to actually back it up.

The speculation could stop this week when someone provided me with a link to a database of Steam titles (Steam, as in the video game service). SteamDB is a third-party site that among other things extracts data and figures out which “SDKs” are used by Steam games: Their list of game titles on Steam using curl.

Since that list is capped at 10,000 titles, I had to filter it and add up the number of titles based on release year. Out of the 91,559 titles they currently list in their database, 22,734 are identified to be using curl: 24.8%.

Not too shabby for a hobby.

Discussion

Hacker news

curl user survey 2024 analysis

As tradition dictates, I have spent many hours walking through the responses to the curl user survey of the year. I have sorted tables, rendered updated graphs and tried to wrap my head around what all these numbers might mean and what conclusions and lessons we should draw.

I present the results, the collected answers, to the survey mostly raw without a lot of analysis or decisions. This, to allow everyone who takes the time to reads through to form their own opinion and thoughts. It also gives me more time to glance over the numbers many more times before I make up my mind about possible outcomes.

The 2024 user survey analysis document

If you find any mistakes or omissions in this document, let me know and I might fix and update corrected versions.

63 pages and 14,800 words. Enjoy!

Why curl closes PRs on GitHub

Contributors to the curl project on GitHub tend to notice the above sequence quite quickly: pull requests submitted do not generally appear as “merged” with its accompanying purple blob, instead they are said to be “closed”. This has been happening since 2015 and is probably not going to change anytime soon.

Let me explain why this happens.

I blame GitHub

GitHub’s UI does not allow us to review or comment on commit messages for pull requests. Therefore, it is hard to insist on contributors to provide the correct message, using the proper language in the correct format.

If you make a pull request based on a single commit, the initial PR message is based on the commit message but when follow-up fixes are done and perhaps force-pushed, the PR message is not updated accordingly with the commit message’s updates.

Commit messages with style

I believe having good commit messages following a fixed style and syntax helps the project. It makes the git history better and easier to browse. It allows us to write tools and scripts around git and the git history. Like how we for example generate release notes and project stat graphs based on git log basically.

We also like and use a strictly linear history in curl, meaning that all commits are rebased on the master branch. Lots of the scripting mentioned above depends on this fact.

Manual merges

In order to make sure the commit message is correct, and in fact that the entire commit looks correct, we merge pull requests manually. That means that we pull down the pull request into a local git repository, clean up the commit message to adhere to project standards.

And then we push the commit to git. One or more of the commit messages in such a push then typically contains lines like:

Fixes #[number] and Closes #[number]. Those are instructions to GitHub and we use them like this:

Fixes means that this commit fixed an issue that was reported in the GitHub issue with that id. When we push a commit with that instruction, GitHub closes that issue.

Closes means that we merged a pull request with this id. (GitHub has no way for us to tell it that we merged the pull request.) This instruction makes GitHub closes the corresponding pull request: “[committer] closed this in [commit hash]”.

We do not let GitHub dictate how we do git. We use git and expect GitHub to reflect our git activity.

We COULD but we won’t

We could in theory fix and cleanup the commits locally and manually exactly the way we do now and then force-push them to the remote branch and then use the merge button on the GitHub site and then they would appear as “merged”.

That is however a clunky, annoying and time-consuming extra-step that not only requires that we (always) push code to other people’s branches, it also triggers a whole new round of CI jobs. This, only to get a purple blob instead of a red one. Not worth it.

If GitHub would allow it, I would disable the merge button in the GitHub PR UI for curl since it basically cannot be used correctly in the project.

Squashing all the commits in the PR is also not something we want since in many cases the changes should be kept as more than one commit and they need their own dedicated and correct commit message.

What GitHub could do

GitHub could offer a Merged keyword in the exact same style as Fixed and Closes, that just tells the service that we took care of this PR and merged it as this commit. It’s on me. My responsibility. I merged it. It would help users and contributors to better understand that their closed PR was in fact merged as that commit.

It would also have saved me from having to write this blog post.

Discussion

Hacker news

Addendum

In some post-publish discussions I have seen people ask about credits. This method to merge commits does not break or change how the authors are credited for their work. The commit authors remain the commit authors, and the one doing the commits (which is I when I do them) is stored separately. Like git always do. Doing the pushes manually this way does in no way change this model. GitHub will even count the commits correctly for the committer – assuming they use an email address their GitHub account does (I think).

HTTP/3 in curl mid 2024

Time for another checkup. Where are we right now with HTTP/3 support in curl for users?

I think curl’s situation is symptomatic for a lot of other HTTP tools and libraries. HTTP/3 has been and continues to be a much tougher deployment journey than HTTP/2 was.

curl supports four alternative HTTP/3 solutions

You can enable HTTP/3 for curl using one of these four different approaches. We provide multiple different ones to let “the market” decide and to allow different solutions to “compete” with each other so that users eventually can get the best one. The one they prefer. That saves us from the hard problem of trying to pick a winner early in the race.

More details about the four different approaches follow below.

Why is curl not using HTTP/3 already?

It already does if you build it yourself with the right set of third party libraries. Also, the curl for windows binaries provided by the curl project supports HTTP/3.

For Linux and other distributions and operating system packagers, a big challenge remains that the most widely used TLS library (OpenSSL) does not offer the widely accepted QUIC API that most other TLS libraries provide. (Remember that HTTP/3 uses QUIC which uses TLS 1.3 internally.) This lack of API prevents existing QUIC libraries to work with OpenSSL as their TLS solution forcing everyone who want to use a QUIC library to use another TLS library – because curl does not easily allows itself to get built using multiple TLS libraries . Having a separate TLS library for QUIC than for other TLS based protocols is not supported.

Debian tries an experiment to enable HTTP/3 in their shipped version of curl by switching to GnuTLS (and building with ngtcp2 + nghttp3).

HTTP/3 backends

To get curl to speak HTTP/3 there are three different components that need to be provided, apart from the adjustments in the curl code itself:

  • TLS 1.3 support for QUIC
  • A QUIC protocol library
  • An HTTP/3 protocol library

Illustrated

Below, you can see the four different HTTP/3 solutions supported by curl in different columns. All except the right-most solution are considered experimental.

From left to right:

  1. the quiche library does both QUIC and HTTP/3 and it works with BoringSSL for TLS
  2. msh3 is an HTTP/3 library that uses mquic for QUIC and either a fork family or Schannel for TLS
  3. nghttp3 is an HTTP/3 library that in this setup uses OpenSSL‘s QUIC stack, which does both QUIC and TLS
  4. nghttp3 for HTTP/3 using ngtcp2 for QUIC can use a range of different TLS libraries: fork family, GnuTLS and wolfSSL. (picotls is supported too, but curl itself does not support picotls for other TLS use)

ngtcp2 is ahead

ngtcp2 + nghttp3 was the first QUIC and HTTP/3 combination that shipped non-beta versions that work solidly with curl, and that is the primary reason it is the solution we recommend.

The flexibility in TLS solutions in that vertical is also attractive as this allows users a wide range of different libraries to select from. Unfortunately, OpenSSL has decided to not participate in that game so this setup needs another TLS library.

OpenSSL QUIC

OpenSSL 3.2 introduced a QUIC stack implementation that is not “beta”. As the second solution curl can use. In OpenSSL 3.3 they improved it further. Since early 2024 curl can get built and use this library for HTTP/3 as explained above.

However, the API OpenSSL provide for doing transfers is lacking. It lacks vital functionality that makes it inefficient and basically forces curl to sometimes busy-loop to figure out what to do next. This fact, and perhaps additional problems, make the OpenSSL QUIC implementation significantly slower than the competition. Another reason to advise users to maybe use another solution.

We keep communicating with the OpenSSL team about what we think needs to happen and what they need to provide in their API so that we can do QUIC efficiently. We hope they will improve their API going forward.

Stefan Eissing produced nice comparisons graph that I have borrowed from his Performance presentation (from curl up 2024. Stefan also blogged about h3 performance in curl earlier.). It compares three HTTP/3 curl backends against each other. (It does not include msh3 because it does not work good enough in curl.)

As you can see below, in several test setups OpenSSL is only achieving roughly half the performance of the other backends in both requests per second and raw transfer speed. This is on a localhost, so basically CPU bound transfers.

I believe OpenSSL needs to work on their QUIC performance in addition to providing an improved API.

quiche and msh3

quiche is still labeled beta and is only using BoringSSL which makes it harder to use in a lot of situations.

msh3 does not work at all right now in curl after a refactor a while ago.

HTTP/3 is a CPU hog

This is not news to anyone following protocol development. I have been repeating this over and over in every HTTP/3 presentation I have done – and I have done a few by now, but I think it is worth repeating and I also think Stefan’s graphs for this show the situation in a crystal clear way.

HTTP/3 is slow in terms of transfer performance when you are CPU bound. In most cases of course, users are not CPU bound because typically networks are the bottlenecks and instead the limited bandwidth to the remote site is what limits the speed on a particular transfer.

HTTP/3 is typically faster to completing a handshake, thanks to QUIC, so a HTTP/3 transfer can often get the first byte transmitted sooner than any other HTTP version (over TLS) can.

To show how this looks with more of Stefan’s pictures, let’s first show the faster handshakes from his machine somewhere in Germany. These tests were using a curl 8.8.0-DEV build, from a while before curl 8.8.0 was released.

Nope, we cannot explain why google.com actually turned out worse with HTTP/3. It can be added that curl.se is hosted by Fastly’s CDN, so this is really comparing curl against three different CDN vendors’ implementations.

Again: these are CPU bound transfers so what this image really shows is the enormous amounts of extra CPU work that is required to push these transfers through. As long as you are not CPU bound, your transfers should of course run at the same speeds as they do with the older HTTP versions.

These comparisons show curl’s treatment of these protocols as they are not generic protocol comparisons (if such are even possible). We cannot rule out that curl might have some issues or weird solutions in the code that could explain part of this. I personally suspect that while we certainly always have areas for improvement remaining, I don’t think we have any significant performance blockers lurking. We cannot be sure though.

OpenSSL-QUIC stands out here as well, in the not so attractive end.

HTTP/3 deployments

w3techs, Mozilla and Cloudflare data all agree that somewhere around 28-30% of the web traffic is HTTP/3 right now. This is a higher rate than HTTP/1.1 for browser traffic.

An interesting detail about this 30% traffic share is that all the big players and CDNs (Google, Facebook, Cloudflare, Akamai, Fastly, Amazon etc) run HTTP/3, and I would guess that they combined normally have a much higher share of all the web traffic than 30%. Meaning that there is a significant amount of browser web traffic that could use HTTP/3 but still does not. Unfortunately I don’t have the means to figure out explanations for this.

HTTPS stack overview

In case you need a reminder, here is how an HTTPS stack works.

bye bye hosting c-ares web

At some point during 2003, my friend Bjørn Reese (from Dancer) and I were discussing back and forth and planning to maybe create our own asynchronous DNS/name resolver library. We felt that the synchronous APIs provided by gethostname() and getaddrinfo() were too limiting in for example curl. We could really use something that would not block the caller.

While thinking about this and researching what was already out there, I found the ares library written by Greg Hudson. It was an effort that was almost exactly what we had been looking for. I decided I would not make a new library but rather join the ares project and help polish that further to perfect it – for curl and for whoever else who wants such functionality.

It was soon made clear to me that the original author of this library did not want the patches I deemed were necessary, including changes to make it more portable to Windows and beyond. I felt I had no choice but to fork the project and instead I created c-ares. It would show its roots but not be the same. The c could be for curl, but it also made it into an English word like “cares” which was enough for me.

The first c-ares release I did was called version 1.0.0, published in February 2004.

The ares project did not have a website, but I am of the opinion that a proper open source project needs one, to provide downloads and not the least its documentation etc. A home. I created a basic c-ares website and since then I have hosted it on my server on the behalf of the c-ares project.

The was available as c-ares.haxx.se for many years but was recently moved over to c-ares.org. It has never been a traffic magnet so quite easy to manage.

In the backseat

In recent years, I have not participated much in the c-ares development. I have had my hands full with curl while the c-ares project has been in a pretty good shape and has been cared for in an excellent manner by Brad House and others.

I have mostly just done the occasional website admin stuff and releases.

Transition

Starting now, the c-ares website is no longer hosted by me. A twenty years streak is over and the website is now instead hosted on GitHub. I own the domain name and I run the DNS for it, but that is all.

The plan is that Brad is also going to take over the release duty. Brad is awesome.