Tag Archives: cURL and libcurl

dropping hyper

The ride is coming to an end. The experiment is done. We tried, but we admit defeat.

Four years ago we started adding support for an alternative HTTP backend in curl. It would use a library written in rust, called hyper. The idea was to introduce an alternative implementation of HTTP internals that you could make curl/libcurl use instead of the native implementation.

This new backend that used a library written in rust would enable users to run a product where a larger piece of the total code than otherwise would be written in a memory-safe language: rust. Memory-safety being all the rage these days.

The initial work was generously sponsored by ISRG, the organization behind such excellent efforts such as Let’s Encrypt, which believes strongly in this concept. I cooperated intensely with Sean McArthur, the lead developer of hyper. We made it work.

We have shipped hyper support in curl labeled EXPERIMENTAL for several years by now, hoping to attract attention and trigger the experimental spirit in users out there. Seeing so many people seem to want more memory-safety, surely the users would come?

95% of the work is the easy part

I mean that we took it perhaps 95% of the way and almost the entire test suite ran identically independently of which backend we built curl to use. The final few percent would however turn out to be friction enough to now eventually make us admit defeat, give up and instead yank it all out again.

There simply were no users asking for it and there were almost no developers interested or knowledgeable enough to work on it. libcurl is written in C, hyper is written in rust and there is a C binding glue layer in between. It takes someone who is interested and good at both languages to dig in, understand the architectures, the challenges and the protocols to drive this all the way through.

But with no user demand, why do it?

It seems quite clear that rust users use hyper but few of them want to work on making it work for a C project like curl, and among existing curl users there is virtually no interest in hyper. The overlap in the Venn diagram of the two universes is not big enough.

With no expectation of seeing this work completed in the short to medium length term, the cost of keeping the hyper code is simply deemed too high. We gain code agility and reduce complexity by trimming this off.

We improved

While the experiment itself is deemed a failure, I think we learned from it and improved curl in the process. We had to rethink and reassess several implementation details when we aligned HTTP behavior with hyper. libcurl parses and handles HTTP stricter now. Better.

I also believe that hyper benefited from this journey and gained experiences and input from us that led to improvements in their end and in their HTTP library. Which then by extension have benefited the hyper users.

When we started this, even rust itself was not ready and over this time rust has improved and it is today a better language and it is better prepared to offer something like this. For us or for other projects.

Backends

With this amputation we are back to no separate HTTP/1 backends. We only provide the single native implementation.

I am not against revisiting the topic and the idea of providing alternative backends for HTTP/1 in the future, but I think we should proceed a little different next time. We also have a better internal architecture now to build on than what we had in 2020 when this attempt started.

Less rust (for now?)

Before this step, we supported three different backends backed up by libraries written in rust. Now we are down to two: rustls (for TLS) and quiche (for QUIC and HTTP/3). Both of them are still marked experimental.

These two backends use better internal APIs in curl and are hooked into libcurl in a cleaner way that makes them easier to support and less of burden to maintain over time.

Of course nothing prevents us from adding support for more and other rust libraries in the future. libcurl is a protocol engine using a plethora of different backends for many different protocols and services hooked into its core. Virtually all of those backends could be provided by a rust library.

Thanks

A big thank you go to Sean and all others who helped us take it as far as we did. You are great. Nothing of this should put any shade on you.

When

The hyper backend code has been removed in git as of December 21. There will be no traces left of it in the curl 8.12.0 release coming in February 2025.

A twenty-five years old curl bug

I have talked about old curl bugs before, but now we have a new curl record.

When we announced the security flaw CVE-2024-11053 on December 11, 2024 together with the release of curl 8.11.1 we fixed a security bug that was introduced in a curl release 9039 days ago. That is close to twenty-five years.

The previous record holder was CVE-2022-35252 at 8729 days.

Now at 161 reported CVEs, the median time a security problem has existed in curl until fixed is 2583 days, a little over seven years.

Age

We know the age of every single curl security problem because every time we have a confirmed one, I spend a significant time and effort digging through the source code history to figure out in which exact commit the problem was introduced.

(This is also how we know that almost every CVE we have ever announced was introduced by my mistakes.)

What’s Wrong?

I don’t think anyone is doing anything wrong here. I think it illustrates the difficulty and challenges involved. There are a lot of people looking at curl code all the time. We run tests and analyzers on the code, all the time. In fact, in November 2024 alone, we had CI jobs running on GitHub alone at 9.17 CPU days per day. Meaning that on average more than nine machines were running curl tests and builds to help us verify that it works as intended.

Apart from that, we of course have all the human individual testers, security researchers and the Google OSS-Fuzz project that is fuzzing curl non-stop and has been doing so for the last 6-7 years.

Security is hard. I mean really really hard.

I have no immediate ideas how to find the next such bug other than the plain old: add more test cases for scenarios and setups not previously tested. That is hard, difficult and quite frankly quite boring work that nobody in particular wants to do nor fund someone else to do.

Enough eyeballs

I think we all agree by now that not all bugs are shallow. Or perhaps we can’t ever truly get enough eyeballs. Or maybe the saying works, just that it needs an addendum

Given enough eyeballs and time, all bugs are shallow

Learn from each mistake

It is often said, and it is true, that you learn from mistakes. The question is only what exactly to learn from each and every reported security vulnerability. Each new one always feels like a unique stupid mistake that was a one-off that surely will not happen again because that situation is now gone and we have no other like that.

Not a C mistake

Let me also touch this subject while talking security problems. This bug, the oldest so far in curl history, was a plain logic error and would not have been avoided had we used another language than C.

Otherwise, about 40% of all security problems in curl can be blamed on us using C instead of a memory-safe language. 50% of the high/critical severity ones.

Almost all of those C mistakes were done before there even existed a viable alternative language – if that even exists now.

Graphs

I decided to not sprinkle graph images in the post this time. You can find data and graphs for all my claims in here in the curl dashboard.

curl 8.11.1

Welcome to another curl release. This time we do a bugfix only release, five weeks since the previous version shipped.

Release Presentation

Numbers

the 263rd release
0 changes
35 days (total: 9,763)

79 bugfixes (total: 11,173)
115 commits (total: 33,811)
0 new public libcurl function (total: 94)
0 new curl_easy_setopt() option (total: 306)

0 new curl command line option (total: 266)
51 contributors, 32 new (total: 3,299)
22 authors, 10 new (total: 1,323)
1 security fixes (total: 161)

Security

CVE-2024-11053: netrc and redirect credential leak. (Severity: Low) When asked to both use a .netrc file for credentials and to follow HTTP redirects, curl could leak the password used for the first host to the followed-to host under certain circumstances.

Bugfixes

As usual, here follows some bugfixes I figure could be worth highlighting. See the changelog on the curl site for the full list of changes.

curl

  • –continue-at is mutually exclusive with –no-clobber
  • –continue-at is mutually exclusive with –range
  • –continue-at is mutually exclusive with –remove-on-error
  • use real time in trace timestamps

scripts

  • dmaketgz: use –no-cache when building docker image

libcurl

  • duphandle: also init netrc
  • hostip: don’t use the resolver for FQDN localhost
  • mime: fix reader stall on small read lengths
  • mprintf: fix integer overflow checks
  • multi: fix callback for CURLMOPT_TIMERFUNCTION not being called again
  • netrc: address several netrc parser flaws
  • netrc: support large file, longer lines, longer tokens
  • socket: handle binding to “host!”

http related

  • http_negotiate: allow for a one byte larger channel binding buffer
  • digest: produce a shorter cnonce in Digest headers
  • cookie: treat cookie name case sensitively
  • nghttp2: use custom memory functions

protocols

  • libssh: use libssh sftp_aio to upload file
  • libssh: when using IPv6 numerical address, add brackets
  • OpenSSL: improved error message on expired certificate
  • rtsp: check EOS in the RTSP receive and return an error code
  • schannel: remove TLS 1.3 ciphersuite-list support
  • fixes for wolfSSL OPENSSL_COEXIST

No need to email me about Cisco AnyConnect

My name and email address can be found in the VPN client application made by Cisco called AnyConnect.

They are present there as part of the curl license, because this product – like thousands of others – uses libcurl. My name appears in many products.

Apparently, people often have problems finding an appropriate address to contact when they have issues with this app. This leads a disproportionate amount of them to send emails to me asking for solutions and fixes to their situations.

So far over the years, close to one hundred different persons have emailed me about problems with Cisco’s AnyConnect. I have not been able to help a single one of them because I know nothing about this application.

The reason my email address is shown there is because I am the lead developer of curl, which is but a small component in this application. I am not associated with Cisco nor this product.

This is the support email address you are looking for:

ac-mobile-feedback@cisco.com

See also: other funny emails I got and curl credit screenshots.

Rock-solid curl

I am thrilled to announce:

Rock-Solid curl: long term supported curl releases

Basics

We make long term support releases of curl that we call Rock-solid curl.

We support each release branch for at least five years.

We only merge security fixes and important stability bugfixes into these branches for updates. No new features. No surprises.

We offer Rock-solid curl downloads to existing support customers. It means that there is no free and open access to these releases. To get access, become a customer!

Rock-solid curl is released under the same license as normal curl (or optionally a commercial license). No funny business.

Rock-solid curl is meant to greatly reduce the risk of regressions and yet be a safe and secure solution with full support. For the companies who want this extra level of attention. An even smoother ride.

We plan to make new Rock-solid curl release branches roughly every 18-24 months.

The first Rock-solid curl release

Rock-solid curl 8.9.2 is the first long-term support curl version. As the version number implies, it is based on the curl 8.9.1 release that we shipped back in July, with two security fixes and a small number of stability patches applied.

Once you have a contract with us, you can get it.

Who is doing this?

I, Daniel Stenberg, will be the primary support person for Rock-solid curl and I will do the releases, and most of the patching and the back-porting of what is deemed necessary.

Customers sign contracts with wolfSSL for this. wolfSSL pays my salary. I have worked for and with wolfSSL with this business setup since 2019.

What about “the normal” curl?

Nothing changes with or happens to the curl project and the regular curl releases because of this. No one is going anywhere. The curl license remains the same. The curl releases and the release cadence remain intact.

Support customers help fund the project by allowing us to pay developers.

How do I become a customer?

Head over to rock-solid.curl.dev and contact us via the provided links.

Downloads and all Rock-solid curl information is hosted on the dedicated rock-solid.curl.dev site, separate from the open source project on curl.se.

On curl

Born in the late 1990s, curl is a client-side Internet transfer engine. Installed in over twenty billion instances it serves virtually everything that is internet connected: phones, tablets, cars, television sets, printers, medical devices, game consoles, helicopters on other planets, etc and it is an embedded component in a significant share of our most used and beloved apps, tools, games and services.

curl is the fruit and outcome from hard work by thousands of volunteers and is completely free and Open Source. The curl project is independent. It is not part of any umbrella organization or foundation and it is not owned nor controlled by any company.

curl is secure, fast and feature-rich. It is a defacto standard and key infrastructure.

curl 8.11.0

Numbers

the 262nd release
5 changes
49 days (total: 9,728)

266 bugfixes (total: 11,094)
435 commits (total: 33,694)
0 new public libcurl function (total: 94)
0 new curl_easy_setopt() option (total: 306)

1 new curl command line option (total: 266)
55 contributors, 22 new (total: 3,268)
25 authors, 10 new (total: 1,312)
1 security fixes (total: 160)

Release presentation

Security

CVE-2024-9681: HSTS subdomain overwrites parent cache entry. When curl is asked to use HSTS, the expiry time for a subdomain might overwrite a parent domain’s cache entry, making it end sooner or later than otherwise intended.

Changes

  • –create-dirs works for –dump-header as well
  • P12 format support added to GnuTLS backend
  • Added options to disable IPFS
  • TLSv1.3 earlydata support (with GnuTLS)
  • Official WebSocket support

Bugfixes

These are some of my favorite bugfixes in this release.

Build

  • cmake: document -D and env build options
  • configure: add support for ‘unity’ builds
  • configure: set linker flags to allow rustls build on macos

curl

  • detect ECH support dynamically, not at build time
  • support –show-headers AND –remote-header-name
  • make –skip-existing work for –parallel

libcurl

  • conncache: find bundle again in case it is removed
  • curl.h: remove the struct pointer for CURL/CURLSH/CURLM typedefs
  • ftp: fix 0-length last write on upload from stdin
  • hsts: support “implied LWS” properly around max-age
  • lib: remove function pointer typecasts for hmac/sha256/md5
  • mprintf: do not ignore length modifiers of %o, %x, %X
  • mprintf: treat %o as unsigned
  • multi: make curl_multi_cleanup invalidate magic latter
  • multi: make multi_handle_timeout use the connect timeout
  • netrc: cache the netrc file in memory
  • select: use poll() if existing, avoid poll() with no sockets
  • url: use same credentials on redirect
  • urlapi: normalize the IPv6 address

protocols

  • ngtcp2: set max window size to 10x of initial (128KB)
  • url: connection reuse on h3 connections
  • gnutls: use session cache for QUIC
  • mbedTLS: fix handling of TLSv1.3 sessions
  • schannel: ignore error on recv beyond close notify
  • schannel: reclassify extra-verbose schannel_recv messages
  • quic: use send/recvmmsg when available
  • quic: use the session cache with wolfSSL as well

tests

  • generate lib1521.c atomically
  • remove all valgrind disable instructions
  • remove debug requirement on 38 tests
  • use ‘-4’ where needed

Next

Unless we find a terrible regression, the next curl release is scheduled to ship on January 8, 2025.

curl source code age

In every software project that has been around for a while there is of course newer code and older code. A question that often pops up at least in my mind is then: How much of the old code has actually survived over the years and is still being in use today?

And how would you visualize that in a way that makes it possible to understand the data?

A challenge

This turned out to become my challenge of the week.

I started off writing a script that iterates over all release tags we have set in the curl git repository and for every such tag, it extracts all relevant source files and runs git blame on them. With the -t --line-porcelain options, the output is really easy to parse.

For every such release tag, we get a large number of lines with different timestamps. Then the script sorts all those timestamps, iterates over them, counts how many that were done within different intervals in time and outputs those counters in a formatted line.

Iterating over several hundred tags in a code base of curl size and running git blame like this is not a quick operation. On my decently fast machine, a full such round takes well over an hour. Admittedly there is probably ways the algorithm can be improved.

gnuplot

Once all the data is written, it is converted into a visualization using gnuplot. I needed to experiment. I had to experiment a bit to learn how to do filledcurves.

Take 1

My first take split up the age just as a percentage. How large share of the code has been changed within how many months.

Turned out rather hard to interpret and understand.

Take 2

The source code is always 100%, so how large share of the source code is written within which two-year segment?

I decided to split the time in two-year segments only to keep the number of segments down a little.

Also, I moved the labels to the right side as it is the side where you are most likely interested in reading them. I had to put the legend outside of the graph.

While I think this version turned out pretty cool, the actual number of lines of code and the growth of the code base is completely invisible in this version.

Take 3

What if I would do the take 2 version but do it based on actual number lines instead. I poked the script again, restarted it and let it run for another hour or two.

Better! This version shows the segments in a way that properly reflects the actual number of lines over time. It beats the weird percentage take from above.

Still, having the oldest code slide over on top of the graph like this and have newer code appear from below might not be the best way to illustrate this data. What if I instead swapped it around so that the graph would keep the oldest code at the bottom and add newer code over that?

Take 4

I think this shows perfectly fine how the exact same data can be experienced so much better if shown just slightly differently.

In this version below, I also experimented a bit on how to name the segments in the legend as someone pointed out that it may not be entirely obvious to everyone that I do two-year segments.

I could also move the legend into the graph again here.

Pedantic viewers of this graph will spot how the number of lines of code here is slightly different than the separate line of code graph shown in the curl dashboard. This, because git blame includes all the lines and the other graph is done using cloc which excludes blank lines – and probably some other minor differences as well.

Take 4 is the version of the scripts that starting now will be included in the curl dashboard.

Takeaways

More than 50% of existing curl code was written since 2020.

About 25% of existing code was written before 2014.

Almost 10% was written before 2008.

1254 lines (0.64%) are still left in the code that were written before the year 2000.

No, I don’t know how this compares to other projects of similar age.

The scripts

codeage.pl and codeage.plot

If you want to play with them against your own git repositories, you will notice that there are some curl-specific assumptions in there that you need to address, but that should not be difficult to patch.

Follow-up

Kees Cook wrote an adaptation of these scripts to generate the same graph for the Linux kernel. In Python and using multiple threads.

Eighteen years of ABI stability

Exactly eighteen years ago today, on October 30 2006, we shipped curl 7.16.0 that among a whole slew of new features and set of bugfixes bumped the libcurl SONAME number from 3 to 4.

ABI breakage

This bump meant that libcurl 7.16.0 was not binary compatible with the previous releases. Users could not just easily and transparently bump up to this version from the previous, but they had to check their use of libcurl and in some cases adjust source code.

This was not the first ABI breakage in the curl project, but at this time our use base was larger than at any of the previous bumps and this time people complained about the pains and agonies such a break brought them.

We took away FTP features

In the 7.16.0 release we removed a few FTP related features and their associated options. Before this release, you could use curl to do “third party” transfers over FTP, and in this release you could no longer do that. That is a feature when the client (curl) connects to server A and instructs that server to communicate with server B and do file transfers among themselves, without sending data to and from the client.

This is an FTP feature that was not implemented well in curl and it was poorly tested. It was also a feature that barely no FTP server allowed and subsequently this was not used by many users. We ripped it out.

A near pitchfork situation

Because so few people used the removed features, barely anyone actually noticed the ABI breakage. It remained theoretical to most users and I believe that detail only made people more upset over the SONAME bump because they did not even see the necessity: we just made their lives more complicated for no benefit (to them).

The Debian project even decided to override our decision “no, that is not an ABI breakage” and added a local patch in their build that lowered the SONAME number back to 3 again in their builds. A patch they would stick to for many years to come.

The obvious friction this bump caused, even when in reality it actually did not affect many users and the loud feedback we received, made a huge impact on me. It had not previously dawned on me exactly how important this was.

I decided there and then to do the utmost to never go through this again. To put ABI compatibility at the top of the priority list. Make it one of the most fundamental key properties of libcurl.

Do. Not. Break. The. ABI

(we don’t break the API either)

A never-breaking ABI

The decision was initially made to avoid the negativity the bump brought, but I have since over time much more come to appreciate the upsides.

Application authors everywhere can always and without risk keep upgrading to the latest libcurl.

It sounds easy and simple, but the impact is huge. The examples, the documentation, the applications, everything can just always upgrade and continue. As libcurl over time has become even more popular and compared to 2006, used in many magnitudes more installations, it has grown into an even more important aspect of the curl life. Possibly the single most important properly of curl.

There is a small caveat here and that is that we occasionally of course have bugs and regressions, so when I say that users can always upgrade, that is true in the sense that we have not broken the ABI since. We have however had a few regressions that sometimes have triggered some users to downgrade again or wait a little longer for the next release that has the bug fixed.

When we took that decision in 2006 we had less than 50,000 lines of product code. Today we are approaching 180,000 lines.

Effects of never breaking ABI

We know that once we adopt a change, we are stuck with it for decades to come. It makes us double-check every knot before we accept new changes.

Once accepted and shipped, we keep supporting code and features that we otherwise could have reconsidered and perhaps removed. Sometimes we think of a better way to do something after the initial merge, but by then it is too late to change. We can then always introduce new and better ways to do things, but we have to keep supporting the old way as well.

A most fundamental effect is that we can never shrink the list of options we support. We can never actually rename something. Doing new things and features consistently over this long time is hard if not impossible, as we learn new things and paradigms vary through the decades.

How

The primary way we maintain this is by manual code view and code inspection of every change. Followed of course by a large range of tests that make sure that assumptions remain.

Occasionally we have (long) discussions around subtle details when someone proposes a change that potentially might be considered an ABI break. Or not.

What exactly is covered by ABI compatibility is not always straight forward or easy to have carved in stone. In particular since the project can be built and run on such a wide range of systems and architectures.

Deprecating

We can still remove functionality if the conditions are right.

Some features and options are documented and work in a way so that something is requested or asked for and libcurl then tries to satisfy that ask. Like for example libcurl once supported HTTP/1 pipelining like that.

libcurl still provides the option to enable pipelining and applications can still ask for it so it is still ABI and API compatible, but a modern libcurl simply will never do it because that functionality has been removed.

Example two: we dropped support for NPN a few years back. NPN being a TLS extension called Next Protocol Negotiation that was used briefly in the early days of HTTP/2 development before ALPN was introduced and replaced NPN. Virtually nothing requires NPN anymore, and users can still set the option asking for it, but it will never actually happen over the wire.

Furthermore, a typical libcurl build involves multiple third party libraries that provide features it needs. For things like TLS, SSH, compression and binary HTTP protocol management. Over the years, we have removed support for several such libraries and introduced support for new, in ways that was never visible in the API or ABI. Some users just had to switch to building curl with different helper libraries.

In reality, libcurl is typically more stable than most existing servers and URLs. The libcurl examples you wrote in 2006 can still be built with the modern libcurl, but the servers and URLs you used back then most probably cannot be used anymore.

If no one can spot it, it did not happen

As blunt as it may sound, it has came down to this fundamental statement several times to judge if a change is an ABI breakage or not:

If no one can spot an ABI change, it is not an ABI change

Of course what makes it harder than it sounds is that it is extremely difficult to actually know if someone will notice something ahead of time. libcurl is used in so ridiculously many installations and different setups, second-guessing whatever everyone does and wants is darned close to impossible.

Adding to the challenge is the crazy long upgrade cycles some of our users seem to sport. It is not unusual to see questions appear on the mailing lists from users bumping from curl versions from eight or ten years ago. The fact that we have not heard users comment on a particular change might just mean that they are still stuck on ancient versions.

Getting frustrated comments from users today about a change we landed five years ago is hard to handle.

Forwards compatible

I should emphasize that all this means that users can always upgrade to a later release. It does not necessarily mean that they can switch back to an older version without problems. We do add new features over time and if you start using a new feature, the application of course will not work, or even still compile, if you would switch to a libcurl version from before that feature was added.

How long is never

What I have laid out here is our plan and ambition. We have managed to stick to this for eighteen years now and there is no particular known blockers in the known future either.

I cannot rule out that we might at some point in the future run into an obstacle so huge or complicated that we will be forced to do the unthinkable. To break the ABI. But until we see absolutely no other way forward, it is not going to happen.