Tag Archives: API

Eighteen years of ABI stability

Exactly eighteen years ago today, on October 30 2006, we shipped curl 7.16.0 that among a whole slew of new features and set of bugfixes bumped the libcurl SONAME number from 3 to 4.

ABI breakage

This bump meant that libcurl 7.16.0 was not binary compatible with the previous releases. Users could not just easily and transparently bump up to this version from the previous, but they had to check their use of libcurl and in some cases adjust source code.

This was not the first ABI breakage in the curl project, but at this time our use base was larger than at any of the previous bumps and this time people complained about the pains and agonies such a break brought them.

We took away FTP features

In the 7.16.0 release we removed a few FTP related features and their associated options. Before this release, you could use curl to do “third party” transfers over FTP, and in this release you could no longer do that. That is a feature when the client (curl) connects to server A and instructs that server to communicate with server B and do file transfers among themselves, without sending data to and from the client.

This is an FTP feature that was not implemented well in curl and it was poorly tested. It was also a feature that barely no FTP server allowed and subsequently this was not used by many users. We ripped it out.

A near pitchfork situation

Because so few people used the removed features, barely anyone actually noticed the ABI breakage. It remained theoretical to most users and I believe that detail only made people more upset over the SONAME bump because they did not even see the necessity: we just made their lives more complicated for no benefit (to them).

The Debian project even decided to override our decision “no, that is not an ABI breakage” and added a local patch in their build that lowered the SONAME number back to 3 again in their builds. A patch they would stick to for many years to come.

The obvious friction this bump caused, even when in reality it actually did not affect many users and the loud feedback we received, made a huge impact on me. It had not previously dawned on me exactly how important this was.

I decided there and then to do the utmost to never go through this again. To put ABI compatibility at the top of the priority list. Make it one of the most fundamental key properties of libcurl.

Do. Not. Break. The. ABI

(we don’t break the API either)

A never-breaking ABI

The decision was initially made to avoid the negativity the bump brought, but I have since over time much more come to appreciate the upsides.

Application authors everywhere can always and without risk keep upgrading to the latest libcurl.

It sounds easy and simple, but the impact is huge. The examples, the documentation, the applications, everything can just always upgrade and continue. As libcurl over time has become even more popular and compared to 2006, used in many magnitudes more installations, it has grown into an even more important aspect of the curl life. Possibly the single most important properly of curl.

There is a small caveat here and that is that we occasionally of course have bugs and regressions, so when I say that users can always upgrade, that is true in the sense that we have not broken the ABI since. We have however had a few regressions that sometimes have triggered some users to downgrade again or wait a little longer for the next release that has the bug fixed.

When we took that decision in 2006 we had less than 50,000 lines of product code. Today we are approaching 180,000 lines.

Effects of never breaking ABI

We know that once we adopt a change, we are stuck with it for decades to come. It makes us double-check every knot before we accept new changes.

Once accepted and shipped, we keep supporting code and features that we otherwise could have reconsidered and perhaps removed. Sometimes we think of a better way to do something after the initial merge, but by then it is too late to change. We can then always introduce new and better ways to do things, but we have to keep supporting the old way as well.

A most fundamental effect is that we can never shrink the list of options we support. We can never actually rename something. Doing new things and features consistently over this long time is hard if not impossible, as we learn new things and paradigms vary through the decades.

How

The primary way we maintain this is by manual code view and code inspection of every change. Followed of course by a large range of tests that make sure that assumptions remain.

Occasionally we have (long) discussions around subtle details when someone proposes a change that potentially might be considered an ABI break. Or not.

What exactly is covered by ABI compatibility is not always straight forward or easy to have carved in stone. In particular since the project can be built and run on such a wide range of systems and architectures.

Deprecating

We can still remove functionality if the conditions are right.

Some features and options are documented and work in a way so that something is requested or asked for and libcurl then tries to satisfy that ask. Like for example libcurl once supported HTTP/1 pipelining like that.

libcurl still provides the option to enable pipelining and applications can still ask for it so it is still ABI and API compatible, but a modern libcurl simply will never do it because that functionality has been removed.

Example two: we dropped support for NPN a few years back. NPN being a TLS extension called Next Protocol Negotiation that was used briefly in the early days of HTTP/2 development before ALPN was introduced and replaced NPN. Virtually nothing requires NPN anymore, and users can still set the option asking for it, but it will never actually happen over the wire.

Furthermore, a typical libcurl build involves multiple third party libraries that provide features it needs. For things like TLS, SSH, compression and binary HTTP protocol management. Over the years, we have removed support for several such libraries and introduced support for new, in ways that was never visible in the API or ABI. Some users just had to switch to building curl with different helper libraries.

In reality, libcurl is typically more stable than most existing servers and URLs. The libcurl examples you wrote in 2006 can still be built with the modern libcurl, but the servers and URLs you used back then most probably cannot be used anymore.

If no one can spot it, it did not happen

As blunt as it may sound, it has came down to this fundamental statement several times to judge if a change is an ABI breakage or not:

If no one can spot an ABI change, it is not an ABI change

Of course what makes it harder than it sounds is that it is extremely difficult to actually know if someone will notice something ahead of time. libcurl is used in so ridiculously many installations and different setups, second-guessing whatever everyone does and wants is darned close to impossible.

Adding to the challenge is the crazy long upgrade cycles some of our users seem to sport. It is not unusual to see questions appear on the mailing lists from users bumping from curl versions from eight or ten years ago. The fact that we have not heard users comment on a particular change might just mean that they are still stuck on ancient versions.

Getting frustrated comments from users today about a change we landed five years ago is hard to handle.

Forwards compatible

I should emphasize that all this means that users can always upgrade to a later release. It does not necessarily mean that they can switch back to an older version without problems. We do add new features over time and if you start using a new feature, the application of course will not work, or even still compile, if you would switch to a libcurl version from before that feature was added.

How long is never

What I have laid out here is our plan and ambition. We have managed to stick to this for eighteen years now and there is no particular known blockers in the known future either.

I cannot rule out that we might at some point in the future run into an obstacle so huge or complicated that we will be forced to do the unthinkable. To break the ABI. But until we see absolutely no other way forward, it is not going to happen.

xCurl

It is often said that Imitation is the Sincerest Form of Flattery.

Also, remember libcrurl? That was the name of the thing Google once planned to do: reimplement the libcurl API on top of their Chrome networking library. Flattery or not, it never went anywhere.

The other day I received an email asking me about details regarding something called xCurl. Not having a clue what that was, a quick search soon had me enlightened.

xCurl is, using their own words, a Microsoft Game Development Kit (GDK) compliant implementation of the libCurl API.

A Frankencurl

The article I link to above describes how xCurl differs from libcurl:

xCurl differs from libCurl in that xCurl is implemented on top of WinHttp and automatically follows all the extra Microsoft Game Development Kit (GDK) requirements and best practices. While libCurl itself doesn’t meet the security requirements for the Microsoft Game Development Kit (GDK) console platform, use xCurl to maintain your same libCurl HTTP implementation across all platforms while changing only one header include and library linkage.

I don’t know anything about WinHttp, but since that is an HTTP API I can only presume that making libcurl use that instead of plain old sockets has to mean a pretty large surgery and code change. I also have no idea what the mentioned “security requirements” might be. I’m not very familiar with Windows internals nor with their game development setups.

The article then goes on to describe with some detail exactly which libcurl options that work, and which don’t and what libcurl build options that were used when xCurl was built. No DoH, no proxy support, no cookies etc.

The provided functionality is certainly a very stripped down and limited version of the libcurl API. A fun detail is that they quite bluntly just link to the libcurl API documentation to describe how xCurl works. It is easy and convenient of course, and it will certainly make xCurl “forced” to stick to the libcurl behavior

With large invasive changes of this kind we can certainly hope that the team making it has invested time and spent serious effort on additional testing, both before release and ongoing.

Source code?

I have not been able to figure out how to download xCurl in any form, and since I can’t find the source code I cannot really get a grip of exactly how much and how invasive Microsoft has patched this. They have not been in touch or communicated about this work of theirs to anyone in the curl project.

Therefore, I also cannot say which libcurl version this is based on – as there is no telling of that on the page describing xCurl.

The email that triggered me to crawl down this rabbit hole included a copyright file that seems to originate from an xCurl package, and that includes the curl license. The curl license file has the specific detail that it shows the copyright year range at the top and this file said

Copyright (c) 1996 - 2020, Daniel Stenberg, daniel@haxx.se, and many contributors, see the THANKS file.

It might indicate that they use a libcurl from a few years back. Only might, because it is quite common among users of libcurl to “forget” (sometimes I’m sure on purpose) to update this copyright range even when they otherwise upgrade the source code. This makes the year range a rather weak evidence of the actual age of the libcurl code this is based on.

Updates

curl (including libcurl) ships a new version at least once every eight weeks. We merge bugfixes at a rate of around three bugfixes per day. Keeping a heavily modified libcurl version in sync with the latest curl releases is hard work.

Of course, since they deliberately limit the scope of the functionality of their clone, lots of upstream changes in curl will not affect xCurl users.

License

curl is licensed under… the curl license! It is an MIT license that I was unclever enough to slightly modify many years ago. The changes are enough for organizations such as SPDX to consider it a separate one: curl. I normally still say that curl is MIT licensed because the changes are minuscule and do not change the spirit of the license.

The curl license of course allows Microsoft or anyone else do to this kind of stunt and they don’t even have to provide the source code for their changes or the final product and they don’t have to ask or tell anyone:

Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

I once picked this license for curl exactly because it allows this. Sure it might sometimes then make people do things in secret that they never contribute back, and we miss out on possible improvements because of that, but I think the more important property is that no company feels scared or restricted as to when and where they can use this code. A license designed for maximum adoption.

I have always had the belief that it is our relentless update scheme and never-ending flood of bugfixes that is what will keep users wanting to use the real thing and avoid maintaining long-running external patches. There will of course always be exceptions to that.

Follow-up

Forensics done by users who installed this indicate that this xCurl is based on libcurl 7.69..x. We removed a define from the headers in 7.70.0 (CURL_VERSION_ESNI) that this package still has. It also has the CURLOPT_MAIL_RCPT_ALLLOWFAILS define, added in 7.69.0.

curl 7.69.1 was released on March 11, 2020. It has 40 known vulnerabilities, and we have logged 3,566 bugfixes since then. Of course not all of any of those affect xCurl.

Making libcurl init more thread-safe

Twenty-one years ago, in May 2001 we introduced the global initialization function to libcurl version 7.8 called curl_global_init().

The main reason we needed this separate function to get called before anything else was used in libcurl, was that several of libcurl’s dependencies at the time (including OpenSSL and GnuTLS) had themselves thread-unsafe initialization procedures.

This rather lame characteristic found in several third party dependencies made the libcurl function inherit that property: not thread-safe. A nasty “feature” in a library that otherwise prides itself for being thread-safe and in many ways working at “it should”. A function that is specifically marked as thread unsafe was not good. Is not good.

Still, we were victims of circumstances and if these were the dependencies we were going to use, this is what we needed to do.

Occasionally, this limitation has poked people in the eye and really hurt them since it makes some use cases really difficult to realize.

Dependencies improved

Over the following decades, the dependencies libcurl use have almost all shaped up and removed the thread-unsafe property of their initialization procedures.

We also slowly cleaned away other code that happened to also fall into the init function out of laziness and convenience because it was there and could be used (or perhaps abused).

Eventually, we were basically masters of our own faith again. The closet was all cleared out and the scrubby leftovers we had sloppily left in there had been cleaned up and gotten converted to proper thread-safe code.

The task of finally making curl_global_init() thread-safe was brought up and attempted a little half-assed a few times but was never pulled through all the way.

The challenges always included that we want to avoid relying on thread library and that we are supporting building libcurl with C89 compilers etc.

Finally, the spring cleaning of 2022

Thanks to work spear-headed by Thomas Guillem who came bursting in with a clear use-case in mind where he felt he really need this to work, and voila now the next libcurl release (7.84.0) features a thread-safe init.

If configure finds support for _Atomic (a C11 feature) or it runs on a new enough Windows version (this should cover a vast amount of platforms), libcurl can now do its own spinlock implementation that makes the init function thread-safe and independent of threading libraries.

A headers API for libcurl

For many years we’ve had this outstanding idea to add a new API to libcurl that would offer applications easy access to HTTP response headers.

Applications could already retrieve the headers using existing methods but that requires them to write a callback and to a certain amount of parsing and “understanding” HTTP that we always felt was a little unfortunate, a bit error-prone on the behalf of the applications and perhaps also a thing that forced a lot of applications out there having to write the same kind of extra function logic.

If libcurl provides this functionality, it would remove a lot of (duplicated) code from a lot of applications.

Designing the API

We started this process a while ago when I first wrote down a basic approach to an API for this and sent it off to the curl-library mailing list for feedback and critique.

/* first take */
char *curl_easy_header(CURL *easy,
                       const char *name);

The conversation that followed that first plea for help, made me realize that my first proposal had been far too basic and it wouldn’t at all work to satisfy the needs and use cases we could think of for this API.

Try again

I went back to mull over what I’ve learned and update my design proposal, trying to take the feedback into account in the best possible way. A few weeks later, I returned with a “proposal v2” and again I asked for comments and opinions on what I had put together.

/* second shot */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           struct curl_header **h);

As I had already adjusted the API from feedback the first time around, the feedback this time was perhaps not calling for as big changes or radical differences as they did the first time around. I could adapt my proposal to what people asked and suggested. We arrived at something that seemed like a pretty solid API for offering HTTP headers to applications.

Let’s do this

As the API proposal feedback settled down and the interface felt good and sensible, I decided it was time for me to write up a first implementation so that we can offer code to people to give everyone a chance to try out the API in real life as well. There’s one thing to give feedback on a “paper product”, actually being able to use it and try it in an application is way better. I dove in.

The final take

When the code worked to the level that I started to be able to extract the first headers with the API, it proved to that we needed to adjust the API a little more, so I did. I then ran into more questions and thoughts about specifics that we hadn’t yet dealt with or nailed proper in the discussions up to that point and I took some questions back to the curl community. This became an iterative process and we smoothed out questions about how access different header “sources” as well as how to deal with multiple headers and “request sequences”. All supported now.

/* final version */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           unsigned int origin,
                           int request,
                           struct curl_header **h);

Multiple headers

This API allows applications to extract all headers from a previous transfer. It can get one or many headers when there are duplicated ones, like Set-Cookie: commonly arrive as.

Sources

The application can ask for “normal” headers, for trailers (that arrive after the body), headers associated with the CONNECT request (if such a one was performed), pseudo headers (that might arrive when HTTP/2 and HTTP/3 is used) or headers associated with a HTTP 1xx “intermediate” response.

Multiple responses

The libcurl APIs typically work on transfers, which means that a single transfer may end up doing multiple transfers, multiple HTTP requests. Primarily when redirects are followed but it can also be due to other reasons. This header API therefore allows the caller to extract headers from the entire “chain” of requests a previous transfer was made with.

EXPERIMENTAL

This API is initially merged (in this commit) labeled “experimental” to be included in the upcoming 7.83.0 release. The experimental label means a few different things to us:

  • The API is disabled by default in the build and you need to explicitly ask for it with --enable-headers-api when you run configure
  • There are no ABI and API promises for these functions yet. We might change the functions based on feedback before we remove the label.
  • We strongly discourage anyone from shipping experimentally labeled functions in production.
  • We rely on people to enable and test this and provide feedback, to give us confidence enough to remove the experimental label as soon as possible.

We use the experimental “route” to lower the bar for merging new stuff, so that we get some extra chances to fix up mistakes before the rules and API are carved in stone and we are set to support that for a life time.

This setup relies on users actually trying out the experimental stuff as otherwise it isn’t method for improving the API, it will only delay the introduction of it to the general public. And it risks becoming be less good.

Documentation

The two new functions have detailed man pages: curl_easy_header and curl_easy_nextheader. If there is anything missing on unclear in there, let us know!

I have also created an initial example source snippet showing header API use. See headerapi.c.

This API deserves its own little section in the everything curl book, but I think I will wait for it to get landed “for real” before I work on adding that.

You wanted WebSockets?

WebSockets has been one of the most requested features and protocol to add to curl and libcurl in the annual user survey. Repeatedly, over the last few years.

WebSockets is not perfectly suitable to be done by libcurl since it’s not really an upload or download transfer protocol, but is more something like “a TCP for JavaScript”. It provides a bidirectional data stream over HTTP. (I was there when it was created, first mentioned on my blog here.)

Ignoring that technicality, WebSockets is often used more or less for a one-directional data stream. Commonly together with the use of other protocols that curl already supports. If libcurl would support it, there will be plenty of applications out there that could simplify their code.

Today, users use a mix of libcurl, custom code on top or “over” libcurl and other WebSockets libraries. There’s no single de-facto way or practice to do WebSockets with libcurl.

WebSockets for libcurl?

I took the topic of drafting a WebSockets API for libcurl to the libcurl mailing list a while ago and after a lot of back and forths and feedback from multiple people, we have a decent beginning of a WebSockets API that might work jotted down.

This is just a potential API described in a document. How it could be made to work. Nobody has actually implemented any of it.

Implement?

We know users ask for WebSockets, repeatedly and several people helped contributing to the tentative API design.

It’s just that this time I decided to pause and see if I couldn’t get some help in implementing this. To create a team of implementers willing to work before I dive in, alternatively to find someone who’d sponsor this work to allow me to spend more and dedicated time on it. I decided to do this, because I already have a lot of other things on my plate and I have to focus on my paying (curl) customers. I estimate that implementing WebSockets support is quite a lot of work.

If nobody is willing to put in the work or money to make it happen, then maybe that’s rather clear message that this is not a feature that is meant to be provided by curl. At least not now.

WebSockets future

WebSockets was created in the HTTP/1.1 era, and is probably still mostly done using that protocol as bootstrap. There are indications hinting that the future might hold less WebSockets.

It took a long time but eventually a way to do WebSockets over HTTP/2 was provided via RFC 8441, “Bootstrapping WebSockets with HTTP/2”, published in 2018. This allows a WebSockets connection to be done over a single HTTP/2 stream.

The next evolutionary step seems to rather be WebTransport. It is a new take and protocol and is meant to be used over HTTP/3 and QUIC. It is described to “send data to and receive data from servers. It can be used like WebSockets but with support for multiple streams, unidirectional streams, out-of-order delivery, and reliable as well as unreliable transport.”

Credits

Image by pisauikan from Pixabay

Heading towards curl eight

There’s a plan for version 8 being forged! Let me just take you back a bit in time first..

The early days

When we first created libcurl, we bumped the major version number of the project from the previous version 6 to version 7. In late summer of 2000 we shipped curl and libcurl 7.1 as the first ever release that featured a separate library for Internet transfer powers. Everything before version 7 was just the command line tool, curl. That was the moment in time we decided we should leave kindergarten and we were ready to take on some tougher loads.

I had the main approach to the API worked out already from the start. It would be transfer-oriented, it would be built up around URLs and shouldn’t necessarily require that the users are themselves protocol experts to use it. I used inspiration from ioctl and fcntl when I made curl_easy_setopt and curl_easy_getinfo. They’re fixed functions with flexible arguments. The idea was that by doing that, we wouldn’t have to add new functions for new features but we would “just” add options and new options would simply just not work with older libcurls.

The first API we shipped also only provided synchronous single transfers. The “easy” interface.

I had no anticipations or particular hopes on the library then. It would be cool if someone found good use for it and it would be even cooler if someone would help out to improve it further.

It grew, I learned

I had never before developed and shipped a library for the world to use. I hadn’t really fully grasped and considered the impact of APIs and ABI stability etc.

We gradually improved the library over time. We bumped the SONAME several times in the first few years as we modified internals. In the same time the library caught on a bigger and bigger audience and in September 2006 as I ripped out code for what is commonly referred to as “FTP third party transfers” I once again bumped the ABI number (to 4) since the older libcurl was no longer compatible with the new release.

People don’t like SONAME bumps

That bump was met with quite a lot of resistance and objections among users. Changing SONAME of a widely used library it turns out causes a lot of pain, agony and squeaking. Possibly this was one of the earlier signs that libcurl had grown up and I decided that we should try to avoid going through this again. We shall not break ABI compatibility again. Ever.

No bumps, no worries

In this world with no SONAME bumps I wanted to keep that solidity visible in the major number of the project so even though the version number of the releases aren’t strictly related to the SONAME we kept shipping curl version seven. In September 2021, we reach curl 7.79.0.

We’ve managed to stick to our goals and a binary libcurl using application built after September 2006 can run with the latest libcurl with no modification needed! This is one of the biggest edges and “selling points” we have in libcurl: We take compatibility and unmodified behavior very seriously.

In 2013 I even wrote blog post emphasizing this and in there I said there won’t be a curl version 8 “in a long time”. But read on, things have changed a little!

An ever-growing minor number

We bump the minor version number every time we “change” something in the project, or add features. If we only do bug-fixes we only bump the patch number. You know, in classic “major,minor.patch” style.

As we do curl releases at least every 8 weeks and most releases have changes added, we bump the minor number very frequently, up to 6 times a year or so.

We provide version number information for libcurl provided as a 24 bit number, using 8 bits for each field. This implies that none of the numbers can ever go above 255. We can’t ship a 7.256.0 for this practical reason.

But also, from what I’ve seen people do with and think about version numbers before, I’m concerned that increasing the minor number beyond 99 will cause confusions. Version 7.100.0 risks gettinged confused and mixed up with 7.1 or 7.10.0, two versions that are terribly old. And frankly that’s a very large minor number and it starts becoming many digits in that release version.

There exists a solution to this!

Reset minor, bump major, keep SONAME

The idea is simply to do “a Linux kernel”. We change to version 8.0.0 at a given point in time, but we stick to the same SONAME as before.

We don’t break any compatibility, there will no no API or functionality cleanups. There will just be a version number bump to lower the minor number and let us start over that journey. Reset the counter so to speak.

When do we do this? We have roughly 20 releases left before the minor counter can reach 100, 20 releases take at least 6*20 = 120 weeks. 120 weeks is 2.3 years.

Another event within this period

On March 20, 2023. About 18 months into the future, the curl project turns 25 years old. Here’s a golden opportunity! Let’s top off the 25 year celebrations with a major version number bump!

The plan

Independently of what version number we have reached to at that point, independently of if we add features or not in that release, independently of exactly how that date fits within the pre-determined release cycle and without changing any APIs, we ship curl version 8.0.0 on that day.

Turning 25 and bumping the major version number on the same day should be fun.

Version 8

I hope that a small side-effect of bumping the major number will make users still left on version 7 to slightly faster feel outdated and push for getting up to 8. It could work as a minor push to get users to catch up a bit.

The minor number will of course immediately start “climbing” again and in a worst-case scenario, we risk reaching minor number 100 again within another 17 years. Maybe we can plan another bump for the 40th birthday?

Imagining a thread-safe curl_global_init

libcurl is thread-safe

That’s the primary message that we push and that’s important to remember. You can write a multi-threaded application that does concurrent Internet transfers with libcurl in as many threads as you like and they fly just fine.

But

But there are nuances and details of course and the devil is always in those. The main obstacle that then and again causes problems for users is the curl_global_init() function. But how come?

curl_global_init

Back in the day, libcurl developers realized that when we work with in particular a lot of third party TLS libraries, they feature init functions that need to be called first, before any other function in those libraries are called. And they typically are all marked as not thread-safe, we have to call those functions knowing that no other thread calls them. This was the case for GnuTLS (before version 3.3.0) and it was the case for OpenSSL (until they shipped version 1.1.1) etc.

In order for libcurl to adhere to those restrictions that weren’t our own inventions, we added a function to libcurl called curl_global_init() that then in itself inherited those non-thread safe characteristics. We documented the function as not thread-safe.

Time passed, and as we now had a function that is a global initialization function that is also marked not thread-safe, it was an attractive point to add more and other functionality for the library. Other global initializations that then weren’t thread-safe either – as that wasn’t any point in doing anyway since the entire thing wasn’t thread-safe to begin with.

The problems

Having the global init function not being thread-safe has caused problems to users, mostly in use cases where for example they use libcurl in a plugin-like cases where you can’t know if you’re the only user in the process.

We’ve then mostly been longing for better days and blamed the third party libraries that forced us into this corner.

Third parties shaped up, we didn’t

One day in recent times when we looked at what third party libraries a typical libcurl user uses in a modern system, we see that they’ve all fixed their init functions! OpenSSL and GnuTLS that once were part of the original reasons for this function have fixed their issues. They no longer have thread-unsafe init functions.

But libcurl still does! 🙁

While we were initially pushed into this unfortunate corner because of limitations in third party libraries, we had added our own init functions into that function that aren’t thread-safe and now, even though the third party libraries had done the right thing over time, we found ourselves no longer able to put the blame on others. Now we need to clean up our own backyard!

Fix it!

In libcurl 7.69.0 we’ve started this journey with two distinct changes. The goal is to make the function thread-safe under the condition that libcurl is built with only thread-safe dependencies, and we should make configure etc check if that’s the case.

1: EINTR handling

Since libcurl 7.30.0, we’ve provided a flag in the curl_global_init() function to let libcurl users ask for EINTR to actually abort internal loops. Starting now, that flag has no meaning and this is now default behavior. No need to store this state globally anywhere.

2: Working IPv6

At least in the past, it has been common with systems that are IPv6-capable at build-times but that can’t actually create IPv6 sockets and therefor they can’t actually use IPv6. This was previously checked for, once, in the global init and then IPv6 is disabled for everyone. Without a global state, we’ve been forced to move this check and it is now instead done for every created multi handle. A minuscule performance hit for thread safety.

Left to do until completely thread-safe

The transition isn’t completed. The low hanging fruit has been picked, here are some remaining issues to solve:

When is it thread-safe?

Since curl can be built with a number of different third party libraries, including version old versions, we need to make the configure script know what versions of what libraries that are safe so that it can tell. But how are libcurl application authors supposed to know? Can we figure how a way to tell them?

curl_version*

Both curl_version() and curl_version_info() store information in static buffers and return information pointing to that memory. They’re currently setup in the global init so they work safely from multiple threads today, but we probably need to create new, alternative versions of them, that instead allocate heap memory to return the info in. Or possibly store the info in memory associated with a handle.

Update: Patrick Monnerat made me realize that a possibly even better way to fix them is to make sure they generate the same output in a way that repeated or concurrent invokes are fine.

Reference counter

There’s a counter counting calls to curl_global_init() so that the corresponding number of calls to curl_global_cleanup() is required before things are actually cleaned up.

This is a hard nut to crack without a global context and no mutex locks. I haven’t yet figured out how to solve this. If you have ideas, I’m listening!

When?

There’s no fixed time schedule for when these remaining nits are supposed to be fixed, but I hope to work on them going forward and I will appreciate all the help I can get and if things just progress, I would imagine we can end 2020 with a libcurl with these flaws fixed!

Oh, and we also really need to make sure that we don’t simultaneously come up with or think of new thread unsafe functionality for the init function..

Credits

Top image by Andreas Lischka from Pixabay

This is your wake up curl

curl_multi_wakeup() is a new friend in the libcurl API family. It will show up for the first time in the upcoming 7.68.0 release, scheduled to happen on January 8th 2020.

Sleeping is important

One of the core functionalities in libcurl is the ability to do multiple parallel transfers in the same thread. You then create and add a number of transfers to a multi handle. Anyway, I won’t explain the entire API here but the gist of where I’m going with this is that you’ll most likely sooner or later end up calling the curl_multi_poll() function which asks libcurl to wait for activity on any of the involved transfers – or sleep and don’t return for the next N milliseconds.

Calling this waiting function (or using the older curl_multi_wait() or even doing a select() or poll() call “manually”) is crucial for a well-behaving program. It is important to let the code go to sleep like this when there’s nothing to do and have the system wake up it up again when it needs to do work. Failing to do this correctly, risk having libcurl instead busy-loop somewhere and that can make your application use 100% CPU during periods. That’s terribly unnecessary and bad for multiple reasons.

Wakey wakey

When your application calls libcurl to say “sleep for a second or until something happens on these N transfers” and something happens and the application for example needs to shut down immediately, users have been asking for a way to do a wake up call.

– Hey libcurl, wake up and return early from the poll function!

You could achieve this previously as well, but then it required you to write quite a lot of extra code, plus it would have to be done carefully if you wanted it to work cross-platform etc. Now, libcurl will provide this utility function for you out of the box!

curl_multi_wakeup()

This function explicitly makes a curl_multi_poll() function return immediately. It is designed to be possible to use from a different thread. You will love it!

curl_multi_poll()

This is the only call that can be woken up like this. Old timers may recognize that this is also a fairly new function call. We introduced it in 7.66.0 back in September 2019. This function is very similar to the older curl_multi_wait() function but has some minor behavior differences that also allow us to introduce this new wakeup ability.

Credits

This function was brought to us by the awesome Gergely Nagy.

Top image by Wokandapix from Pixabay

The future of HTTP Symposium

This year’s version of curl up started a little differently: With an afternoon of HTTP presentations. The event took place the same week the IETF meeting has just ended here in Prague so we got the opportunity to invite people who possibly otherwise wouldn’t have been here… Of course this was only possible thanks to our awesome sponsors, visible in the image above!

Lukáš Linhart from Apiary started out with “Web APIs: The Past, The Present and The Future”. A journey trough XML-RPC, SOAP and more. One final conclusion might be that we’re not quite done yet…

James Fuller from MarkLogic talked about “The Defenestration of Hypermedia in HTTP”. How HTTP web technologies have changed over time while the HTTP paradigms have survived since a very long time.

I talked about DNS-over-HTTPS. A presentation similar to the one I did before at FOSDEM, but in a shorter time so I had to talk a little faster!

Mike Bishop from Akamai (editor of the HTTP/3 spec and a long time participant in the HTTPbis work) talked about “The evolution of HTTP (from HTTP/1 to HTTP/3)” from HTTP/0.9 to HTTP/3 and beyond.

Robin Marx then rounded off the series of presentations with his tongue in cheek “HTTP/3 (QUIC): too big to fail?!” where we provided a long list of challenges for QUIC and HTTP/3 to get deployed and become successful.

We ended this afternoon session with a casual Q&A session with all the presenters discussing various aspects of HTTP, the web, REST, APIs and the benefits and deployment challenges of QUIC.

I think most of us learned things this afternoon and we could leave the very elegant Charles University room enriched and with more food for thoughts about these technologies.

We ended the evening with snacks and drinks kindly provided by Apiary.

(This event was not streamed and not recorded on video, you had to be there in person to enjoy it.)


QUIC and missing APIs

I trust you’ve heard by now that HTTP/3 is coming. It is the next destined HTTP version, targeted to get published as an RFC in July 2019. Not very far off.

HTTP/3 will not be done over TCP. It will only be performed over QUIC, which is a transport protocol replacement for TCP that always is done encrypted. There’s no clear-text version of QUIC.

TLS 1.3

The encryption in QUIC is based on TLS 1.3 technologies which I believe everyone thinks is a good idea and generally the correct decision. We need to successively raise the bar as we move forward with protocols.

However, QUIC is not only a transport protocol that does encryption by itself while TLS is typically (and designed as) a protocol that is done on top of TCP, it was also designed by a team of engineers who came up with a design that requires APIs from the TLS layer that the traditional TLS over TCP use case doesn’t need!

New TLS APIs

A QUIC implementation needs to extract traffic secrets from the TLS connection and it needs to be able to read/write TLS messages directly – not using the TLS record layer. TLS records are what’s used when we send TLS over TCP. (This was discussed and decided back around the time for the QUIC interim in Kista.)

These operations need APIs that still are missing in for example the very popular OpenSSL library, but also in other commonly used ones like GnuTLS and libressl. And of course schannel and Secure Transport.

Libraries known to already have done the job and expose the necessary mechanisms include BoringSSL, NSS, quicly, PicoTLS and Minq. All of those are incidentally TLS libraries with a more limited number of application users and less mainstream. They’re also more or less developed by people who are also actively engaged in the QUIC protocol development.

The QUIC libraries in progress now are typically using either one of the TLS libraries that already are adapted or do what ngtcp2 does: it hosts a custom-patched version of OpenSSL that brings the needed functionality.

Matt Caswell of the OpenSSL development team acknowledged this situation already back in September 2017, but so far we haven’t seen this result in updated code shipped in a released version.

curl and QUIC

curl is TLS library agnostic and can get built with around 12 different TLS libraries – one or many actually, as you can build it to allow users to select TLS backend in run-time!

OpenSSL is without competition the most popular choice to build curl with outside of the proprietary operating systems like macOS and Windows 10. But even the vendor-build and provided mac and Windows versions are also built with libraries that lack APIs for this.

With our current keen interest in QUIC and HTTP/3 support for curl, we’re about to run into an interesting TLS situation. How exactly is someone going to build curl to simultaneously support both traditional TLS based protocols as well as QUIC going forward?

I don’t have a good answer to this yet. Right now (assuming we would have the code ready in our end, which we don’t), we can’t ship QUIC or HTTP/3 support enabled for curl built to use the most popular TLS libraries! Hopefully by the time we get our code in order, the situation has improved somewhat.

This will slow down QUIC deployment

I’m personally convinced that this little API problem will be friction enough when going forward that it will slow down and hinder QUIC deployment at least initially.

When the HTTP/2 spec shipped in May 2015, it introduced a dependency on the fairly new TLS extension called ALPN that for a long time caused head aches for server admins since ALPN wasn’t supported in the OpenSSL versions that was typically installed and used at the time, but you had to upgrade OpenSSL to version 1.0.2 to get that supported.

At that time, almost four years ago, OpenSSL 1.0.2 was already released and the problem was big enough to just upgrade to that. This time, the API we’re discussing here is not even in a beta version of OpenSSL and thus hasn’t been released in any version yet. That’s far worse than the HTTP/2 situation we had and that took a few years to ride out.

Will we get these APIs into an OpenSSL release to test before the QUIC specification is done? If the schedule sticks, there’s about six months left…