Category Archives: cURL and libcurl

curl and/or libcurl related

“So what exactly is curl?”

You know that question you can get asked casually by a person you’ve never met before or even by someone you’ve known for a long time but haven’t really talked to about this before. Perhaps at a social event. Perhaps at a family dinner.

– So what do you do?

The implication is of course what you work with. Or as. Perhaps a title.

Software Engineer

In my case I typically start out by saying I’m a software engineer. (And no, I don’t use a title.)

If the person who asked the question is a non-techie, this can then take off in basically any direction. From questions about the Internet, how their printer acts up sometimes to finicky details about Wifi installations or their parents’ problems to install anti-virus. In other words: into areas that have virtually nothing to do with software engineering but is related to computers.

If the person is somewhat knowledgeable or interested in technology or computers they know both what software and engineering are. Then the question can get deepened.

What kind of software?

Alternatively they ask for what company I work for, but it usually ends up on the same point anyway, just via this extra step.

I work on curl. (Saying I work for wolfSSL rarely helps.)

Business cards of mine

So what is curl?

curl is a command line tool used but a small set of people (possibly several thousands or even millions), and the library libcurl that is installed in billions of places.

I often try to compare libcurl with how companies build for example cars out of many components from different manufacturers and companies. They use different pieces from many separate sources put together into a single machine to produce the end product.

libcurl is like one of those little components that a car manufacturer needs. It isn’t the only choice, but it is a well known, well tested and familiar one. It’s a safe choice.

Internet what?

Lots of people, even many with experience, knowledge or even jobs in the IT industry I’ve realized don’t know what an Internet transfer is. Me describing curl as doing such, doesn’t really help in those cases.

An internet transfer is the bridge between “the cloud” and your devices or applications. curl is a bridge.

Everything wants Internet these days

In general, anything today that has power goes towards becoming networked. Everything that can, will connect to the Internet sooner or later. Maybe not always because it’s a good idea, but because it gives your thing a (perceived) advantage to your competitors.

Things that a while ago you wouldn’t dream would do that, now do Internet transfers. Tooth brushes, ovens, washing machines etc.

If you want to build a new device or application today and you want it to be successful and more popular than your competitors, you will probably have to make it Internet-connected.

You need a “bridge”.

Making things today is like doing a puzzle

Everyone who makes devices or applications today have a wide variety of different components and pieces of the big “puzzle” to select from.

You can opt to write many pieces yourself, but virtually nobody today creates anything digital entirely on their own. We lean on others. We stand on other’s shoulders. In particular open source software has grown up to become or maybe provide a vast ocean of puzzle pieces to use and leverage.

One of the little pieces in your device puzzle is probably Internet transfers, because you want your thing to get updates, upload telemetry and who knows what else.

The picture then needs a piece inserted in the right spot to get complete. The Internet transfers piece. That piece can be curl. We’ve made curl to be a good such piece.

This perfect picture is just missing one little piece…

Relying on pieces provided by others

Lots have been said about the fact that companies, organizations and entire ecosystems rely on pieces and components written, maintained and provided by someone else. Some of them are open source components written by developers on their spare time, but are still used by thousands of companies shipping commercial products.

curl is one such component. It’s not “just” a spare time project anymore of course, but the point remains. We estimate that curl runs in some ten billion installations these days, so quite a lot of current Internet infrastructure uses our little puzzle piece in their pictures.

Modified version of the original xkcd 2347 comic

So you’re rich

I rarely get to this point in any conversation because I would have already bored my company into a coma by now.

The concept of giving away a component like this as open source under a liberal license is a very weird concept to general people. Maybe also because I say that I work on this and I created it, but I’m not at all the only contributor and we wouldn’t have gotten to this point without the help of several hundred other developers.

“- No, I give it away for free. Yes really, entirely and totally free for anyone and everyone to use. Correct, even the largest and richest mega-corporations of the world.”

The ten billion installations work as marketing for getting companies to understand that curl is a solid puzzle piece so that more will use it and some of those will end up discovering they need help or assistance and they purchase support for curl from me!

I’m not rich, but I do perfectly fine. I consider myself very lucky and fortunate who get to work on curl for a living.

A curl world

There are about 5 billion Internet using humans in the world. There are about 10 billion curl installations.

The puzzle piece curl is there in the middle.

This is how they’re connected. This is the curl world map 2021.

Or put briefly

libcurl is a library for doing transfers specified with a URL, using one of the supported protocols. It is fast, reliable, very portable, well documented and feature rich. A de-facto standard API available for everyone.

Credits

The original island image is by Julius Silver from Pixabay. xkcd strip edits were done by @tsjost.

Mars 2020 Helicopter Contributor

Friends of mine know that I’ve tried for a long time to get confirmation that curl is used in space. We’ve believed it to be likely but I’ve wanted to get a clear confirmation that this is indeed the fact.

Today GitHub posted their article about open source in the Mars mission, and they now provide a badge on their site for contributors of projects that are used in that mission.

I have one of those badges now. Only a few other of the current 879 recorded curl authors got it. Which seems to be due to them using a very old curl release (curl 7.19, released in September 2008) and they couldn’t match all contributors with emails or the authors didn’t have their emails verified on GitHub etc.

According to that GitHub blog post, we are “almost 12,000” developers who got it.

While this strictly speaking doesn’t say that curl is actually used in space, I think it can probably be assumed to be.

Here’s the interplanetary curl development displayed in a single graph:

See also: screenshotted curl credits and curl supports NASA.

Credits

Image by Aynur Zakirov from Pixabay

curl those funny IPv4 addresses

Everyone knows that on most systems you can specify IPv4 addresses just 4 decimal numbers separated with periods (dots). Example:

192.168.0.1

Useful when you for example want to ping your local wifi router and similar. “ping 192.168.0.1”

Other bases

The IPv4 string is usually parsed by the inet_addr() function or at times it is passed straight into the name resolver function like getaddrinfo().

This address parser supports more ways to specify the address. You can for example specify each number using either octal or hexadecimal.

Write the numbers with zero-prefixes to have them interpreted as octal numbers:

0300.0250.0.01

Write them with 0x-prefixes to specify them in hexadecimal:

0xc0.0xa8.0x00.0x01

You will find that ping can deal with all of these.

As a 32 bit number

An IPv4 address is a 32 bit number that when written as 4 separate numbers are split in 4 parts with 8 bits represented in each number. Each separate number in “a.b.c.d” is 8 bits that combined make up the whole 32 bits. Sometimes the four parts are called quads.

The typical IPv4 address parser however handles more ways than just the 4-way split. It can also deal with the address when specified as one, two or three numbers (separated with dots unless its just one).

If given as a single number, it treats it as a single unsigned 32 bit number. The top-most eight bits stores what we “normally” with write as the first number and so on. The address shown above, if we keep it as hexadecimal would then become:

0xc0a80001

And you can of course write it in octal as well:

030052000001

and plain old decimal:

3232235521

As two numbers

If you instead write the IP address as two numbers with a dot in between, the first number is assumed to be 8 bits and the next one a 24 bit one. And you can keep on mixing the bases as you see like. The same address again, now in a hexadecimal + octal combo:

0xc0.052000001

This allows for some fun shortcuts when the 24 bit number contains a lot of zeroes. Like you can shorten “127.0.0.1” to just “127.1” and it still works and is perfectly legal.

As three numbers

Now the parts are supposed to be split up in bits like this: 8.8.16. Here’s the example address again in octal, hex and decimal:

0xc0.0250.1

Bypassing filters

All of these versions shown above work with most tools that accept IPv4 addresses and sometimes you can bypass filters and protection systems by switching to another format so that you don’t match the filters. It has previously caused problems in node and perl packages and I’m guessing numerous others. It’s a feature that is often forgotten, ignored or just not known.

It begs the question why this very liberal support was once added and allowed but I’ve not been able to figure that out – maybe because of how it matches class A/B/C networks. The support for this syntax seems to have been introduced with the inet_aton() function in the 4.2BSD release in 1983.

IPv4 in URLs

URLs have a host name in them and it can be specified as an IPv4 address.

RFC 3986

The RFC 3986 URL specification’s section 3.2.2 says an IPv4 address must be specified as:

dec-octet "." dec-octet "." dec-octet "." dec-octet

… but in reality very few clients that accept such URLs actually restrict the addresses to that format. I believe mostly because many programs will pass on the host name to a name resolving function that itself will handle the other formats.

The WHATWG URL Spec

The Host Parsing section of this spec allows the many variations of IPv4 addresses. (If you’re anything like me, you might need to read that spec section about three times or so before that’s clear).

Since the browsers all obey to this spec there’s no surprise that browsers thus all allow this kind of IP numbers in URLs they handle.

curl before

curl has traditionally been in the camp that mostly accidentally somewhat supported the “flexible” IPv4 address formats. It did this because if you built curl to use the system resolver functions (which it does by default) those system functions will handle these formats for curl. If curl was built to use c-ares (which is one of curl’s optional name resolver backends), using such address formats just made the transfer fail.

The drawback with allowing the system resolver functions to deal with the formats is that curl itself then works with the original formatted host name so things like HTTPS server certificate verification and sending Host: headers in HTTP don’t really work the way you’d want.

curl now

Starting in curl 7.77.0 (since this commit ) curl will “natively” understand these IPv4 formats and normalize them itself.

There are several benefits of doing this ourselves:

  1. Applications using the URL API will get the normalized host name out.
  2. curl will work the same independently of selected name resolver backend
  3. HTTPS works fine even when the address is using other formats
  4. HTTP virtual hosts headers get the “correct” formatted host name

Fun example command line to see if it works:

curl -L 16843009

16843009 gets normalized to 1.1.1.1 which then gets used as http://1.1.1.1 (because curl will assume HTTP for this URL when no scheme is used) which returns a 301 redirect over to https://1.1.1.1/ which -L makes curl follow…

Credits

Image by Thank you for your support Donations welcome to support from Pixabay

curl 7.76.1 – h2 works again

I’m happy to once again present a new curl release to the world. This time we decided to cut the release cycle short and do a quick patch release only two weeks since the previous release. The primary reason was the rather annoying and embarrassing HTTP/2 bug. See below for all the details.

Release presentation

Numbers

the 199th release
0 changes
14 days (total: 8,426)

21 bug-fixes (total: 6,833)
30 commits (total: 27,008)
0 new public libcurl function (total: 85)
0 new curl_easy_setopt() option (total: 288)

0 new curl command line option (total: 240)
23 contributors, 10 new (total: 2,366)
14 authors, 6 new (total: 878)
0 security fixes (total: 100)
0 USD paid in Bug Bounties (total: 5,200 USD)

Bug-fixes

This was a very short cycle but we still managed to merge a few interesting fixes. Here are some:

HTTP/2 selection over HTTPS

This regression is the main reason for this patch release. I fixed an issue before 7.76.0 was released and due to lack of covering tests with other TLS backends, nobody noticed that my fix also break HTTP/2 selection over HTTPS when curl was built to use one GnuTLS, BearSSL, mbedTLS, NSS, SChannnel, Secure Transport or wolfSSL!

The problem I fixed for 7.76.0: I made sure that no internal code updates the HTTP version choice the user sets, but that it then updates only the internal “we want this version”. Without this fix, an application that reuses an easy handle could without specifically asking for it, get another HTTP version in subsequent requests if a previous transfer had been downgraded. Clearly the fix was only partial.

The new fix should make HTTP/2 work and make sure the “wanted version” is used correctly. Fingers crossed!

Progress meter final update in parallel mode

When doing small and quick transfers in parallel mode with the command line tool, the logic could make the last update call to get skipped!

file: support getting directories again

Another regression. A recent fix made curl not consider directories over FILE:// to show a size (if -I or -i is used). That did however also completely break “getting” such a directory…

HTTP proxy: only loop on 407 + close if we have credentials

When a HTTP(S) proxy returns a 407 response and closes the connection, curl would retry the request to it even if it had no credentials to use. If the proxy just consistently did the same 407 + close, curl would get stuck in a retry loop…

The fixed version now only retries the connection (with auth) if curl actually has credentials to use!

Next release cycle

The plan is to make the next cycle two weeks shorter, to get us back on the previously scheduled path. This means that if we open the feature window on Monday, it will be open for just a little over two weeks, then give us three weeks of only bug-fixes before we ship the next release on May 26.

The next one is expected to become 7.77.0. Due to the rather short feature window this coming cycle I also fear that we might not be able to merge all the new features that are waiting to get merged.

talking curl on changelog again

We have almost a tradition now, me and the duo Jerod and Adam of the Changelog podcast. We talk curl and related stuff every three years. Back in 2015 we started out in episode 153 and we did the second one in episode 299 in 2018.

Time flies and now we’re in 2021 and we did again “meet up” virtually and talked curl and related stuff for a while. curl is now 23 years old and I still run the project, a few things have changed since the last curl episode and I asked my twitter friends for what they wanted to know and I think we managed to get a whole bunch of such topics into the mix.

So, here’s the 2021 edition of Daniel on the Changelog podcast: episode 436.

The Changelog 436: Curl is a full-time job (and turns 23) – Listen on Changelog.com

Anyone want to bet if we’ll do it again in 2024?

steps to release curl

I have a lot of different hats and roles in the curl project. One of them is “release manager” and in this post I’ve tried to write down pretty much all the steps I do to prepare and ship a curl release at the end of every release cycle in the project.

I’ve handled every curl release so far. All 198 of them. While the process certainly wasn’t this formal or extensive in the beginning, we’ve established a set of steps that have worked fine for us, that have been mostly unchanged for maybe ten years by now.

There’s nothing strange or magic about it. Just a process.

Release cycle

A typical cycle between two releases starts on a Wednesday when we do a release. We always release on Wednesdays. A complete and undisturbed release cycle is always exactly 8 weeks (56 days).

The cycle starts with us taking the remainder of the release week to observe the incoming reports to judge if there’s a need for a follow-up patch release or if we can open up for merging features again.

If there was no significant enough problems found in the first few days, we open the “feature window” again on the Monday following the release. Having the feature window open means that we accept new changes and new features getting merged – if anyone submits such a pull-request in a shape ready for merge.

If there was an issue found to be important enough to a warrant a patch release, we instead schedule a new release date and make the coming cycle really short and without opening the feature window. There aren’t any set rules or guidelines to help us judge this. We play this by ear and go with what feels like the right action for our users.

Closing the feature window

When there’s exactly 4 weeks left to the pending release we close the feature window. This gives us a period where we only merge bug-fixes and all features are put on hold until the window opens again. 28 days to polish off all sharp corners and fix as many problems we can for the coming release.

Contributors can still submit pull-requests for new stuff and we can review them and polish them, but they will not be merged until the window is reopened. This period is for focusing on bug-fixes.

We have a web page that shows the feature window’s status and I email the mailing list when the status changes.

Slow down

A few days before the pending release we try to slow down and only merge important bug-fixes and maybe hold off the less important ones to reduce risk.

This is a good time to run our copyright.pl script that checks copyright ranges of all files in the git repository and makes sure they are in sync with recent changes. We only update the copyright year ranges of files that we actually changed this year.

Security fixes

If we have pending security fixes to announce in the coming release, those have been worked on in private by the curl security team. Since all our test infrastructure is public we merge our security fixes into the main source code and push them approximately 48 hours before the planned release.

These 48 hours are necessary for CI and automatic build jobs to verify the fixes and still give us time to react to problems this process reveals and the subsequent updates and rinse-repeats etc until everyone is happy. All this testing is done using public code and open infrastructure, which is why we need the code to be pushed for this to work.

At this time we also have detailed security advisories written for each vulnerability that are ready to get published. The advisories are stored in the website repository and have been polished by the curl security team and the reporters of the issues.

Release notes

The release notes for the pending release is a document that we keep in sync and updated at a regular interval so that users have a decent idea of what to expect in the coming release – at all times.

It is basically a matter of running the release-notes.pl script, clean up the list of bug-fixes, then the run contributors.sh script and update the list of contributors to the release so far and then commit it with the proper commit message.

At release-time, the work on the release notes is no different than the regular maintenance of it. Make sure it reflects what’s been done in the code since the previous release.

Tag

When everything is committed to git for the release, I tag the repository. The name and format of the tag is set in stone for historical reasons to be curl-[version] where [version] is the version number with underscores instead of periods. Like curl-7_76_0 for curl 7.76.0. I sign and annotate the tag using git.

git push

Make sure everything is pushed. Git needs the --tags option to push the new tag.

maketgz

Our script that builds a full release tarball is called maketgz. This script is also used to produce the daily snapshots of curl that we provide and we verify that builds using such tarballs work in the CI.

The output from maketgz is four tarballs. They’re all the exact same content, just different compressions and archive formats: gzip, bz2, xz and zip.

The output from this script is the generated release at the point in time of the git tag. All the tarballs contents are then not found (identically) in git (or GitHub). The release is the output of this script.

Upload

I GPG sign the four tarballs and upload them to the curl site’s download directory. Uploading them takes just a few seconds.

The actual upload of the packages doesn’t actually update anything on the site and they will not be published just because of this. It needs a little more on the website end.

Edit release on GitHub

Lots of users get their release off GitHub directly so I make sure to edit the tag there to make it a release and I upload the tarballs there. By providing the release tarballs there I hope that I lower the frequency of users downloading the state of the git repo from the tag assuming that’s the same thing as a release.

As mentioned above: a true curl release is a signed tarball made with maketgz.

Web site

The curl website at curl.se is managed with the curl-www git repository. The site automatically updates and syncs with the latest git contents.

To get a release done and appear on the website, I update three files on the site. They’re fairly easy to handle:

  1. Makefile contains the latest release version number, release date and the planned date for the next release.
  2. _changes.html is the changelog of changes done per release. The secret to updating this is to build the web site locally and use the generated file dev/release-notes.gen to insert into the changelog. It’s mostly a copy and paste. That generated file is built from the RELEASE-NOTES that’s present in the source code repo.
  3. _newslog.html is used for the “latest news” page on the site. Just mention the new release and link to details.

If there are security advisories for this release, they are also committed to the docs/ directory using their CVE names according to our established standard.

Tag

I tag the website repository as well, using the exact same tag name as I did in the source code repository, just to allow us to later get an idea of the shape of the site at the time of this particular release. Even if we don’t really “release” the website.

git push

Using the --tags option again I push the updates to the website with git.

The website, being automatically synced with the git repository, will then very soon get the news about the release and rebuild the necessary pages on the site and the new release is then out and shown to the world. At least those who saw the git activity and visitors of the website. See also the curl website infrastructure.

Now it’s time to share the news to the world via some more channels.

Post blog

I start working on the release blog post perhaps a week before the release. I then work on it on and off and when the release is getting closer I make sure to tie all loose ends and finalize it.

Recently I’ve also created a new “release image” for the particular curl release I do so if I feel inspired I do that too. I’m not really skilled or talented enough for that, but I like the idea of having a picture for this unique release – to use in the blog post and elsewhere when talking about this version. Even if that’s a very ephemeral thing as this specific version very soon appears in my rear view mirror only…

Email announcements

Perhaps the most important release announcement is done per email. I inform curl-users, curl-library and curl-announce about it.

If there are security advisories to announce in association with the release, those are also sent individually to the same mailing lists and the oss-security mailing list.

Tweet about it

I’m fortunate enough to have a lot of twitter friends and followers so I also make sure they get to know about the new release. Follow me there to get future tweets.

Video presentation

At the day of the release I do a live-streamed presentation of it on twitch.

I create a small slide set and go through basically the same things I mention in my release blog post: security issues, new features and a look at some bug-fixes we did for this release that I find interesting or note-worthy.

Once streamed, recorded and published on YouTube. I update my release blog post and embed the presentation there and I add a link to the presentation on the changelog page on the curl website.

A post-release relief

Immediately after having done all the steps for a release. When its uploaded, published, announced, discussed and presented I can take a moment to lean back and enjoy the moment.

I then often experience a sense of calmness and relaxation. I get an extra cup of coffee, put my feet up and just go… aaaah. Before any new bugs has arrived, when the slate is still clean so to speak. That’s a mighty fine moment and I cherish it.

It never lasts very long. I finish that coffee, get my feet down again and get back to work. There are pull requests to review that might soon be ready for merge when the feature window opens and there are things left to fix that we didn’t get to in this past release that would be awesome to have done in the next!

Can we open the feature window again on the coming Monday?

Credits

Coffee Image by Karolina Grabowska from Pixabay

20,000 github stars

In September 2018 I celebrated 10,000 stars, up from 5,000 back in May 2017. We made 1,000 stars on August 12, 2014.

Today I’m cheering for the 20,000 stars curl has received on GitHub.

It is worth repeating that this is just a number without any particular meaning or importance. It just means 20,000 GitHub users clicked the star symbol for the curl project over at curl/curl.

At exactly 08:15:23 UTC today we reached this milestone. Checked with a curl command line like this:

$ curl -s https://api.github.com/repos/curl/curl | jq '.stargazers_count'
20000

(By the time I get around to finalize this post, the count has already gone up to 20087…)

To celebrate this occasion, I decided I was worth a beer and this time I went with a hand-written note. The beer was a Swedish hazy IPA called Amazing Haze from the brewery Stigbergets. One of my current favorites.

Photos from previous GitHub-star celebrations :

Where is HTTP/3 right now?

tldr: the level of HTTP/3 support in servers is surprisingly high.

The specs

The specifications are all done. They’re now waiting in queues to get their final edits and approvals before they will get assigned RFC numbers and get published as such – they will not change any further. That’s a set of RFCs (six I believe) for various aspects of this new stack. The HTTP/3 spec is just one of those. Remember: HTTP/3 is the application protocol done over the new transport QUIC. (See http3 explained for a high-level description.)

The HTTP/3 spec was written to refer to, and thus depend on, two other HTTP specs that are in the works: httpbis-cache and https-semantics. Those two are mostly clarifications and cleanups of older HTTP specs, but this forces the HTTP/3 spec to have to get published after the other two, which might introduce a small delay compared to the other QUIC documents.

The working group has started to take on work on new specifications for extensions and improvements beyond QUIC version 1.

HTTP/3 Usage

In early April 2021, the usage of QUIC and HTTP/3 in the world is measured by a few different companies.

QUIC support

netray.io scans the IPv4 address space weekly and checks how many hosts that speak QUIC. Their latest scan found 2.1 million such hosts.

Arguably, the netray number doesn’t say much. Those two million hosts could be very well used or barely used machines.

HTTP/3 by w3techs

w3techs.com has been in the game of scanning web sites for stats purposes for a long time. They scan the top ten million sites and count how large share that runs/supports what technologies and they also check for HTTP/3. In their data they call the old Google QUIC for just “QUIC” which is confusing but that should be seen as the precursor to HTTP/3.

What stands out to me in this data except that the HTTP/3 usage seems very high: the top one-million sites are claimed to have a higher share of HTTP/3 support (16.4%) than the top one-thousand (11.9%)! That’s the reversed for HTTP/2 and not how stats like this tend to look.

It has been suggested that the growth starting at Feb 2021 might be explained by Cloudflare’s enabling of HTTP/3 for users also in their free plan.

HTTP/3 by Cloudflare

On radar.cloudflare.com we can see Cloudflare’s view of a lot of Internet and protocol trends over the world.

The last 30 days according to radar.cloudflare.com

This HTTP/3 number is significantly lower than w3techs’. Presumably because of the differences in how they measure.

Clients

The browsers

All the major browsers have HTTP/3 implementations and most of them allow you to manually enable it if it isn’t already done so. Chrome and Edge have it enabled by default and Firefox will so very soon. The caniuse.com site shows it like this (updated on April 4):

(Earlier versions of this blog post showed the previous and inaccurate data from caniuse.com. Not anymore.)

curl

curl supports HTTP/3 since a while back, but you need to explicitly enable it at build-time. It needs to use third party libraries for the HTTP/3 layer and it needs a QUIC capable TLS library. The QUIC/h3 libraries are still beta versions. See below for the TLS library situation.

curl’s HTTP/3 support is not even complete. There are still unsupported areas and it’s not considered stable yet.

Other clients

Facebook has previously talked about how they use HTTP/3 in their app, and presumably others do as well. There are of course also other implementations available.

TLS libraries

curl supports 14 different TLS libraries at this time. Two of them have QUIC support landed: BoringSSL and GnuTLS. And a third would be the quictls OpenSSL fork. (There are also a few other smaller TLS libraries that support QUIC.)

OpenSSL

The by far most popular TLS library to use with curl, OpenSSL, has postponed their QUIC work:

“It is our expectation that once the 3.0 release is done, QUIC will become a significant focus of our effort.”

At the same time they have delayed the OpenSSL 3.0 release significantly. Their release schedule page still today speaks of a planned release of 3.0.0 in “early Q4 2020”. That plan expects a few months from the beta to final release and we have not yet seen a beta release, only alphas.

Realistically, this makes QUIC in OpenSSL many months off until it can appear even in a first alpha. Maybe even 2022 material?

BoringSSL

The Google powered OpenSSL fork BoringSSL has supported QUIC for a long time and provides the OpenSSL API, but they don’t do releases and mostly focus on getting a library done for Google. People outside the company are generally reluctant to use and depend on this library for those reasons.

The quiche QUIC/h3 library from Cloudflare uses BoringSSL and curl can be built to use quiche (as well as BoringSSL).

quictls

Microsoft and Akamai have made a fork of OpenSSL available that is based on OpenSSL 1.1.1 and has the QUIC pull-request applied in order to offer a QUIC capable OpenSSL flavor to the world before the official OpenSSL gets their act together. This fork is called quictls. This should be compatible with OpenSSL in all other regards and provide QUIC with an API that is similar to BoringSSL’s.

The ngtcp2 QUIC library uses quictls. curl can be built to use ngtcp2 as well as with quictls,

Is HTTP/3 faster?

I realize I can’t blog about this topic without at least touching this question. The main reason for adding support for HTTP/3 on your site is probably that it makes it faster for users, so does it?

According to cloudflare’s tests, it does, but the difference is not huge.

We’ve seen other numbers say h3 is faster shown before but it’s hard to find up-to-date performance measurements published for the current version of HTTP/3 vs HTTP/2 in real world scenarios. Partly of course because people have hesitated to compare before there are proper implementations to compare with, and not just development versions not really made and tweaked to perform optimally.

I think there are reasons to expect h3 to be faster in several situations, but for people with high bandwidth low latency connections in the western world, maybe the difference won’t be noticeable?

Future

I’ve previously shown the slide below to illustrate what needs to be done for curl to ship with HTTP/3 support enabled in distros and “widely” and I think the same works for a lot of other projects and clients who don’t control their TLS implementation and don’t write their own QUIC/h3 layer code.

This house of cards of h3 is slowly getting some stable components, but there are still too many moving parts for most of us to ship.

I assume that the rest of the browsers will also enable HTTP/3 by default soon, and the specs will be released not too long into the future. That will make HTTP/3 traffic on the web increase significantly.

The QUIC and h3 libraries will ship their first non-beta versions once the specs are out.

The TLS library situation will continue to hamper wider adoption among non-browsers and smaller players.

The big players already deploy HTTP/3.

Updates

I’ve updated this post after the initial publication, and the biggest corrections are in the Chrome/Edge details. Thanks to immediate feedback from Eric Lawrence. Remaining errors are still all mine! Thanks also to Barry Pollard who filed the PR to update the previously flawed caniuse.com data.

curl 7.76.0 adds rustls

I’m happy to announce that we yet again completed a full eight week release cycle and as customary, we end it with a fresh release. Enjoy!

Release presentation

Numbers

the 198th release
6 changes
56 days (total: 8,412)

130 bug fixes (total: 6,812)
226 commits (total: 26,978)
0 new public libcurl function (total: 85)
3 new curl_easy_setopt() option (total: 288)

3 new curl command line option (total: 240)
58 contributors, 34 new (total: 2,356)
24 authors, 11 new (total: 871)
2 security fixes (total: 100)
800 USD paid in Bug Bounties (total: 5,200 USD)

Security

Automatic referer leaks

CVE-2021-22876 is the first curl CVE of 2021.

libcurl did not strip off user credentials from the URL when automatically populating the Referer: HTTP request header field in outgoing HTTP requests, and therefore risks leaking sensitive data to the server that is the target of the second HTTP request.

libcurl automatically sets the Referer: HTTP request header field in outgoing HTTP requests if the CURLOPT_AUTOREFERER option is set. With the curl tool, it is enabled with --referer ";auto".

Rewarded with 800 USD

TLS 1.3 session ticket proxy host mixup

CVE-2021-22890 is a flaw in curl’s OpenSSL backend that allows a malicious HTTPS proxy to trick curl with session tickets and subsequently allow the proxy to MITM the remote server. The problem only exists with OpenSSL and it needs to speak TLS 1.3 with the HTTPS proxy – and the client must accept the proxy’s certificate, which has to be especially crafted for the purpose.

Note that an HTTPS proxy is different than the mode comon HTTP proxy.

The reporter declined offered reward money.

Changes

We list 6 “changes” this time around. They are…

support multiple -b parameters

The command line option for setting cookies can now be used multiple times on the command line to specify multiple cookies. Either by setting cookies by name or by providing a name to a file to read cookie data from.

add –fail-with-body

The command line tool has had the --fail option for a very long time. This new option is very similar, but with a significant difference: this new option saves the response body first even if it returns an error due to HTTP response code that is 400 or larger.

add DoH options to disable TLS verification

When telling curl to use DoH to resolve host names, you can now specify that curl should ignore the TLS certificate verification for the DoH server only. Independently of how it treats other TLS servers that might be involved in the transfer.

read and store the HTTP referer header

This is done with the new CURLINFO_REFERER libcurl option and with the command line tool, --write-out '%{referer}‘.

support SCRAM-SHA-1 and SCRAM-SHA-256 for mail auth

For SASL authentication done with mail-using protocols such as IMAP and SMTP.

A rustls backend

A new optional TLS backend. This is provided via crustls, a C API for the rustls TLS library.

Some Interesting bug-fixes

Again we’ve logged over a hundred fixes in a release, so here goes some of my favorite corrections we did this time:

curl: set CURLOPT_NEW_FILE_PERMS if requested

Due to a silly mistake in the previous release, the new --create-file-mode didn’t actually work because it didn’t set the permissions with libcurl properly – but now it does.

share user’s resolve list with DOH handles

When resolving host names with DoH, the transfers done for that purpose now “inherit” the same --resolve info as used for the normal transfer, which I guess most users already just presumed it did…

bump the max HTTP request size to 1MB

Virtually all internal buffers have length restrictions for security and the maximum size we allowed for a single HTTP request was previously 128 KB. A user with a use-case sending a single 300 KB header turned up and now we allow HTTP requests to be up to 1 MB! I can’t recommend doing it, but now at least curl supports it.

allow SIZE to fail when doing (resumed) FTP upload

In a recent change I made SIZE failures get treated as “file not found” error, but it introduced this regression for resumed uploads because when resuming a file upload and there’s nothing uploaded previously, SIZE is then expected to fail and it is fine.

fix memory leak in ftp_done

The torture tests scored another victory when it proved that when the connection failed at just the correct moment after an FTP transfer is complete, curl could skip a free() and leak memory.

fail if HTTP/2 connection is terminated without END_STREAM

When a HTTP/2 connection is (prematurely) terminated, streams over that connection could return “closed” internally without noticing the premature part. As there was no previous END_STREAM message received for the stream(s), curl should consider that an error and now it does.

don’t set KEEP_SEND when there’s no more HTTP/2 data to be sent

A rare race condition in the HTTP/2 code could make libcurl remain expecting to send data when in reality it had already delivered the last chunk.

With HTTP, use credentials from transfer, not connection

Another cleanup in the code that had the potential to get wrong in the future and mostly worked right now due to lucky circumstances. In HTTP each request done can use its own set of credentials, so it is vital to not use “connection bound” credentials but rather the “transfer oriented” set. That way streams and requests using different credentials will work fine over a single connection even when future changes alter code paths.

lib: remove ‘conn->data’ completely

A rather large internal refactor that shouldn’t be visible on the outside to anyone: transfer objects now link to the corresponding connection object like before, but now connection objects do not link to any transfer object. Many transfers can share the same connection.

adapt to OpenSSL v3’s new const for a few API calls

The seemingly never-ending work to make a version 3 of OpenSSL keeps changing the API and curl is adapting accordingly so that we are prepared and well functioning with this version once it ships “for real” in the future.

Close the connection when downgrading from HTTP/2 to HTTP/1

Otherwise libcurl is likely to reuse the same (wrong) connection again in the next transfer attempt since the connection reuse logic doesn’t take downgrades into account!

Cap initial HTTP body data amount during send speed limiting

The rate limiting logic was previously not correctly applied on the initial body chunk that libcurl sends. Like if you’d tell libcurl to send 50K data with CURLOPT_POSTFIELDS and limit the sending rate to 5K/second.

Celebratory drink

I’ll go for an extra fine cup of coffee today after I posted this. I think I’m worth it. I bet you are too. Go ahead and join me: Hooray for another release!

HOWTO backdoor curl

I’ve previously blogged about the possible backdoor threat to curl. This post might be a little repeat but also a refresh and renewed take on the subject several years later, in the shadow of the recent PHP backdoor commits of March 28, 2021. Nowadays, “supply chain attacks” is a hot topic.

Since you didn’t read that PHP link: an unknown project outsider managed to push a commit into the PHP master source code repository with a change (made to look as if done by two project regulars) that obviously inserted a backdoor that could execute custom code when a client tickled a modified server the right way.

Partial screenshot of a diff of the offending commit in question

The commits were apparently detected very quickly. I haven’t seen any proper analysis on exactly how they were performed, but to me that’s not the ultimate question. I rather talk and think about this threat in a curl perspective.

PHP is extremely widely used and so is curl, but where PHP is (mostly) server-side running code, curl is client-side.

How to get malicious code into curl

I’d like to think about this problem from an attacker’s point of view. There are but two things an attacker need to do to get a backdoor in and a third adjacent step that needs to happen:

  1. Make a backdoor change that is hard to detect and appears innocent to a casual observer, while actually still being able to do its “job”
  2. Get that changed landed in the master source code repository branch
  3. The code needs to be included in a curl release that is used by the victim/target

These are not simple steps. The third step, getting into a release, is not strictly always necessary because there are sometimes people and organizations that run code off the bleeding edge master repository (against our advice I should add).

Writing the backdoor code

As was seen in this PHP attack, it failed rather miserably at step 1, making the attack code look innocuous, although we can suspect that maybe that was done so on purpose. In 2010 there was a lengthy discussion about an alleged backdoor in OpenBSD’s IPSEC stack that presumably had been in place for years and even while that particular backdoor was never proven to be real, the idea that it can be done certainly is.

Every time we fix a security problem in curl there’s that latent nagging question in the back of our collective minds: was this flaw placed here deliberately? Historically, we’ve not seen any such attacks against curl. I can tell this with a high degree of certainty since almost all of the existing security problems detected and reported in curl was done by me…!

The best attack code would probably do something minor that would have a huge impact in a special context for which the attacker has planned to use it. I mean minor as in doing a NULL-pointer dereference or doing a use-after-free or something. This, because doing a full-fledged generic stack based buffer overflow is much harder to land undetected. Maybe going with a single-byte overwrite outside of a malloc could be the way, like it was back in 2016 when such a flaw in c-ares was used as the first step in a multi-flaw exploit sequence to execute remote code as root on ChromeOS…

Ideally, the commit should also include an actual bug-fix that would be the public facing motivation for it.

Get that code landed in the repo

Okay let’s imagine that you have produced code that actually is a useful bug-fix or feature addition but with an added evil twist, and you want that landed in curl. I can imagine several different theoretical ways to do it:

  1. A normal pull-request and land using the normal means
  2. Tricking or forcing a user with push rights to circumvent the review process
  3. Use a weakness somewhere and land the code directly without involving existing curl team members

The Pull Request method

I’ve never seen this attempted. Submit the pull-request to the project the usual means and argue that the commit fixes a bug – which could be true.

This makes the backdoor patch to have to go through all testing and reviews with flying colors to get merged. I’m not saying this is impossible, but I will claim that it is very hard and also a very big gamble by an attacker. Presumably it is a fairly big job just to get the code for this attack to work, so maybe going with a less risky way to land the code is then preferable? But then which way is likely to have the most reliable outcome?

The tricking a user method

Social engineering is very powerful. I can’t claim that our team is immune to that so maybe there’s a way an outsider could sneak in behind our imaginary personal walls and make us take a shortcut for a made up reason that then would circumvent the project’s review process.

We can even include more forced “convincing” such as direct threats against persons or their families: “push this code or else…”. This way of course cannot be protected against using 2fa, better passwords or things like that. Forcing a users to do it is also likely to eventually get known and then immediately make the commit reverted.

Tricking a user doesn’t make the commit avoid testing and scrutinizing after the fact. When the code has landed, it will be scanned and tested in a hundred CI jobs that include a handful of static code analyzers and memory/address sanitizers.

Tricking a user could land the code, but it can’t make it stick unless the code is written as the perfect stealth change. It really needs to be that good attack code to work out. Additionally: circumventing the regular pull-request + review procedure is unusual so I believe it is likely that such commit will be reviewed and commented on after the fact, and there might then be questions about it and even likely follow-up actions.

The exploiting a weakness method

A weakness in this context could be a security problem in the hosting software or even a rogue admin in the company that hosts the main source code git repo. Something that allows code to get pushed into the code repository without it being the result of one of the existing team members. This seems to be the method that the PHP attack was done through.

This is a hard method as well. Not only does it shortcut reviews, it is also done in the name of someone on the team who knows for sure that they didn’t do the commit, and again, the commit will be tested and poked at anyway.

For all of us who sign our git commits, detecting such a forged commit is easy and quickly done. In the curl project we don’t have mandatory signed commits so the lack of a signature won’t actually block it. And who knows, a weakness somewhere could even possibly find a way to bypass such a requirement.

The skip-git-altogether methods

As I’ve described above, it is really hard even for a skilled developer to write a backdoor and have that landed in the curl git repository and stick there for longer than just a very brief period.

If the attacker instead can just sneak the code directly into a release archive then it won’t appear in git, it won’t get tested and it won’t get easily noticed by team members!

curl release tarballs are made by me, locally on my machine. After I’ve built the tarballs I sign them with my GPG key and upload them to the curl.se origin server for the world to download. (Web users don’t actually hit my server when downloading curl. The user visible web site and downloads are hosted by Fastly servers.)

An attacker that would infect my release scripts (which btw are also in the git repository) or do something to my machine could get something into the tarball and then have me sign it and then create the “perfect backdoor” that isn’t detectable in git and requires someone to diff the release with git in order to detect – which usually isn’t done by anyone that I know of.

But such an attacker would not only have to breach my development machine, such an infection of the release scripts would be awfully hard to pull through. Not impossible of course. I of course do my best to maintain proper login sanitation, updated operating systems and use of safe passwords and encrypted communications everywhere. But I’m also a human so I’m bound to do occasional mistakes.

Another way could be for the attacker to breach the origin download server and replace one of the tarballs there with an infected version, and hope that people skip verifying the signature when they download it or otherwise notice that the tarball has been modified. I do my best at maintaining server security to keep that risk to a minimum. Most people download the latest release, and then it’s enough if a subset checks the signature for the attack to get revealed sooner rather than later.

The further-down-the-chain method

As an attacker, get into the supply chain somewhere else: find a weaker link in the chain between the curl release tarball and the target system for your attack . If you can trick or social engineer maybe someone else along the way to get your evil curl tarball to get used there instead of the actual upstream tarball, that might be easier and give you more bang for your buck. Perhaps you target your particular distribution’s or Operating System’s release engineers and pretend to be from the curl project, make up a story and send over a tarball to help them out…

Fake a security advisory and send out a bad patch directly to someone you know build their own curl/libcurl binaries?

Better ways?

If you can think of other/better ways to get malicious code via curl code into a victim’s machine, let me know! If you find a security problem, we will reward you for it!

Similarly, if you can think of ways or practices on how we can improve the project to further increase our security I’ll be very interested. It is an ever-moving process.

Dependencies

Added after the initial post. Lots of people have mentioned that curl can get built with many dependencies and maybe one of those would be an easier or better target. Maybe they are, but they are products of their own individual projects and an attack on those projects/products would not be an attack on curl or backdoor in curl by my way of looking at it.

In the curl project we ship the source code for curl and libcurl and the users, the ones that builds the binaries from that source code will get the dependencies too.

Credits

Image by SeppH from Pixabay