Tag Archives: cURL and libcurl

Copyright without years

Like so many other software projects the curl project has copyright mentions at the top of almost every file in the source code repository. Like

Copyright (C) 1998 - 2022, Daniel Stenberg ...

Over the years we have used a combination of scripts and manual edits to update the ending year in that copyright line to match the year of the latest update of that file.

As soon as we started a new year and someone updated a file, the copyright range needed update. Scripts and tools made it less uncomfortable, but it was always somewhat of a pain to remember and fix.

In 2023 this changed

When the year was again bumped and the first changes of the year were done to curl, we should then consequentially start updating years again to make ranges end with 2023.

Only this time someone asked me why? and it made me decide that what the heck, let’s completely rip them out instead! Doing it at the beginning of the year is also a very good moment.

Do we need the years?

The Berne Convention states that copyright “must be automatic; it is prohibited to require formal registration”.

The often-used copyright lines are not necessary to protect our rights. According to the Wikipedia page mentioned above, the Berne Convention has been ratified by 181 states out of 195 countries in the world.

They can still serve a purpose as they are informational and make the ownership question quite clear. The year ranges add questionable value though.

I have tried to find resources that argue for the importance of the copyright years to be stated and present, but I have not found any credible sources. Possibly because I haven’t figured out where to look.

Not alone

It turns out quite a few projects run by many different organizations or even huge companies have already dropped the years from their source code header copyright statements. Presumably at least some of those giant corporations have had their legal departments give a green light to the idea before they went ahead and published source code that way to the world.

Low risk

We own the copyrights no matter if the years are stated or not. The exact years the files were created or edited can still easily be figured out since we use version control, should anyone ever actually care about it. And we give away curl for free, under an extremely liberal license.

I don’t think we risk much by doing this move.

January 3, 2023

On this day I merged commit 2bc1d775f510, which updated 1856 files and removed copyright years from almost everywhere in the source code repository.

I decided to leave them in the main license file. Partly because this is a file that lots of companies include in their products and I have had some use of seeing the year ranges in there in the past!

Bliss

Now we can forget about copyright years in the project. It’s a relief!

An m1 for curl

A generous member of the wider curl community stepped up and donated an unused Mac mini m1 model to me to be used for curl development. Today it arrived at my home. An 8C CPU/16GB/1TB/8C GPU/1GbE model as per the sticker on the box.

The m1 mac mini, still wrapped in plastic.

Apple is not helping

Apple has shipped and used curl in their products for twenty years but they never assist, help or otherwise contribute to the development. They also don’t sponsor us in any way, like with hardware.

Yet, there are many curl users on the different Apple platforms and sometimes these users run into issues that are unique to those platforms and are challenging to address without direct access to such.

For curl

I decided to accept this gift as I believe it might help the project, but this is not a guarantee or promise that I will run around and become the mac support guy in the project. It will just allow me to sometimes get a better grip and ability to help out.

I will also offer other curl committers access to the machine in case of need. For development and debugging and whatnot. Talk to me about it.

A tiny speed comparison

My Intel-based development machine runs Linux, is ten years old and is equipped with an i7-3770K CPU at 3.5GHz. The source code is stored on an OCZ-VERTEX4 SSD on the Intel, the mac has SSD storage only.

Here’s a rough and not very scientific test of some of my most common build activities on the m1+macOS vs the old Intel+Linux machines. This is using the bleeding edge curl source code with roughly the same build config. Both used clang for compiling, a debug build.

Testm1Intel
configure19.8 s18.5 s
make -sj12.8 s14.2 s
autoreconf -fi7.9 s12.8 s
make -sj (in tests/)19.1 s33.9 s

I expected the differences to be bigger.

The first line of curl -V for the two builds:

curl 7.87.1-DEV (aarch64-apple-darwin22.2.0) libcurl/7.87.1-DEV OpenSSL/3.0.7 zlib/1.2.11 brotli/1.0.9 zstd/1.5.2 c-ares/1.18.1 libidn2/2.3.4 libpsl/0.21.2 (+libicu/71.1) libssh2/1.10.0 nghttp2/1.51.0 libgsasl/2.2.0
curl 7.87.1-DEV (x86_64-pc-linux-gnu) libcurl/7.87.1-DEV OpenSSL/3.0.7 zlib/1.2.13 brotli/1.0.9 zstd/1.5.2 c-ares/1.17.0 libidn2/2.3.3 libpsl/0.21.0 (+libidn2/2.3.0) libssh2/1.10.1_DEV nghttp2/1.50.0-DEV librtmp/2.3 libgsasl/2.2.0

Interestingly, there is no mention anywhere that I can find in the OS settings/config or in the box etc as to what CPU speed the m1 runs at.

Credits

This device was donated “to the cause” by “a member and supporter of the Network Time Foundation at nwtime.org” (real name withheld on request).

Discussed

Hacker news.

Short follow-up

People mention that the Intel CPU uses much more power, runs at higher temperature and that the m1 is “just first generation” and all sorts of other excuses for the results presented above. Others insist that the Makefiles must be bad or that I’m not using the mac to its best advantage etc.

None of those excuses change the fact that my ten year old machine builds curl and related code at roughly the same speed as this m1 box while I expected it to be a more noticeable speed difference in the m1’s favor. Yes, it was probably bad expectations.

curl -w certs

When a client connects to a TLS server it gets sent one or more certificates during the handshake.

Those certificates are verified by the client, to make sure that the server is indeed the right one: the server the client expects it to be; no impostor and no man in the middle etc.

When such a server certificate is signed by a Certificate Authority (CA), that CA’s certificate is normally not sent by the server but the client is expected to have it already in its CA store.

What certs?

Ever since the day SSL and TLS first showed up in the 1990s user have occasionally wanted to be able to save the certificates provided by the server in a TLS handshake.

The openssl tool has offered this ability since along time and is actually one of my higher ranked stackoverflow answers.

Export the certificates with the tool first, and then in subsequent transfers you can tell curl to use those certificates as a CA store:

$ echo quit | openssl s_client -showcerts -connect curl.se:443 > cacert.pem
$ curl --cacert cacert.pem https://curl.se/

This is of course most convenient when that server is using a self-signed certificate or something otherwise unusual.

(WARNING: The above shown example is an insecure way of reaching the host, as it does not detect if the host is already MITMed at the time when the first command runs. Trust On First Use.)

OpenSSL

A downside with the approach above is that it requires the openssl tool. Albeit, not a big downside for most people.

There are also alternative tools provided by wolfSSL and GnuTLS etc that offer the same functionality.

QUIC

Over the last few years we have seen a huge increase in number of servers that run QUIC and HTTP/3, and tools like curl and all the popular browsers can communicate using this modern set of protocols.

OpenSSL cannot. They decided to act against what everyone wanted, and as a result the openssl tool also does not support QUIC and therefore it cannot show the certificates used for a HTTP/3 site!

This is an inconvenience to users, including many curl users. I decided I could do something about it.

CURLOPT_CERTINFO

Already back in 2016 we added a feature to libcurl that enables it to return a list of certificate information back to the application, including the certificate themselves in PEM format. We call the option CURLOPT_CERTINFO.

We never exposed this feature in the command line tool and we did not really see the need as everyone could use the openssl tool etc fine already.

Until now.

curl -w is your friend

curl supports QUIC and HTTP/3 since a few years back, even if still marked as experimental. Because of this, the above mentioned CURLOPT_CERTINFO option works fine for that protocol version as well.

Using the –write-out (-w) option and the new variables %{certs} and %{num_certs} curl can now do what you want. Get the certificates from a server in PEM format:

$ curl https://curl.se -w "%{certs}" -o /dev/null > cacert.pem
$ curl --cacert cacert.pem https://curl.se/

You can of course also add --http3 to the command line if you want, and if you like to get the certificates from a server with a self-signed one you may want to use --insecure. You might consider adding --head to avoid the response body. This command line uses -o to write the content to /dev/null because it does not care about that data.

The %{num_certs} variable shows the number of certificates returned in the handshake. Typically one or two but can be more.

%{certs} outputs the certificates in PEM format together with a number of other details and meta data about the certificates in a “name: value” format.

Availability

These new -w variables are only supported if curl is built with a supported TLS backend: OpenSSL/libressl/BoringSSL/quictls, GnuTLS, Schannel, NSS, GSKit and Secure Transport.

Support for these new -w variables has been merged into curl’s master branch and is scheduled to be part of the coming release of curl version 7.88.0 on February 15th, 2023.

At 17000 curl commits

Today, another 1,000 commits have been recorded as done by me in the curl source code git repository since November 2021. Out of a total of 29,608 commits to the curl source code repository, I have made 17,001. 57.42%.

The most recent one was PR #10019.

In 2022, I have done 56% of all the commits in the curl source repository. I am also the only developer who works full time on curl all the time.

In 2022, 179 individuals authored commits that were merged into curl. 115 of them did that for the first time this year. Over curl’s life time, a total of 1104 persons have authored code merged into curl.

Do I ever get bored? Not yet. I will let you know if I do.

The curl fragment trick

curl supports globbing in the sense that you can provide ranges or lists in the URL that will make curl iterate, loop, over all the different variations and do a separate transfer for each.

For example, get ten images in a numeric range:

curl "https://example.com/image[1-10].jpg" -O

Or get them when named after some weekdays:

curl "https://example.com/{Monday,Tuesday,Friday}.jpg" -O

Naming the output

The examples above use -O which makes curl use the same name for the destination file as is used the effective URL. Convenient, but not always what you want.

curl also allows you to refer to the number or name from the range or list and use that when naming your output files, which helps you do better globbing.

For example, maybe the file name part of the URL is actually the same and you iterate over another difference in the URL. Like this:

curl "https://example.com/{Monday,Tuesday,Friday}/image" -o #1.jpg

The #1 part in the example is a reference back to the first list/range, as you can do multiple ones and even using mixed types and you can then use multiple #-references in the same command line. To illustrate, here is a simple example using two iterators to download three hundred images:

curl "https://{red,blue,green}.example.com/image[1-100].jpg" -o "#2-#1-stored.jpg"

There is actually no upper limit to how many transfers you can do like this with curl, other than that the numeric ranges only deal with up to 64 bit numbers.

Hundreds? Maybe go parallel

If you actually do come up with a command line that needs to transfer several hundred or more resources, then maybe consider adding -Z, --parallel to the mix so that curl performs many transfers simultaneously, in parallel. This can drastically reduce the total time needed for completing the task.

curl runs up to 50 transfers in parallel by default when this option is used, but you can also tweak this amount with --parallel-max.

A fragment trick

Okay, so now we finally arrive at the fragment and the trick mentioned in the title.

If you want to do several repeated transfers but not actually change the URL then the examples above do not satisfy you as they change the URL for every new transfer.

A neat trick is then to add a fragment part to the URL you use, and then do the globbing there. The fragment is the rightmost part of a URL that starts with a #-character and continues to the end of the URL.

A fragment can always be added to a URL, but the fragment is never actually transmitted over the network so the remote server is not aware of it.

Get the same URL ten times, saved in different target files:

curl "https://example.com/index.html#[1-10]" -o #1.html

If you rather name the outputs according some scheme, you can of course just list them in the glob:

curl "https://example.com/index.html#{mercury,venus,earth,mars}" -o #1.html

Maybe slower

In cases where you transfer the same URL many times, chances are you want to do this because the content changes at some interval. Perhaps you then do not want them all to be done as fast as possible as then the contents may not have updated.

To help you pace the transfers to get the same thing over and over in a more controlled manner, curl offers --rate. With this you can tell curl to not do it faster than N transfers per given period.

If the URL contents update every 5 minutes, then doing the transfer 12 times per hour seems suitable. Let’s do it 2016 times to have the operation run non-stop for a week:

curl "https://example.com/index.html#[1-2016]" --rate 12/h -o "#1.html"

The 2022 curl security audit

tldr: several hundred hours of dedicated scrutinizing of curl by a team of security experts resulted in two CVEs and a set of less serious remarks. The link to the reports is at the bottom of this article.

Thanks to an OpenSSF grant, OSTIF helped us set up a curl security audit, which the excellent Trail of Bits was selected to perform in September 2022. We are most grateful to OpenSSF for doing this for us, and I hope all users who use and rely on curl recognize this extraordinary gift. OSTIF and Trail of Bits both posted articles about this audit separately.

We previously had an audit performed on curl back in 2016 by Cure53 (sponsored by Mozilla) but I like to think that we (curl) have traveled quite far and matured a lot since those days. The fixes from the discoveries reported in that old previous audit were all merged and shipped in the 7.51.0 release, in November 2016. Now over six years ago.

Changes since previous audit

We have done a lot in the project that have improved our general security situation over the last six years. I believe we are in a much better place than the last time around. But we have also grown and developed a lot more features since then.

curl is now at150,000 lines of C code. This count is for “product code” only and excludes blank lines but includes 19% comments.

71 additional vulnerabilities have been reported and fixed since then. (42 of those even existed in the version that was audited in 2016 but were obviously not detected)

We have 30,000 additional lines of code today (+27%), and we have done over 8,000 commits since.

We have 50% more test cases (now 1550).

We have done 47 releases featuring more than 4,200 documented bugfixes and 150 changes/new features.

We have 25 times the number of CI jobs: up from 5 in 2016 to 127 today.

The OSS-Fuzz project started fuzzing curl in 2017, and it has been fuzzing curl non-stop since.

We introduced our “dynbuf” system internally in 2020 for managing growing buffers to maybe avoid common C mistakes around those.

Audit

The Trail of Bits team was assigned this as a three-part project:

  1. Create a Threat Model document
  2. Testing Analysis and Improvements
  3. Secure code Review

The project was setup to use a total of 380 man hours and most of the time two Trail of Bits engineers worked in parallel on the different tasks. The Trail of Bits team themselves eventually also voluntarily extended the program with about a week. They had no problems finding people who wanted to join in and look into curl. We can safely say that they spent a significant amount of time and effort scrutinizing curl.

The curl security team members had frequent status meetings and assisted with details and could help answer questions. We would also get updates and reports on how they progressed.

Two security vulnerabilities were confirmed

The first vulnerability they found ended up known as the CVE-2022-42915: HTTP proxy double-free issue.

The second vulnerability was found after Trail of Bits had actually ended their work and their report, while they were still running a fuzzer that triggered a separate flaw. This second vulnerability is not covered in the report but was disclosed earlier today in sync with the curl 7.87.0 release announcement: CVE-2022-43552: HTTP Proxy deny use-after-free.

Minor frictions detected

Discoveries and remarks highlighted through their work that were not consider security sensitive we could handle on the fly. Some examples include:

  • Using --ssl now outputs a warning saying it is unsafe and instead recommending --ssl-reqd to be used.
  • The Alt-svc: header parser did not deal with illegal port numbers correctly
  • The URL parser accepted “illegal” characters in the host name part.
  • Harmless memory leaks

You should of course read the full reports to learn about all the twenty something issues with all details, including feedback from the curl security team.

Actions

The curl team acted on all reported issues that we think we could act on. We disagree with the Trail of Bits team on a few issues and there are some that are “good ideas” that we should probably work on getting addressed going forward but that can’t be fixed immediately – but also don’t leave any immediate problem or danger in the code.

Conclusions

Security is not something that can be checked off as done once and for all nor can it ever be considered complete. It is a process that needs to blend in and affect everything we do when we develop software. Now and forever going forward.

This team of security professionals spent more time and effort in this security auditing and poking on curl with fuzzers than probably anyone else has ever done before. Personally, I am thrilled that they only managed to uncovered two actual security problems. I think this shows that a lot of curl code has been written the right way. The CVEs they found were not even that terrible.

Lessons

Twenty something issues were detected, and while the report includes advice from the auditors on how we should improve things going forward, they are of the kind we all already know we should do and paths we should follow. I could not really find any real lessons as in obvious things or patterns we should stop or new paradigms och styles to adapt.

I think we learned or more correctly we got these things reconfirmed:

  • we seem to be doing things mostly correct
  • we can and should do more and better fuzzing
  • adding more tests to increase coverage is good

Security is hard

To show how hard security can be, we received no less than three additional security reports to the project during the actual life-time when this audit was being done. Those additional security reports of course came from other people and identified security problems this team of experts did not find.

My comments on the reports

The term Unresolved is used for a few issues in the report and I have a minor qualm with the use of that particular word in this context for all cases. While it is correct that we in several cases did not act on the advice in the report, we saw some cases where we distinctly disagree with the recommendations and some issues that mentioned things we might work on and address in the future. They are all just marked as unresolved in the reports, but they are not all unresolved to us in the curl project.

In particular I am not overly pleased with how the issue called TOB-CURLTM-6 is labeled severity high and status unresolved as I believe this wrongly gives the impression that curl has issues with high severity left unresolved in the code.

If you want to read the specific responses for each and every reported issue from the curl project, they are stored in this separate GitHub gist.

The reports

You find the two reports linked to from the curl security page. A total of almost 100 pages in two PDF documents.

curl 7.87.0

Numbers

the 212th release
5 changes
56 days (total: 9,042)

155 bug-fixes (total: 8,492)
238 commits (total: 29,571)
0 new public libcurl function (total: 91)
2 new curl_easy_setopt() option (total: 302)

1 new curl command line option (total: 249)
83 contributors, 40 new (total: 2,771)
42 authors, 20 new (total: 1,101)
2 security fixes (total: 132)
Bug Bounties total: 48,580 USD

Release presentation

At 10:00 CET (9:00 UTC) on December 21, Daniel live-streams the release presentation on twitch. This paragraph will later be replaced by a link to the YouTube version of that video.

Security

Two security advisories this time around, severity low and medium.

CVE-2022-43551: Another HSTS bypass via IDN

The HSTS logic could be bypassed if the host name in the given URL first uses IDN characters that get replaced to ASCII counterparts as part of the IDN conversion. Like using the character UTF-8 U+3002 (IDEOGRAPHIC FULL STOP) instead of the common ASCII full stop (U+002E). Then in a subsequent request, it does not detect the HSTS state and makes a clear text transfer. Because it would store the info IDN encoded but look for it IDN decoded.

CVE-2022-43552: HTTP Proxy deny use-after-free

When an HTTP PROXY denied to tunnel SMB or TELNET, curl would use a heap-allocated struct after it had been freed in its transfer shutdown code path.

Changes

–url-query

curl’s 249th command line option adds data to the query part of the URL.

CURLOPT_QUICK_EXIT

Tell libcurl to not wait for any DNS threads on exit.

CURL_WRITEFUNC_ERROR

New and easier way to signal write callback errors.

CURLOPT_CA_CACHE_TIMEOUT

libcurl can now cache the CA store in memory, as I blogged about separately.

feature names added to curl_version_info_data

The struct returned by curl_version_info now returns all built-in features listed by name. This is a preparation to allow applications to adapt slowly and get ready for the future moment when the features can no longer fit in in the 32 bit fields previously used for this purpose.

Bugfixes

Better base64

The encoder now allocates the output using a more appropriate size, and both the encoder and decoder implementations are much faster.

hyper

We fixed a few issues in the hyper backend and are down to just 12 remaining disabled tests to address.

gen.pl: fix the linkifier

This script generates the curl.1 man page and make sure to properly mark references correctly, so that the man page can get rendered as we webpage with correct links etc on the website. This time we made it work better and therefore more cross-references in the man page is now linked correctly in the web version.

tool: override the numeric locale and set “C” by force

In previous curl versions it mistakenly used the locale when parsing floating point numbers, which then made the tool hard to use in scripts which would run in multiple locales. An example is the timeout option specified with -m / --max-time as number of seconds with a fraction. Now it requires the decimal separator to always be a dot/period independently of the user’s locale.

tool: timeout in the read callback

The command line tool can now timeout reading data better, for example when using telnet:// with a timeout option and the user does not press any key and nothing happens over the network.

curl_get_line: allow last line without newline char

Because of a somewhat lazy recent fix, the .netrc parsed and other users of the nternal curl_get_line() function would ignore the last line if it did not end with a newline. This is no more.

support growing FTP files with CURLOPT_IGNORE_CONTENT_LENGTH

If this option is set, also known as --ignore-content-length on the command line, curl will not complain if the size grows from the moment the FTP transfer starts until it ends. Thus allowing it to grow while being transferred.

do not send PROXY more than once

The HAproxy protocol line could get sent more than once and thus break stuff.

feature deprecation warnings in gcc

A number of outdated libcurl options and functions are now tagged as deprecated, which will cause compiler warnings when used in application code for users of gcc 6.1 or later. Deprecated here means that we recommend using other, more modern, alternatives.

parse numbers with fixed known base 10

In several places in curl and libcurl source code we would allow numbers to be specified using octal or hexadecimal while decimal was the only expected and documented base. In order to minimize surprises and for consistency, we now limited them as far as possible to only accepting decimal numbers.

rewind BEFORE request instead of AFTER previous

When curl is used to send a request, for example a POST, and there is reason for it to send it again, like if there is a redirect or an ongoing authentication process, it would previously rewind the stream at the end of that transfer first transfer in order to have it done when the next transfer is about to get done. Now, it instead does the rewind first in the second request. This, because there are times when the second request are not done, and the rewind may not work. So, such a failing rewind can be avoided by not doing it until it is strictly necessary.

noproxy

Several independent regressions were fixed – in spite of the new set of test cases added for testing this feature in the previous release. Noproxy is the support for the NO_PROXY environment variable and related options.

openssl: prefix errors with ‘[lib]/[version]:’

To help users understand errors and their origins a little better, libcurl will now prefix error messages originating from OpenSSL (and forks) with the name of the flavor and its version number.

RTSP auth works again

This functionality was broken a few versions back and now it has finally been fixed again.

runtests: –no-debuginfod now disables DEBUGINFOD_URLS

valgrind and gdb support downloading stuff at the moment of need if this environment variable is set. Previously the curl test running script would unset that variable unconditionally, but now it will not and instead offer an option that unsets it – for the cases where the environment variable causes problems (such as performance slowdowns).

HTTP/3 tests

We finally have the first infrastructure merged for doing and running HTTP/3 specific tests in the curl test suite. Now we can better avoid regressions going forward. This is only the beginning and I expect us to expand and grow these tests going forward.

determine the correct fopen option for -D

When saving response headers into a dedicated file with curl’s -D, –dump-header option, curl would be inconsistent about when to create a new file and when to append do it. Now it acts exactly as documented.

better error message for -G with bad URL

Several users figured out curl showed misleading error messages when -G was used in combination with a malformed URL. This is now improved.

repair IDN for proxies

A recent fix we landed for IDN for host names accidentally simultaneously broke it for proxies…

cmake: set the soname on the shared library

Using cmake to build libcurl as a shared library on Linux and several other systems, will now set the SONAME number correctly in the same style and with the same number that the autotools build uses.

WebSocket polish

  • fixes for partial frames and buffer updates
  • now returns CURLE_NOT_BUILT_IN when websockets support is not built in
  • returns error properly when the connection is closed

TLS goes connection filters => more HTTPS-proxy

As a direct result of the internal refactor and introduction of connection filters also for TLS, curl now supports HTTPS-proxy for a wider selection of TLS backends than previously.

Credits

Release image by @sny@mas.to

curl sighting: Tschugger

In the Swiss crime comedy TV series Tschugger, season two episode two at roughly 25:20, there is a shot with a curl command line in a terminal window using an unnecessary –request option.

Following the curl line is what looks like an interactive login procedure, which certainly is not something a real curl would present. Based on this, I think we need to give this use of curl a fairly low realism score: a 2 out 5.

Trying that displayed command line in a real terminal unfortunately only gives us Could not resolve host: secure.da-34-22.remote.com. I doubt that the TV company actually purchased this domain though. It seems a little too generic.

I have not seen it

I have not been able to view this episode so I cannot yet comment on the conditions and the surroundings for when this snapshot is taken. Once I do, I might be able to extend the description above somewhat.

Credits

First brought to my attention by Cybergossipgirl, who also took the snapshot seen above.

curl sighting: Silk Road

In the 2021 movie Silk Road, at around 19:23-19:26 into the film we can see Ross Ulbricht, the lead character, write a program on his laptop that uses curl. A few seconds we get a look at the screen as Ross types on the keyboard and explains to the female character who says I didn’t know you know how to code that he’s teaching himself to write code.

The code

Let’s take a look at the code on the screen. This is PHP code using the well known PHP/CURL binding. The URL on the screen on line two has really bad contrast, but I believe this is what it says:

<?php
  $ch = curl_init("http://silkroadvb5pzir.onion");
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_PROXY,                
              "http://127.0.0.1:9050/");
  curl_setopt($ch, CURLOPT_PROXYTYPE, 7);
  $output = curl_exec($ch);
  $curl_error = curl_error($ch);
  curl_cl

.onion is a TLD for websites on Tor so this seems legit as it a URL for this purpose could look like this. But then Ross confuses matters a little. He uses two curl_init() calls, one that sets a URL and then again a call without a URL. He could just have removed line three and four. This doesn’t prohibit the code from working, it just wouldn’t have passed a review.

The code then sets a proxy to use for the transfer, specified as an HTTP URL which is a little odd since the proxy type he then sets on the line below is 7, the number corresponding to CURLPROXY_SOCKS5_HOSTNAME – so not a HTTP proxy at all but a SOCKS5 proxy. The typical way you access Tor: as a SOCKS5 proxy to which you pass the host name, as opposed to resolving the host name locally.

The last line is incomplete but should ultimately be curl_close($ch); to close the handle after use.

All in all a seemingly credible piece of code, especially if we consider it as a work in progress code. The minor mistakes would be soon be fixed.

Credits

Viktor Szakats spotted this and sent me the screenshot above. Thanks!

Faster base64 in curl

This adventure started with an issue where a user pointed out that the libcurl function for base64 encoding actually would allocate a few bytes too many at times.

That turned out to be true and we fixed it fairly quickly.

As I glanced at that base64 encoder function that was still loaded and showing in my editor window, it struck me that it really was not written in an optimal way.

Base64 encoding

This “encoding” converts 8 bit data into a 6 bit data, where each 6 bit combination has a dedicated ASCII character. It uses these 64 different characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/.

Three 8-bit bytes make up 24 bit of data, which can be represented by four 6-bit symbols. Like this: the example byte sequence 0x12 , 0x34 and 0x56 creates the 24 bit value 0x123456. Shown in binary it looks like: 000100100 0110100 01010110

That 24 bit number is split into 6-bit chunks: 000100, 100011, 010001 and 010110. Written in decimal, they are 4, 35, 17 and 22. Pick the corresponding symbols from those indexes in the base64 table shown above and they make the base64 encoded sequence: EjRW. And so on.

That’s base 64 encoding.

A realization

The base64 encoder function source code I looked at, was introduced in curl in the late 1990s and existed in the first commit we have saved. It has remained mostly intact since. Over twenty two years old.

This is how the code used to look. Fairly readable, but with a lot of conditions and perhaps most importantly, with calls to msnprintf() to output data. msnprintf() is our internal snprintf implementation,

The old base64 encoder

while(insize > 0) {
  for(i = inputparts = 0; i < 3; i++) {
    if(insize > 0 {
     inputparts++;
     ibuf[i] = (unsigned char) *indata;
     indata++;
     insize--;
   }
   else
     ibuf[i] = 0;
  }

  obuf[0] = ((ibuf[0] & 0xFC) >> 2);
  obuf[1] = (((ibuf[0] & 0x03) << 4) |
            ((ibuf[1] & 0xF0) >> 4));
  obuf[2] = (((ibuf[1] & 0x0F) << 2) |
            ((ibuf[2] & 0xC0) >> 6));
  obuf[3] = (ibuf[2] & 0x3F);

  switch(inputparts) {
  case 1: /* only one byte read */
    i = msnprintf(output, 5, "%c%c%s%s",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  padstr,
                  padstr);
    break;

  case 2: /* two bytes read */
    i = msnprintf(output, 5, "%c%c%c%s",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  table64[obuf[2]],
                  padstr);
    break;

  default:
    i = msnprintf(output, 5, "%c%c%c%c",
                  table64[obuf[0]],
                  table64[obuf[1]],
                  table64[obuf[2]],
                  table64[obuf[3]]);
    break;
  }
  output += i;

}

(The padstr variable in there is for handling the mode where it does not output any final = padding characters. )

Improving

I started out by writing a test program. I created a huge string, and made the test program base64-encode a part of that string. Starting from one byte string, then increasing the length one by one to iterating over all sizes until the final size which happened to be 106128 bytes. Maybe not the most realistic test in the world, but at least it does a lot of base64 encoding. The base64 encode algorithm is also content agnostic so it doesn’t matter what the exact content is, the size of it is the main thing.

My first casual attempt that only replaced the snprintf() calls with direct assigns into the target buffer first made me doubt my numbers or test program. My test program ran 14 times faster.

Motivated by the enormous performance gain seen with that minor change, I continued. I removed the use of the obuf array and it occurred to me I should deal with the encoding in two phases; one main one for all complete three-byte triplets and then do the padding final chunk separately – as then we can avoid conditions in the main loop.

The final result that I ended up merging showed an almost 29 times improvement. With the old code the test program took six minutes to complete, the new one finished in twelve seconds.

The new encoder

  while(insize >= 3) {
    *output++ = table64[ in[0] >> 2 ];
    *output++ = table64[ ((in[0] & 0x03) << 4) |
                         (in[1] >> 4) ];
    *output++ = table64[ ((in[1] & 0x0F) << 2) | 
                         ((in[2] & 0xC0) >> 6) ];
    *output++ = table64[ in[2] & 0x3F ];
    insize -= 3;
    in += 3;
  }
  if(insize) {
    /* this is only one or two bytes now */
    *output++ = table64[ in[0] >> 2 ];
    if(insize == 1) {
      *output++ = table64[(in[0] & 0x03) << 4];
      if(*padstr) {
        *output++ = *padstr;
        *output++ = *padstr;
      }
    }
    else {
      /* insize == 2 */
      *output++ = table64[((in[0] & 0x03) << 4) |
                          ((in[1] & 0xF0) >> 4)];
      *output++ = table64[(in[1] & 0x0F) << 2];
      if(*padstr)
        *output++ = *padstr;
    }
  }

I think the new version still is highly readable, and it actually is significantly smaller in size than the previous version!

Base64 decoding

Energized by that fascinating improvement I managed to do to the encoder function, I turned my eyes to the base64 decoder function. This function is slightly newer in curl than the encoder, but still traces back to 2001 and it too was never improved much after its initial merge.

I started again by writing another test program. This one creates a 2948 bytes base64 encoded string which the application then iterates over and decodes pieces of. From one byte up to the full size, and then it loops so that it repeats that procedure a thousand times.

When decoding base64, the core of the code needs to find out what binary number each particular input octet represents. ‘A’ is zero, ‘B’ is one etc. Then it needs to take four such consecutive (effectively 6 bit) letters at a time and output 3 eight bit bytes. Over and over. In a final round it might need to deal with =-padding to make it an even 4 bytes size.

The old decoder

This is how the old decodeQuantum function looked like. It decodes a 4-letter sequence into three output bytes.

  for(i = 0, s = src; i < 4; i++, s++) {
    if(*s == '=') {
      x <<= 6;
      padding++;
    }
    else {
      const char *p = strchr(base64, *s);
      if(p)
        x = (x << 6) + curlx_uztoul(p - base64);
      else
        return 0;
    }
  }

  if(padding < 1)
    dest[2] = curlx_ultouc(x & 0xFFUL);

  x >>= 8;
  if(padding < 2)
    dest[1] = curlx_ultouc(x & 0xFFUL);

  x >>= 8;
  dest[0] = curlx_ultouc(x & 0xFFUL);

The new decoder

What immediately sticks out in the old code is the use of strchr() to find the letters’ offset in the base64 table as a means to figure out its byte value. The larger the value, the longer the function needs to search through the string to find it. And it needs to search for every single byte used in the input string.

What if we could replace the search by a lookup table instead?

My updated code features a lookup table for each input byte, and in similar fashion to how the encoder logic works, it now separates the final padding step in a separate extra block in the end to avoid extra conditions in the main loop.

With the simplified quad decoder, I put the whole thing in the same loop. Unfortunately I could not think of a way to avoid the check for invalid characters in the loop.

It turns out this new base64 decoder is about 4.5 times faster than the previous code!

  for(i = 0; i < fullQuantums; i++) {
    unsigned char val;
    unsigned int x = 0;
    int j;

    for(j = 0; j < 4; j++) {
      val = lookup[(unsigned char)*src++];
      if(val == 0xff) /* bad symbol */
        goto bad;
      x = (x << 6) | val;
    }
    pos[2] = x & 0xff;
    pos[1] = (x >> 8) & 0xff;
    pos[0] = (x >> 16) & 0xff;
    pos += 3;
  }

Final words

I wrote my test applications to measure the impact of my changes. We could argue if they show real world use cases or not but I think they at still prove that my changes were good and made the functions faster. The exact numbers are not that important.

The exact performance numbers I mention in this blog post are based on results that I saw on my old Intel-based development machine as the best result of at least three consecutive runs. Other systems and architectures will for sure show different numbers.

Base64 encoding and decoding are not significant functions, and are not even very frequently called ones, in curl. These improvements are not likely to even be noticeable to curl or libcurl users. I still consider these optimizations worthwhile because why not do things as fast as you can if you are going to them anyway. As long as there’s no particular penalty involved.

The changes I did for the base64 handling were done entirely without changing the behavior or function prototypes in order to compartmentalize them and keep them constrained.

Credits

Image by DrZoltan from Pixabay

Discussed

On Hacker news, Reddit and lobste.rs