All posts by Daniel Stenberg

curl is REUSE compliant

The REUSE project is an effort to make Open Source projects provide copyright and license information (for all files) in a machine readable way.

When a project is fully REUSE compliant, you can easily figure out the copyright and license situation for every single file it holds.

The easiest way to accomplish this is to make sure that all files have the correct header with the appropriate copyright info and SPDX-License-Identifier specified, but it also has ways to provide that meta data in adjacent files – for files where prepending that info isn’t sensible.

What we needed to do

We were already in a fairly good place before this push. We have a script that verifies the presence of copyright header in files (including checking the end year vs the latest git commit), with a list of files that were deliberately skipped.

The biggest things we needed to do were

  1. Add the SPDX identifier all over
  2. Make sure that the skipped files also have copyright and licensing info provided
  3. Add a CI job that verifies that we remain compliant

I also ended up adjusting our own copyright scan script to use the REUSE metadata files instead of its own ignore filters which also made it even easier for us to make sure we are and remain compatible — that every single files in the curl git repository has a known and documented license and copyright situation.

As a bonus, the cleanup work helped us detect an example file that stood out which we got relicensed and we removed two older files that had their own unique licenses (without any good reason).

There are 3518 files in the curl git repository this exact moment.

Compliant!

Starting mid-June 2022, curl is 100% REUSE compliant. curl 7.84.0 will be the first release done in this status.

Motivation

I think it is a good idea to have perfect control over the copyright and license situation for every single file, and to make sure that the situation is documented enough and to a level that allows anyone and everyone to check it out and learn how things lie. No surprises.

Companies have obviously figured out this info before to a degree that they have been satisfied with since curl is widely used even commercial since a long time. But I believe that by providing the information in an even easier and more descriptive way makes things even better. For existing and future users.

I also think that the low threshold for us to reach this compliance was a factor. We were almost there already. We just need to polish up some small details and I think it made it worth it.

This cleanup also makes sure we have perfect control and knowledge of the license situation, now and going forward. I think this can be expected from a project aiming for gold standard.

The curl SPDX license identifier

Keen readers will notice that curl has its own license identifier. It is called the curl license. Not MIT, X or a BSD variation. curl.

The reason for this is good old stupidity. In January 2001 we adopted the MIT license for use in the project because we believed it better matches what we want compared to the previous license situation. We started out with a dual license situation together with the MPL license we used previously, but the MPL part was removed completely in October 2002.

For reasons that have since been forgotten, we thought it was a good idea to edit the license text. To trim it a little. Since August 2002, the license text that started out as an MIT/X license is no longer a perfect copy. It is a derivative . Very similar and almost identical. But it’s not the same.

When the SPDX project created their set of identifiers for well-used licenses out in the FOSS world they decided that the curl license is different enough from the MIT/X license to treat it separately and give it its own identifier. I know of no other project than curl that uses this particular edited version of the MIT license.

In hindsight, I believe the editing of the license text back in 2002 was dumb. I regret it, but I will not change it again. I think we can live with this situation pretty good.

Credits

Most of the heavy lifting necessary to make curl compliant was done by Max Mehl.

curl user survey 2022 analysis

Once again I’ve collected the numbers, generated graphs, scratched my head and tried to understand what users mean and how to best use this treasure trove of user feedback.

The curl user survey 2022 ran for two full weeks in the end of May. Here is the document with all the numbers, graphs and analysis from this year’s data.

You will learn what protocols curl users use (HTTPS and HTTP), which TLS backend is the most popular (OpenSSL) and which the top platform is (Linux). And a lot more.

Spoiler: the results are not terribly different than last year and the year before that!

The analysis is a 36-page PDF, available here:

curl-user-survey-2022-analysis

If you have specific feedback on the analysis itself, then I’m all ears. I’m not statistics scholar or anything, but I believe all the numbers, graphs and data I present in there are accurate barring my mistakes of course.

Making libcurl init more thread-safe

Twenty-one years ago, in May 2001 we introduced the global initialization function to libcurl version 7.8 called curl_global_init().

The main reason we needed this separate function to get called before anything else was used in libcurl, was that several of libcurl’s dependencies at the time (including OpenSSL and GnuTLS) had themselves thread-unsafe initialization procedures.

This rather lame characteristic found in several third party dependencies made the libcurl function inherit that property: not thread-safe. A nasty “feature” in a library that otherwise prides itself for being thread-safe and in many ways working at “it should”. A function that is specifically marked as thread unsafe was not good. Is not good.

Still, we were victims of circumstances and if these were the dependencies we were going to use, this is what we needed to do.

Occasionally, this limitation has poked people in the eye and really hurt them since it makes some use cases really difficult to realize.

Dependencies improved

Over the following decades, the dependencies libcurl use have almost all shaped up and removed the thread-unsafe property of their initialization procedures.

We also slowly cleaned away other code that happened to also fall into the init function out of laziness and convenience because it was there and could be used (or perhaps abused).

Eventually, we were basically masters of our own faith again. The closet was all cleared out and the scrubby leftovers we had sloppily left in there had been cleaned up and gotten converted to proper thread-safe code.

The task of finally making curl_global_init() thread-safe was brought up and attempted a little half-assed a few times but was never pulled through all the way.

The challenges always included that we want to avoid relying on thread library and that we are supporting building libcurl with C89 compilers etc.

Finally, the spring cleaning of 2022

Thanks to work spear-headed by Thomas Guillem who came bursting in with a clear use-case in mind where he felt he really need this to work, and voila now the next libcurl release (7.84.0) features a thread-safe init.

If configure finds support for _Atomic (a C11 feature) or it runs on a new enough Windows version (this should cover a vast amount of platforms), libcurl can now do its own spinlock implementation that makes the init function thread-safe and independent of threading libraries.

New HTTP core specs

Before this, the latest refreshed specification of HTTP/1.1 was done in the RFC 7230 series, published in June 2014. After that, HTTP/2 was done in the spring of 2015 and recently the HTTP/3 spec has been a work in progress.

To better reflect this new world of multiple HTTP versions and an HTTP protocol ecosystem that has some parts that are common for all versions and some other parts that are specific for each particular version, the team behind this refresh has been working on this updated series.

My favorite documents in this “cluster” are:

HTTP Semantics

RFC 9110 basically describes how HTTP works independently of and across versions.

HTTP/1.1

RFC 9112 replaces 7230.

HTTP/2

RFC 9113 replaces 7540.

HTTP/3

RFC 9114 is finally the version three of the protocol in a published specification.

Credits

Top image by Gerhard G. from Pixabay. The HTTP stack image is done by me, Daniel.

.netrc pains

The .netrc file is used to hold user names and passwords for specific host names and allows tools to login to those systems automatically without having to prompt the user for the credentials while avoiding having to use them in command lines. The .netrc file is typically set without group or world read permissions (0600) to reduce the risk of leaking those secrets.

History

Allegedly, the .netrc file format was invented and first used for Berknet in 1978 and it has been used continuously since by various tools and libraries. Incidentally, this was the same year Intel introduced the 8086 and DNS didn’t exist yet.

.netrc has been supported by curl (since the summer of 1998), wget, fetchmail, and a busload of other tools and networking libraries for decades. In many cases it is the only cross-tool way to provide credentials to remote systems.

The .netrc file use is perhaps most widely known from the “standard” ftp command line client. I remember learning to use this file when I wanted to do automatic transfers without any user interaction using the ftp command line tool on unix systems in the early 1990s.

Example

A .netrc file where we tell the tool to use the user name daniel and password 123456 for the host user.example.com is as simple as this:

machine user.example.com
login daniel
password 123456

Those different instructions can also be written on the same single line, they don’t need to be separated by newlines like above.

Specification

There is no and has never been any standard or specification for the file format. If you google .netrc now, the best you get is a few different takes on man pages describing the format in a high level. In general this covers our needs and for most simple use cases this is good enough, but as always the devil is in the details.

The lack of detailed descriptions on how long lines or fields to accept, how to handle special character or white space for example have left the implementers of the different code basis to decide by themselves how to handle those things.

The horse left the barn

Since numerous different implementations have been done and have been running in systems for several decades already, it might be too late to do a spec now.

This is also why you will find man pages out there with conflicting information about the support for space in passwords for example. Some of them explicitly say that the file format does not support space in passwords.

Passwords

Most fields in the .netrc work fine even when not supporting special characters or white space, but in this age we have hopefully learned that we need long and complicated passwords and thus having “special characters” in there is now probably more common than back in the 1970s.

Writing a .netrc file with for example a double-quote or a white space in the password unfortunately breaks tools and is not portable.

I have found at least three different ways existing tools do the parsing, and they are all incompatible with each other.

curl parser (before 7.84.0)

curl did not support spaces in passwords, period. The parser split all fields at the following space or newline and accepted whatever is in between. curl thus supported any characters you want, except space and newlines . It also did not “unquote” anything so if you wanted to provide a password like ""llo (with two leading double-quotes), you would use those five bytes verbatim in the file.

wget parser

This parser allows a space in the password if you provide it quoted within double-quotes and use a backslash in front of the space. To specify the same ""llo password mentioned above, you would have to write it as "\"\"llo".

fetchmail parser

Also supports spaces in passwords. Here the double-quote is a quote character itself so in order to provide a verbatim double-quote, it needs to be doubled. To specify the same ""llo password mentioned above, you would have to write it as """"llo – that is with four double-quotes.

What is the best way?

Changing any of these parsers in an effort to unify risk breaking existing use cases and scripts out in the wild with outraged users as a result. But a change could also generate a few happy users too who then could better share the same .netrc file between tools.

In my personal view, the wget parser approach seems to be the most user friendly one that works perhaps most closely to what I as a user would expect. So that’s how I went ahead and made curl work.

What to do

Users will of course be stuck with ancient versions for a long time and this incompatibility situation will remain for the foreseeable future. I can think of a few work-arounds users can do to cope:

  • Avoid space, tabs, newline and various quotes in passwords
  • Use separate .netrc files for separate tools
  • Provide passwords using other means than .netrc – with curl you can for example explore using –config instead

Future curl supports quoting

We are changing the curl parser somewhat in the name of compatibility with other tools (read wget) and curl will allow quoted strings in the way wget does it, starting in curl 7.84.0. While this change risks breaking a few command lines out there (for users who have leading double-quotes in their existing passwords), I think the change is worth doing in the name of compatibility and the new ability to use spaces in passwords.

A little polish after twenty-four years of not supporting spaces in user names or passwords.

Hopefully this will not hurt too many users.

Credits

Image by Anja-#pray for ukraine# #helping hands# stop the war from Pixabay

curl offers repeated transfers at slower pace

curl --rate is your new friend.

This option is for when you use curl to do many requests in a single command line, but you want curl to not do them as quickly as possible. You want curl to do them no more often than at a certain interval. This is a way to slow down the request frequency curl would otherwise possibly use. Tell curl to do the transfers no faster than…

This is a completely different and separate option from the transfer speed rate limit option --limit-rate that has existed for a long time.

A primary reason for using this option is when the server end has a certain capped acceptance rate or other cases where you know it makes no sense to do the requests faster than at a certain interval.

With this new option, you specify the maximum transfer frequency you allow curl to use – in number of transfer starts per time unit (sometimes called request rate) with the new --rate option.

Set the fastest allowed rate with --rate "N/U" where N is an integer number and U is a time unit. Supported units are ‘s’ (for second), ‘m’ (for minute), ‘h’ (for hour) and ‘d’ (for day, as in a 24 hour unit). U is the time unit. The default time unit, if no “/U” is provided, is number of transfers per hour.

For example, to make curl not do its requests faster than twice per minute, use --rate 2/m but if you rather want 25 per hour, then use --rate 25/h.

If curl is provided several URLs and a single transfer completes faster than the allowed rate, curl will wait before it kicks off the next transfer in order to maintain the requested rate and not go faster. If curl is told to allow 10 requests per minute, it will not start the next request until 6 seconds have elapsed since the previous transfer started.

This option has no effect when --parallel is used. Primarily because you then ask for the transfers to happen simultaneously and we have not figured out how this option should affect such transfers!

This functionality uses a millisecond resolution timer internally. If the allowed frequency is set to more than 1000 transfer starts per second, it will instead run unrestricted.

When retrying transfers, enabled with --retry, the separate retry delay logic is used and not this setting.

Rate-limiting response headers

There is ongoing work to standardize HTTP response headers for the purpose or rate-limiting. (See RateLimit Header Fields for HTTP.) Using these headers, a server can tell the client what the maximum allowed transfer rate is and it can adapt.

This new command line option however, does nothing with any such new headers, but I think it would make sense to make a future version able to check for the rate-limit headers and, if opted-in, adapt to those instead of the frequency set by the user.

A challenge with these ratelimit headers vs the --rate command line option is of course that the response headers for this will return the rules for a given site/API, and curl might have been told to talk to many different sites which might all have different (or no) rates in their headers. Also, the command line option works for all protocols curl supports, not just HTTP(S).

Ship it

This feature is due in the pending curl 7.84.0 release.

Credits

Image by kewl from Pixabay

case insensitive string comparisons in C

Back in 2008, I had a revelation when it dawned on me that the POSIX function called strcasecmp() compares strings case insensitively, but locale dependent. Because of this, “file” and “FILE” is not actually a case insensitive match in Turkish but is a match in most other locales. curl would sometimes fail in mysterious ways due to this. Mysterious to the users, now we know why.

Of course this behavior was no secret. The knowledge about this problem was widespread already then. It was just me who hadn’t realized this yet.

A custom replacement

To work around that problem for curl, we immediately implemented our own custom comparison replacement function that doesn’t care about locales. Internet protocols work the same way no matter which locale the user happens to prefer.

We did not go the POSIX route. The POSIX function for case insensitive string comparisons that ignores the locale is called strcasecmp_l() but that uses a special locale argument and also doesn’t exist on non-POSIX platforms.

curl has used its custom set of functions since 7.19.1, released in early November 2008.

OpenSSL 3.0.3

Fast forward to May 2022. OpenSSL released their version 3.0.3. In the change-log for this release we learned that they now offer public functions for case insensitive string comparisons. Whatdoyouknow! They too have learned about the Turkish locale. Apparently to the degree that they feel they need to offer those functions in their already super-huge API set. Oh well, that is certainly their choice.

I can relate since we too have such functions in libcurl, but I have always regretted that we added them to the API since comparing strings is not libcurl’s core business. We did wrong then and we still live with the consequences several decades later.

OpenSSL however took the POSIX route and based their implementation on strcasecmp_l() and use a global variable for the locale and an elaborate system to initialize that global and even a way to make things work if string comparisons are needed before that global variable is initialized etc.

This new system was complicated to the degree that it broke the library on several platforms, which curl users running Windows 7 figured out almost instantly. curl with OpenSSL 3.0.3 simply does not work on Windows 7 – at all.

Reasons for not exposing a string compare API

Libraries should only provide functions that are within their core objective. Not fluffy might be useful things. Reasons for this include:

  • It adds to the complexity to users. Yet another function in the ever expanding set of function calls in the API.
  • It increases the documentation size even more and makes the real things harder to find somewhere in there.
  • It adds “attack surface” and areas where you can make errors and introduce security problems.
  • You get more work since now you have additional functions to keep ABI and API stable for all eternity and you have to spend developer time and effort on making sure they remain so.

Do a custom one for OpenSSL?

I think there is a software law that goes something like this

eventually, all C libraries implement their own case insensitive string comparison functions

When I proposed they should implement their own custom function in discussions in one of the issues about this OpenSSL problem, the suggestion was shot down fairly quickly because of how hard it is to implement a such function that is as fast as the glibc version.

In my ears, that sounds like they prefer to stick with an overworked and complicated error-prone system, because an underlying function is faster, rather than going with simplicity and functionality at the price of sightly slower performance. In fairness, they say that case insensitive string comparisons are “6-7%” of the time spent in some (to me unknown) performance test they referred to. I have no way or intention to argue with that.

I think maybe they couldn’t really handle that idea from an outsider and they might just need a little more time to figure out the right way forward on their own. Then go with simple.

I am of course not in the best position to say how they should act on this. I’m just a spectator here. I may be completely wrong.

Update (May 23)

In a separate PR (4 days after this blog post went live), OpenSSL suddenly implemented their own and it was deemed that it would not hurt performance noticeably. Merged on May 23. Almost like they followed my recommendation!

OpenSSL’s current tolower() implementation used in the comparison function is similar to curl’s old one so I suspect curl’s current function is a tad bit faster.

Custom vs glibc performance

glibc truly has really fast string comparison implementations, with optimized assembly versions for the common architectures. Versions written in plain C tend to be slower.

However, the API and way to use those functions to make them locale independent is horrific because of the way it forces the caller to provide a locale argument (which could be the “C” locale – the equivalent of no locale).

The curl custom function

That talk about the slowness of custom string functions made us start discussing this topic a little in the curl IRC channel and we bounced around some ideas of what things the curl function does not already do and what it could do and how it compares against the glibc assembly version.

Also: the string comparisons in curl are certainly not that performance critical as they seem to be in OpenSSL and while used a lot in curl they are not used in the most important performance critical transfer-data code paths.

Optimizations

Frank Gevaerts took the lead and after some rounds and discussions backed up with tests, he ended up with an updated function that is 1.6 to 1.7 times faster than before his work. We dropped non-ASCII support in curl a while ago, which also made this task more straight-forward.

The two improvements:

  1. Use a lookup table for our own toupper() implementation instead of the previous simple condition + math.
  2. Better end of loop handling: return immediately on mismatch, and a minor touch-up of the final check when the loop goes all the way to the end.

Measurements

The glibc assembler versions are still faster than curl’s custom functions and the exact speed improvements the above mentioned changes provide will of course depend both on platform and the test set.

Ships in 7.84.0

The faster libcurl functions will ship in curl 7.84.0. I doubt anyone will notice the difference!

curl annual user survey 2022

For the eighth consecutive year, we run the curl user survey. We usually kick it off during this time of the year.

Tell us how you use curl!

This is the best and frankly the only way the curl project has to get real feedback from people as to what features that are used and which are not used as well as other details in the project that can help us navigate our future and what to do next. And what not to do next.

curl runs no ads, has no trackers, users don’t report anything back and the project has no website logs. We are in many aspects completely blind as to what users do with curl and what they think of it. Unless we ask. This is us asking.

How is curl working for you?

[Go to survey]

Please ask your curl-using friends to also stop by and tell us their views!

[The survey analysis]

Credits

Image by Andreas Breitling from Pixabay

A tale of a trailing dot

Trailing dots on host names in URLs is the gift that keeps on giving.

Let me take you through a dwindling story of how the dot is handled differently in different places through the stack of an Internet client. The evil trailing dot.

DNS

When a given host name is to be resolved to an IP address on a networked computer, there are dedicated functions to use. The host name example.com resolves to a number of IP addresses.

If you add a dot to the end of that host name, it does not change what is resolved. “example.com.” resolves to the same set of addresses as “example.com” does. (But putting two dots at the end will make it fail.)

The reason why it works like this is based on how DNS is built up with different “labels” (that when written in text are separated with dots) and then having a trailing dot is just an empty final label, just as with no dot. So, in the DNS protocol there are no trailing dots so to speak. When trying two dots at the end, it makes a zero-length label between them and that is not allowed.

People accustomed to fiddle with DNS are used to ending Fully Qualified Domain Names (FQDN) with a trailing dot.

Resolving names

In addition to the name actually being resolved (sent to a DNS resolver), native resolver functions usually puts a meaning and a semantic difference between resolving “hello” and “hello.” (with or without the trailing dot). The trailing dot then means the name is to be used actually exactly only like that, it is specified in full, while the name without a trailing dot can be tried with a domain name appended to it. Or even a list of domain names, until one resolves. This makes people want to use a trailing dot at times, to avoid that domain test.

HTTP names

HTTP clients that want to work with a given URL needs to extract the name part from the URL and use that name to:

  1. resolve the host name to a list of IP addresses to connect to
  2. pass that name in the Host: or :authority: request headers, so that the HTTP server knows which specific server the clients speaks to – as it may run multiple servers on the same IP address

The HTTP spec says the name in the Host header should be used verbatim from the URL; the trailing dot should be included if it was present in the URL. This allows a server to host different content for “example.com” and “example.com.”, even if many servers will by default treat them as the same. Some hosts will just redirect the dot version to the non-dot. Some hosts will return error.

The HTTP client certainly connects to the same set of addresses for both.

For a lot of HTTP traffic, having the trailing dot there or not makes no difference. But they can be made to make a difference. And boy, they can certainly make a difference internally…

Cookies

Cookies are passed back and forth over HTTP using dedicated request and response headers. When a server wants to pass a cookie to the client, it can specify for which particular domain it is valid for and the client will send back cookies to the server only when there is a match of the domain it speaks to and for which domain cookies are set to etc.

The cookie spec RFC 6265 section 5.1.2 defines the host name in a way that makes it ignore trailing dots. Cookies set for a domain with a dot are valid for the same domain without one and vice versa.

SNI

When speaking to a HTTPS server, a client passes on the name of the remote server during the TLS handshake, in the SNI (Server Name Indication) field, so that the server knows which instance the client wants to speak to. What about the trailing dot you think?

The hostname is represented as a byte string using ASCII encoding without a trailing dot.

Meaning, that a HTTPS server cannot – in the TLS layer – make a distinction between a server for “example.com.” and “example.com”. Different hosts for HTTP, the same host for HTTPS.

curl’s dotted past

In the curl project, we – as everyone else – have struggled with the trailing dot over time.

As-is

We started out being mostly oblivious about the effects of the trailing dot and most of the code just treated it as part of the host name and it would be in the host name everywhere. Until one day someone pointed out that the SNI field does not approve of it. We fixed that.

Strip it

In 2014, curl started to always just cut off trailing dots internally if one was provided in the URL. The dot rarely makes a difference, it made the host name work fine with SNI and for HTTPS it is practically difficult to make a difference between them.

Keep it

In 2022, someone found a web site that actually requires a trailing dot in the Host: header to respond correctly and reported it to the curl project.

Sigh. We back-pedaled on the eight years old decision and decided to internally keep the dot in the name, but strip it for the purpose of the SNI field. This seems to be how the browsers are doing it. We released curl 7.82.0 with this change. That site that needed the trailing dot kept in the Host: header could now be retrieved with curl. Yay.

As a bonus curl also lowercases the SNI name field now, because that is what the browsers do even if the spec says the field is supposed to be used case insensitively. That habit has made sure there are servers on the Internet that won’t work properly if the SNI name is not lowercase…

In your face

That “back-peddle” for 7.82.0 when we brought back the dot into the host name field, turned out to be incomplete, but it was not totally obvious nor immediately apparent.

When we brought back the trailing dot into the name field, we accidentally broke several internal name checks.

The checks broke in the cookie handling of domains even though cookies, as mentioned above, are supposed to not care about trailing dots.

To understand this, we have to back up a little bit and talk about how cookies and cookie domains work.

Public Suffixes

Cookies are strange beasts and because the server can tell the client for which domain the cookie applies to, a client needs to check so that the server does not try to set the cookies too broadly or for other domains. It does not stop there, but there is also the concept of something called “Public Suffix List” (PSL), which are known domains for which setting a cookie is not accepted. (This list is also used for limiting other things in browsers but they are out of scope here.) One widely known such domain to mention as an example is “co.uk”. A server should not be allowed to set a cookie for “co.uk” as then it would basically be sent back for every web site that exist in the UK.

The PSL is a maintained list with a huge number of domains in it. To manage those and to make sure tools like curl can check for them in a convenient way, a dedicated library was made for this several years ago: libpsl. curl has optionally used this since 2015.

I said optional

That public suffix list is huge, which is a primary reason why many users still opt to build curl without support for it. This means that curl needs to provide backup functionality for the builds where libpsl is not present. Typically in a lot of embedded systems.

Without knowledge of the PSL curl will not reject cookies for “co.uk” but it should reject cookies for “.uk” or “.com” as even without PSL knowledge it still knows that setting cookies for top-level domains is not okay.

How did the curl check used without PSL verify if the given domain is a TLD only?

It checked – if there is a dot present in the name, then it is not a TLD.

CVE-2022-27779

Axel Chong figured out that for a curl build without PSL knowledge, the server could set a cookie for a TLD if you just made sure to end the name with a dot.

With the 7.82.0 change in place, where curl keeps the trailing dot for the host name, combined with that cookie set for TLD domain with a trailing dot, they have matching tail ends. This means that curl would send cookies to servers that match the criteria. The broken TLD check was benign all those years until we let the trailing dot in. This is security vulnerability CVE-2022-27779.

CVE-2022-30115

It did not stop there. Axel did not stop there. Since curl now keeps the trailing dot in the name and did not do it before, there was a second important string comparison that broke in unexpected ways that Axel figured out and reported. A second vulnerability introduced by the same change.

HSTS is the concept that allows curl to store a “cache” of host names and keep it around, so that if you want to do a subsequent transfer to one of those host names again before they expire, curl will go directly to HTTPS even if HTTP is used in the URL. As a way to avoid the clear-text insecure redirect step some URLs use.

The new treatment of trailing dots, that basically allows users to provide the same host name in two different ways and yet resolve to the exact same addresses exposed that the HSTS code did take care of (ignored) the trailing dot properly. If you let curl store HSTS info for the host name without a trailing dot, you can then later bypass the HSTS by using the same host name with a trailing dot. Or vice versa. This is security vulnerability CVE-2022-30115.

alt-svc

The code for alt-svc also needed adjustment for the dot, but fortunately that was “just” a bug and had not security impact.

All these three separate areas in which trailing dots caused problems have been fixed in curl 7.83.1 and all of them are now tested and verified with an extended set of tests to make sure they keep handle the dots correctly.

Someone called it a dot release.

Is this the end of dot problems?

I don’t know but it seems unlikely. The trailing dots have kept on haunting us since a long time by now so I would say the chances are big that there are both some more flaws lingering and some future changes pending. That then can make the cycle take another loop or two.

I suppose we will find out. Stay tuned!

curl 7.83.1 it burns

Welcome to this patch release of curl, shipped only 14 days since the previous version. We decided to cut the release cycle short because of the several security vulnerabilities that were pointed out. See below for details. There are no new features added in this release.

It burns. Mostly in our egos.

Release video

Numbers

the 208th release
0 changes
14 days (total: 8,818)

41 bug-fixes (total: 7,857)
65 commits (total: 28,573)
0 new public libcurl function (total: 88)
0 new curl_easy_setopt() option (total: 295)

0 new curl command line option (total: 247)
20 contributors, 6 new (total: 2,632)
13 authors, 3 new (total: 1,030)
6 security fixes (total: 121)
Bug Bounties total: 22,660 USD

Security

Axel Chong reported three issues, Harry Sintonen two and Florian Kohnhäuser one. An avalanche of security reports. Let’s have a look.

curl removes wrong file on error

CVE-2022-27778 reported a way how the brand new command line options remove-on-error and no-clobber when used together could end up having curl removing the wrong file. The file that curl was told not to clobber actually.

cookie for trailing dot TLD

CVE-2022-27779 is the first of two issues this time that identified a problem with how curl handles trailing dots since the 7.82.0 version. This flaw lets a site set a cookie for a TLD with a trailing dot that then might have curl send it back for all sites under that TLD.

percent-encoded path separator in URL host

In CVE-2022-27780 the reporter figured out how to abuse curl URL parser and its recent addition to decode percent-encoded host names.

CERTINFO never-ending busy-loop

CVE-2022-27781 details how a malicious server can trick curl built with NSS to get stuck in a busy-loop when returning a carefully crafted certificate.

TLS and SSH connection too eager reuse

CVE-2022-27782 was reported and identifies a set of TLS and SSH config parameters that curl did not consider when reusing a connection, which could end up in an application getting a reused connection for a transfer that it really did not expected to.

HSTS bypass via trailing dot

CVE-2022-30115 is very similar to the cookie TLD one, CVE-2022-27779. A user can make curl first store HSTS info for a host name without a trailing dot, and then in subsequent requests bypass the HSTS treatment by adding the trailing dot to the host name in the URL.

Bug-fixes

The security fixes above took a lot of my efforts this cycle, but there were a few additional ones I could mention.

urlapi: address (harmless) UndefinedBehavior sanitizer warning

In our regular attempts to remove warnings and errors, we fixed this warning that was on the border of a false positive. We want to be able to run with sanitizers warning-free so that every real warning we get can be treated accordingly.

gskit: fixed bogus setsockopt calls

A set of setsockopt() calls in the gskit.c backend was fond to be defective and haven’t worked since their introduction several years ago.

define HAVE_SSL_CTX_SET_EC_CURVES for libressl

Users of the libressl backend can now set curves correctly as well. OpenSSL and BoringSSL users already could.

x509asn1: make do_pubkey handle EC public keys

The libcurl private asn1 parser (used for some TLS backends) did not have support for these before.