msh3 as the third h3 backend

With the brand new merged support for the msh3 library, curl now supports no less than three different HTTP/3 backends. It was merged into curl’s git repository on April 10.

When you build curl, you have the option to build it with HTTP/3 support enabled. The HTTP/3 support in curl is still considered experimental so it is still not enabled by default.

The HTTP/3 support in curl depends on the presence and support from third party libraries. You need to select and enable a specific HTTP/3 backend when you build curl. It has previously been doing HTTP/3 using either quiche or ngtcp2 + nghttp3. Starting now, there is yet another option to consider: the msh3 library.

The msh3 library itself uses msquic for doing QUIC. This is a multi platform library that uses Schannel for TLS when on Windows and OpenSSL/quictls for other platforms. The Schannel part probably makes solution this particularly interesting for curl users on Windows.

More steel

I didn’t expect this, and this year I wasn’t asked ahead of time if I wanted to receive this gift. It is however something of a collector’s item that I find very enjoyable.

I received my GitHub contribution matrix printed in steel. This is my 2021 contribution skyline. (Click the images for higher resolution.)

The thing is surprisingly heavy and sturdy. If I had papers lying around, this would an awesome paper press.

You might remember that I got a similar gift last year, so it felt natural to do a comparison shot of 2021 and 2020.

For 2020, GitHub counted 2,466 contributions, while I reached 2,543 in 2021. Very similar numbers, but clearly distributed very differently. The two matrix images look like this.

2020

2021

Letter

Enclosed with this gift was also a friendly and encouraging letter.

Oh, and you can of course also see a rendered version in your browsers or download it in STL format so that you can print using your own 3d-printer.

Thank you, all my friends at GitHub!

Talked curl on software engineering radio

I was invited to the podcast and talked to host Gavin Henry for over on hour.

What it’s been like to look after the curl project for the past 25 years. We talked about the history of cURL, libcurl, whether C was the right choice, portability, some key events in those 25 years, implementing protocols, why HTTP is not so simple, rust libs, the Polhem Prize, security issues, feature requests, random support requests, code on Mars, Apple OS adoption, cars stuck in production lines, Android OS, 8 week release cycles, release cycle joy, breakdown of bug types, 1000 committers, 250 command line options, user bases, determination, json, libSSH2, c-ares, HTTPbis, HTTP/2, QUIC, Mozilla, OpenSSL, wolfSSL, DNS, FTP, the cURL book, testing, CI/CD, favorite command line options that you might not know about, and making sure that you don’t give up on that idea or project you are working on.

Listen to it

This busy-loop is not a security issue

One of the toughest jobs I have, is to assess if a reported security problem is indeed an actual security vulnerability or “just” a bug. Let me take you through a recent case to give you an insight…

Some background

curl is 24 years old and so far in our history we have registered 111 security vulnerabilities in curl. I’ve sided with the “security vulnerability” side in reported issues 111 times. I’ve taken the opposite stance many more times.

Over the last two years, we have received 129 reports about suspected security problems and less than 15% of them (17) were eventually deemed actual security vulnerabilities. In the other 112 cases, we ended up concluding that the report was not pointing out a curl security problem. In many of those 112 cases, it was far from easy to end up with that decision and in several instances the reporter disagreed with us. (But sure, in the majority of the cases we could fairly quickly conclude that the reports were completely bonkers.)

The reporter’s view

Many times, the reporter that reports a security bug over on Hackerone has spent a significant amount of time and effort to find it, research it, reproduce it and report it. The reporter thinks it is a security problem and there’s a promised not totally insignificant monetary reward for such problems. Not to mention that a found and reported vulnerability in curl might count as something of a feat and a “feather in the hat” for a security researcher. The reporter has an investment in this work and a strong desire to have their reported issue classified as a security vulnerability.

The project’s view

If the reported problem is a security problem then we must consider it as that and immediately work on fixing the issue to reduce the risk of users getting hurt, and to inform all users about the risk and ask them to upgrade or otherwise mitigate and take precautions against the risks.

Most reported security issues are not immediately obvious. At least not in my eyes. I usually need to object, discuss, question and massage the data for a while in order to land on how we should best view the issue. I’m a skeptic by nature and I need to be convinced before I accept it.

Labeling something a “security vulnerability” if it indeed is not, is rather hurting users and the entire community rather than helping it. We must not cry wolf for a problem that cannot hurt users or that in practical terms is impossible to occur. Or maybe it is a problem that users are already expected to deal with. Or a result of an explicit or implicit application choice rather than a mistake done by us.

But we must not ignore actual security problems!

This latest MQTT problem

On March 24, 2022 we got a new report filed over on hackerone with the title Denial of Service vulnerability in curl when parsing MQTT server response.

Here’s (roughly) what the issue is about:

  1. A bug in current libcurl makes it misbehave under certain conditions. When the MQTT connection gets closed mid message, libcurl refuses to acknowledge that and thinks the connection is still alive. Easily triggered by a malicious server.
  2. libcurl considers the connection readable non-stop
  3. Reading from the connection brings no more data
  4. Busy-looping in the event-loop. Goto 2

The loop stops only once it reaches the set timeout, the progress callback can stop it and the speed-limit options will stop it if the right conditions are met.

By default, none of those options are set for a transfer and therefore, by default this makes an endless busy-loop.

At the same time…

A transfer can always stall and take a very long time to complete. A server can basically always just stop delivering more data, making the transfer take an infinite amount of time to complete. Applications that have not set any options to stop such a transfer risk doing a transfer that never ends. An endless transfer.

Also: if libcurl makes a transfer over a really fast network, such as localhost or using a super fast local network, then it might also reach the same level of busy-loop due to never having to wait for data. Albeit for a limited amount of time – until the transfer is complete. This busy-loop is highly unlikely to actually starve out any important threads in a system.

Yes, a closed connection is a much “cheaper” attack from server’s point of view than maintaining a long-living connection, but the cost of the attack is not a factor here.

Where in this grey area do we land?

This is difficult one.

I can see the point of the reporter, but I can also see how this flaw will basically not hurt any existing curl user. Where is our responsibility here?

I ended up concluding that this issue not a security vulnerability. The reporter disagreed.

It is a terribly annoying bug for sure. But the only applications that are seriously affected by it, are the ones that already allow an endless transfer.

The bug-fix was instead submitted as a normal pull-request: PR 8644, targeted to be fixed and included in the pending curl 7.83.0 release.

We publicize the reports after the fact

We make all (non-rubbish) previously reported hackerone issues public, whether they ended up being a vulnerability or not. To give everyone involved time to object or redact sensitive details, the publication date is usually within a month after the issue was closed.

By making the reports public, we allow everyone interested enough the ability and chance to check out and follow past discussions and deliberations for going the directions we did. The idea is primarily to be completely open about the reported issues and how we classify them, to show that we are not hiding anything and it also provides a chance for us to get more feedback from the surrounding and from security people who might disagree with previous analyses.

Security is hard.

What curl expects from dependencies

curl supports a large number of third party libraries. In a build, those libraries become “dependencies”. These components offer functionality and features that we don’t implement ourselves but still have been deemed interesting or even crucial to support to do Internet transfers the way we want.

A curl build done today can use one or more out of 35 different libraries. No build can actually use all of them at once as many are mutually exclusive and most of them only work on one or a subset of platforms.

The green boxes illustrate the third party dependencies curl can use as direct dependencies.

Keeping our backyard tidy

Every now and then we learn that one of the 3rd party libraries we can build curl to use has ceased development or has in some other way started to decay into a state where we feel is no longer healthy to the level that we can no longer recommend our users to use it.

We do this as a service to our users. If users build curl with a dependency we support, I think we should at least have some rudimentary knowledge that the dependencies we help users to use are not terrible. It’s not a guarantee, but we try. To help strengthen the ecosystem. To sweep our own backyard.

Also, getting rid of old code is good.

The different third party direct dependencies supported by curl by the time of the initial added support. A minus prefix means the support was dropped.

Indirect dependencies

There are of course also indirect dependencies in the form of libraries our direct dependencies use (or even libraries the indirect used libraries use), and we try to also include them in the “package” when we consider dependencies, but especially if they are optional we need to put less attention on them.

What is a healthy dependency?

We have no automatic checks or even fixed set of rules or conditions to help us make this distinction. It would of course be cool to have that, but we don’t.

Ideally, it would be awesome if all dependencies would be top-rated on bestpractices, as that would greatly help us figure this out. But unfortunately too many projects are still not even added to that effort so this doesn’t work – plus we also support a number of proprietary dependencies that can’t be rated there.

Instead, we need to rely on old-fashioned human checks and asking users and maintainers.

Maybe not add it to begin with

We have declined to add functionality to curl in the past just because the proposed 3rd party dependency it would use just didn’t live up to our standards. I don’t mean that we need to raise the bar to ridiculous levels, but if a casual browsing of the 3rd party library found issues and there were not satisfying answers in a reasonable time on how those should be addressed, then that library is probably not ready to be used by curl. There’s no need to “lure” curl users into a possibly bad situation if we can save them from it.

Abandoned

Sometimes work officially stops on a library we support. That’s a strong sign we should also stop.

curl users actually using it

Since curl is being developed, extended and bug-fixed at a fairly high pace, we can be fairly sure that if a dependency is actually being used, it needs to get fixed every now and then to keep up. If support code for a dependency hasn’t been updated or touched for many years, there’s a strong suspicion that there aren’t many users of it in modern curl.

Sometimes that can be verified to be the case when we notice a blatant bug that’s been present in the code for a good while without anyone noticing, but more often we need to ask users. Anyone using this anymore? (Which also is complicated because we often lack connections to users who don’t read any of our mailing lists and generally only upgrade curl once every decade so it might take a while until those users notice changes…)

Releases

A dependency that has stopped making new releases can be a signal that it on its way downwards. It could also be a sign that it has matured and doesn’t need much more to be done to it.

How do we even know they stopped? Maybe they just take forever from the previous release…

Developer activity

A library that is used by curl is almost required to have some level of developer activity over time. Nobody writes bug-free code unless its scope is razor-sharp-narrow and the project spent a lot of time perfecting it. No commits or developer activity for a long time means that clearly nobody takes care of the bug reports.

Slowly deteriorating projects are probably the hardest to handle. Are they still good enough?

Maintainers ultimately decide

But we just ship source code.

In the end of the day, the people who package curl or libcurl decide what third party libraries to actually get used. They are the ones who decide what dependencies users of their build rely on. In many cases this means the maintainers of the curl packages in Linux distros and other operating systems. Manufacturers of devices and tools that use libcurl often build their own and then they can decide and cherry-pick individually between all provided choices.

This makes it possible for such maintainers to add extra conditions and checks and only go with the dependencies they like.

The only binary packages the curl project itself provides, are the ones for Windows, and we try to go with only solid, reliable and conservative choices for those.

Easier header-picking with curl

Okay you might ask, what’s the news here? We’ve been able to get HTTP response headers with curl since virtually the stone age. Yes we have. Get the page and also show the headers:

curl -i https://example.com/

Make a HEAD request and see what headers we get back:

curl -I https://example.com/

Save the response headers in a separate file:

curl -D headers.txt https://example.com/

Get a specific header

This gets a little more complicated but you can always do

curl -I https://example.com/ | grep Date:

Which of course will fail if the casing is different, you need to check for it case insensitively. There might also be another header ending with “date:” that matches so you need to make sure that this an exact match

curl -I https://example.com/ | grep -i ^Date:

Now this shows the entire header, but for most cases you only want the value. So get it with cut:

curl -I https://example.com/ | grep -i ^Date: | cut -d: -f2-

You have the header value extracted now, but the leading and trailing white spaces in the content are probably not what you want in there so let’s strip them as well:

curl -I https://example.com/ | grep -i ^Date: | cut -d: -f2- | sed 's/^ *\(.*\).*/\1/'

There are of course many different ways you can do this operation and some of them are more clever than the methods I’ve used here. They are still often more or less convoluted and error-prone.

If we imagine that this is a fairly common use case for curl users in the world, then this kind of operation is found duplicated in quite a few scripts, applications and devices in the world.

Maybe we could make this easier for curl users?

A headers API

The other day we introduced a new experimental headers API to libcurl. Using this API, an application using libcurl gets an easy to use API to extract individual or several headers and their content.

As curl is such a libcurl-using application, we have expanded it to make use of this new API and this brings some new fun features to the curl tool.

Let me emphasize that since this API is labeled experimental it is not enabled in a default build. You need to explicitly enable it!

Get a single header, the new way

I decided to extend the -w output feature for this.

To extract a single header, get the value with leading and trailing spaces trimmed, use %header{name}. To repeat the operation from above and get the Date: header

curl -I -w '%header{date}' https://example.com/

‘date’ in this example is a case insensitive header name without the trailing colon and you can of course use any header name you please there. If the given header did not actually arrive in the response, it outputs nothing.

If you want more headers output, just repeat the %header{name} construct as many times as you like. If the -w output string gets unwieldy and hard to manage on the command line, then make it into a text file instead and tell -w about it with -w @filename.

curl -I -w @filename https://example.com/

Which headers?

There are several different kinds of headers and there can be multiple requests used for a transfer, but this option outputs the “normal” server response headers from the most recent request done. The option only works for HTTP(S) responses.

All headers – as JSON

As dealing with formatted data in the form of JSON has become very popular, I want to help fertilize this by making curl able to output all response headers as a JSON object.

This way, you can move the header handling, parsing and perhaps filtering to your JSON aware tool.

Tell curl to output the received HTTP headers as a JSON object:

curl -o save -w "%{header_json}" https://example.com/

curl itself does not pretty-print this, but if you pass the JSON from curl to a beautifier such as jq, the output ends up looking like this:

{
  "age": [
    "269578"
  ],
  "cache-control": [
    "max-age=604800"
  ],
  "content-type": [
    "text/html; charset=UTF-8"
  ],
  "date": [
    "Tue, 22 Mar 2022 08:35:21 GMT"
  ],
  "etag": [
    "\"3147526947+ident\""
  ],
  "expires": [
    "Tue, 29 Mar 2022 08:35:21 GMT"
  ],
  "last-modified": [
    "Thu, 17 Oct 2019 07:18:26 GMT"
  ],
  "server": [
    "ECS (nyb/1D2E)"
  ],
  "vary": [
    "Accept-Encoding"
  ],
  "x-cache": [
    "HIT"
  ],
  "content-length": [
    "1256"
  ]
}

JSON details

The headers are presented in the same order as received over the wire. Except if there are duplicated header names, as then they are grouped on the first occurrence and all values are provided there as a JSON array.

All headers are arrays just because there can be multiple headers using the same name .

The casing for the header names are kept unmodified from what was received, but for duplicated headers the casing used for the first occurrence will be used in the output.

Update: we lowercase all header names in the JSON output.

The “status line” of HTTP 1.x response, that first line that says “HTTP1.1 200 OK” when everything is fine, is not counted as a header by this function and will therefor not be included in this output.

Ships in 7.83.0

This feature is present in source code that will ship in curl 7.83.0, scheduled to happen late April 2022. Run your own build with it enabled, or ask your packager to provide an experimental build for you.

With enough positive feedback we should be able to move this out of experimental state fairly quickly.

Anatomy of a ghost CVE

“The Lord giveth and the Lord taketh away.”

Job 1:21

On March 16 2022, the curl security team received an email in which the reporter highlighted an Apple web page. What can you tell us about this?

I hadn’t seen it before. On this page with the title “About the security content of macOS Monterey 12.3”, said to have been published just two days prior, Apple mentions recent package upgrades and the page lists a bunch of products and what security fixes that were done for them in this update. Among the many products listed, curl is mentioned.

This is what the curl section of the page looked like:

Screenshot from March 17, 2022

In the curl project we always make all CVEs public with as much detail as we can possibly extract and provide about them. We take great pride in being the best in class in security flaw information and transparency.

Apple listed four CVE fixed. The three first IDs we immediately recognized from the curl security page. The last one however, was a surprise. What was that?

CVE-2022-22623

This is not a CVE published by the curl project. The curl project has in fact not shipped any CVE at all in 2022 (yet) so that’s easy to spot. When we looked at the MITRE registration for the ID, it also didn’t disclose any clues really. Not that it was expected to. It did show it was created on January 5 though, so it wasn’t completely new.

Was it a typo?

I compared this number to other recent CVE numbers announced from curl and I laid eyes on CVE-2021-22923 which had just two digits changed. Did they perhaps mean that CVE?

The only “problem” with that CVE is that it was in regards to Metalink and I don’t think Apple ever shipped their curl package with metalink support so therefore they wouldn’t have fixed a Metalink problem. So probably not a typo for that number at least!

I reached out to a friend at Apple as well with an email to Apple Product Security.

Security is our number one priority

In the curl project, we take security seriously. The news that there might be a security problem in curl that we haven’t been told about and that looks like it was about to get public sooner or later was of course somewhat alarming and something we just needed to get to the bottom of. It was also slightly disappointing that a large vendor and packager of curl since over 20 years would go about it this way and jab this into our back.

No source code

Apple has not made the source code for their macOS 12.3 version and the packages they use in there public, so there was no way for us to run diffs or anything to check for the exact modifications that this claimed fix would’ve resulted in.

Apple said so

Several “security websites” (the quotes are there to indicate that clearly these sites are more security in the name than in reality) immediately posted details about this “vulnerability”. Some of them with CVSS scores and CWE numbers , explaining how this problem can hurt users. Obviously completely made up since none of that info was made available by any first party sources anywhere. Not from Apple and not from the curl project. If you now did a web search on that CVE number, several of the top search results linked to such sites providing details – obviously made up from thin air.

As I think these sites don’t add much value to humanity, I won’t link to them here but instead I will show you a screenshot from such an article to show you what a made up CVE number posted by Apple can make people claim:

Screenshot from exploitone.com

At 23:28 (my time zone) on the 17th, my Apple friend responded saying they had forwarded the issue to “the right team”.

The Apple Product Security team I also emailed about this issue, answered at 00:23 (still my time) on the 18th saying “we are looking into this and will provide an update soon when we have more information.”

The MITRE page got more details

The MITRE CVE page from March 21st

After the weekend passed with no response, I looked back again on the MITRE page for the CVE in question and it had then gotten populated with additional curl details; mentioning Apple as CNA and now featuring links back to the Apple page! Now it really started to look like the CVE was something real that Apple (or someone) had registered but not told us about. It included real curl related snippets like this:

Multiple issues were addressed by updating to curl version 7.79.1. This issue is fixed in macOS Monterey 12.3. Multiple issues in curl.

Please tell us more details

On Monday the 21st, I continued to get questions about this CVE. Among others, from a member of a major European ISP’s CERT team curious about this CVE as they couldn’t find any specific information about this issue either and they were concerned they might have this vulnerability in the curl versions they run. They of course (rightfully) assumed that I would know about curl CVEs.

It turns out that when a major company randomly mentions a new CVE, it actually has an impact on the world!

Gone!

At around 20:30 on March 21st, someone on Twitter spotted that the ghost CVE had been removed from Apple’s web page and it only listed three issues (and a mention that the section had been updated). At 21:39 I get an email response from Apple Product Security:

Thank you for reaching out to us about the error with this CVE on our security advisory. We’ve updated our site and requested that MITRE reject CVE-2022-22623 on their end.

Please let us know if you have any questions.

Screenshot from March 21, 2022

The reject request to MITRE is expected to be slow so that page will remains showing the outdated data for a while longer.

Exploit one

When Apple had retracted the wrong CVE, I figured I should maybe try to get exploitone.com to remove their “article” to maybe at least stop one avenue of further misinformation about this curl “issue”. I tweeted (in perhaps a tad bit inflammatory manner):

I get the feeling they didn’t quite understand my point. They replied:

What happened?

As I had questions about Apple’s mishap, I replied (sent off 22:28 on the 21st, still only early afternoon on the US west coast), asking for details on what exactly had happened here. If it was a typo, then how come it got registered with MITRE? It’s just so puzzling and mysterious!

When I’m posting this article on my blog (36 hours after I sent the question), I still haven’t gotten any response or explanation. I don’t expect to get any either, but if I do, I will update this post accordingly.

Update March 26

exploitone.com updated their page at some point after my tweet to remove the mention of the imaginary CVE, but the wording remains very odd:

A headers API for libcurl

For many years we’ve had this outstanding idea to add a new API to libcurl that would offer applications easy access to HTTP response headers.

Applications could already retrieve the headers using existing methods but that requires them to write a callback and to a certain amount of parsing and “understanding” HTTP that we always felt was a little unfortunate, a bit error-prone on the behalf of the applications and perhaps also a thing that forced a lot of applications out there having to write the same kind of extra function logic.

If libcurl provides this functionality, it would remove a lot of (duplicated) code from a lot of applications.

Designing the API

We started this process a while ago when I first wrote down a basic approach to an API for this and sent it off to the curl-library mailing list for feedback and critique.

/* first take */
char *curl_easy_header(CURL *easy,
                       const char *name);

The conversation that followed that first plea for help, made me realize that my first proposal had been far too basic and it wouldn’t at all work to satisfy the needs and use cases we could think of for this API.

Try again

I went back to mull over what I’ve learned and update my design proposal, trying to take the feedback into account in the best possible way. A few weeks later, I returned with a “proposal v2” and again I asked for comments and opinions on what I had put together.

/* second shot */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           struct curl_header **h);

As I had already adjusted the API from feedback the first time around, the feedback this time was perhaps not calling for as big changes or radical differences as they did the first time around. I could adapt my proposal to what people asked and suggested. We arrived at something that seemed like a pretty solid API for offering HTTP headers to applications.

Let’s do this

As the API proposal feedback settled down and the interface felt good and sensible, I decided it was time for me to write up a first implementation so that we can offer code to people to give everyone a chance to try out the API in real life as well. There’s one thing to give feedback on a “paper product”, actually being able to use it and try it in an application is way better. I dove in.

The final take

When the code worked to the level that I started to be able to extract the first headers with the API, it proved to that we needed to adjust the API a little more, so I did. I then ran into more questions and thoughts about specifics that we hadn’t yet dealt with or nailed proper in the discussions up to that point and I took some questions back to the curl community. This became an iterative process and we smoothed out questions about how access different header “sources” as well as how to deal with multiple headers and “request sequences”. All supported now.

/* final version */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           unsigned int origin,
                           int request,
                           struct curl_header **h);

Multiple headers

This API allows applications to extract all headers from a previous transfer. It can get one or many headers when there are duplicated ones, like Set-Cookie: commonly arrive as.

Sources

The application can ask for “normal” headers, for trailers (that arrive after the body), headers associated with the CONNECT request (if such a one was performed), pseudo headers (that might arrive when HTTP/2 and HTTP/3 is used) or headers associated with a HTTP 1xx “intermediate” response.

Multiple responses

The libcurl APIs typically work on transfers, which means that a single transfer may end up doing multiple transfers, multiple HTTP requests. Primarily when redirects are followed but it can also be due to other reasons. This header API therefore allows the caller to extract headers from the entire “chain” of requests a previous transfer was made with.

EXPERIMENTAL

This API is initially merged (in this commit) labeled “experimental” to be included in the upcoming 7.83.0 release. The experimental label means a few different things to us:

  • The API is disabled by default in the build and you need to explicitly ask for it with --enable-headers-api when you run configure
  • There are no ABI and API promises for these functions yet. We might change the functions based on feedback before we remove the label.
  • We strongly discourage anyone from shipping experimentally labeled functions in production.
  • We rely on people to enable and test this and provide feedback, to give us confidence enough to remove the experimental label as soon as possible.

We use the experimental “route” to lower the bar for merging new stuff, so that we get some extra chances to fix up mistakes before the rules and API are carved in stone and we are set to support that for a life time.

This setup relies on users actually trying out the experimental stuff as otherwise it isn’t method for improving the API, it will only delay the introduction of it to the general public. And it risks becoming be less good.

Documentation

The two new functions have detailed man pages: curl_easy_header and curl_easy_nextheader. If there is anything missing on unclear in there, let us know!

I have also created an initial example source snippet showing header API use. See headerapi.c.

This API deserves its own little section in the everything curl book, but I think I will wait for it to get landed “for real” before I work on adding that.

Trevlig Mjukvara

It was a while since I last spoke Swedish on a podcast. I joined the friendly hosts Sebastian and Alex of the Trevlig Mjukvara (translates to something like “Nice Software”) podcast and we talked software development, open source, curl, Mozilla and a few other topics for an hour. I had a great time. (We had Jitsi act up on us more than once so we had to switch away from it mid-recording!)

Listen to Trevlig Mjukvara s10e04. In Swedish!

I’ve also participated in a lot of other podcasts over the years.

curl, open source and networking