On March 16 2022, the curl security team received an email in which the reporter highlighted an Apple web page. What can you tell us about this?
I hadn’t seen it before. On this page with the title “About the security content of macOS Monterey 12.3”, said to have been published just two days prior, Apple mentions recent package upgrades and the page lists a bunch of products and what security fixes that were done for them in this update. Among the many products listed, curl is mentioned.
This is what the curl section of the page looked like:
In the curl project we always make all CVEs public with as much detail as we can possibly extract and provide about them. We take great pride in being the best in class in security flaw information and transparency.
Apple listed four CVE fixed. The three first IDs we immediately recognized from the curl security page. The last one however, was a surprise. What was that?
This is not a CVE published by the curl project. The curl project has in fact not shipped any CVE at all in 2022 (yet) so that’s easy to spot. When we looked at the MITRE registration for the ID, it also didn’t disclose any clues really. Not that it was expected to. It did show it was created on January 5 though, so it wasn’t completely new.
Was it a typo?
I compared this number to other recent CVE numbers announced from curl and I laid eyes on CVE-2021-22923 which had just two digits changed. Did they perhaps mean that CVE?
The only “problem” with that CVE is that it was in regards to Metalink and I don’t think Apple ever shipped their curl package with metalink support so therefore they wouldn’t have fixed a Metalink problem. So probably not a typo for that number at least!
I reached out to a friend at Apple as well with an email to Apple Product Security.
Security is our number one priority
In the curl project, we take security seriously. The news that there might be a security problem in curl that we haven’t been told about and that looks like it was about to get public sooner or later was of course somewhat alarming and something we just needed to get to the bottom of. It was also slightly disappointing that a large vendor and packager of curl since over 20 years would go about it this way and jab this into our back.
No source code
Apple has not made the source code for their macOS 12.3 version and the packages they use in there public, so there was no way for us to run diffs or anything to check for the exact modifications that this claimed fix would’ve resulted in.
Apple said so
Several “security websites” (the quotes are there to indicate that clearly these sites are more security in the name than in reality) immediately posted details about this “vulnerability”. Some of them with CVSS scores and CWE numbers , explaining how this problem can hurt users. Obviously completely made up since none of that info was made available by any first party sources anywhere. Not from Apple and not from the curl project. If you now did a web search on that CVE number, several of the top search results linked to such sites providing details – obviously made up from thin air.
As I think these sites don’t add much value to humanity, I won’t link to them here but instead I will show you a screenshot from such an article to show you what a made up CVE number posted by Apple can make people claim:
At 23:28 (my time zone) on the 17th, my Apple friend responded saying they had forwarded the issue to “the right team”.
The Apple Product Security team I also emailed about this issue, answered at 00:23 (still my time) on the 18th saying “we are looking into this and will provide an update soon when we have more information.”
The MITRE page got more details
After the weekend passed with no response, I looked back again on the MITRE page for the CVE in question and it had then gotten populated with additional curl details; mentioning Apple as CNA and now featuring links back to the Apple page! Now it really started to look like the CVE was something real that Apple (or someone) had registered but not told us about. It included real curl related snippets like this:
Multiple issues were addressed by updating to curl version 7.79.1. This issue is fixed in macOS Monterey 12.3. Multiple issues in curl.
Please tell us more details
On Monday the 21st, I continued to get questions about this CVE. Among others, from a member of a major European ISP’s CERT team curious about this CVE as they couldn’t find any specific information about this issue either and they were concerned they might have this vulnerability in the curl versions they run. They of course (rightfully) assumed that I would know about curl CVEs.
It turns out that when a major company randomly mentions a new CVE, it actually has an impact on the world!
At around 20:30 on March 21st, someone on Twitter spotted that the ghost CVE had been removed from Apple’s web page and it only listed three issues (and a mention that the section had been updated). At 21:39 I get an email response from Apple Product Security:
Thank you for reaching out to us about the error with this CVE on our security advisory. We’ve updated our site and requested that MITRE reject CVE-2022-22623 on their end.
Please let us know if you have any questions.
The reject request to MITRE is expected to be slow so that page will remains showing the outdated data for a while longer.
When Apple had retracted the wrong CVE, I figured I should maybe try to get exploitone.com to remove their “article” to maybe at least stop one avenue of further misinformation about this curl “issue”. I tweeted (in perhaps a tad bit inflammatory manner):
I get the feeling they didn’t quite understand my point. They replied:
As I had questions about Apple’s mishap, I replied (sent off 22:28 on the 21st, still only early afternoon on the US west coast), asking for details on what exactly had happened here. If it was a typo, then how come it got registered with MITRE? It’s just so puzzling and mysterious!
When I’m posting this article on my blog (36 hours after I sent the question), I still haven’t gotten any response or explanation. I don’t expect to get any either, but if I do, I will update this post accordingly.
Update March 26
exploitone.com updated their page at some point after my tweet to remove the mention of the imaginary CVE, but the wording remains very odd:
I’ve talked on this topic before but I realized I never did a proper blog post on the topic. So here it is: how we develop curl to keep it safe. The topic of supply chain security is one that is discussed frequently these days and every so often there’s a very well used (open source) component that gets a terrible weakness revealed.
Don’t get me wrong. Proprietary packages have their share of issues as well, and probably even more so, but for obvious reasons we never get the same transparency, details and insight into those problems and solutions.
If we would find a critical vulnerability in curl, it could potentially exist in every internet-connected device on the globe. We don’t want that.
A critical security flaw in our products would be bad, but we also similarly need to make sure that we provide APIs and help users of our products to be safe and to use curl safely. To make sure users of libcurl don’t accidentally end up getting security problems, to the best of our ability.
In the curl project, we work hard to never have our own version of a “heartbleed moment“. How do we do this?
Our method is not strange, weird or innovative. We simply apply all best practices, tools and methods that are available to us. In all areas. As we go along, we tighten the screws and improve our procedures, learning from past mistakes.
There are no short cuts or silver bullets. Just hard work and running tools.
Not a coincidence
Getting safe and secure code into your product is not something that happens by chance. We need to work on it and we need to make a concerned effort. We must care about it.
We all know this and we all know how to do it, we just need to make sure that we also actually do it.
Write code following the rules
Review written code and make sure it is clear and easy to read.
Test the code. Before and after merge
Verify the products and APIs to find cracks
Bug-bounty to reward outside helpers
Act on mistakes – because they will happen
For users of libcurl we provide an API with safe and secure defaults as we understand the power of the default. We also document everything with details and take great pride in having world-class documentation. To reduce the risk of applications becoming unsafe just because our API was unclear.
We also document internal APIs and functions to help contributors write better code when improving and changing curl.
We don’t allow compiler warnings to remain – on any platform. This is sometimes quite onerous since we build on such a ridiculous amount of systems.
We encourage use of source code comments and assert()s to make assumptions obvious. (curl is primarily written in C.)
All code should be reviewed. Maintainers are however allowed to review and merge their own pull-requests for practical reasons.
Code should be easy to read and understand. Our code style must be followed and encourages that: for example, no assignments in conditions, one statement per line, no lines longer than 80 columns and more.
Strict compliance with the code style also means that the code gets a flow and a consistent look, which makes it easier to read and manage. We have a tool that verifies most aspects of the code style, which takes away most of that duty away from humans. I find that PR authors generally take code style remarks better when pointed out by a tool than when humans do it.
A source code change is accompanied with a git commit message that need to follow the template. A consistent commit message style makes it easier to later come back and understand it proper when viewing source code history.
We want everything tested.
Unit tests. We strive at writing more and more unit tests of internal functions to make sure they truly do what expected.
System tests. Do actual network transfers against test servers, and make sure different situations are handled.
Integration tests. Test libcurl and its APIs and verify that they handle what they are expected to.
Documentation tests. Check formats, check references and cross-reference with source code, check lists that they include all items, verify that all man pages have all sections, in the same order and that they all have examples.
“Fix a bug? Add a test!” is a mantra that we don’t always live up to, but we try.
curl runs on 80+ operating systems and 20+ CPU architectures, but we only run tests on a few platforms. This usually works out fine because most of the code is written to run on multiple platforms so if tested on one, it will also run fine on all the other.
curl has a flexible build system that offers many million different build combinations with over 30 different possible third-party libraries in countless version combinations. We cannot test all build combos, but we try to test all the popular ones and at least one for each config option enabled and disabled.
We have many tests, but there are unfortunately still gaps and details not tested by the test suite. For those things we simply have to rely on the code review and then that users report problems in the shipped products.
We run all the tests using valgrind to make sure nothing leaks memory or do bad memory accesses.
We build and run with address, undefined behavior and integer overflow sanitizers.
We are part of the OSS-Fuzz project which fuzzes curl code non-stop, and we run CIFuzz in CI builds, which runs “a little” fuzzing on the curl code in the normal pull-request process.
We do “torture testing“: run a test case once and count the number of “fallible” function calls it makes. Those are calls to memory allocation, file operations, socket read/write etc. Then re-run the test that many times, and for each new iteration we make another one of the fallible functions fail and return error. Verify that no memory leaks or crashes occur. Do this on all tests.
We use several different static code analyzers to scan the code checking for flaws and we always fix or otherwise handle every reported defect. Many of them for each pull-request and commit, some are run regularly outside of that process:
The exact set has varied and will continue to vary over time as services come and go.
No matter how hard we try, we still ship bugs and mistakes. Most of them of course benign and harmless but some are not. We run a bug-bounty program to reward security searchers real money for reported security vulnerabilities found in curl. Until today, we have paid almost 17,000 USD in total and we keep upping the amounts for new findings.
When we report security problems, we produce detailed and elaborate advisories to help users understand every subtle detail about the problem and we provide overview information that shows exactly what versions are vulnerable to which problems. The curl project aims to also be a world-leader in security advisories and related info.
Act on mistakes
We are not immune, no matter how hard we try. Bad things will happen. When they do, we:
Own the problem, responsibly
Fix it and announce it – as soon as possible
Learn from it
Make it harder to do the same or similar mistakes again
Does it work? Do we actually learn from our history of mistakes? Maybe. Having our product in ten billion installations is not a proof of this. There are some signs that might show we are doing things right:
We were reporting fewer CVEs/year the last few years but in 2021 we went back up. It could also be the result of more people looking, thanks to the higher monetary rewards offered. At the same time the number of lines of code have kept growing at a rate of around 6,000 lines per year.
We get almost no issues reported by OSS-Fuzz anymore. The first few years it ran it found many problems.
We are able to increase our bug-bounty payouts significantly and now pay more than one thousand USD almost every time. We know people are looking hard for security bugs.
For every pull-request and commit done in the project, we run about 100 different builds + test rounds.
Done using several different CI services for maximum performance, widest possible coverage and shortest time to completion.
We currently use the following CI services: Cirrus CI, AppVeyor, Azure Pipelines, GitHub Actions, Circle CI and Zuul CI.
We also have a separate autobuild system with systems run by volunteers that checkout the latest code, build, run all the tests and report back in a continuous manner a few times or maybe once per day.
New habits past mistakes have taught us
We have done several changes to curl internals as direct reactions to past security vulnerabilities and their root causes. Lessons learned.
Unified dynamic buffer functions
These days we have a family of functions for working with dynamically sized buffers. Be using the same set for this functionality we have it well tested and we reduce the risk that new code messes up. Again, nothing revolutionary or strange, but as curl had grown organically over the decades, we found ourselves in need of cleaning this up one day. So we did.
Maximum string sizes
Several past mistakes came from possible integer overflows due to libcurl accepting input string sizes of unrestricted lengths and after doing operations on such string sizes, they would sometimes lead to overflows.
Since a few years back now, no string passed to curl is allowed to be larger than eight megabytes. This limit is somewhat arbitrarily set but is meant to be way larger than the largest user names and passwords ever used etc. We could also update the limit in a future, should we want. It’s not a limit that is exposed in the API or even mentioned. It is there to trap mistakes and malicious use.
Thanks to the previous points we now avoid realloc as far as possible outside of those functions. History shows that realloc in combination with integer overflows have been troublesome for us. Now, both reallocs and integer overflows should be much harder to mess up.
A few years ago we ran code coverage reports for one build combo on one platform. This generated a number that really didn’t mean a lot to anyone but instead rather mislead users to drawing funny conclusions based on the report. We stopped that. Getting a “complete” and representative number for code coverage for curl is difficult and nobody has yet gone back to attempt this.
The impact of security problems
Every once in a while someone discovers a security problem in curl. To date, those security vulnerabilities have been limited to certain protocols and features that are not used by everyone and in many cases even disabled at build-time among many users. The issues also often rely on either a malicious user to be involved, either locally or remotely and for a lot of curl users, the environments it runs in limit that risk.
To date, I’m not aware of any curl user, ever, having been seriously impacted by a curl security problem.
This is not a guarantee that it will not ever happen. I’m only stating facts about the history so far. Security is super hard and I can only promise that we will keep working hard on shipping secure products.
Is it scary?
Changes done to curl code today will end up in billions of devices within a few years. That’s an intimidating fact that could truly make you paralyzed by fear of the risk that the world will “burn” due to a mistake of mine.
Rather than instilling fear by this outlook, I think the proper way to think it about it, is respecting the challenge and “shouldering the responsibility”. Make the changes we deem necessary, but make them according to the guidelines, follow the rules and trust that the system we have setup is likely to detect almost every imaginable mistake before it ever reaches a release tarball. Of course we plug holes in the test suite that we spot or suspect along the way.
The back-door threat
I blogged about that recently. I think a mistake is much more likely to slip-in and get shipped to the world than a deliberate back-door is.
Memory safe components might help
By rewriting parts of curl to use memory safe components, such as hyper for HTTP, we might be able to further reduce the risk of future vulnerabilities. That’s a long game to make reality. It will also be hard in the future to actually measure and tell for sure if it truly made an impact.
How can you help out?
Pay for a curl support contract. This is what enables me to work full time on curl.
Help out with reviews and adding new tests to curl
Help out with fixing issues and improving the code
An unexpected or undocumented feature in a piece of computer software, included as a joke or a bonus.
There are no Easter eggs in curl. For the good.
I’ve been asked about this many times. Among the enthusiast community, people seem to generally like the concept of Easter eggs and hidden treasures, features and jokes in software and devices. Having such an embedded surprise is considered fun and curl being a cool and interesting project should be fun too!
With the risk of completely ruining my chances of ever being considered a fun person, I’ll take you through my thought process on why curl does not feature any such Easter eggs and why it will not have any in the future either.
The primary and main reason is the question of trust.
We deliver products with known and documented functionality. Everything is known and documented. There’s nothing secret or hidden. The users see it all, can learn it all and it all is documented. We are always 100% transparent.
curl is installed in some ten billion installations to date and we are doing everything we can to be responsible and professional to make sure curl can and will be installed in many more places going forward.
Having an Easter egg in curl would violate several of the “commandments” we live by. If we could hide an Easter egg, what else is there that we haven’t shown or talked about?
Everything in curl needs to be scrutinized, poked at, “tortured” and reviewed for security. An Easter egg would as well, as otherwise it would be an insecure component and therefor a security risk. This makes it impossible to maintain an Easter egg even almost secret.
Adding code to perform an Easter egg would mean adding code that potentially could cause problems to users by the plain unexpected nature of an Easter egg. Unexpected behavior is not a good foundation for security and secure procedures.
Boring is good
curl is not meant to be “fun” (on that fun scale). curl is here to perform its job, exactly as documented and expected and it is not meant to be fun. Boring is good and completely predictable. Boring is to deliver nothing else than the expected.
Even more security
If we would add an Easter egg, which by definition would be a secret or surprise to many, it would need to be hidden or sneaked in somehow and remain undocumented. We cannot allow code or features to get “snuck in” or remain undocumented.
If we would allow some features to get added like that, where would we draw the line? What other functionality and code do we merge into curl without properly disclosing and documenting it?
If we would allow an Easter egg to get merged, we would soon start getting improvements to the egg code and people would like to add more eggs and to change the existing one. We would spend time and effort on the silly parts and we would need to spend testing and energy on these jokes instead of the real thing. We already have enough work without adding irrelevant work to the pile.
“Unintended Easter eggs”
We frequently ship bugs and features that go wrong. Due to fluke or random accidents, some of those mistakes can perhaps at times almost appear as Easter eggs, if you try hard. Still, when they are not done on purpose they are just bugs – not Easter eggs – and we will fix them as soon as we get them reported and have the chance.
Yes, some readers will take this denial as a sign that there actually exists an Easter egg in curl and I am just doing my best to hide it. My advice to you, if you are one of those thinking this, is to read the code. We all benefit if more people read and carefully investigate the code so we will just be happy if you do and then ask us about whatever you think is unclear or “suspicious”.
I am not judging
This is not a judgement on projects that ship Easter eggs. I respect and acknowledge that different projects and people resonate differently on these topics.
In a world that is now gradually adopting HTTP/3 (which, as you know, is implemented over QUIC), the problem with the missing API for QUIC is still a key problem.
There are a number of existing QUIC library implementation now since a few years back, and they are slowly maturing. The QUIC protocol became RFC 9000 and friends, but the most popular TLS libraries still don’t provide the necessary APIs to make QUIC libraries possible to use them.
Example that makes people want HTTP/3
For a long time, many people and projects (including yours truly) in the QUIC community were eagerly following the OpenSSL Pull Request 8797, which introduced the necessary QUIC APIs into OpenSSL. This change brought the same API to OpenSSL that BoringSSL already provides and as such the API has already been used and tested out by several independent implementations.
Implementations have a problem to ship to the world based on BoringSSL since that’s a TLS library without versions and proper releases, so it is not a good choice for the big wide world. OpenSSL is already the most widely used TLS library out there and lots of applications are already made to use that.
Delays made quictls happen
The OpenSSL PR8797 was delayed back in February 2020 on when the OpenSSL management committee (OMC) decreed that they would not deal with that PR until after their pending 3.0.0 release had shipped.
“It is our expectation that once the 3.0 release is done, QUIC will become a significant focus of our effort.”
OpenSSL then proceeded and their 3.0.0 release was delayed significantly compared to their initial time schedule.
In March 2021, Microsoft and Akamai announcedquictls, an OpenSSL fork with the express idea to ship OpenSSL + the QUIC API. They didn’t want to wait for OpenSSL to do it.
Several QUIC libraries can now use quictls. quictls has kept their fork up to date and now offers the equivalent of OpenSSL 3.0.0 + the QUIC API.
While we’ve been waiting for OpenSSL to adopt the API.
OpenSSL makes a turn instead
Then came the next blow to everyone’s expectations. An autumn surprise. On October 13, the OpenSSL OMC announces:
The focus for the next releases is QUIC, with the objective of providing a fully functional QUIC implementation over a series of releases (2-3).
OpenSSL has decided to implement a complete QUIC stack on their own and with the given time line it sounds like it will take them a few years (?) to ship. And instead of providing the API lots of implementers have been been waiting for so long, they explicitly say that it is a non-goal at the start:
The MVP will not contain a library API for an HTTP/3 implementation (it is a non-goal of the initial release).
I didn’t write my own QUIC implementation but I’ve followed the work of several of the implementations fairly closely and it is fairly complicated journey they set out for themselves – for very unclear reasons. There already exist several high quality QUIC libraries, why does OpenSSL think they need to make yet another one? They seem to be overloaded with work already before, which the long delays of the 3.0.0 release seemed to show, how are they going to be able to add a complete new stack implementation of top of this? The future will tell.
On October 20 2021, the pull request that was created in April 2019, is finally closed for real as a “won’t fix”.
Where are we now?
The lack of a QUIC API in OpenSSL has held us back and with this move from OpenSSL, it will continue to hold us back for an uncertain amount of time going forward.
QUIC stacks will have to stick to using or switching to other libraries.
James Snell, one of the key contributors on the QUIC and HTTP/3 work in nodejs tweeted:
Back in February of this year, I received one of the most chilling emails in my life when someone threatened my life: “I will slaughter you“.
That email penetrated something deep into my soul and heart and I was not quite myself for a few days.
Almost six month later (a few days ago), I received another email from a person who claims to be this “Al Nocai” from that first threat. I read the beginning of this email several times before I actually could make myself continue past the opening paragraphs. Slightly worried what to find, with the memories from the last time coming back.
This email however, says “an apology” in the subject. I believe this email is written by the same guy.
The long and full apology is inserted below. The gist of it is: he claims to have been the victim of all sorts of bad stuff by several people named “Dan” and I got lumped in there because of my first name. He (?) also says he suffers from schizophrenia.
I’m publishing this partly because I got a lot of attention when I made the initial threat public so I figure it could be interesting for some to learn about this development. I can of course not verify that this is in fact the same person nor will I even attempt to verify any of his many claims of wrongdoings against him.
I’m happy for “Al” that he’s getting help and tries to move on. For me, this apology at least finally proves that this threat is over and in fact never was intended literally. I hope I will never receive anything close to that again.
The apology in full
I am Al Nocai. When I contacted you initially, I believed you to be a Dan E., from texas, or a Dan S from delaware or a Dan from Minneapolis. I didn’t do my research, and when I found it was actually you and you had nothing to do with my situation, I became indignant and even more of an asshole. You had every right to be mad, and publish as you did. I’m not trying to justify what I did, there is none, I should have been a lot more cordial. I just want to provide context around what was happening, I believe I at least owe you why.
I had to retire from my career do to schizophrenia. Again, I should have not let my delusions go to the point they did nor should I have acted the way it does. My illness doesn’t detract from the rashness of my actions.
I, at the time, lost my defense project after getting hacked through the Department of Veterans Affairs in the US. It was a few years into development, and it was meant to be a pathway to get homeless veterans off the streets. I was trying to develop a “trade-route” in tech. At the time I was very lucid. I was volunteering regularly in my community and I had a great family life. In October 2020, there was an attempt on my life which left me bludgeoned and near death with 2 orbital fractures. These event led me to uncover a massive money laundering operation and then had ties elsewhere. I then got hacked. When I say hacked, I lost every device. They rooted my charge arbitrator, I was bios bonded and I basically lost every document and all my software.
The people who did it gloated to me about it to me through linkedin. Trying to take my computers back, I ended up QAing a lot of their malware. This led to be being whaled for months. People attempted to blackmail me, they stole my identity and I was lied to about what I was doing as the people approaching me hid their relations to the people who initially hacked me.
These stressors where the predicates to my psychotic break. My federal correspondence about the hacking was re-routed. I had people impersonating microsoft employees. I even had a $50 billion dollar mortgage servicer trying to sue me over tweets. I also had firms like outpost 24 doing MiTM attacks on me, which is also a factor as I didnt realize at first where all my EU Certs were coming from in my Active directory.
Last, just to give you a clearer picture at how much I was monitored: I ended up not being able to take anything back in Windows. They were in my Windows Registry directly from install. Finding what I did, I believe I misattributed a good amount of sloppy programming to malicious behavior due to stress and paranoia.
So why did I believe you to be someone else? All the people responsible for these actions, well most of them were named Dan. They were mostly ex-google employees. Which is why my results changed and I ended up with about 19k various porn accounts in my name. This, they were also posing as random people named Dan (i.e. when IO tried telling Convercent, Microsoft’s auditor) about the issues. They were named Dan, and they were fraudulent.
So when I found what I did, and as well maintained as I did, I jumped the gun and assumed you were, again, one of the individuals I had whaling me. And at that point, after going to every agency I could, state and local, about the issue, I was a vicious dog backed into a corner. And I also at the time didn’t mean I’d physically harm you. I meant I was going to keep taking out your site.
Again, this doesn’t excuse my behavior. And in the end after all I tried to do to report everything that was happening. I ended up making the people who whaled me a lot of money. And I ended up still losing everything, but I also hurt and was a real asshole to a lot of people that had nothing to do with anything.
In the end, I had a hard time realizing what I did and the curl reason was a piss poor one. Again, I was just being an indignant ass. Especially to someone who actually reported security issues.
Sorry it took me a long time to write this. I should have apologized right after I knew I was wrong. I apologize for that mistake as well.
This amends is also a part of my getting better. So I apologize if this angered you. I just needed to make sure I tell you that I was wrong and I should have had betterjudgement.
When you use the name localhost in a URL, what does it mean? Where does the network traffic go when you ask curl to download http://localhost ?
Is “localhost” just a name like any other or do you think it infers speaking to your local host on a loopback address?
The name was “resolved” using the standard resolver mechanism into one or more IP addresses and then curl connected to the first one that works and gets the data from there.
The (default) resolving phase there involves asking the getaddrinfo() function about the name. In many systems, it will return the IP address(es) specified in /etc/hosts for the name. In some systems things are a bit more unusually setup and causes a DNS query get sent out over the network to answer the question.
In other words: localhost was not really special and using this name in a URL worked just like any other name in curl. In most cases in most systems it would resolve to 127.0.0.1 and ::1 just fine, but in some cases it would mean something completely different. Often as a complete surprise to the user…
Starting in commit 1a0ebf6632f8, to be released in curl 7.78.0, curl now treats the host name “localhost” specially and will use an internal “hard-coded” set of addresses for it – the ones we typically use for the loopback device: 127.0.0.1 and ::1. It cannot be modified by /etc/hosts and it cannot be accidentally or deliberately tricked by DNS resolves. localhost will now always resolve to a local address!
Does that kind of mistakes or modifications really happen? Yes they do. We’ve seen it and you can find other projects report it as well.
Who knows, it might even be a few microseconds faster than doing the “full” resolve call.
(You can still build curl without IPv6 support at will and on systems without support, for which the ::1 address of course will not be provided for localhost.)
Specs say we can
The RFC 6761 is titled Special-Use Domain Names and in its section 6.3 it especially allows or even encourages this:
Users are free to use localhost names as they would any other domain names. Users may assume that IPv4 and IPv6 address queries for localhost names will always resolve to the respective IP loopback address.
Name resolution APIs and libraries SHOULD recognize localhost names as special and SHOULD always return the IP loopback address for address queries and negative responses for all other query types. Name resolution APIs SHOULD NOT send queries for localhost names to their configured caching DNS server(s).
Mike West at Google also once filed an I-D with even stronger wording suggesting we should always let localhost be local. That wasn’t ever turned into an RFC though but shows a mindset.
(Some) Browsers do it
Chrome has been special-casing localhost this way since 2017, as can be seen in this commit and I think we can safely assume that the other browsers built on their foundation also do this.
Firefox landed their corresponding change during the fall of 2020, as recorded in this bugzilla entry.
Safari (on macOS at least) does however not do this. It rather follows what /etc/hosts says (and presumably DNS of not present in there). I’ve not found any official position on the matter, but I found this source code comment indicating that localhost resolving might change at some point:
Since some time back, Windows already resolves “localhost” internally and it is not present in their /etc/hosts alternative. I believe it is more of a hybrid solution though as I believe you can put localhost into that file and then have that custom address get used for the name.
Secure over http://localhost
When we know for sure that http://localhost is indeed a secure context (that’s a browser term I’m borrowing, sorry), we can follow the example of the browsers and for example curl should be able to start considering cookies with the “secure” property to be dealt with over this host even when done over plain HTTP. Previously, secure in that regard has always just meant HTTPS.
This change in cookie handling has not happened in curl yet, but with localhost being truly local, it seems like an improvement we can proceed with.
Can you still trick curl?
When I mentioned this change proposal on twitter two of the most common questions in response were
can’t you still trick curl by routing 127.0.0.1 somewhere else
can you still use --resolve to “move” localhost?
The answers to both questions are yes.
You can of course commit the most hideous hacks to your system and reroute traffic to 127.0.0.1 somewhere else if you really wanted to. But I’ve never seen or heard of anyone doing it, and it certainly will not be done by mistake. But then you can also just rebuild your curl/libcurl and insert another address than the default as “hardcoded” and it’ll behave even weirder. It’s all just software, we can make it do anything.
The --resolve option is this magic thing to redirect curl operations from the given host to another custom address. It also works for localhost, since curl will check the cache before the internal resolve and --resolve populates the DNS cache with the given entries. (Provided to applications via the CURLOPT_RESOLVE option.)
What will break?
With enough number of users, every single little modification or even improvement is likely to trigger something unexpected and undesired on at least one system somewhere. I don’t think this change is an exception. I fully expect this to cause someone to shake their fist in the sky.
However, I believe there are fairly good ways to make to restore even the most complicated use cases even after this change, even if it might take some hands on to update the script or application. I still believe this change is a general improvement for the vast majority of use cases and users. That’s also why I haven’t provided any knob or option to toggle off this behavior.
The top photo was taken by me (the symbolism being that there’s a path to take somewhere but we don’t really know where it leads or which one is the right to take…). This curl change was written by me. Mike West provided me the Chrome localhost change URL. Valentin Gosu gave me the Firefox bugzilla link.
The official publication date of the relevant QUIC specifications is: May 27, 2021.
I’ve done many presentations about HTTP and related technologies over the years. HTTP/2 had only just shipped when the QUIC working group had been formed in the IETF and I started to mention and describe what was being done there.
I’ve explained HTTP/3
I started writing the document HTTP/3 explained in February 2018 before the protocol was even called HTTP/3 (and yeah the document itself was also called something else at first). The HTTP protocol for QUIC was just called “HTTP over QUIC” in the beginning and it took until November 2018 before it got the name HTTP/3. I did my first presentation using HTTP/3 in the title and on slides in early December 2018, My first recorded HTTP/3 presentation was in January 2019 (in Stockholm, Sweden).
In that talk I mentioned that the protocol would be “live” by the summer of 2019, which was an optimistic estimate based on the then current milestones set out by the IETF working group.
I think my optimism regarding the release schedule has kept up but as time progressed I’ve updated that estimation many times…
HTTP/3 – not yet
The first four RFC documentations to be ratified and published only concern QUIC, the transport protocol, and not the HTTP/3 parts. The two HTTP/3 documents are also in queue but are slightly delayed as they await some other prerequisite (“generic” HTTP update) documents to ship first, then the HTTP/3 ones can ship and refer to those other documents.
QUIC is a new transport protocol. It is done over UDP and can be described as being something of a TCP + TLS replacement, merged into a single protocol.
Okay, the title of this blog is misleading. QUIC is actually documented in four different RFCs:
RFC 9002 – QUIC Loss Detection and Congestion Control
My role: I’m just a bystander
I initially wanted to keep up closely with the working group and follow what happened and participate on the meetings and interims etc. It turned out to be too difficult for me to do that so I had to lower my ambitions and I’ve mostly had a casual observing role. I just couldn’t muster the energy and spend the time necessary to do it properly.
I’ve participated in many of the meetings, I’ve been present in the QUIC implementers slack, I’ve followed lots of design and architectural discussions on the mailing list and in GitHub issues. I’ve worked on implementing support for QUIC and h3 in curl and thanks to that helped out iron issues and glitches in various implementations, but the now published RFCs have virtually no traces of me or my feedback in them.
In the curl project we make great efforts to store a lot of meta data about each and every vulnerability that we have fixed over the years – and curl is over 23 years old. This data set includes CVE id, first vulnerable version, last vulnerable version, name, announce date, report to the project date, CWE, reward amount, code area and “C mistake kind”.
We also keep detailed data about releases, making it easy to look up for example release dates for specific versions.
All this, combined with my fascination (some would call it obsession) of graphs is what pushed me into creating the curl project dashboard, with an ever-growing number of daily updated graphs showing various data about the curl projects in visual ways. (All scripts for that are of course also freely available.)
What to show is interesting but of course it is sometimes even more important how to show particular data. I don’t want the graphs just to show off the project. I want the graphs to help us view the data and make it possible for us to draw conclusions based on what the data tells us.
The worst bugs possible in a project are the ones that are found to be security vulnerabilities. Those are the kind we want to work really hard to never introduce – but we basically cannot reach that point. This special status makes us focus a lot on these particular flaws and we of course treat them special.
For a while we’ve had two particular vulnerability graphs in the dashboard. One showed the number of fixed issues over time and another one showed how long each reported vulnerability had existed in released source code until a fix for it shipped.
CVE age in code until report
The CVE age in code until report graph shows that in general, reported vulnerabilities were introduced into the code base many years before they are found and fixed. In fact, the all time average time suggests they are present for more than 2,700 – more than seven years. Looking at the reports from the last 12 months, the average is even almost 1000 days more!
It takes a very long time for vulnerabilities to get found and reported.
When were the vulnerabilities introduced
Just the other day it struck me that even though I had a lot of graphs already showing in the dashboard, there was none that actually showed me in any nice way at what dates we created the vulnerabilities we spent so much time and effort hunting down, documenting and talking about.
I decided to use the meta data we already have and add a second plot line to the already existing graph. Now we have the previous line (shown in green) that shows the number of fixed vulnerabilities bumped at the date when a fix was released.
Added is the new line (in red) that instead is bumped for every date we know a vulnerability was first shipped in a release. We know the version number from the vulnerability meta data, we know the release date of that version from the release meta data.
This all new graph helps us see that out of the current 100 reported vulnerabilities, half of them were introduced into the code before 2010.
Using this graph it also very clear to me that the increased CVE reporting that we can spot in the green line started to accelerate in the project in 2016 was not because the bugs were introduced then. The creation of vulnerabilities rather seem to be fairly evenly distributed over time – with occasional bumps but I think that’s more related to those being particular releases that introduced a larger amount of features and code.
As the average vulnerability takes 2700 days to get reported, it could indicate that flaws landed since 2014 are too young to have gotten reported yet. Or it could mean that we’ve improved over time so that new code is better than old and thus when we find flaws, they’re more likely to be in old code paths… I don’t think the red graph suggests any particular notable improvement over time though. Possibly it does if we take into account the massive code growth we’ve also had over this time.
The green “fixed” line at least has a much better trend and growth angle.
Present in which releases
As we have the range of vulnerable releases stored in the meta data file for each CVE, we can then add up the number of the flaws that are present in every past release.
Together with the release dates of the versions, we can make a graph that shows the number of reported vulnerabilities that are present in each past release over time, in a graph.
You can see that some labels end up overwriting each other somewhat for the occasions when we’ve done two releases very close in time.
tldr: the level of HTTP/3 support in servers is surprisingly high.
The specifications are all done. They’re now waiting in queues to get their final edits and approvals before they will get assigned RFC numbers and get published as such – they will not change any further. That’s a set of RFCs (six I believe) for various aspects of this new stack. The HTTP/3 spec is just one of those. Remember: HTTP/3 is the application protocol done over the new transport QUIC. (See http3 explained for a high-level description.)
The HTTP/3 spec was written to refer to, and thus depend on, two other HTTP specs that are in the works: httpbis-cache and https-semantics. Those two are mostly clarifications and cleanups of older HTTP specs, but this forces the HTTP/3 spec to have to get published after the other two, which might introduce a small delay compared to the other QUIC documents.
The working group has started to take on work on new specifications for extensions and improvements beyond QUIC version 1.
In early April 2021, the usage of QUIC and HTTP/3 in the world is measured by a few different companies.
netray.io scans the IPv4 address space weekly and checks how many hosts that speak QUIC. Their latest scan found 2.1 million such hosts.
Arguably, the netray number doesn’t say much. Those two million hosts could be very well used or barely used machines.
HTTP/3 by w3techs
w3techs.com has been in the game of scanning web sites for stats purposes for a long time. They scan the top ten million sites and count how large share that runs/supports what technologies and they also check for HTTP/3. In their data they call the old Google QUIC for just “QUIC” which is confusing but that should be seen as the precursor to HTTP/3.
What stands out to me in this data except that the HTTP/3 usage seems very high: the top one-million sites are claimed to have a higher share of HTTP/3 support (16.4%) than the top one-thousand (11.9%)! That’s the reversed for HTTP/2 and not how stats like this tend to look.
It has been suggested that the growth starting at Feb 2021 might be explained by Cloudflare’s enabling of HTTP/3 for users also in their free plan.
HTTP/3 by Cloudflare
On radar.cloudflare.com we can see Cloudflare’s view of a lot of Internet and protocol trends over the world.
This HTTP/3 number is significantly lower than w3techs’. Presumably because of the differences in how they measure.
All the major browsers have HTTP/3 implementations and most of them allow you to manually enable it if it isn’t already done so. Chrome and Edge have it enabled by default and Firefox will so very soon. The caniuse.com site shows it like this (updated on April 4):
(Earlier versions of this blog post showed the previous and inaccurate data from caniuse.com. Not anymore.)
curl supports HTTP/3 since a while back, but you need to explicitly enable it at build-time. It needs to use third party libraries for the HTTP/3 layer and it needs a QUIC capable TLS library. The QUIC/h3 libraries are still beta versions. See below for the TLS library situation.
curl’s HTTP/3 support is not even complete. There are still unsupported areas and it’s not considered stable yet.
curl supports 14 different TLS libraries at this time. Two of them have QUIC support landed: BoringSSL and GnuTLS. And a third would be the quictls OpenSSL fork. (There are also a few other smaller TLS libraries that support QUIC.)
The by far most popular TLS library to use with curl, OpenSSL, has postponed their QUIC work:
At the same time they have delayed the OpenSSL 3.0 release significantly. Their release schedule page still today speaks of a planned release of 3.0.0 in “early Q4 2020”. That plan expects a few months from the beta to final release and we have not yet seen a beta release, only alphas.
Realistically, this makes QUIC in OpenSSL many months off until it can appear even in a first alpha. Maybe even 2022 material?
The Google powered OpenSSL fork BoringSSL has supported QUIC for a long time and provides the OpenSSL API, but they don’t do releases and mostly focus on getting a library done for Google. People outside the company are generally reluctant to use and depend on this library for those reasons.
The quiche QUIC/h3 library from Cloudflare uses BoringSSL and curl can be built to use quiche (as well as BoringSSL).
Microsoft and Akamai have made a fork of OpenSSL available that is based on OpenSSL 1.1.1 and has the QUIC pull-request applied in order to offer a QUIC capable OpenSSL flavor to the world before the official OpenSSL gets their act together. This fork is called quictls. This should be compatible with OpenSSL in all other regards and provide QUIC with an API that is similar to BoringSSL’s.
The ngtcp2 QUIC library uses quictls. curl can be built to use ngtcp2 as well as with quictls,
Is HTTP/3 faster?
I realize I can’t blog about this topic without at least touching this question. The main reason for adding support for HTTP/3 on your site is probably that it makes it faster for users, so does it?
We’ve seen other numbers say h3 is faster shown before but it’s hard to find up-to-date performance measurements published for the current version of HTTP/3 vs HTTP/2 in real world scenarios. Partly of course because people have hesitated to compare before there are proper implementations to compare with, and not just development versions not really made and tweaked to perform optimally.
I think there are reasons to expect h3 to be faster in several situations, but for people with high bandwidth low latency connections in the western world, maybe the difference won’t be noticeable?
I’ve previously shown the slide below to illustrate what needs to be done for curl to ship with HTTP/3 support enabled in distros and “widely” and I think the same works for a lot of other projects and clients who don’t control their TLS implementation and don’t write their own QUIC/h3 layer code.
This house of cards of h3 is slowly getting some stable components, but there are still too many moving parts for most of us to ship.
I assume that the rest of the browsers will also enable HTTP/3 by default soon, and the specs will be released not too long into the future. That will make HTTP/3 traffic on the web increase significantly.
The QUIC and h3 libraries will ship their first non-beta versions once the specs are out.
The TLS library situation will continue to hamper wider adoption among non-browsers and smaller players.
The big players already deploy HTTP/3.
I’ve updated this post after the initial publication, and the biggest corrections are in the Chrome/Edge details. Thanks to immediate feedback from Eric Lawrence. Remaining errors are still all mine! Thanks also to Barry Pollard who filed the PR to update the previously flawed caniuse.com data.
curl is an internet transfer engine. A rather modular one too. Parts of curl’s functionality is provided by selectable alternative implementations that we call backends. You select what backends to enable at build-time and in many cases the backends are enabled and powered by different 3rd party libraries.
curl has a range of such alternative backends for various features:
International Domain Names
HTTP content encoding
Stable API and ABI
Maintaining a stable API and ABI is key to libcurl. As long as those promises are kept, changing internals such as switching between backends is perfectly fine.
The API is the armored front door that we don’t change. The backends is the garden on the back of the house that we can dig up and replant every year if we want, without us having to change the front door.
Already back in 2005 we added support for using an alternative TLS library in curl when we added support for GnuTLS in addition to OpenSSL, and since then we’ve added many more. We do this by having an internal API through which we do all the TLS related things and for each third party library we support we have code that does the necessary logic to connect the internal API with the corresponding TLS library.
Today, we merged support for yet another TLS library: rustls. This is a TLS library written in rust and it has a C API provided in a separate project called crustls. Strictly speaking, curl is built to use crustls.
This is still early days for the rustls backend and it is not yet feature complete. There’s more work to do and polish to apply before we can think of it as a proper competitor to the already established and well-used TLS backends, but with this merge it makes it much easier for more people to help out and test it out. Feel free and encouraged to join in!
We count this addition as the 14th concurrently supported TLS library in curl. I’m not aware of any other project, anywhere, that supports more or even this many TLS libraries.
The TLS library named mesalink is actually already using rustls, but under an OpenSSL API disguise and we support that since a few years back…
The TLS backend code for rustls was written and contributed by Jacob Hoffman-Andrews.