The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:
I guess partly because IPFS is a rather new protocol not widely supported by many clients yet, people came up with the concept of IPFS “gateways”. (Others might have called it a proxy, because that is what it really is.)
This gateway is an HTTP server that runs on a machine that also knows how to speak and access IPFS. By sending the right HTTP request to the gateway, that includes the IPFS hash you are interested in, the gateway can respond with the contents from the IPFS hash.
This way, you just need an ordinary HTTP client to access IPFS. I think it is pretty clever. But as always, the devil is in the details.
Rewrite ipfs:// into https://
The quest for IPFS aficionados have seemingly become to add support for IPFS using this gateway approach to multiple widely used applications that know how to speak HTTP(S). Just rewrite the URL internally from IPFS to HTTPS.
The IPFS “URL rewrite” is done in such so that the example IPFS URL is converted into “https://$gateway/$hash”. The transfer is done by the client as if it was a plain old HTTPS transfer.
What if the gateway is not local?
This approach has its biggest benefit of course when you can actually use a remote IPFS gateway. I presume most random ordinary users who want to access IPFS does not actually want to download, install and run an IPFS gateway on their machine to use this new power. They might very well appreciate the idea and convenience of accessing a remote IPFS gateway.
After all, the gateways are using HTTPS so at least the transfers are secure, right?
Leading people behind IPFS are even running public IPFS gateways for anyone to use. Both dweb.link and ipfs.io have been mentioned and suggested for default use. Possibly they are the same physical host as they appear to have the same IP addresses (owned by Protocol Labs, one of the big proponents behind IPFS).
In an attempt to make this even easier for users, ffmpeg is made to use a built-in default gateway if none is set by the user.
I would expect very few users to actually have an IPFS gateway set. And even fewer to actually specify a local one.
So, when users want to watch a video using ffmpeg and an ipfs:// URL what happens?
The gateway sees it all
The remote gateway, that is administered by someone, somewhere, gets to see the full incoming request, and all traffic (video, whatever) that is sent back to the client (ffmpeg in this example) goes via the gateway. Sure, the client accesses the gateway via HTTPS so nobody can tamper with the traffic along the way, but the gateway is in full control and can inspect and tamper with the data as much as it likes. And there is no way for the client to know or detect if it is happening.
I am not involved in ffmpeg, nor do I have any insights into how the discussions evolved and were done around this subject, but for curl I have put my foot down and said that we must not blindly use and trust any default remote gateway like this. I believe it is our responsibility to not lure users into this traffic-monitoring setup.
Make sure you trust the gateway you use.
The curl way
curl will probably output an error message if there was no gateway set when an ipfs:// URL is used, and inform the user that it needs a gateway set to function and probably also show a URL for a page that contains more information about what this means and how to specify a gateway.
During the work of making the IPFS support for curl (which I review and comment on, not written or authored myself) I have also learned that some of the IPFS gateways even do regular HTTP 30x redirects to bounce over the client to another gateway.
Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.
I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects, as I believe that adds even more risk to the user so if a user wants to allow this operation, it should be opt-in.
Imagine rising use of this
I would imagine that the ones hosting a popular default gateway either becomes overloaded and slow, or they need to scale up. If they scale up, they risk to leak traffic even wider. Scaling up also makes the operation more expensive, leading to incentives to make money somehow to finance it. Will the cookie car in the form of a massive data trove perhaps then be used/sold?
Users of these gateways get no promises, no rights, no contracts.
I am not dissing the idea nor implementation of IPFS itself. I think using HTTP gateways to access IPFS is a good idea in general as it makes the network more accessible. It is just a fragile solution that easily misleads users to do things they maybe shouldn’t. So maybe a little too accessible?
On July 1st 2017, exactly five years ago today, the OSS-Fuzz project “adopted” curl into their program and started running fuzz tests against it.
OSS-Fuzz is a project run by Google and they do fuzzing on a large amount of open source projects: OSS-Fuzz aims to make common open source software more secure and stable by combining modern fuzzing techniques with scalable, distributed execution.
That initial adoption of curl into OSS-Fuzz was done entirely by Google themselves and its fuzzing integration was rough and not ideal but it certainly got the ball rolling.
Later in in the fall of 2017, Max Dymond stepped up and seriously improved the curl-fuzzer so that it would better test protocols and libcurl options deeper and to a higher degree. (Max subsequently got a grant from Google for his work.)
Fuzzing is really the next-level thing all projects should have a go at when there is no compiler warning left to fix and the static code analyzers all report zero defects, because fuzzing has a way to find silly mistakes, off-by-ones etc in a way no analyzers can and that is very hard for human reviewers to match.
curl is a network tool and library written in C, it is meant to handle non-trusted data properly and doing it wrongly can have serious repercussions.
For the curl project, OSS-Fuzz has found several silly things and bad code paths especially after that rework Max put in. We have no less than 6 curl CVEs registered in 2017 and 2018 giving OSS-Fuzz (at least partial) credit for their findings.
In the first year or two, OSS-Fuzz truly kept us on our toes and we fixed numerous other flaws as well that weren’t CVE worthy but still bugs. At least 36 separate ones to date.
The beauty of OSS-Fuzz
For a little Open Source project such as curl, this service is great and in case you never worked with it let me elaborate what makes me enjoy it so much.
Google runs all the servers and keeps the service running. We don’t have to admin it, spend energy on maintaining etc.
They automatically pull the latest curl code from our master branch and keep fuzzing it. All day, every day, non-stop.
When they find a problem, we get a bug submitted and an email sent. The bug report they create contains stack traces and a minimized and reproducible test case. In most cases, that means I can download a small file that is often less than a hundred bytes, and run my locally built fuzzer code with this input and see the same issue happen in my end.
The bug they submit is initially closed for the public and only accessible to select team of curl developers.
90 days after the initial bug was filed, it is made public automatically.
The number of false positives are remarkably low, meaning that almost every time the system emails me, it spotted a genuine problem.
Fuzzing per commit
In 2020, the OSS-Fuzz team introduced CIFuzz, which allows us to run a little bit of fuzzing – for a limited time – on a PR as part of the CI pipeline. Using this, we catch the most obvious mistakes even before we merge the code into our master branch!
After the first few years of us fixing bad code paths, and with the introduction of CIFuzz, we nowadays rarely get reports from OSS-Fuzz anymore. And when we do, we have it complain on code that hasn’t been released yet. The number of CVEs detected in curl by OSS-Fuzz has therefore seemingly stopped at 6.
Taking fuzzing further
After five years of fuzzing, we would benefit from extending and improving the “hooks” where and how the fuzzing is made so that we can help it find and reach code paths with junk that it so far hasn’t reached. There are of course still potentially many undetected flaws in curl remaining because of the limited set of entry point we have for the fuzzer.
I hope OSS-Fuzz continues and lives on for a long time. It certainly helps curl a lot!
Saturday June 18: I had some curl time in the afternoon and I was just about to go edit the four security advisories I had pending for the next release, to brush up the language and check that they read fine, when it dawned on me.
These particular security advisories were still in draft versions but maybe 90% done. There were details, like dates and links to current in-progress patches, left to update. I also like to reread them a few times, especially in a webpage rendered format, to make sure they are clear and accurate in describing the problem, the solution and all other details, before I consider them ready for publication.
I checked out my local git branch where I expected the advisories to reside. I always work on pending security details in a local branch named security-next-release or something like that. The branch and its commits remain private and undisclosed until everything is ready for publication.
(I primarily use git command lines in terminal windows.)
The latest commits in my git log output did not show the advisories so I did a rebase but git promptly told me there was nothing to rebase! Hm, did I use another branch this time?
It took me a few seconds to realize my mistake. I saw four commits in the git master branch containing my draft advisories and then it hit me: I had accidentally pushed them to origin master and they were publicly accessible!
The secrets I was meant to guard until the release, I had already mostly revealed to the world – for everyone who was looking.
In retrospect I can’t remember exactly how the mistake was done, but I clearly committed the CVE documents in the wrong branch when I last worked on them, a little over a week ago. The commit date say June 9.
On June 14, I got a bug report about a problem with curl’s .well-known/security.txt file (RFC 9116) where it was mentioned that our file didn’t have an Expires: keyword in spite of it being required in the spec. So I fixed that oversight and pushed the update to the website.
When doing that push, I did not properly verify exactly what other changes that would be pushed in the same operation, so when I pressed enter there, my security advisories that had accidentally been committed in the wrong branch five days earlier and still were present there were also pushed to the remote origin. Swooosh.
The advisories are created in markdown format, and anyone who would update their curl-www repository after June 14 would then get them into their local repository. Admittedly, there probably are not terribly many people who do that regularly. Anyone could also browse them through the web interface on github. Also probably not something a lot of people do.
These pending advisories would however not appear on the curl website since the build files were not updated to generate the HTML versions. If you could guess the right URL, you could still get the markdown version to show on the site.
Nobody reported this mistake in the four days they were visible before I realized my own mistake (and nobody has reported it since either). I then tried googling the CVE numbers but no search seemed to find and link to the commits. The CVE numbers were registered already so you would mostly get MITRE and other vulnerability database listings that were still entirely without details.
After some quick deliberations with my curl security team friends, we decided expediting the release was the most sensible thing to do. To reduce the risk that someone takes advantage of this and if they do, we limit the time window before the problems and their fixes become known. For curl users security’s sake.
Previously, the planned release date was set to July 1st – thirteen days away. It had already been adjusted somewhat to not occur on its originally intended release Wednesday to cater for my personal summer plans.
To do a proper release with several security advisories I want at least a few days margin for the distros mailing list to prepare before we go public with everything. There was also the Swedish national midsummer holiday coming up next weekend and I did not feel like ruining my family’s plans and setup for that, so I picked the first weekday after midsummer: June 27th.
While that is just four days earlier than what we had previously planned, I figure those four days might be important and if we imagine that someone finds a way to exploit one of these problems before then, then at least we shorten the attack time window by four days.
When working with my security advisories, I must pay more attention and be more careful with which branch I commit to.
When pushing commits to the website and I know I have pending security sensitive details locally that have not been revealed yet, I should make it a habit to double-check that what I am about to push is only and nothing but what I expect to be there.
Simultaneously, I have worked using this process for many years now and this is the first time I did this mistake. I do not think we need to be alarmist about it.
Trailing dots on host names in URLs is the gift that keeps on giving.
Let me take you through a dwindling story of how the dot is handled differently in different places through the stack of an Internet client. The evil trailing dot.
When a given host name is to be resolved to an IP address on a networked computer, there are dedicated functions to use. The host name example.com resolves to a number of IP addresses.
If you add a dot to the end of that host name, it does not change what is resolved. “example.com.” resolves to the same set of addresses as “example.com” does. (But putting two dots at the end will make it fail.)
The reason why it works like this is based on how DNS is built up with different “labels” (that when written in text are separated with dots) and then having a trailing dot is just an empty final label, just as with no dot. So, in the DNS protocol there are no trailing dots so to speak. When trying two dots at the end, it makes a zero-length label between them and that is not allowed.
People accustomed to fiddle with DNS are used to ending Fully Qualified Domain Names (FQDN) with a trailing dot.
In addition to the name actually being resolved (sent to a DNS resolver), native resolver functions usually puts a meaning and a semantic difference between resolving “hello” and “hello.” (with or without the trailing dot). The trailing dot then means the name is to be used actually exactly only like that, it is specified in full, while the name without a trailing dot can be tried with a domain name appended to it. Or even a list of domain names, until one resolves. This makes people want to use a trailing dot at times, to avoid that domain test.
HTTP clients that want to work with a given URL needs to extract the name part from the URL and use that name to:
resolve the host name to a list of IP addresses to connect to
pass that name in the Host: or :authority: request headers, so that the HTTP server knows which specific server the clients speaks to – as it may run multiple servers on the same IP address
The HTTP spec says the name in the Host header should be used verbatim from the URL; the trailing dot should be included if it was present in the URL. This allows a server to host different content for “example.com” and “example.com.”, even if many servers will by default treat them as the same. Some hosts will just redirect the dot version to the non-dot. Some hosts will return error.
The HTTP client certainly connects to the same set of addresses for both.
For a lot of HTTP traffic, having the trailing dot there or not makes no difference. But they can be made to make a difference. And boy, they can certainly make a difference internally…
Cookies are passed back and forth over HTTP using dedicated request and response headers. When a server wants to pass a cookie to the client, it can specify for which particular domain it is valid for and the client will send back cookies to the server only when there is a match of the domain it speaks to and for which domain cookies are set to etc.
The cookie spec RFC 6265 section 5.1.2 defines the host name in a way that makes it ignore trailing dots. Cookies set for a domain with a dot are valid for the same domain without one and vice versa.
When speaking to a HTTPS server, a client passes on the name of the remote server during the TLS handshake, in the SNI (Server Name Indication) field, so that the server knows which instance the client wants to speak to. What about the trailing dot you think?
The hostname is represented as a byte string using ASCII encoding without a trailing dot.
Meaning, that a HTTPS server cannot – in the TLS layer – make a distinction between a server for “example.com.” and “example.com”. Different hosts for HTTP, the same host for HTTPS.
curl’s dotted past
In the curl project, we – as everyone else – have struggled with the trailing dot over time.
We started out being mostly oblivious about the effects of the trailing dot and most of the code just treated it as part of the host name and it would be in the host name everywhere. Until one day someone pointed out that the SNI field does not approve of it. We fixed that.
In 2014, curl started to always just cut off trailing dots internally if one was provided in the URL. The dot rarely makes a difference, it made the host name work fine with SNI and for HTTPS it is practically difficult to make a difference between them.
In 2022, someone found a web site that actually requires a trailing dot in the Host: header to respond correctly and reported it to the curl project.
Sigh. We back-pedaled on the eight years old decision and decided to internally keep the dot in the name, but strip it for the purpose of the SNI field. This seems to be how the browsers are doing it. We released curl 7.82.0 with this change. That site that needed the trailing dot kept in the Host: header could now be retrieved with curl. Yay.
As a bonus curl also lowercases the SNI name field now, because that is what the browsers do even if the spec says the field is supposed to be used case insensitively. That habit has made sure there are servers on the Internet that won’t work properly if the SNI name is not lowercase…
In your face
That “back-peddle” for 7.82.0 when we brought back the dot into the host name field, turned out to be incomplete, but it was not totally obvious nor immediately apparent.
When we brought back the trailing dot into the name field, we accidentally broke several internal name checks.
The checks broke in the cookie handling of domains even though cookies, as mentioned above, are supposed to not care about trailing dots.
To understand this, we have to back up a little bit and talk about how cookies and cookie domains work.
Cookies are strange beasts and because the server can tell the client for which domain the cookie applies to, a client needs to check so that the server does not try to set the cookies too broadly or for other domains. It does not stop there, but there is also the concept of something called “Public Suffix List” (PSL), which are known domains for which setting a cookie is not accepted. (This list is also used for limiting other things in browsers but they are out of scope here.) One widely known such domain to mention as an example is “co.uk”. A server should not be allowed to set a cookie for “co.uk” as then it would basically be sent back for every web site that exist in the UK.
The PSL is a maintained list with a huge number of domains in it. To manage those and to make sure tools like curl can check for them in a convenient way, a dedicated library was made for this several years ago: libpsl. curl has optionally used this since 2015.
I said optional
That public suffix list is huge, which is a primary reason why many users still opt to build curl without support for it. This means that curl needs to provide backup functionality for the builds where libpsl is not present. Typically in a lot of embedded systems.
Without knowledge of the PSL curl will not reject cookies for “co.uk” but it should reject cookies for “.uk” or “.com” as even without PSL knowledge it still knows that setting cookies for top-level domains is not okay.
How did the curl check used without PSL verify if the given domain is a TLD only?
It checked – if there is a dot present in the name, then it is not a TLD.
Axel Chong figured out that for a curl build without PSL knowledge, the server could set a cookie for a TLD if you just made sure to end the name with a dot.
With the 7.82.0 change in place, where curl keeps the trailing dot for the host name, combined with that cookie set for TLD domain with a trailing dot, they have matching tail ends. This means that curl would send cookies to servers that match the criteria. The broken TLD check was benign all those years until we let the trailing dot in. This is security vulnerability CVE-2022-27779.
It did not stop there. Axel did not stop there. Since curl now keeps the trailing dot in the name and did not do it before, there was a second important string comparison that broke in unexpected ways that Axel figured out and reported. A second vulnerability introduced by the same change.
HSTS is the concept that allows curl to store a “cache” of host names and keep it around, so that if you want to do a subsequent transfer to one of those host names again before they expire, curl will go directly to HTTPS even if HTTP is used in the URL. As a way to avoid the clear-text insecure redirect step some URLs use.
The new treatment of trailing dots, that basically allows users to provide the same host name in two different ways and yet resolve to the exact same addresses exposed that the HSTS code did take care of (ignored) the trailing dot properly. If you let curl store HSTS info for the host name without a trailing dot, you can then later bypass the HSTS by using the same host name with a trailing dot. Or vice versa. This is security vulnerability CVE-2022-30115.
The code for alt-svc also needed adjustment for the dot, but fortunately that was “just” a bug and had not security impact.
All these three separate areas in which trailing dots caused problems have been fixed in curl 7.83.1 and all of them are now tested and verified with an extended set of tests to make sure they keep handle the dots correctly.
Someone called it a dot release.
Is this the end of dot problems?
I don’t know but it seems unlikely. The trailing dots have kept on haunting us since a long time by now so I would say the chances are big that there are both some more flaws lingering and some future changes pending. That then can make the cycle take another loop or two.
I’m not in the US and I’m not a US citizen but I felt I should help out when asked and I was able to.
On April 21 2022, I joined the video meeting together with an OpenSSL and a Tomcat contributor and several members of the board. (I am not naming any names of participants in this post because I have not asked for permission nor do I think the names are important here.)
For about an hour we talked to the board how we develop Open Source, how we take on security problems and how we work on making sure we do things as securely as we can. It was striking how similarly the three of us looked at the issues and how we work in our project, despite our projects all being different and having our own specifics.
As projects, we believe we have pretty well-established and working procedures for getting problems reported and we think we fix the issues fairly swiftly. We ship fixes, advisories and updates not long after the issues get known. The CVE system where we register and publish security vulnerabilities in a global registry is working adequately. (I’m not saying things are perfect.)
The main problem
It was pretty clear to me that we agreed that the biggest problem in the Open Source supply chain today is the slow uptake in patching vulnerable software.
Lots of vendors and products have not been made or have any plans for how to handle upgrades when vulnerabilities are found. Many of those that do act, do that with such glacier like speeds that users of such products remain exposed for attackers for a long period after the flaws are already fixed and have become known.
My own analysis of this is that such vendors of course do this because its the cheapest way. Plain capitalistic reasons.
Addressing this is hard
If we had any easy fixes for this, we would already have them in progress. We were also asked by the board what kind of systems that we would not like to see.
Will Software Bill Of Materials (SBOM) fix this? Maybe it can help, by exposing to the world what software and versions are used in products, but it will certainly depend on how it is used and enforced. If done too heavy-handed, it risks causing overhead and added complications but in the other end it might end up too wishy-washy.
This was just an hour of conversation with a few follow-up clarifying emails. I hope that we were able to provide insights into how Open Source is made but I have no illusions of us changing anything in drastic ways.
I felt honored to represent “my kind” and help sharing knowledge of Open Source to areas of the world that might not always get informed about it.
One of the toughest jobs I have, is to assess if a reported security problem is indeed an actual security vulnerability or “just” a bug. Let me take you through a recent case to give you an insight…
curl is 24 years old and so far in our history we have registered 111 security vulnerabilities in curl. I’ve sided with the “security vulnerability” side in reported issues 111 times. I’ve taken the opposite stance many more times.
Over the last two years, we have received 129 reports about suspected security problems and less than 15% of them (17) were eventually deemed actual security vulnerabilities. In the other 112 cases, we ended up concluding that the report was not pointing out a curl security problem. In many of those 112 cases, it was far from easy to end up with that decision and in several instances the reporter disagreed with us. (But sure, in the majority of the cases we could fairly quickly conclude that the reports were completely bonkers.)
The reporter’s view
Many times, the reporter that reports a security bug over on Hackerone has spent a significant amount of time and effort to find it, research it, reproduce it and report it. The reporter thinks it is a security problem and there’s a promised not totally insignificant monetary reward for such problems. Not to mention that a found and reported vulnerability in curl might count as something of a feat and a “feather in the hat” for a security researcher. The reporter has an investment in this work and a strong desire to have their reported issue classified as a security vulnerability.
The project’s view
If the reported problem is a security problem then we must consider it as that and immediately work on fixing the issue to reduce the risk of users getting hurt, and to inform all users about the risk and ask them to upgrade or otherwise mitigate and take precautions against the risks.
Most reported security issues are not immediately obvious. At least not in my eyes. I usually need to object, discuss, question and massage the data for a while in order to land on how we should best view the issue. I’m a skeptic by nature and I need to be convinced before I accept it.
Labeling something a “security vulnerability” if it indeed is not, is rather hurting users and the entire community rather than helping it. We must not cry wolf for a problem that cannot hurt users or that in practical terms is impossible to occur. Or maybe it is a problem that users are already expected to deal with. Or a result of an explicit or implicit application choice rather than a mistake done by us.
A bug in current libcurl makes it misbehave under certain conditions. When the MQTT connection gets closed mid message, libcurl refuses to acknowledge that and thinks the connection is still alive. Easily triggered by a malicious server.
libcurl considers the connection readable non-stop
Reading from the connection brings no more data
Busy-looping in the event-loop. Goto 2
The loop stops only once it reaches the set timeout, the progress callback can stop it and the speed-limit options will stop it if the right conditions are met.
By default, none of those options are set for a transfer and therefore, by default this makes an endless busy-loop.
At the same time…
A transfer can always stall and take a very long time to complete. A server can basically always just stop delivering more data, making the transfer take an infinite amount of time to complete. Applications that have not set any options to stop such a transfer risk doing a transfer that never ends. An endless transfer.
Also: if libcurl makes a transfer over a really fast network, such as localhost or using a super fast local network, then it might also reach the same level of busy-loop due to never having to wait for data. Albeit for a limited amount of time – until the transfer is complete. This busy-loop is highly unlikely to actually starve out any important threads in a system.
Yes, a closed connection is a much “cheaper” attack from server’s point of view than maintaining a long-living connection, but the cost of the attack is not a factor here.
Where in this grey area do we land?
This is difficult one.
I can see the point of the reporter, but I can also see how this flaw will basically not hurt any existing curl user. Where is our responsibility here?
I ended up concluding that this issue not a security vulnerability. The reporter disagreed.
It is a terribly annoying bug for sure. But the only applications that are seriously affected by it, are the ones that already allow an endless transfer.
The bug-fix was instead submitted as a normal pull-request: PR 8644, targeted to be fixed and included in the pending curl 7.83.0 release.
We publicize the reports after the fact
We make all (non-rubbish) previously reported hackerone issues public, whether they ended up being a vulnerability or not. To give everyone involved time to object or redact sensitive details, the publication date is usually within a month after the issue was closed.
By making the reports public, we allow everyone interested enough the ability and chance to check out and follow past discussions and deliberations for going the directions we did. The idea is primarily to be completely open about the reported issues and how we classify them, to show that we are not hiding anything and it also provides a chance for us to get more feedback from the surrounding and from security people who might disagree with previous analyses.
On March 16 2022, the curl security team received an email in which the reporter highlighted an Apple web page. What can you tell us about this?
I hadn’t seen it before. On this page with the title “About the security content of macOS Monterey 12.3”, said to have been published just two days prior, Apple mentions recent package upgrades and the page lists a bunch of products and what security fixes that were done for them in this update. Among the many products listed, curl is mentioned.
This is what the curl section of the page looked like:
In the curl project we always make all CVEs public with as much detail as we can possibly extract and provide about them. We take great pride in being the best in class in security flaw information and transparency.
Apple listed four CVE fixed. The three first IDs we immediately recognized from the curl security page. The last one however, was a surprise. What was that?
This is not a CVE published by the curl project. The curl project has in fact not shipped any CVE at all in 2022 (yet) so that’s easy to spot. When we looked at the MITRE registration for the ID, it also didn’t disclose any clues really. Not that it was expected to. It did show it was created on January 5 though, so it wasn’t completely new.
Was it a typo?
I compared this number to other recent CVE numbers announced from curl and I laid eyes on CVE-2021-22923 which had just two digits changed. Did they perhaps mean that CVE?
The only “problem” with that CVE is that it was in regards to Metalink and I don’t think Apple ever shipped their curl package with metalink support so therefore they wouldn’t have fixed a Metalink problem. So probably not a typo for that number at least!
I reached out to a friend at Apple as well with an email to Apple Product Security.
Security is our number one priority
In the curl project, we take security seriously. The news that there might be a security problem in curl that we haven’t been told about and that looks like it was about to get public sooner or later was of course somewhat alarming and something we just needed to get to the bottom of. It was also slightly disappointing that a large vendor and packager of curl since over 20 years would go about it this way and jab this into our back.
No source code
Apple has not made the source code for their macOS 12.3 version and the packages they use in there public, so there was no way for us to run diffs or anything to check for the exact modifications that this claimed fix would’ve resulted in.
Apple said so
Several “security websites” (the quotes are there to indicate that clearly these sites are more security in the name than in reality) immediately posted details about this “vulnerability”. Some of them with CVSS scores and CWE numbers , explaining how this problem can hurt users. Obviously completely made up since none of that info was made available by any first party sources anywhere. Not from Apple and not from the curl project. If you now did a web search on that CVE number, several of the top search results linked to such sites providing details – obviously made up from thin air.
As I think these sites don’t add much value to humanity, I won’t link to them here but instead I will show you a screenshot from such an article to show you what a made up CVE number posted by Apple can make people claim:
At 23:28 (my time zone) on the 17th, my Apple friend responded saying they had forwarded the issue to “the right team”.
The Apple Product Security team I also emailed about this issue, answered at 00:23 (still my time) on the 18th saying “we are looking into this and will provide an update soon when we have more information.”
The MITRE page got more details
After the weekend passed with no response, I looked back again on the MITRE page for the CVE in question and it had then gotten populated with additional curl details; mentioning Apple as CNA and now featuring links back to the Apple page! Now it really started to look like the CVE was something real that Apple (or someone) had registered but not told us about. It included real curl related snippets like this:
Multiple issues were addressed by updating to curl version 7.79.1. This issue is fixed in macOS Monterey 12.3. Multiple issues in curl.
Please tell us more details
On Monday the 21st, I continued to get questions about this CVE. Among others, from a member of a major European ISP’s CERT team curious about this CVE as they couldn’t find any specific information about this issue either and they were concerned they might have this vulnerability in the curl versions they run. They of course (rightfully) assumed that I would know about curl CVEs.
It turns out that when a major company randomly mentions a new CVE, it actually has an impact on the world!
At around 20:30 on March 21st, someone on Twitter spotted that the ghost CVE had been removed from Apple’s web page and it only listed three issues (and a mention that the section had been updated). At 21:39 I get an email response from Apple Product Security:
Thank you for reaching out to us about the error with this CVE on our security advisory. We’ve updated our site and requested that MITRE reject CVE-2022-22623 on their end.
Please let us know if you have any questions.
The reject request to MITRE is expected to be slow so that page will remains showing the outdated data for a while longer.
When Apple had retracted the wrong CVE, I figured I should maybe try to get exploitone.com to remove their “article” to maybe at least stop one avenue of further misinformation about this curl “issue”. I tweeted (in perhaps a tad bit inflammatory manner):
I get the feeling they didn’t quite understand my point. They replied:
As I had questions about Apple’s mishap, I replied (sent off 22:28 on the 21st, still only early afternoon on the US west coast), asking for details on what exactly had happened here. If it was a typo, then how come it got registered with MITRE? It’s just so puzzling and mysterious!
When I’m posting this article on my blog (36 hours after I sent the question), I still haven’t gotten any response or explanation. I don’t expect to get any either, but if I do, I will update this post accordingly.
Update March 26
exploitone.com updated their page at some point after my tweet to remove the mention of the imaginary CVE, but the wording remains very odd:
I’ve talked on this topic before but I realized I never did a proper blog post on the topic. So here it is: how we develop curl to keep it safe. The topic of supply chain security is one that is discussed frequently these days and every so often there’s a very well used (open source) component that gets a terrible weakness revealed.
Don’t get me wrong. Proprietary packages have their share of issues as well, and probably even more so, but for obvious reasons we never get the same transparency, details and insight into those problems and solutions.
If we would find a critical vulnerability in curl, it could potentially exist in every internet-connected device on the globe. We don’t want that.
A critical security flaw in our products would be bad, but we also similarly need to make sure that we provide APIs and help users of our products to be safe and to use curl safely. To make sure users of libcurl don’t accidentally end up getting security problems, to the best of our ability.
In the curl project, we work hard to never have our own version of a “heartbleed moment“. How do we do this?
Our method is not strange, weird or innovative. We simply apply all best practices, tools and methods that are available to us. In all areas. As we go along, we tighten the screws and improve our procedures, learning from past mistakes.
There are no short cuts or silver bullets. Just hard work and running tools.
Not a coincidence
Getting safe and secure code into your product is not something that happens by chance. We need to work on it and we need to make a concerned effort. We must care about it.
We all know this and we all know how to do it, we just need to make sure that we also actually do it.
Write code following the rules
Review written code and make sure it is clear and easy to read.
Test the code. Before and after merge
Verify the products and APIs to find cracks
Bug-bounty to reward outside helpers
Act on mistakes – because they will happen
For users of libcurl we provide an API with safe and secure defaults as we understand the power of the default. We also document everything with details and take great pride in having world-class documentation. To reduce the risk of applications becoming unsafe just because our API was unclear.
We also document internal APIs and functions to help contributors write better code when improving and changing curl.
We don’t allow compiler warnings to remain – on any platform. This is sometimes quite onerous since we build on such a ridiculous amount of systems.
We encourage use of source code comments and assert()s to make assumptions obvious. (curl is primarily written in C.)
All code should be reviewed. Maintainers are however allowed to review and merge their own pull-requests for practical reasons.
Code should be easy to read and understand. Our code style must be followed and encourages that: for example, no assignments in conditions, one statement per line, no lines longer than 80 columns and more.
Strict compliance with the code style also means that the code gets a flow and a consistent look, which makes it easier to read and manage. We have a tool that verifies most aspects of the code style, which takes away most of that duty away from humans. I find that PR authors generally take code style remarks better when pointed out by a tool than when humans do it.
A source code change is accompanied with a git commit message that need to follow the template. A consistent commit message style makes it easier to later come back and understand it proper when viewing source code history.
We want everything tested.
Unit tests. We strive at writing more and more unit tests of internal functions to make sure they truly do what expected.
System tests. Do actual network transfers against test servers, and make sure different situations are handled.
Integration tests. Test libcurl and its APIs and verify that they handle what they are expected to.
Documentation tests. Check formats, check references and cross-reference with source code, check lists that they include all items, verify that all man pages have all sections, in the same order and that they all have examples.
“Fix a bug? Add a test!” is a mantra that we don’t always live up to, but we try.
curl runs on 80+ operating systems and 20+ CPU architectures, but we only run tests on a few platforms. This usually works out fine because most of the code is written to run on multiple platforms so if tested on one, it will also run fine on all the other.
curl has a flexible build system that offers many million different build combinations with over 30 different possible third-party libraries in countless version combinations. We cannot test all build combos, but we try to test all the popular ones and at least one for each config option enabled and disabled.
We have many tests, but there are unfortunately still gaps and details not tested by the test suite. For those things we simply have to rely on the code review and then that users report problems in the shipped products.
We run all the tests using valgrind to make sure nothing leaks memory or do bad memory accesses.
We build and run with address, undefined behavior and integer overflow sanitizers.
We are part of the OSS-Fuzz project which fuzzes curl code non-stop, and we run CIFuzz in CI builds, which runs “a little” fuzzing on the curl code in the normal pull-request process.
We do “torture testing“: run a test case once and count the number of “fallible” function calls it makes. Those are calls to memory allocation, file operations, socket read/write etc. Then re-run the test that many times, and for each new iteration we make another one of the fallible functions fail and return error. Verify that no memory leaks or crashes occur. Do this on all tests.
We use several different static code analyzers to scan the code checking for flaws and we always fix or otherwise handle every reported defect. Many of them for each pull-request and commit, some are run regularly outside of that process:
The exact set has varied and will continue to vary over time as services come and go.
No matter how hard we try, we still ship bugs and mistakes. Most of them of course benign and harmless but some are not. We run a bug-bounty program to reward security searchers real money for reported security vulnerabilities found in curl. Until today, we have paid almost 17,000 USD in total and we keep upping the amounts for new findings.
When we report security problems, we produce detailed and elaborate advisories to help users understand every subtle detail about the problem and we provide overview information that shows exactly what versions are vulnerable to which problems. The curl project aims to also be a world-leader in security advisories and related info.
Act on mistakes
We are not immune, no matter how hard we try. Bad things will happen. When they do, we:
Own the problem, responsibly
Fix it and announce it – as soon as possible
Learn from it
Make it harder to do the same or similar mistakes again
Does it work? Do we actually learn from our history of mistakes? Maybe. Having our product in ten billion installations is not a proof of this. There are some signs that might show we are doing things right:
We were reporting fewer CVEs/year the last few years but in 2021 we went back up. It could also be the result of more people looking, thanks to the higher monetary rewards offered. At the same time the number of lines of code have kept growing at a rate of around 6,000 lines per year.
We get almost no issues reported by OSS-Fuzz anymore. The first few years it ran it found many problems.
We are able to increase our bug-bounty payouts significantly and now pay more than one thousand USD almost every time. We know people are looking hard for security bugs.
For every pull-request and commit done in the project, we run about 100 different builds + test rounds.
Done using several different CI services for maximum performance, widest possible coverage and shortest time to completion.
We currently use the following CI services: Cirrus CI, AppVeyor, Azure Pipelines, GitHub Actions, Circle CI and Zuul CI.
We also have a separate autobuild system with systems run by volunteers that checkout the latest code, build, run all the tests and report back in a continuous manner a few times or maybe once per day.
New habits past mistakes have taught us
We have done several changes to curl internals as direct reactions to past security vulnerabilities and their root causes. Lessons learned.
Unified dynamic buffer functions
These days we have a family of functions for working with dynamically sized buffers. Be using the same set for this functionality we have it well tested and we reduce the risk that new code messes up. Again, nothing revolutionary or strange, but as curl had grown organically over the decades, we found ourselves in need of cleaning this up one day. So we did.
Maximum string sizes
Several past mistakes came from possible integer overflows due to libcurl accepting input string sizes of unrestricted lengths and after doing operations on such string sizes, they would sometimes lead to overflows.
Since a few years back now, no string passed to curl is allowed to be larger than eight megabytes. This limit is somewhat arbitrarily set but is meant to be way larger than the largest user names and passwords ever used etc. We could also update the limit in a future, should we want. It’s not a limit that is exposed in the API or even mentioned. It is there to trap mistakes and malicious use.
Thanks to the previous points we now avoid realloc as far as possible outside of those functions. History shows that realloc in combination with integer overflows have been troublesome for us. Now, both reallocs and integer overflows should be much harder to mess up.
A few years ago we ran code coverage reports for one build combo on one platform. This generated a number that really didn’t mean a lot to anyone but instead rather mislead users to drawing funny conclusions based on the report. We stopped that. Getting a “complete” and representative number for code coverage for curl is difficult and nobody has yet gone back to attempt this.
The impact of security problems
Every once in a while someone discovers a security problem in curl. To date, those security vulnerabilities have been limited to certain protocols and features that are not used by everyone and in many cases even disabled at build-time among many users. The issues also often rely on either a malicious user to be involved, either locally or remotely and for a lot of curl users, the environments it runs in limit that risk.
To date, I’m not aware of any curl user, ever, having been seriously impacted by a curl security problem.
This is not a guarantee that it will not ever happen. I’m only stating facts about the history so far. Security is super hard and I can only promise that we will keep working hard on shipping secure products.
Is it scary?
Changes done to curl code today will end up in billions of devices within a few years. That’s an intimidating fact that could truly make you paralyzed by fear of the risk that the world will “burn” due to a mistake of mine.
Rather than instilling fear by this outlook, I think the proper way to think it about it, is respecting the challenge and “shouldering the responsibility”. Make the changes we deem necessary, but make them according to the guidelines, follow the rules and trust that the system we have setup is likely to detect almost every imaginable mistake before it ever reaches a release tarball. Of course we plug holes in the test suite that we spot or suspect along the way.
The back-door threat
I blogged about that recently. I think a mistake is much more likely to slip-in and get shipped to the world than a deliberate back-door is.
Memory safe components might help
By rewriting parts of curl to use memory safe components, such as hyper for HTTP, we might be able to further reduce the risk of future vulnerabilities. That’s a long game to make reality. It will also be hard in the future to actually measure and tell for sure if it truly made an impact.
How can you help out?
Pay for a curl support contract. This is what enables me to work full time on curl.
Help out with reviews and adding new tests to curl
Help out with fixing issues and improving the code
In April 2019 we launched the current curl bug-bounty program under the Hackerone umbrella and from my point of view it has been nothing but a raging success. Until today we’ve paid almost 17,000 USD in rewards and and the average payment amount has been increasing all the time.
The reward money in this program have been paid to security reporters sourced from our own funds. Funds that have been donated to the curl project by our generous curl sponsors.
Before that day in 2019, when this program started, we did a few attempts to lean on and piggy-back on other bug-bounty efforts, but that never worked good enough. It mostly made the process unpredictable, outside of our control and ability to influence them and they never paid researchers “proper”.
We even started this latest program in association with a known brand company (that I won’t name here) who promised to chip-in and contribute money to the rewards whenever they would affect one of their use cases – but that similarly just ended up an empty promise for something that apparently never could happen. It feels much more honest and straight forward not giving anyone such false expectations – so they’re no longer involved here.
The original Internet Bug-Bounty
Another “failed program” from the past, at least as far as bounties for curl issues go, was the Hackerone driven bounty program known as IBB. It was an umbrella project to offer bounties for security problems in a set of “internet programs” including curl. I won’t bore you with details why that didn’t work. I think they paid some small bounties to two or three curl related issues.
IBB reborn but different now
The experience from all previous attempts and programs we’ve tried for bounties says that we need to be in control of what reported issues that are considered security related problems and I think it is important that we reward all such issues, without discrimination or other conditions. If the issue is indeed a security problem, then we appreciate getting told about it and we reward the person who did the job, figured it out and told us.
Therefore, skepticism was the initial response I felt when I was briefed about the re-introduction, rebirth if you want, of the IBB program. We’ve been there, we tried that.
But after talking to the people involved, I was subsequently convinced that we should give this effort a chance. There are several reasons that made me think this time can be different, to our benefit. They include:
The IBB program will pay the rewards from their funds, and they will do their own fund raising and “pester “big companies to help out, thus either entirely or mostly remove the need for us to fund the rewards or at least make our spending smaller. Or the rewards larger.
The members of the curl security team will still work with reported issues the exact same way as before and our security team will remain the sole arbiters of what problems that are in-scope and what problems that are not for issues reported on curl. We’ve established a decent working method for that over the last two something years and I feel good about us sticking to this. The IBB program is mostly involved at the end of the process when the reward amount and payout are handled.
We stick to mostly the same work-flow and site for reporting issues and communicating with reporters while the issues are in the initial non-disclosed state. Namely within the nicely working Hackerone issue tracker, which is designed and made specifically for this purpose.
We have not signed up for this new way of doing things for life. If it turns out that it is bad somehow for the curl project or for security researchers filing problems about curl, then we can always just backpedal back to the previous situation and continue as before.
This should be a fairly harmless test and change of process that should be an improvement for us as otherwise we won’t stick to it!
Welcome to the 200th curl release. We call it 200 OK. It coincides with us counting more than 900 commit authors and surpassing 2,400 credited contributors in the project. This is also the first release ever in which we thank more than 80 persons in the RELEASE-NOTES for having helped out making it and we’ve set two new record in the bug-bounty program: the largest single payout ever for a single bug (2,000 USD) and the largest total payout during a single release cycle: 3,800 USD.
This release cycle was 42 days only, two weeks shorter than normal due to the previous 7.76.1 patch release.
the 200th release 5 changes 42 days (total: 8,468) 133 bug-fixes (total: 6,966) 192 commits (total: 27,202) 0 new public libcurl function (total: 85) 2 new curl_easy_setopt() option (total: 290) 2 new curl command line option (total: 242) 82 contributors, 44 new (total: 2,410) 47 authors, 23 new (total: 901) 3 security fixes (total: 103) 3,800 USD paid in Bug Bounties (total: 9,000 USD)
We set two new records in the curl bug-bounty program this time as mentioned above. These are the issues that made them happen.
This is a Use-After-Free in the OpenSSL backend code that in the absolutely worst case can lead to an RCE, a Remote Code Execution. The flaw is reasonably recently added and it’s very hard to exploit but you should upgrade or patch immediately.
The issue occurs when TLS session related info is sent from the TLS server when the transfer that previously used it is already done and gone.
The reporter was awarded 2,000 USD for this finding.
In the Schannel backend code, the selected cipher for a transfer done with was stored in a static variable. This caused one transfer’s choice to weaken the choice for a single set transfer could unknowingly affect other connections to a lower security grade than intended.
The reporter was awarded 800 USD for this finding.
In this release we introduce 5 new changes that might be interesting to take a look at!
Make TLS flavor explicit
As explained separately, the curl configure script no longer defaults to selecting a particular TLS library. When you build curl with configure now, you need to select which library to use. No special treatment for any of them!
No more SSL
curl now has no more traces of support for SSLv2 or SSLv3. Those ancient and insecure SSL versions were already disabled by default by TLS libraries everywhere, but now it’s also impossible to activate them even in special build. Stripped out from both the curl tool and the library (thus counted as two changes).
HSTS in the build
We brought HSTS support a while ago, but now we finally remove the experimental label and ship it enabled in the build by default for everyone to use it more easily.
In-memory cert API
We introduce API options for libcurl that allow users to specify certificates in-memory instead of using files in the file system. See CURLOPT_CAINFO_BLOB.
Again we manage to perform a large amount of fixes in this release, so I’m highlighting a few of the ones I find most interesting!
The first line of curl -V output got updated: libcurl now includes OpenLDAP and its version of that was used in the build, and then the curl tool can add libmetalink and its version of that was used in the build!
curl_mprintf: add description
We’ve provided the *printf() clone functions in the API since forever, but we’ve tried to discourage users from using them. Still, now we have a first shot at a man page that clearly describes how they work.
This is important as they’re not quite POSIX compliant and users who against our advice decide to rely on them need to be able to know how they work!
CURLOPT_IPRESOLVE: preventing wrong IP version from being used
This option was made a little stricter than before. Previously, it would be lax about existing connections and prefer reuse instead of resolving again, but starting now this option makes sure to only use a connection with the request IP version.
This allows applications to explicitly create two separate connections to the same host using different IP versions when desired, which previously libcurl wouldn’t easily let you do.
Ignore SIGPIPE in curl_easy_send
libcurl makes its best at ignoring SIGPIPE everywhere and here we identified a spot where we had missed it… We also made sure to enable the ignoring logic when built to use wolfSSL.
There are no less than 6 separate fixes mentioned in the HTTP/2 module in this release. Some potential memory leaks but also some more behavior improving things. Possibly the most important one was the move of the transfer-related error code from the connection struct to the transfers struct since it was vulnerable to a race condition that could make it wrong. Another related fix is that libcurl no longer forcibly disconnects a connection over which a transfer gets HTTP_1_1_REQUIRED returned.
Partial CONNECT requests
When the CONNECT HTTP request sent to a proxy wasn’t all sent in a single send() call, curl would fail. It is baffling that this bug hasn’t been found or reported earlier but was detected this time when the reporter issued a CONNECT request that was larger than 16 kilobytes…
TLS: add USE_HTTP2 define
There was several remaining bad assumptions that HTTP/2 support in curl relies purely on nghttp2. This is no longer true as HTTP/2 support can also be provide by hyper.
When libcurl (built with libssh2 support) stopped an SFTP transfer because a timeout was triggered, the following SFTP disconnect procedure was subsequently also stopped because of the same timeout and therefore wasn’t allowed to properly clean up everything, leading to a memory-leak!