It started as just a test to see if I could use the existing advisory data we have for all curl CVEs to date and provide that as JSON. Maybe, I thought, if we provide it good enough it can be used to populate other databases automatically or even get queried easier by tools.
In a recent push I decided that all the info we have and provide could and should be offered in a more machine friendly format for whoever wants it.
After a first few test shots, fellow curl team mates pointed out to me that there is an existing effort called the Open Source Vulnerability format, an openly developed JSON schema designed for pretty much exactly what I was set out to do. I agreed that it made sense to follow something existing rather to make up my own format.
Can be improved
Of course we ran into some minor snags with this schema and there are still details in it that I think can be improved and we are discussing with the OSV team to see if there is merit to our ideas or not. Still, even without any changes we can now offer our data using this established format.
The two primary things I want to improve is how we provide project identification (whose issue is this) and how we convey the severity level of the issue.
As of now, we offer a set of different ways to get the CVE data as JSON.
1. Everything all at once
On the fixed URL https://curl.se/docs/vuln.json, we provide a JSON array holding a number of JSON objects. One object per existing CVE. Right now, this is 349KB of data. (If you ask for it compressed it will be smaller!)
This URL will always contain the entire set and it will automatically update in the future as we published new CVEs. It also automatically updates when we correct or change any of the previously published advisories.
2. Single object per issue
If you prefer to get the exact metadata for a single specific curl CVE, you can get the JSON for an issue by replacing the .html extension to .json for any CVE documented on the website. You will also find a menu option the “related box” for each documented CVE that links to it.
On the curl website we already offer a way for users to get a list of all known vulnerabilities a certain specific release is known to be vulnerable to. Of course always updated with the latest publications.
For example, curl version 7.87.0 has all its vulnerability information detailed on https://curl.se/docs/vuln-7.87.0.html. When I write this, there are eight known vulnerabilities for this version.
Again, either by clicking the JSON link there on the page under the table, or simply by replacing html by json in the URL, the user can get a listing of all the CVEs this version is vulnerable to, as a JSON array with a number of JSON objects inside. In this case, eight objects as of now.
While I expect the format might still change a little bit going forward, and not all issues have all the metadata provided just yet (for example, the git commit ranges are still lacking on a number of issues from before 2017), here is an illustration screenshot of jq displaying the JSON object for CVE-2022-27780.
The database_specific object near the top is metadata that we have and believe belongs with the issue but that has no defined established field in the JSON schema. Since I think the data still adds value to users, I decided to put them into this section that is designed and meant exactly for this kind of extensions.
You can see that we set an “id” that is the CVE ID with a CURL- prefix. This is just us catering to the conditions of OSV and the JSON schema. We apparently need our own ID and provide the actual CVE ID as an alias, so we “fake” this by simply prepending curl to the CVE ID. We don’t use any private ID when we work with vulnerabilities: we only deal with public issues and we only deal with issues that are CVE worthy so it seems unnecessary to involve anything else.
In one particular case, CVE-2022-43552 was reported by the curl project in December 2022. It is a use-after-free flaw that we determined to be severity low and not higher mostly because of the very limited time window you need to make something happen for it to be exploited or abused. NVD set it to medium which admittedly was just one notch higher (this time).
This is not helpful.
Lots of Windows users everywhere runs security scanners on their systems with regular intervals in order to verify that their systems are fine. At some point after December 21, 2022, some of these scanners started to detect installations of curl that included the above mentioned CVE. Nessus apparently started this on February 23.
This is not helpful.
Lots of Windows users everywhere then started to panic when these security applications warned them about their vulnerable curl.exe. Many Windows users are even contractually “forced” to fix (all) such security warnings within a certain time period or risk bad consequences and penalties.
How do you fix this?
I have been asked numerous times about how to fix this problem. I have stressed at every opportunity that it is a horrible idea to remove the system curl or to replace it with another executable. It is very easy to download a fresh curl install for Windows from the curl site – but we still strongly discourage everyone from replacing system files.
But of course, far from everyone asked us. A seemingly large enough crowd has proceeded and done exactly what we would stress they should not: they deleted or replaced their C:\Windows\System32\curl.exe.
The real fix is of course to let Microsoft ship an update and make sure to update then. The exact update that upgrades curl to version 8.0.1 is called KB5025221 and shipped on April 11. (And yes, this is the first time you get the very latest curl release shipped in a Windows update)
The people who deleted or replaced the curl executable noticed that they cannot upgrade because the Windows update procedure detects that the Windows install has been tampered with and it refuses to continue.
I do not know how to restore this to a state that Windows update is happy with. Presumably if you bring back curl.exe to the exact state from before it could work, but I do not know exactly what tricks people have tested and ruled out.
I have been pointed to responses on the Microsoft site answers.microsoft.com done by “helpful volunteers” that specifically recommend removing the curl.exe executable as a fix.
This is not helpful.
I don’t want to help spreading that idea so I will not link to any such post. I have reported this to Microsoft contacts and I hope they can maybe edit or comment those posts soon.
We are not responsible
I just want to emphasize that if you install and run Windows, your friendly provider is Microsoft. You need to contact Microsoft for support and help with Windows related issues. The curl.exe you have in System32 is only provided indirectly by the curl project and we cannot fix this problem for you. We in fact fixed the problem in the source code already back in December 2022.
If you have removed curl.exe or otherwise tampered with your Windows installation, the curl project cannot help you.
In 2011 I started to send “pre-notifications” about pending curl security vulnerabilities to the distros mailing list (back then it was still called linux-distros).
For several years we also asked them for CVE IDs for the new vulnerabilities that we were about to publish to the world. By notifying the distros ahead of time, the idea is that they get a little head-start to fix their curl packages so that at the day when we publish the vulnerabilities to the world, they can already provide curl upgrades.
The gap from us announcing a flaw until they offer curl upgrades could ideally be made a minimum.
The distros list’s rules forbid us to tell them more than 10 days before the planned release day. They call this an embargo as they are expected to not tell anyone who is not a mailing list member about these flaws.
During the last twelve plus years, I have told them about almost 130 pending curl vulnerabilities like this up until today.
Secrets are hard
For an open source project that has all its processes and test infrastructure public and open there are several challenges with how to deal with secrets, such as vulnerabilities and their corresponding fixes.
We recently updated our security process in the curl project: we have noticed that we have previously – several times – landed fixes to security problems that were defective and in some cases did not even fix the reported problem correctly. I believe one reason for this is that we had this policy to make the fix into a (public) pull-request no earlier than 48 hours before the pending release. 48 hours is enough to make all the tests and CI verify the fix, but it is a very short time window for the community to react or be able to test and find any problems with the fixes before the release goes out.
As an attempt to do better we have tweaked our policy. If a reported security problem is deemed to be of severity low or severity medium, we will instead allow and rather push for turning the fix into a public pull-request much earlier. We will however not mention the security aspect of the fix in the public communication about the pull-request, but only talk about the bugfix aspect.
This will allow us to merge fixes earlier in the release cycle. To give the bugfixes more time to mature and ripe in the repository before the pending release. It should increase the chances that we can do follow-up fixes and truly make it a good correction by the time we do the next release. Hopefully it leads to better releases with fewer regressions.
Of course the risk with this is that a malicious user somewhere finds out about a vulnerability this way, earlier than 48 hours before a release, and therefore gets an extended time window to perform nefarious actions. That is also why we limit this method to severity low and medium issues, as the ones rated more serious are deemed too dangerous to risk.
Policy vs policy
The week before we were about to ship the curl 8.0.0 release, I emailed the distros mailing list again like I have done so many times before and told them about the upcoming six(!) vulnerabilities we were about to reveal to the world.
This time turned out to be different.
Because of our updated policy where the fixes were already committed in a public git repository, the distros mailing list’s policy says that if there is a public commit they consider the issue to be public and thus they refuse to accept any embargo.
What they call embargo I of course call heads-up time.
I argue that while the fixes are public, the actual vulnerabilities and the security issues those fixes rectify are not. It takes a serious effort and pretty good insights to just detect that one or more of the commits for the pending release are done because of a security problem and then even more so if you want to convert that suspicion into an actual attack vector.
They maintain that while they could make an exception for me/us this time, this is an exception and their policy says this is not acceptable for embargos.
If we make commits public before telling distros, we may not “ask for an embargo”.
So we won’t tell
I thought we were doing this for their benefit. I was under the impression that we actually helped distributors of open source operating systems by telling them ahead of time what was going to ship very soon that they might want to get a head-start on so that their users stay protected.
I have been told in very clear terms that they do not want to be notified about vulnerabilities ahead of time if the commits are public.
I have informed them that I will not tell them anymore until they change their minds because I think our updated security process can make our releases better and I think improving curl and making better releases is more important than telling distros ahead of time.
I cannot understand how this stubbornness makes anything better for them. For me, it takes away some amount of work so I will manage just fine. For curl users “in the wild”, this will probably mean that they will get security-patched curl releases from their distros a little slower in the future.
We rarely see curl vulnerabilities rated higher than medium so this means we will effectively stop emailing distros about pending flaws. We are still allowed to tell them about more criticality scored vulnerabilities but I must confess I feel less inclined to do that than I used to.
This a major version number bump but without any ground-breaking changes or fireworks. We decided it was about time to reset the minor number down to more a manageable level and doing it exactly on curl’s 25th birthday made it extra fun. There is no API nor ABI break in this version.
This is likely the best curl release we ever made.
curl supports communicating using the TELNET protocol and as a part of this it offers users to pass on user name and “telnet options” for the server negotiation.
Due to lack of proper input scrubbing and without it being the documented functionality, curl would pass on user name and telnet options to the server as provided. This could allow users to pass in carefully crafted content that pass on content or do option negotiation without the application intending to do so. In particular if an application for example allows users to provide the data or parts of the data.
curl supports SFTP transfers. curl’s SFTP implementation offers a special feature in the path component of URLs: a tilde (~) character as the first path element in the path to denotes a path relative to the user’s home directory. This is supported because of wording in the once proposed to-become RFC draft that was to dictate how SFTP URLs work.
Due to a bug, the handling of the tilde in SFTP path did however not only replace it when it is used stand-alone as the first path element but also wrongly when used as a mere prefix in the first element.
Using a path like /~2/foo when accessing a server using the user dan (with home directory /home/dan) would then quite surprisingly access the file /home/dan2/foo.
This can be taken advantage of to circumvent filtering or worse.
libcurl would reuse a previously created FTP connection even when one or more options had been changed that could have made the effective user a very different one, thus leading to the doing the second transfer with wrong credentials.
libcurl would reuse a previously created connection even when the GSS delegation (CURLOPT_GSSAPI_DELEGATION) option had been changed that could have changed the user’s permissions in a second transfer.
libcurl keeps previously used connections in a connection pool for subsequent transfers to reuse if one of them matches the setup. However, this GSS delegation setting was left out from the configuration match checks, making them match too easily, affecting krb5/kerberos/negotiate/GSSAPI transfers.
libcurl supports sharing HSTS data between separate “handles”. This sharing was introduced without considerations for do this sharing across separate threads but there was no indication of this fact in the documentation.
Due to missing mutexes or thread locks, two threads sharing the same HSTS data could end up doing a double-free or use-after-free.
libcurl would reuse a previously created connection even when an SSH related option had been changed that should have prohibited reuse.
libcurl keeps previously used connections in a connection pool for subsequent transfers to reuse if one of them matches the setup. However, two SSH settings were left out from the configuration match checks, making them match too easily.
There is only one actual “change” in this release. This is the first curl release to drop support for building on a systems that lack a working 64 bit data type. curl now requires that ‘long long‘ or an equivalent exists.
This release cycle was half the length of a regular one but yet we managed to merge an impressive amount of bugfixes. Below I highlight a few that I think deserve a special mention.
build: drop the use of XC_AMEND_DISTCLEAN
A strange description but this change removed an old autotools macro that made configure sometimes “balloon” Makefiles to several gigabytes.
connect: fix time_connect and time_appconnect timer statistics
A regression after the new happy eyeball h2/h3 connect approach was introduced.
curl.1: list all “global options”
Command line options that survive the use of --next are called “global options” and the man page now lists all of them for easier identification.
To accomplish this, there is a new metadata “tag” for this purpose to mark the global options in their corresponding docs files.
ftp: active mode with SSL, add the filter
Regression: FTPS in active mode did not setup the data connection correctly.
replaced sscanf() in several parsers
From 24 occurrences of sscanf() calls in the code in the previous release, down to just 4 left.
headers: make curl_easy_header and nextheader return different buffers
error handling during parallel operations
fix http2 prior knowledge when reusing connections
RST and GOAWAY better recognize partial transfers
avoid upload busy loop
http: don’t send 100-continue for short PUT requests
Now aligns with and behaves more similarly to how curl has treated POST for a long time.
http: fix unix domain socket use in https connects
multi: make multi_perform ignore/unignore signals less often
When iterating over a potentially long list of individual transfers to “take care of”, we can avoid many ignore + unignore sequences by retaining the previous state when possible.
multi: remove PENDING + MSGSENT handles from the main linked list
To speed up the handling of large amounts of easy handles added to a multi handle that are either pending or already completed, those easy handles are now moved out of the main linked list to separate queues.
rand: use arc4random as fallback when available
Makes curl built without a TLS library get better random, assuming the platform supports it.
urlapi: ‘%’ is illegal in host names
The URL parser would wrongly accept a stand-alone percent as part of a host name. It remains accepted for percent-encoded host names and as separator between an IPv6 address and a zone id.
urlapi: parse IPv6 literals without ENABLE_IPV6
To make the URL parser behavior more consistent, it can now parse and deal with IPv6 addresses perfectly fine and the same way even if IPv6 connectivity does not actually work.
binding to an interface with host name using c-ares
When a security vulnerability has been found and confirmed in curl, we request a CVE Id for the issue. This is a global unique identifier for this specific problem. We request the ID from our CVE Numbering Authority (CNA), Hackerone, which once we make the issue public will publish all details about it to MITRE, which hosts the central database.
In the curl project we have until today requested CVE Ids for and provided information about 135 vulnerabilities spread out over twenty-five years.
A CVE identifier affects a specific product (or set of products), and the problem affects the product from a version until a fixed version. And then there is a severity. How bad is the problem?
The Common Vulnerability Scoring System (CVSS) is a way to grade severity on a scale from zero to ten. You typically use a CVSS calculator, fill in the info as good as you can and voilá, out comes a score.
The ranges have corresponding names:
9 or higher
CVSS is a shitty system
Anyone who ever gets a problem reported for their project and tries to assess and set a CVSS score will immediately realize what an imperfect, simplified and one-dimensional concept this is.
The CVSS score leaves out several very important factors like how widespread the affected platform is, how common the affected configuration is and yet it is still very subjective as you need to assess as and mark different things as None, Low, Medium or High.
The same bug is therefore likely to end up with different CVSS scores depending on who fills in the form – even when the persons are familiar with the product and the error in question.
In the curl project we decided to abandon CVSS years ago because of its inherent problems. Instead we use only the four severity names: Low, Medium, High, and Critical and we work out the severity together in the curl security team as we work on the vulnerability. We make sure we understand the problem, the risks, its prevalence and more. We take all factors into account and then we set a severity level we think helps the world understand it.
All security vulnerabilities are vulnerabilities and therefore security risks, even the ones set to severity Low, but having the correct severity is still important in messaging and for the rest of the world to get a better picture of howserious the issue is. Getting the right severity is important.
NVD hosts a database of vulnerabilities. All CVEs that are submitted to MITRE are sucked in into NVD’s database. NVD says it “performs analysis on CVEs that have been published to the CVE Dictionary“.
That last sentence is probably important.
NVD imports CVEs into their database and they in turn offer other databases to import vulnerabilities from them. One large and known user of the NVD database is this I mentioned in a recent blog post: GitHub Security Advisory Database (GHSA DB) .
This GitHub thing an ambitious database that subsequently hosts a lot of vulnerabilities that people and projects reported themselves in addition to them importing information about all vulnerabilities ever published with CVE Ids.
This creates a huge database that in theory should contain just about every software vulnerability ever reported in the public. Pretty cool.
NVD, in their great wisdom, rescores the CVSS score for CVE Ids they import into their database! (It’s not clear how or why, but they seem to not do it for all issues).
NVD decides they know better than the project that set the severity level for the issue, enters their own answers in the CVSS calculator and eventually sets that new score on the CVEs they import.
NVD clearly thinks they need to do this and that they improve the state of the CVEs by this practice, but the end result is close to scaremongering.
Because NVD sets their own severity level and they have some sort of “worst case” approach, virtually all issues that NVD sets severity for is graded worse or much worse when they do it than how we set the severity levels.
Let’s take an example: CVE-2022-42915: HTTP proxy double-free. We deemed this a medium severity. It was not made higher partly because of the very limited time-window between the two frees, making it harder to take advantage of.
Yes, it makes you wonder what magic insights and knowledge the person/bots on NVD possessed when they did this.
The different severity levels should not matter too much but people find those inflated ones and they believe them. Users also find the discrepancies, get confused and won’t know what to believe or whom to trust. After all, NVD is trust-inducing brand. People think they know their stuff and if they say critical and the curl project says medium, what are we expected to think?
I claim that NVD overstate their severity levels and there unnecessarily scares readers and make them think issues are worse and more dangerous than they actually are.
The fact that GitHub now imports all CVE data from NVD makes these severity levels get transported, shown and believed as they are now also shown in the GHSA DB.
Look how many critical issues there are!
Not exactly GitHub’s fault
This NVD habit of re-scoring is an old existing habit and I just recently learned it. GitHub’s displaying the severity levels highlighted it for me, especially since users out there seem to trust and use this GitHub database.
I have talked to humans on the GitHub database team and I push for them to ignore or filter out the severity levels as set by NVD, if possible. But me being just a single complaining maintainer I do not expect this to have much of an effect. I would urge NVD to stop this insanity if I had any way to.
(Updated after first post). It turns out that some CVEs that we have filed from the curl project that uses our CNA hackerone have been submitted to MITRE without any severity level or CVSS score at all. For such issues, I of course understand why someone would put their own score on the issue because then our originally set score/severity is not passed on. Then the “blame” is instead shifted to Hackerone. I have contacted them about it.
Dispute a CVSS
NVD provides a way to dispute their rescores, but that’s just an open free-text form. I have use that form to request that NVD stop rescoring all curl issues. Although I honestly think they should rather stop all rescoring and only do that in the rare occasions where the original score or severity is obviously wrong.
I cannot dispute the severity levels at GitHub. They show the NVD levels.
tldr: several hundred hours of dedicated scrutinizing of curl by a team of security experts resulted in two CVEs and a set of less serious remarks. The link to the reports is at the bottom of this article.
Thanks to an OpenSSF grant, OSTIF helped us set up a curl security audit, which the excellent Trail of Bits was selected to perform in September 2022. We are most grateful to OpenSSF for doing this for us, and I hope all users who use and rely on curl recognize this extraordinary gift. OSTIF and Trail of Bits both posted articles about this audit separately.
We previously had an audit performed on curl back in 2016 by Cure53 (sponsored by Mozilla) but I like to think that we (curl) have traveled quite far and matured a lot since those days. The fixes from the discoveries reported in that old previous audit were all merged and shipped in the 7.51.0 release, in November 2016. Now over six years ago.
Changes since previous audit
We have done a lot in the project that have improved our general security situation over the last six years. I believe we are in a much better place than the last time around. But we have also grown and developed a lot more features since then.
curl is now at150,000 lines of C code. This count is for “product code” only and excludes blank lines but includes 19% comments.
71 additional vulnerabilities have been reported and fixed since then. (42 of those even existed in the version that was audited in 2016 but were obviously not detected)
We have 30,000 additional lines of code today (+27%), and we have done over 8,000 commits since.
We have 50% more test cases (now 1550).
We have done 47 releases featuring more than 4,200 documented bugfixes and 150 changes/new features.
We have 25 times the number of CI jobs: up from 5 in 2016 to 127 today.
The OSS-Fuzz project started fuzzing curl in 2017, and it has been fuzzing curl non-stop since.
The Trail of Bits team was assigned this as a three-part project:
Create a Threat Model document
Testing Analysis and Improvements
Secure code Review
The project was setup to use a total of 380 man hours and most of the time two Trail of Bits engineers worked in parallel on the different tasks. The Trail of Bits team themselves eventually also voluntarily extended the program with about a week. They had no problems finding people who wanted to join in and look into curl. We can safely say that they spent a significant amount of time and effort scrutinizing curl.
The curl security team members had frequent status meetings and assisted with details and could help answer questions. We would also get updates and reports on how they progressed.
The second vulnerability was found after Trail of Bits had actually ended their work and their report, while they were still running a fuzzer that triggered a separate flaw. This second vulnerability is not covered in the report but was disclosed earlier today in sync with the curl 7.87.0 release announcement: CVE-2022-43552: HTTP Proxy deny use-after-free.
Minor frictions detected
Discoveries and remarks highlighted through their work that were not consider security sensitive we could handle on the fly. Some examples include:
Using --ssl now outputs a warning saying it is unsafe and instead recommending --ssl-reqd to be used.
The Alt-svc: header parser did not deal with illegal port numbers correctly
The URL parser accepted “illegal” characters in the host name part.
Harmless memory leaks
You should of course read the full reports to learn about all the twenty something issues with all details, including feedback from the curl security team.
The curl team acted on all reported issues that we think we could act on. We disagree with the Trail of Bits team on a few issues and there are some that are “good ideas” that we should probably work on getting addressed going forward but that can’t be fixed immediately – but also don’t leave any immediate problem or danger in the code.
Security is not something that can be checked off as done once and for all nor can it ever be considered complete. It is a process that needs to blend in and affect everything we do when we develop software. Now and forever going forward.
This team of security professionals spent more time and effort in this security auditing and poking on curl with fuzzers than probably anyone else has ever done before. Personally, I am thrilled that they only managed to uncovered two actual security problems. I think this shows that a lot of curl code has been written the right way. The CVEs they found were not even that terrible.
Twenty something issues were detected, and while the report includes advice from the auditors on how we should improve things going forward, they are of the kind we all already know we should do and paths we should follow. I could not really find any real lessons as in obvious things or patterns we should stop or new paradigms och styles to adapt.
I think we learned or more correctly we got these things reconfirmed:
we seem to be doing things mostly correct
we can and should do more and better fuzzing
adding more tests to increase coverage is good
Security is hard
To show how hard security can be, we received no less than three additional security reports to the project during the actual life-time when this audit was being done. Those additional security reports of course came from other people and identified security problems this team of experts did not find.
My comments on the reports
The term Unresolved is used for a few issues in the report and I have a minor qualm with the use of that particular word in this context for all cases. While it is correct that we in several cases did not act on the advice in the report, we saw some cases where we distinctly disagree with the recommendations and some issues that mentioned things we might work on and address in the future. They are all just marked as unresolved in the reports, but they are not all unresolved to us in the curl project.
In particular I am not overly pleased with how the issue called TOB-CURLTM-6 is labeled severity high and status unresolved as I believe this wrongly gives the impression that curl has issues with high severity left unresolved in the code.
If you want to read the specific responses for each and every reported issue from the curl project, they are stored in this separate GitHub gist.
You find the two reports linked to from the curl security page. A total of almost 100 pages in two PDF documents.
In 2022 we have already had 14 CVEs reported so far, and we will announce the 15th when we release curl 7.85.0 at the end of August. Going into September 2022, there have been a total of 18 reported CVEs in the last 12 months.
During the whole of 2021 we had 13 CVEs reported – and already that was a large amount and the most CVEs in a single year since 2016.
There has clearly been an increased CVE issue rate in curl as of late.
Finding and fixing problems is good
While every reported security problem stings my ego and makes my soul hurt since it was yet another mistake I feel I should have found or not made in the first place, the key take away is still that it is good that they are found and reported so that they can be fixed properly. Ideally, we also learn something from each such report and make it less likely that we ever introduce that (kind of) problem again.
That might be the absolutely hardest task around each CVE. To figure out what went wrong, detect a pattern and then lock it down. It’s almost amusing how all bugs look a like a one-off mistake with nothing to learn from…
Ironically, the only way we know people are looking really hard at curl security is when someone reports security problems. If there was no reports at all, can we really be sure that people are scrutinizing the code the way we want?
Who is counting?
Counting the amount of CVEs and giving that a meaning, or even comparing the number between projects, is futile and a bad idea. The number does not say much and comparing two projects this way is impossible and will not tell you anything. Every project is unique.
Just counting CVEs disregards the fact that they all have severity levels. Are they a dozen “severity low” or are they a handful of critical ones? Even if you take severity into account, they might have gotten entirely different severity for virtually the same error, or vice versa.
Further, some projects attract more scrutinize and investigation because they are considered more worthwhile targets. Or perhaps they just pay researchers more for their findings. Projects that don’t get the same amount of focus will naturally get fewer security problems reported for them, which does not necessarily mean that they have fewer problems.
The curl bug-bounty really works as an incentive as we do reward security researchers a sizable amount of money for every confirmed security flaw they report. Recently, we have handed over 2,400 USD for each Medium-severity security problem.
In addition to that, finding and getting credited for finding a flaw in a widespread product such as curl is also seen as an extra “feather in the hat” for a lot of security-minded bug hunters.
Over the last year alone, we have paid about 30,000 USD in bug bounty rewards to security researchers, summing up a total of over 40,000 USD since the program started.
We have been fortunate to have received the attention of some very skilled, patient and knowledgeable individuals. To find security problems in modern curl, the best bug hunters both know the ins and outs of the curl project source code itself while at the same time they know the protocols curl speaks to a deep level. That’s how you can find mismatches that shouldn’t be there and that could lead to security problems.
The 15 reports we have received in 2022 so far (including the pending one) have been reported by just four individuals. Two of them did one each, the other two did 87% of the reporting. Harry Sintonen alone reported 60% of them.
Hardly any curl security problems are found with source code analyzers or even fuzzers these days. Those low hanging fruits have already been picked.
We care, we act
Our average response time for security reports sent to the curl project has been less than two hours during 2022, for the 56 reports received so far.
We give each report a thorough investigation and we spend a serious amount of time and effort to really make sure we understand all the angles of the claim, that it really is a security problem, that we produce the best possible fix for it and not the least: that we produce a mighty fine advisory for the issue that explains it to the world with detail and accuracy.
Less than 8% of the submissions we get are eventually confirmed actual security problems.
As a general rule all security problems we confirm, are fixed in the pending next release. The only acceptable exception would be if the report arrives just a day or two from the next release date.
Since you cannot judge a project by the number of CVEs that come out of it, what you should instead pay more focus on when you assess the health of a software project is how it acts when security problems are reported.
Most problems are still very old
In the curl project we make a habit of tracing back and figuring out exactly in which release each and every security problem was once introduced. Often the exact commit. (Usually that commit was authored by me, but let’s not linger on that fact now.)
One fun thing this allows us to do, is to see how long time the offending code has been present in releases. The period during which all the eyeballs that presumably glanced over the code missed the fact that there was a security bug in there.
On average, curl security problems have been present an extended period of time before there are found and reported.
On average for all CVEs: 2,867 days
The average time bugs were present for CVEs reported during the last 12 months is a whopping 3,245 days. This is very close to nine years.
How many people read the code in those nine years?
The people who find security bugs do not know nor do they care about the age of a source code line when they dig up the problems. The fact that the bugs are usually old could be an indication that we introduced more security bugs in the past than we do now.
Finding vs introducing
Enough about finding issues for a moment. Let’s talk about introducing security problems. I already mentioned we track down exactly when security flaws were introduced. We know.
With this, we can look at the trend and see if we are improving over time.
The project has existed for almost 25 years, which means that if we introduce problems spread out evenly over time, we would have added 4% of them every year. About 5 CVE problems are introduced per year on average. So, being above or below 5 introduced makes us above or below an average year.
Bugs are probably not introduced as a product of time, but more as a product of number of lines of code or perhaps as a ratio of the commits.
75% of the CVE errors were introduced before March 2014 – and yet the code base “only” had 102,000 lines of code at the end of that period. At that time, we had done 61% of the git commits (17684). One CVE per 189 commits.
15% of the security problems were introduced during the last five years and now we are at 148,000 lines of code. Finding needles in a growing haystack. With more code than ever before, we introduce bugs at a lower-than average rate. One CVE per 352 commits. This probably is also related to the ever-growing number of tests and CI jobs that help us detect more problems before we merge them.
The commit rate has remained between 1,100 and 1,700 commits per year since 2007 with ups and downs but no obvious growing or declining trend.
Harry reported a large amount of the recent curl CVEs as mentioned above, and is credited for a total of 17 reported curl CVEs – no one else is even close to that track record. I figured it is apt to ask him about the curl situation of today, to make sure this is not just me hallucinating things up. What do you think is the reason for the increased number of CVEs in curl in 2022?
The news of good bounties being paid likely has been attracting more researchers to look at curl.
I did put considerable effort in doing code reviews. I’m sure some other people put in a lot of effort too.
When a certain before unseen type of vulnerability is found, it will attract people to look at the code as a whole for similar or surrounding issues. This quite often results in new, similar CVEs bundling up. This is kind of a clustering effect.
It’s worth mentioning, I think, that even though there has been more CVEs found recently, they have been found (well most of them at least) as part of the bug bounty program and get handled in a controlled manner. While it would be even better to have no vulnerabilities at all, finding and handling them in controlled manner is the 2nd best option.
This might be escaping a random observer who just looks at the recent amount of CVEs and goes “oh this is really bad – something must have gone wrong!”. Some of the issues are rather old and were only found now due to the increased attention.
We introduce CVE problems at a slower rate now than we did in the past even though we have gotten problems reported at a higher than usual frequency recently.
The way I see it, we are good. I suppose the future will tell if I am right.
The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:
I guess partly because IPFS is a rather new protocol not widely supported by many clients yet, people came up with the concept of IPFS “gateways”. (Others might have called it a proxy, because that is what it really is.)
This gateway is an HTTP server that runs on a machine that also knows how to speak and access IPFS. By sending the right HTTP request to the gateway, that includes the IPFS hash you are interested in, the gateway can respond with the contents from the IPFS hash.
This way, you just need an ordinary HTTP client to access IPFS. I think it is pretty clever. But as always, the devil is in the details.
Rewrite ipfs:// into https://
The quest for IPFS aficionados have seemingly become to add support for IPFS using this gateway approach to multiple widely used applications that know how to speak HTTP(S). Just rewrite the URL internally from IPFS to HTTPS.
The IPFS “URL rewrite” is done in such so that the example IPFS URL is converted into “https://$gateway/$hash”. The transfer is done by the client as if it was a plain old HTTPS transfer.
What if the gateway is not local?
This approach has its biggest benefit of course when you can actually use a remote IPFS gateway. I presume most random ordinary users who want to access IPFS does not actually want to download, install and run an IPFS gateway on their machine to use this new power. They might very well appreciate the idea and convenience of accessing a remote IPFS gateway.
After all, the gateways are using HTTPS so at least the transfers are secure, right?
Leading people behind IPFS are even running public IPFS gateways for anyone to use. Both dweb.link and ipfs.io have been mentioned and suggested for default use. Possibly they are the same physical host as they appear to have the same IP addresses (owned by Protocol Labs, one of the big proponents behind IPFS).
In an attempt to make this even easier for users, ffmpeg is made to use a built-in default gateway if none is set by the user.
I would expect very few users to actually have an IPFS gateway set. And even fewer to actually specify a local one.
So, when users want to watch a video using ffmpeg and an ipfs:// URL what happens?
The gateway sees it all
The remote gateway, that is administered by someone, somewhere, gets to see the full incoming request, and all traffic (video, whatever) that is sent back to the client (ffmpeg in this example) goes via the gateway. Sure, the client accesses the gateway via HTTPS so nobody can tamper with the traffic along the way, but the gateway is in full control and can inspect and tamper with the data as much as it likes. And there is no way for the client to know or detect if it is happening.
I am not involved in ffmpeg, nor do I have any insights into how the discussions evolved and were done around this subject, but for curl I have put my foot down and said that we must not blindly use and trust any default remote gateway like this. I believe it is our responsibility to not lure users into this traffic-monitoring setup.
Make sure you trust the gateway you use.
The curl way
curl will probably output an error message if there was no gateway set when an ipfs:// URL is used, and inform the user that it needs a gateway set to function and probably also show a URL for a page that contains more information about what this means and how to specify a gateway.
During the work of making the IPFS support for curl (which I review and comment on, not written or authored myself) I have also learned that some of the IPFS gateways even do regular HTTP 30x redirects to bounce over the client to another gateway.
Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.
I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects, as I believe that adds even more risk to the user so if a user wants to allow this operation, it should be opt-in.
Imagine rising use of this
I would imagine that the ones hosting a popular default gateway either becomes overloaded and slow, or they need to scale up. If they scale up, they risk to leak traffic even wider. Scaling up also makes the operation more expensive, leading to incentives to make money somehow to finance it. Will the cookie car in the form of a massive data trove perhaps then be used/sold?
Users of these gateways get no promises, no rights, no contracts.
I am not dissing the idea nor implementation of IPFS itself. I think using HTTP gateways to access IPFS is a good idea in general as it makes the network more accessible. It is just a fragile solution that easily misleads users to do things they maybe shouldn’t. So maybe a little too accessible?
On July 1st 2017, exactly five years ago today, the OSS-Fuzz project “adopted” curl into their program and started running fuzz tests against it.
OSS-Fuzz is a project run by Google and they do fuzzing on a large amount of open source projects: OSS-Fuzz aims to make common open source software more secure and stable by combining modern fuzzing techniques with scalable, distributed execution.
That initial adoption of curl into OSS-Fuzz was done entirely by Google themselves and its fuzzing integration was rough and not ideal but it certainly got the ball rolling.
Later in in the fall of 2017, Max Dymond stepped up and seriously improved the curl-fuzzer so that it would better test protocols and libcurl options deeper and to a higher degree. (Max subsequently got a grant from Google for his work.)
Fuzzing is really the next-level thing all projects should have a go at when there is no compiler warning left to fix and the static code analyzers all report zero defects, because fuzzing has a way to find silly mistakes, off-by-ones etc in a way no analyzers can and that is very hard for human reviewers to match.
curl is a network tool and library written in C, it is meant to handle non-trusted data properly and doing it wrongly can have serious repercussions.
For the curl project, OSS-Fuzz has found several silly things and bad code paths especially after that rework Max put in. We have no less than 6 curl CVEs registered in 2017 and 2018 giving OSS-Fuzz (at least partial) credit for their findings.
In the first year or two, OSS-Fuzz truly kept us on our toes and we fixed numerous other flaws as well that weren’t CVE worthy but still bugs. At least 36 separate ones to date.
The beauty of OSS-Fuzz
For a little Open Source project such as curl, this service is great and in case you never worked with it let me elaborate what makes me enjoy it so much.
Google runs all the servers and keeps the service running. We don’t have to admin it, spend energy on maintaining etc.
They automatically pull the latest curl code from our master branch and keep fuzzing it. All day, every day, non-stop.
When they find a problem, we get a bug submitted and an email sent. The bug report they create contains stack traces and a minimized and reproducible test case. In most cases, that means I can download a small file that is often less than a hundred bytes, and run my locally built fuzzer code with this input and see the same issue happen in my end.
The bug they submit is initially closed for the public and only accessible to select team of curl developers.
90 days after the initial bug was filed, it is made public automatically.
The number of false positives are remarkably low, meaning that almost every time the system emails me, it spotted a genuine problem.
Fuzzing per commit
In 2020, the OSS-Fuzz team introduced CIFuzz, which allows us to run a little bit of fuzzing – for a limited time – on a PR as part of the CI pipeline. Using this, we catch the most obvious mistakes even before we merge the code into our master branch!
After the first few years of us fixing bad code paths, and with the introduction of CIFuzz, we nowadays rarely get reports from OSS-Fuzz anymore. And when we do, we have it complain on code that hasn’t been released yet. The number of CVEs detected in curl by OSS-Fuzz has therefore seemingly stopped at 6.
Taking fuzzing further
After five years of fuzzing, we would benefit from extending and improving the “hooks” where and how the fuzzing is made so that we can help it find and reach code paths with junk that it so far hasn’t reached. There are of course still potentially many undetected flaws in curl remaining because of the limited set of entry point we have for the fuzzer.
I hope OSS-Fuzz continues and lives on for a long time. It certainly helps curl a lot!
Saturday June 18: I had some curl time in the afternoon and I was just about to go edit the four security advisories I had pending for the next release, to brush up the language and check that they read fine, when it dawned on me.
These particular security advisories were still in draft versions but maybe 90% done. There were details, like dates and links to current in-progress patches, left to update. I also like to reread them a few times, especially in a webpage rendered format, to make sure they are clear and accurate in describing the problem, the solution and all other details, before I consider them ready for publication.
I checked out my local git branch where I expected the advisories to reside. I always work on pending security details in a local branch named security-next-release or something like that. The branch and its commits remain private and undisclosed until everything is ready for publication.
(I primarily use git command lines in terminal windows.)
The latest commits in my git log output did not show the advisories so I did a rebase but git promptly told me there was nothing to rebase! Hm, did I use another branch this time?
It took me a few seconds to realize my mistake. I saw four commits in the git master branch containing my draft advisories and then it hit me: I had accidentally pushed them to origin master and they were publicly accessible!
The secrets I was meant to guard until the release, I had already mostly revealed to the world – for everyone who was looking.
In retrospect I can’t remember exactly how the mistake was done, but I clearly committed the CVE documents in the wrong branch when I last worked on them, a little over a week ago. The commit date say June 9.
On June 14, I got a bug report about a problem with curl’s .well-known/security.txt file (RFC 9116) where it was mentioned that our file didn’t have an Expires: keyword in spite of it being required in the spec. So I fixed that oversight and pushed the update to the website.
When doing that push, I did not properly verify exactly what other changes that would be pushed in the same operation, so when I pressed enter there, my security advisories that had accidentally been committed in the wrong branch five days earlier and still were present there were also pushed to the remote origin. Swooosh.
The advisories are created in markdown format, and anyone who would update their curl-www repository after June 14 would then get them into their local repository. Admittedly, there probably are not terribly many people who do that regularly. Anyone could also browse them through the web interface on github. Also probably not something a lot of people do.
These pending advisories would however not appear on the curl website since the build files were not updated to generate the HTML versions. If you could guess the right URL, you could still get the markdown version to show on the site.
Nobody reported this mistake in the four days they were visible before I realized my own mistake (and nobody has reported it since either). I then tried googling the CVE numbers but no search seemed to find and link to the commits. The CVE numbers were registered already so you would mostly get MITRE and other vulnerability database listings that were still entirely without details.
After some quick deliberations with my curl security team friends, we decided expediting the release was the most sensible thing to do. To reduce the risk that someone takes advantage of this and if they do, we limit the time window before the problems and their fixes become known. For curl users security’s sake.
Previously, the planned release date was set to July 1st – thirteen days away. It had already been adjusted somewhat to not occur on its originally intended release Wednesday to cater for my personal summer plans.
To do a proper release with several security advisories I want at least a few days margin for the distros mailing list to prepare before we go public with everything. There was also the Swedish national midsummer holiday coming up next weekend and I did not feel like ruining my family’s plans and setup for that, so I picked the first weekday after midsummer: June 27th.
While that is just four days earlier than what we had previously planned, I figure those four days might be important and if we imagine that someone finds a way to exploit one of these problems before then, then at least we shorten the attack time window by four days.
When working with my security advisories, I must pay more attention and be more careful with which branch I commit to.
When pushing commits to the website and I know I have pending security sensitive details locally that have not been revealed yet, I should make it a habit to double-check that what I am about to push is only and nothing but what I expect to be there.
Simultaneously, I have worked using this process for many years now and this is the first time I did this mistake. I do not think we need to be alarmist about it.