Category Archives: Security

curl bug-bounty stats

tldr: the curl bug-bounty has been an astounding success so far.

We started the current curl bug-bounty setup in April 2019. We have thus run it for five and a half years give or take.

In the beginning we awarded researchers just a few hundred USD per issue because we did not know where it would go and as we used money from the curl fund (donated money) we wanted to make sure we could afford it.

Since a few years back, the money part of the bug-bounty is sponsored by the Internet Bug Bounty, meaning that the curl project actually earns money for every flaw as we get 20% of the IBB money for each bounty paid.

While the exact award amounts per report vary over time, they are roughly 500 USD for a low severity issue, 2,500 USD for a Medium and almost 5,000 USD for a High severity one.

To this day, we have paid out 84,260 USD to security researchers as rewards for their findings, distributed over 69 separate CVEs. 1,220 USD on average.

Counters

In this period we have received 477 reports, which is about 6 per month on average.

73 of the reports (15.4%) were confirmed and treated as valid security vulnerabilities that ended up CVEs. This also means that we get roughly one valid security report per month on average. Only 3 of these security problems were rated severity High, the rest were Low or Medium. None of them reached the worst level: Critical.

92 of of the reports (19.4%) were confirmed legitimate bugs but not security problems.

311 of the reports (65.3%) were Not Applicable. They were not bugs and not security problems. See below for more on this category.

1 of the reports is still being assessed as I write this.

Tightening the screws

Security is top priority for us but we also continue to develop curl at a high pace. We merge code into the repository at a frequency of more than four bugfixes per day on average over the last couple of years. When we tighten the screws in this project in order to avoid future problems and to mitigate the risks that we add new ones, we need to do it using policies and concepts that still allow us to move fast and be agile.

First response

We have an ambition to always have a first response posted within 24 hours. Over these first 477 reports, we have had a medium response time on under one hour and we have never missed our 24 hour goal. I am personally a little amazed by this feat.

Time to triage

The medium time from filed report until the curl security team has determined and concluded with some confidence that the problem is a security problem is 36 hours.

Assessing

Assessing a (good) report is hard and usually involves a lot of work: reading up on protocols details, reading code, trying different reproducer builds/scripts and bouncing back and forth with the reporters and the security team.

Acknowledging that it is a security problem is only one step. The adjacent one that is at least equally difficult is to then figuring out the severity. How serious is this flaw? A normal pattern is of course that the researcher considers the problem to be several degrees worse than the curl security team does so it can take a great deal of reasoning to reach an agreement. Sometimes we even decree a certain severity against the will of the researcher.

The team

There is a curl security team that works on and with security reports. The awesome people in this group are:

Max Dymond, Dan Fandrich, Daniel Gustafsson, James Fuller, Viktor Szakats, Stefan Eissing and myself.

They are all long-time curl maintainers. Knowledgeable, skilled, trusted.

Report quality

65.3% of the incoming reports are deemed not even a bug.

These reports can be all sorts of different things of course. When promising people money for their reports, there is no surprise that we get a fair share of luck-seekers trying to earn a few bucks the easy route.

Some reporters run scanners against the code, the mail server or the curl website and insist some findings are bounty worthy. The curl bug-bounty does not cover infrastructure, only the products, so they are not covered no matter what.

A surprisingly large amount of the bad reports are on various kinds of “information exposure” on the website – which is often ironic since the entire website already is available in a public git repository and the information exposed is hardly secret.

Reporting scanner results on code without applying your own thinking and confirming that the findings are indeed correct – and actual security problems – is rarely a good idea. That also goes for when asking AIs for finding problems.

Dismissing

Typically, the worse the report is, the quicker it is to dismiss. That is also why having this large share of rubbish is usually not a problem: we normally get rid of them with just a few minutes work spent.

The better crap we get, the worse the problem gets. An AI or a person that writes a long and good-looking report arguing for their sake can take a long time to analyze, asses and eventually debunk.

Since security problems are top priority in the project, getting too much good crap can to some degree cause a denial of service in the project as we need to halt other activities while we take care of the incoming reports.

We run our bug-bounty program on Hackerone, which has a reputation system for reporters. When we close reports as N/A, they get a reputation cut. This works as a mild deterrence for submitting low quality reports. Of course it also sometimes gives the reporter a reason to argue with us and insist we should rather close it as informative which does not come with a reputation penalty.

The good findings

I would claim that it is pretty hard to find a security problem in curl these days, but since we still average in maybe twelve per year recently they certainly still exist.

The valid reports today tend to happen because either a user accidentally did something that made them look, research and unveil something troublesome, or in the more common case: they have put in some real effort into research.

In the latter cases, we see researchers run their own custom fuzzers on parts of the code that our own fuzzers have not exercised as well, we see them check for code patterns that have led to problems before or in other projects and we also see researchers get inspiration by previous reports and fixes to see if perhaps there were gaps left.

The best curl security problem finders today understand the underlying involved protocols, the curl architecture, the source code and they look for inconsistencies between them all, as such might cause security problems.

Bounty hunters

The 69 bug bounty payouts so far have been done to 27 separate individuals. Five reporters have been rewarded for more than two issues each. The true curl security researcher heroes:

ReportsNameRewarded
25Harry Sintonen29,620 USD
8Hiroki Kurosawa9,800 USD
4Axel Chong7,680 USD
4Patrick Monnerat7,300 USD
3z2_4,080 USD
Top-5 curl bounty hunters

We are extremely fortunate to have this skilled set of people tracking down and highlighting our worst mistakes.

Harry of course sticks out in the top with his 25 rewarded curl security reports. More than three times the amount the number two has.

(Before you think the math is wrong: a few reports have been filed that ended up as valid CVEs but for which the reporters have declined getting a monetary reward.)

My advice

I think the curl bug-bounty is an absolute and undisputed success. I believe it is a key part in our mission to keep our users safe and secure.

If you consider kicking off a bug-bounty for your project here’s my little checklist:

  • Do your software engineering proper. Run all the tools, tests, checks, analyzers, scanners, fuzzers you can and make sure they are at zero reported defects. To avoid a raging herd of reports when you open the gates.
  • Start out with conservative bounty amounts to get a lay of the land, then raise them as you go.
  • Own all security problems for your project. Whoever reports them and however they appear, you assess, evaluate, research and fix them. You write and publish the complete and original security advisory.
  • Make sure you have a team. Even the best maintainers need sleep and occasional vacation days. Security is hard and having good people around to bounce problems with is priceless.
  • Close/reject crap reports as quickly as possible to prevent them from wasting team time and energy.
  • Always fix security problems with haste. Never let them linger around.
  • Transparency. Make as much as possible open and public once the CVEs are out, so that your processes, communications, methods are visible. This builds trust and allows for feedback and iterative improvements of the process.

Future

I think we will continue to receive valid security reports going forward, simply because we keep developing at a high pace and we change and add a lot of source code every year.

The trend in recent years have been more security reports, but the ratio of low/medium vs high/critical has sky-rocketed. The issues reported these days tend to be less sever than they were in the past.

My explanation for this is primarily that we have more people looking harder for problems now than in the past. Due to mitigations and past reports we introduce really bad security problems at a lower frequency than before.

Talk: Keeping the world from Burning

On Monday this week, I did a talk at the Nordic Software Security Summit conference in Stockholm Sweden. I titled it CVEMITRECVSSNVDCNAOSS WTF with the subtitle “Keeping the world from Burning”.

The talk was well received and I think it added something to the conversation. Almost every other talk during the rest of the conference that I saw referred back to it.

Since the talk was not recorded (no talks were at this event), I intend to do the presentation again – from home. This time live-streamed and recorded.

This happens on:

Monday September 30, 2024
14:00 UTC (16:00 CEST)

The stream happens on Twitch where I as always am curlhacker. Join the chatroom, ask questions, have a good time. There will of course be room for a Q&A.

No registration. No fee. Just show up.

At the conference, I did the presentation in under thirty minutes. This version might go on a few more minutes.

Abstract

The abstract I provided for this talk to the conference says:

Bogus CVEs, know-better organizations, conflicting databases, AI hallucinations, inflated severity scoring, security scanners, Jia Tan. As the lead developer in the curl project, Daniel describes some of the challenges involved and what you need to do to stay on top of security when working in a high profile Open Source project running in some twenty billion instances. The talk will be involving many examples from real life.

Differences

Since this is a second run of a talk I already did and I have no script, it will not be identical. I will also try to polish some minor details that I felt could need some brush-ups.

Recording

curl, Tor, dot onion and SOCKS

You can of course use curl to access hosts through Tor. (I know you know Tor so I am not going to explain it here.)

SOCKS

The typical way to access Tor is via a SOCKS5 proxy and curl has supported that since some time during 2002. Like this:

curl --socks5-hostname localhost:5432 https://example.com

or

curl --proxy socks5h://localhost:5432 https://example.com

or

export HTTPS_PROXY=socks5h://localhost:5432
curl https://example.com

Name resolving with SOCKS5

You know Tor, but do you know SOCKS5? It is an old and simple protocol for setting up a connection and when using it, the client can decide to either pass on the full hostname it wants to connect to, or it can pass on the exact IP address.

(SOCKS5 is by the way a minor improvement of the SOCKS4 protocol, which did not support IPv6.)

When you use curl, you decide if you want curl or the proxy to resolve the target hostname. If you connect to a site on the public Internet it might not even matter who is resolving it as either party would in theory get the same set of IP addresses.

The .onion TLD

There is a concept of “hidden” sites within the Tor network. They are not accessible on the public Internet. They have names in the .onion top-level domain. For example. the search engine DuckDuckGo is available at https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/.

.onion names are used to provide access to end to end encrypted, secure, anonymized services; that is, the identity and location of the server is obscured from the client. The location of the client is obscured from the server.

To access a .onion host, you must let Tor resolve it because a normal DNS server aware of the public Internet knows nothing about it.

This is why we recommend you ask the SOCKS5 proxy to resolve the hostname when accessing Tor with curl.

The proxy connection

The SOCKS5 protocol is clear text so you must make sure you do not access the proxy over a network as then it will leak the hostname to eavesdroppers. That is why you see the examples above use localhost for the proxy.

You can also step it up and connect to the SOCKS5 proxy over unix domain sockets with recent curl versions like this:

curl --proxy socks5h://localhost/run/tor/socks https://example.com

.onion leakage

Sites using the .onion TLD are not on the public Internet and it is pointless to ask your regular DNS server to resolve them. Even worse: if you in fact ask your normal resolver you practically advertise your intention of connection to a .onion site and you give the full name of that site to the outsider. A potentially significant privacy leak.

To combat the leakage problem, RFC 7686 The “.onion” Special-Use Domain Name was published in October 2015. With the involvement and consent from people involved in the Tor project.

It only took a few months after 7686 was published until there was an accurate issue filed against curl for leaking .onion names. Back then, in the spring of 2016, no one took upon themselves to fix this and it was instead simply added to the queue of known bugs.

This RFC details (among other things) how libraries should refuse to resolve .onion host names using the regular means in order to avoid the privacy leak.

After having stewed in the known bugs lists for almost five years, it was again picked up in 2023, a pull-request was authored, and when curl 8.1.0 shipped on May 17 2023 curl refused to resolve .onion hostnames.

Tor still works remember?

Since users are expected to connect using SOCKS5 and handing over the hostname to the proxy, the above mention refusal to resolve a .onion address did not break the normal Tor use cases with curl.

Turns out there are other common ways to do it.

A few days before the 8.1.0 release shipped a discussion thread was created: I want to resolve onion addresses.

Every change breaks someone’s workflow

XKCD 1172 – we hear you

Transparent proxies

Turns out there is a group of people who runs transparent proxies who automatically “catches” all local traffic and redirects it over Tor. They have a local DNS server who can resolve .onion host names and they intercept outgoing traffic to instead tunnel it through Tor.

With this setup now curl no longer works because it will not send .onion addresses to the local resolver because RFC 7686 tells us we should not,

curl of course does not know when it runs in a presumed safe and deliberate transparent proxy network or when it does not. When a leak is not a leak or when it actually is a leak.

torsocks

A separate way to access tor is to use the torsocks tool. Torsocks allows you to use most applications in a safe way with Tor. It ensures that DNS requests are handled safely and explicitly rejects any traffic other than TCP from the application you’re using.

You run it like

torsocks curl https://example.com

Because of curl’s new .onion filtering, the above command line works fine for “normal” hostnames but no longer for .onion hostnames.

Arguably, this is less of a problem because when you use curl you typically don’t need to use torsocks since curl has full SOCKS support natively.

Option to disable the filter?

In the heated discussion thread we are told repeatedly how silly we are who block .onion name resolves – exactly in the way the RFC says, the RFC that had the backing and support from the Tor project itself. There are repeated cries for us to add ways to disable the filter.

I am of course sympathetic with the users whose use cases now broke.

A few different ways to address this have been proposed, but the problem is difficult: how would curl or a user know that it is fine to leak a name or not? Adding a command line option to say it is okay to leak would just mean that some scripts would use that option and users would run it in the wrong conditions and your evil malicious neighbors who “help out” will just add that option when they convince their victims to run an innocent looking curl command line.

The fact that several of the louder voices show abusive tendencies in the discussion of course makes these waters even more challenging to maneuver.

Future

I do not yet know how or where this lands. The filter has now been in effect in curl for a year. Nothing is forever, we keep improving. We listen to feedback and we are of course eager to make sure curl remains and awesome tool and library also for content over Tor.

This discussion is also held within the more proper realms of the tor project itself.

Credits

Image by Couleur from Pixabay

The I in LLM stands for intelligence

I have held back on writing anything about AI or how we (not) use AI for development in the curl factory. Now I can’t hold back anymore. Let me show you the most significant effect of AI on curl as of today – with examples.

Bug Bounty

Having a bug bounty means that we offer real money in rewards to hackers who report security problems. The chance of money attracts a certain amount of “luck seekers”. People who basically just grep for patterns in the source code or maybe at best run some basic security scanners, and then report their findings without any further analysis in the hope that they can get a few bucks in reward money.

We have run the bounty for a few years by now, and the rate of rubbish reports has never been a big problem. Also, the rubbish reports have typically also been very easy and quick to detect and discard. They have rarely caused any real problems or wasted our time much. A little like the most stupid spam emails.

Our bug bounty has resulted in over 70,000 USD paid in rewards so far. We have received 415 vulnerability reports. Out of those, 64 were ultimately confirmed security problems. 77 of the report were informative, meaning they typically were bugs or similar. Making 66% of the reports neither a security issue nor a normal bug.

Better crap is worse

When reports are made to look better and to appear to have a point, it takes a longer time for us to research and eventually discard it. Every security report has to have a human spend time to look at it and assess what it means.

The better the crap, the longer time and the more energy we have to spend on the report until we close it. A crap report does not help the project at all. It instead takes away developer time and energy from something productive. Partly because security work is consider one of the most important areas so it tends to trump almost everything else.

A security report can take away a developer from fixing a really annoying bug. because a security issue is always more important than other bugs. If the report turned out to be crap, we did not improve security and we missed out time on fixing bugs or developing a new feature. Not to mention how it drains you on energy having to deal with rubbish.

AI generated security reports

I realize AI can do a lot of good things. As any general purpose tool it can also be used for the wrong things. I am also sure AIs can be trained and ultimately get used even for finding and reporting security problems in productive ways, but so far we have yet to find good examples of this.

Right now, users seem keen at using the current set of LLMs, throwing some curl code at them and then passing on the output as a security vulnerability report. What makes it a little harder to detect is of course that users copy and paste and include their own language as well. The entire thing is not exactly what the AI said, but the report is nonetheless crap.

Detecting AI crap

Reporters are often not totally fluent in English and sometimes their exact intentions are hard to understand at once and it might take a few back and fourths until things reveal themselves correctly – and that is of course totally fine and acceptable. Language and cultural barriers are real things.

Sometimes reporters use AIs or other tools to help them phrase themselves or translate what they want to say. As an aid to communicate better in a foreign language. I can’t find anything wrong with that. Even reporters who don’t master English can find and report security problems.

So: just the mere existence of a few give-away signs that parts of the text were generated by an AI or a similar tool is not an immediate red flag. It can still contain truths and be a valid issue. This is part of the reason why a well-formed crap report is harder and takes longer to discard.

Exhibit A: code changes are disclosed

In the fall of 2023, I alerted the community about a pending disclosure of CVE-2023-38545. A vulnerability we graded severity high.

The day before that issue was about to be published, a user submitted this report on Hackerone: Curl CVE-2023-38545 vulnerability code changes are disclosed on the internet

That sounds pretty bad and would have been a problem if it actually was true.

The report however reeks of typical AI style hallucinations: it mixes and matches facts and details from old security issues, creating and making up something new that has no connection with reality. The changes had not been disclosed on the Internet. The changes that actually had been disclosed were for previous, older, issues. Like intended.

In this particular report, the user helpfully told us that they used Bard to find this issue. Bard being a Google generative AI thing. It made it easier for us to realize the craziness, close the report and move on. As can be seen in the report log, we did have to not spend much time on researching this.

Exhibit B: Buffer Overflow Vulnerability

A more complicated issue, less obvious, done better but still suffering from hallucinations. Showing how the problem grows worse when the tool is better used and better integrated into the communication.

On the morning of December 28 2023, a user filed this report on Hackerone: Buffer Overflow Vulnerability in WebSocket Handling. It was morning in my time zone anyway.

Again this sounds pretty bad just based on the title. Since our WebSocket code is still experimental, and thus not covered by our bug bounty it helped me to still have a relaxed attitude when I started looking at this report. It was filed by a user I never saw before, but their “reputation” on Hackerone was decent – this was not their first security report.

The report was pretty neatly filed. It included details and was written in proper English. It also contained a proposed fix. It did not stand out as wrong or bad to me. It appeared as if this user had detected something bad and as if the user understood the issue enough to also come up with a solution. As far as security reports go, this looked better than the average first post.

In the report you can see my first template response informing the user their report had been received and that we will investigate the case. When that was posted, I did not yet know how complicated or easy the issue would be.

Nineteen minutes later I had looked at the code, not found any issue, read the code again and then again a third time. Where on earth is the buffer overflow the reporter says exists here? Then I posted the first question asking for clarification on where and how exactly this overflow would happen.

After repeated questions and numerous hallucinations I realized this was not a genuine problem and on the afternoon that same day I closed the issue as not applicable. There was no buffer overflow.

I don’t know for sure that this set of replies from the user was generated by an LLM but it has several signs of it.

Ban these reporters

On Hackerone there is no explicit “ban the reporter from further communication with our project” functionality. I would have used it if it existed. Researchers get their “reputation” lowered then we close an issue as not applicable, but that is a very small nudge when only done once in a single project.

I have requested better support for this from Hackerone. Update: this function exists, I just did not look at the right place for it…

Future

As these kinds of reports will become more common over time, I suspect we might learn how to trigger on generated-by-AI signals better and dismiss reports based on those. That will of course be unfortunate when the AI is used for appropriate tasks, such as translation or just language formulation help.

I am convinced there will pop up tools using AI for this purpose that actually work (better) in the future, at least part of the time, so I cannot and will not say that AI for finding security problems is necessarily always a bad idea.

I do however suspect that if you just add an ever so tiny (intelligent) human check to the mix, the use and outcome of any such tools will become so much better. I suspect that will be true for a long time into the future as well.

I have no doubts that people will keep trying to find shortcuts even in the future. I am sure they will keep trying to earn that quick reward money. Like for the email spammers, the cost of this ends up in the receiving end. The ease of use and wide access to powerful LLMs is just too tempting. I strongly suspect we will get more LLM generated rubbish in our Hackerone inboxes going forward.

Discussion

Hacker news

Credits

Image by TungArt7

Making it harder to do wrong

You know I spend all my days working on curl and related matters. I also spend a lot of time thinking on the project; like how we do things and how we should do things.

The security angle of this project is one of the most crucial ones and an area where I spend a lot of time and effort. Dealing with and assessing security reports, handling the verified actual security vulnerabilities and waiving off the imaginary ones.

150 vulnerabilities

The curl project recently announced its 150th published security vulnerability and its associated CVE. 150 security problems through a period of over 25 years in a library that runs in some twenty billion installations? Is that a lot? I don’t know. Of course, the rate of incoming security reports is much higher in modern days than it was decades ago.

Out of the 150 published vulnerabilities, 60 were reported and awarded money through our bug-bounty program. In total, the curl bug-bounty has of today paid 71,400 USD to good hackers and security researchers. The monetary promise is an obvious attraction to researchers. I suppose the fact that curl also over time has grown to run in even more places, on more architectures and in even more systems also increases people’s interest in looking into and scrutinize our code. curl is without doubt one of the world’s most widely installed software components. It requires scrutiny and control. Do we hold up our promises?

curl is a C program running in virtually every internet connect device you can think of.

Trends

Another noticeable trend among the reports the last decade is that we are getting way more vulnerabilities reported with severity level low or medium these days, while historically we got more ones rated high or even critical. I think this is partly because of the promise of money but also because of a generally increased and sharpened mindset about security. Things that in the past would get overlooked and considered “just a bug” are nowadays more likely to get classified as security problems. Because we think about the problems wearing our security hats much more now.

Memory-safety

Every time we publish a new CVE people will ask about when we will rewrite curl in a memory-safe language. Maybe that is good, it means people are aware and educated on these topics.

I will not rewrite curl. That covers all languages. I will however continue to develop it, also in terms of memory-safety. This is what happens:

  1. We add support for more third party libraries written in memory-safe languages. Like the quiche library for QUIC and HTTP/3 and rustls for TLS.
  2. We are open to optionally supporting a separate library instead of native code, where that separate library could be written in a memory-safe language. Like how we work with hyper.
  3. We keep improving the code base with helper functions and style guides to reduce risks in the C code going forward. The C code is likely to remain with us for a long time forward, no matter how much the above mention areas advance. Because it is the mature choice and for many platforms still the only choice. Rust is cool, but the language, its ecosystem and its users are rookies and newbies for system library level use.

Step 1 and 2 above means that over time, the total amount of executable code in curl gradually can become more and more memory-safe. This development is happening already, just not very fast. Which is also why number 3 is important and is going to play a role for many years to come. We move forward in all of these areas at the same time, but with different speeds.

Why no rewrite

Because I’m not an expert on rust. Someone else would be a much more suitable person to lead such a rewrite. In fact, we could suspect that the entire curl maintainer team would need to be replaced since we are all old C developers maybe not the most suitable to lead and take care of a twin project written in rust. Dedicated long-term maintainer internet transfer library teams do not grow on trees.

Because rewriting is an enormous project that will introduce numerous new problems. It would take years until the new thing would be back at a similar level of rock solid functionality as curl is now.

During the initial years of the port’s “beta period”, the existing C project would continue on and we would have two separate branches to maintain and develop, more than doubling the necessary work. Users would stay on the first version until the second is considered stable, which will take a long time since it cannot become stable until it gets a huge amount of users to use it.

There is quite frankly very little (if any) actual demand for such a rewrite among curl users. The rewrite-it-in-rust mantra is mostly repeated by rust fans and people who think this is an easy answer to fixing the share of security problems that are due to C mistakes. Typically, the kind who has no desire or plans to participate in said venture.

C is unsafe and always will be

The C programming language is not memory-safe. Among the 150 reported curl CVEs, we have determined that 61 of them are “C mistakes”. Problems that most likely would not have happened had we used a memory-safe language. 40.6% of the vulnerabilities in curl reported so far could have been avoided by using another language.

Rust is virtually the only memory-safe language that is starting to become viable. C++ is not memory-safe and most other safe languages are not suitable for system/library level use. Often because how they fail to interface well with existing C/C++ code.

By June 2017 we had already made 51 C mistakes that ended up as vulnerabilities and at that time Rust was not a viable alternative yet. Meaning that for a huge portion of our problems, Rust was too late anyway.

40 is not 70

In lots of online sources people repeat that when writing code with C or C++, the share of security problems due to lack of memory-safety is in the range 60-70% of the flaws. In curl, looking per the date of when we introduced the flaws (not reported date), we were never above 50% C mistakes. Looking at the flaw introduction dates, it shows that this was true already back when the project was young so it’s not because of any recent changes.

If we instead count the share per report-date, the share has fluctuated significantly over time, as then it has depended on when people has found which problems. In 2010, the reported problems caused by C mistakes were at over 60%.

Of course, curl is a single project and not a statistical proof of any sort. It’s just a 25 year-old project written in C with more knowledge of and introspection into these details than most other projects.

Additionally, the share of C mistakes is slightly higher among the issues rated with higher severity levels: 51% (22 of 43) of the issues rated high or critical was due to C mistakes.

Help curl authors do better

We need to make it harder to write bad C code and easier to write correct C code. I do not only speak of helping others, I certainly speak of myself to a high degree. Almost every security problem we ever got reported in curl, I wrote. Including most of the issues caused by C mistakes. This means that I too need help to do right.

I have tried to learn from past mistakes and look for patterns. I believe I may have identified a few areas that are more likely than others to cause problems:

  1. strings without length restrictions, because the length might end up very long in edge cases which risks causing integer overflows which leads to issues
  2. reallocs, in particular without length restrictions and 32 bit integer overflows
  3. memory and string copies, following a previous memory allocation, maybe most troublesome when the boundary checks are not immediately next to the actual copy in the source code
  4. perhaps this is just subset of (3), but strncpy() is by itself complicated because of the padding and its not-always-null-terminating functionality

We try to avoid the above mentioned “problem areas” like this:

  1. We have general maximum length restrictions for strings passed to libcurl’s API, and we have set limits on all internally created dynamic buffers and strings.
  2. We avoid reallocs as far as possible and instead provide helper functions for doing dynamic buffers. In fact, avoiding all sorts of direct memory allocations help.
  3. Many memory copies cannot be avoided, but if we can use a pointer and length instead that is much better. If we can snprintf() a target buffer that is better. If not, try do the copy close to the boundary check.
  4. Avoid strncpy(). In most cases, it is better to just return error on too long input anyway, and then instead do plain strcpy or memcpy with the exact amount. Ideally of course, just using a pointer and the length is sufficient.

These helper functions and reduction of “difficult functions” in the code are not silver bullets. They will not magically make us avoid future vulnerabilities, they should just ideally make it harder to do security mistakes. We still need a lot of reviews, tools and testing to verify the code.

Clean code

Already before we created these helpers we have gradually and slowly over time made the code style and the requirements to follow it, stricter. When the source code looks and feels coherent, consistent, as if written by a single human, it becomes easier to read. Easier to read becomes easier to debug and easier to extend. Harder to make mistakes in.

To help us maintain a consistent code style, we have tool and CI job that runs it, so that obvious style mistakes or conformance problems end up as distinct red lines in the pull request.

Source verification

Together with the strict style requirement, we also of course run many compilers with as many picky compiler flags enabled as possible in CI jobs, we run fuzzers, valgrind, address/memory/undefined behavior sanitizers and we throw static code analyzers on the code – in a never-ending fashion. As soon as one of the tools gives a warning or indicates that something could perhaps be wrong, we fix it.

Of course also to verify the correct functionality of the code.

Data for this post

All data and numbers I speak of in this post are publicly available in the curl git repositories: curl and curl-www. The graphs come from the curl web site dashboard. All graph code is available.

NVD damage continued

There is something about having your product installed in over twenty billion instances all over the world and even out of the globe. In my case it helps me remain focused on and committed to working on the security aspects of curl. Ideally, we will never have our heartbleed moment.

Security is also a generally growing concern in the world around us and Open Source security perhaps especially so. This is one reason why NVD making things up is such a big problem.

The National Vulnerability Database (NVD) has a global presence. They host and share information about security vulnerabilities. If you search for a CVE Id using your favorite search engine, it is likely that the first result you get is a link to NVD’s page with information about that specific CVE. They take it upon themselves to educate the world about security issues. A job that certainly is needed but also one that puts a responsibility and requirement on them to be accurate. When they get things wrong they help distributing misinformation. Misinformation makes people potentially draw the wrong conclusions or act in wrong, incomplete or exaggerated ways.

Low or Medium severity issues

There are well-known, recognized and reputable Open Source projects who by policy never issue CVEs for security vulnerabilities they rank severity low or medium. (I will not identify such projects here because it is not the point of this post.)

Such a policy successfully avoids the risk that NVD will greatly inflate their issues since they can already only be high or critical. But is it helping the users and the ecosystem at large?

In the curl project we have a policy which makes us register a CVE for every single reported or self-detected problem that can have a security impact. Either at will or by mistake. This includes a fair amount of low and medium issues. The amount of low and medium issues as a total of all issues increases over time as we keep finding issues, but the really bad ones are less frequently reported.

As we have all data recorded and stored, we can visualize this development over time. Below is a graph showing the curl vulnerability and severity level trends since 2010.

Severity distribution in reported curl vulnerabilities since 2010

Out of the 145 published curl security vulnerabilities so far, 28% have been rated severity high or critical while 104 of them were set low or medium by the curl security team.

I think this trend is easy to explain. It is because of two separate developments:

  1. We as a project have matured and have learned over time how to test better, write code better to minimize risk and we have existed for a while to have a series of truly bad flaws already found (and fixed). We make less serious bugs these days.
  2. Since 2010, lots of more people look for security problems and these days we are much better better at identifying problems as security related and we have better tools, while for a few years ago the same problem would just have become “a bug fix”.

Deciding severity

When a security problem is reported to curl, the curl security team and the reporter collaborate. First to make sure we understand the full width of the problem and its security impact. What can happen and what is required for that badness to trigger? Further, we assess what the likeliness that this can be done on purpose or by mistake and how common those situations and required configurations might be. We know curl, we know the code but we also often go back and double-check exactly what the documentation says and promises to better assess what users should be expected to know and do, and what is not expected from them etc. And we re-read the involved code again and again.

curl is currently a little over 160,000 lines of feature packed C code (excluding blank lines). It might not always be straight forward to a casual observer exactly how everything is glued together even if we try to also document internals to help you find deeper knowledge.

I think it is fair to say that it requires a certain amount of experience and time spent with the code to be able to fully understand a curl security issue and what impact it might have. I believe it is difficult or next to impossible for someone without knowledge about how it works to just casually read our security advisories and try to second-guess our assessments and instead make your own.

Yet this is exactly what NVD does. They don’t even ask us for help or for clarifications of anything. They think they can assess the severity of our problems without knowing curl nor fully understanding the reported issues.

A case to prove my point

In March 2023 we published a security advisory for the problem commonly referred to as CVE-2023-27536.

This is potentially a security problem, probably never hurts anyone and is in fact quite unlikely to ever cause a problem. But it might. So after deliberating we accepted it and ranked it severity low.

Bear with me here. I’ll spend two paragraphs revealing some details from the internal libcurl engine:

The problem is of a kind we have had several times in the past: curl has a connection pool and when a user makes a subsequent request which this particular option modified (compared to how it was when the previous connection was setup) it would wrongly reuse the first connection thinking they had the exact same properties.

The second would then accidentally get the wrong rights because it was setup differently. Still, the first connection would need the correct credentials and everything and so would the second one, it would just differ depending on what “GSSAPI delegation” that is allowed.

NVD ranks this

The person or team at NVD whose job it is to make up stuff for security vulnerabilities ranked this as CRITICAL 9.8. Almost as bad as it gets apparently. 10 is the max as you might recall.

When realizing this, at the end of May, I first fell off my chair in shock by this insanity, but after a quick recovery I emailed them (again) and complained (yet again) on setting this severity for *27536. I used the word “ridiculous” in my email to describe their actions. Why and who benefits from them scaremongering the world like this? It makes no sense. On the contrary, this is bad for everyone.

As a reaction to my complaint, someone at NVD went back and agreed to revise the CVSS string they had set and suddenly it was “only” ranked HIGH 7.2. I say “someone” because they never communicate with names and never sign the emails which whomever I talk to. They are just “NVD”.

I objected to their new CVSS string as well. It is just not a high severity security problem!

In my new argument I changed two particular details in the CVSS string (compared to the one they insisted was good) and presented arguments for that. For your pleasure, I include my exact wording below. (Some emphasis is added here for display purposes.)

How I motivated a downgrade

I could possibly live with: AV:N/AC:H/PR:H/UI:N/S:U/C:H/I:N/A:N (4.4) - even if that means Medium and we argue Low.

These are two changes and my motivations:

Attack complexity high - because how this requires that you actually have a working first communication and then do a second is slightly changed and you would expect the second to be different but in reality it accidentally reuse the first connection and therefore gives different/elevated rights.

It is a super-niche and almost impossible attack and there has been no report ever of anyone having suffered from this or even the existence of an application that actually would enable it to happen.

It is more likely to only happen by mistake by an application, but it also seems unlikely to ever be used by an application in a way that would trigger it since having the same user credentials with different values for GSS delegation and assume different access levels seems … weird.

This almost impossible chance of occurring is the primary reason we think this is a Low severity. With CVSS, it seems impossible to reach Low.

Privileges Required high - because the only way you can trigger this flaw is by having full privileges for the *same* user credentials that is later used again but with changed GSS API delegation set. While the previous connection is still live in the connection pool.

It would also only be an attack or a flaw if that second transfer actually assumes to have different access properties, which is probably debatable if users of the API would expect or not

CVSS still sucks

CVSS is a crap system so using this single-dimension number it seems next to impossible to actually get severity low report.

NVD wants “public sources”

NVD does not just take my word for how curl works. I mean, I only wrote a large chunk of it and am probably the single human that knows most about its internals and how it works. I also wrote the patch for this issue, I wrote the connection pool logic and I understand the problem exactly. Nope, just because I say so does not make it true.

My claims above about this issue can of course be verified by reading the publicly available source code and you can run tests to reproduce my claims. Not to mention that the functionality in question is documented.

But no.

They decided to agree to one of my proposed changes, which further downgraded the severity to MEDIUM 5.9. Quite far away from their initial stance. I think it is at least a partial victory.

For the second change to the CVSS string I requested, they demand that I provide more information for them. In their words:

There is no publicly available information about the CVE that clarifies your statement so we must request clarification from you and additionally have this detail added to the HackerOne report or some other public interface for transparency purposes prior to making changes to the CVSS vector.

… which just emphasizes exactly what I have stated already in this post. They set a severity on this without understanding the issue, with no knowledge of the feature that gets this wrong and without clues about what is actually necessary to trigger this flaw in the first place.

For people intimately familiar with curl internals, we actually don’t have to spell out all these facts with excruciating details. We know how the connection pool works, how the reuse of connections should work and what it means when curl gets it wrong. We have also had several other issues in this areas in the past. (It is a tricky area to get right.)

But it does not make this CVE more than a Low severity issue.

Conclusion

This issue is now stuck at this MEDIUM 5.9 at NVD. Much less bad than where they started. Possibly Low or Medium does not make a huge difference out there in the world.

I think it is outrageous that I need to struggle and argue for such a big and renowned organization to do right. I can’t do this for every CVE we have reported because it takes serious time and energy, but at the same time I have zero expectation of them getting this right. I can only assume that they are equally lost and bad when assessing security problems in other projects as well.

A completely broken and worthless system. That people seem to actually use.

It is certainly tempting to join the projects that do not report Low or Medium issues at all. If we would stop doing that, at least NVD would not shout wolf and foolishly claim they are critical.

My response

That is a ridiculous request.

I'm stating *verifiable facts* about the flaw and how curl is vulnerable to it. The publicly available information this is based on is the actual source code which is openly available. You can also verify my claims by running code and checking what happens and then you'd see that my statements match what the code does.

The fact that you assess the severity of this (and other) CVE without understanding the basic facts of how it works and what the vulnerability is, just emphasizes how futile your work is: it does not work. If you do not even bother to figure these things out then of course you cannot set a sensible severity level or CVSS score. Now I understand your failures much better.

We in the curl project's security team already know how curl works, we understand this vulnerability and we set the severity accordingly. We don't need to restate known facts. curl functionality is well documented and its source code has always been open and public.

If you have questions after having read that, feel free to reach out to the curl security team and we can help you. You reach us at security@curl.se

I recommend that you (NVD) always talk to us before you set CVSS scores for curl issues so that we can help guide you through them. I think that could make the world a better place and it would certainly benefit a world of curl users who trust the info you provide.

 / Daniel

deleting system32\curl.exe

Let me tell you a story about how Windows users are deleting files from their installation and as a consequence end up in tears.

Background

The real and actual curl tool has been shipped as part of Windows 10 and Windows 11 for many years already. It is called curl.exe and is located in the System32 directory.

Microsoft ships this bundled with its Operating system. They get the code from the curl project but Microsoft builds, tests, ships and are in all ways responsible for their operating system.

NVD inflation

As I have blogged about separately earlier, the next brick in the creation of this story is the fact that National Vulnerability Database deliberately inflates the severity levels of security flaws in its vast database. They believe scaremongering serves their audience.

In one particular case, CVE-2022-43552 was reported by the curl project in December 2022. It is a use-after-free flaw that we determined to be severity low and not higher mostly because of the very limited time window you need to make something happen for it to be exploited or abused. NVD set it to medium which admittedly was just one notch higher (this time).

This is not helpful.

“Security scanners”

Lots of Windows users everywhere runs security scanners on their systems with regular intervals in order to verify that their systems are fine. At some point after December 21, 2022, some of these scanners started to detect installations of curl that included the above mentioned CVE. Nessus apparently started this on February 23.

This is not helpful.

Panic

Lots of Windows users everywhere then started to panic when these security applications warned them about their vulnerable curl.exe. Many Windows users are even contractually “forced” to fix (all) such security warnings within a certain time period or risk bad consequences and penalties.

How do you fix this?

I have been asked numerous times about how to fix this problem. I have stressed at every opportunity that it is a horrible idea to remove the system curl or to replace it with another executable. It is very easy to download a fresh curl install for Windows from the curl site – but we still strongly discourage everyone from replacing system files.

But of course, far from everyone asked us. A seemingly large enough crowd has proceeded and done exactly what we would stress they should not: they deleted or replaced their C:\Windows\System32\curl.exe.

The real fix is of course to let Microsoft ship an update and make sure to update then. The exact update that upgrades curl to version 8.0.1 is called KB5025221 and shipped on April 11. (And yes, this is the first time you get the very latest curl release shipped in a Windows update)

The people who deleted or replaced the curl executable noticed that they cannot upgrade because the Windows update procedure detects that the Windows install has been tampered with and it refuses to continue.

I do not know how to restore this to a state that Windows update is happy with. Presumably if you bring back curl.exe to the exact state from before it could work, but I do not know exactly what tricks people have tested and ruled out.

Bad advice

I have been pointed to responses on the Microsoft site answers.microsoft.com done by “helpful volunteers” that specifically recommend removing the curl.exe executable as a fix.

This is not helpful.

I don’t want to help spreading that idea so I will not link to any such post. I have reported this to Microsoft contacts and I hope they can maybe edit or comment those posts soon.

We are not responsible

I just want to emphasize that if you install and run Windows, your friendly provider is Microsoft. You need to contact Microsoft for support and help with Windows related issues. The curl.exe you have in System32 is only provided indirectly by the curl project and we cannot fix this problem for you. We in fact fixed the problem in the source code already back in December 2022.

If you have removed curl.exe or otherwise tampered with your Windows installation, the curl project cannot help you.

Credits

Image by Alexa from Pixabay

Discussions

Hacker news

NVD makes up vulnerability severity levels

When a security vulnerability has been found and confirmed in curl, we request a CVE Id for the issue. This is a global unique identifier for this specific problem. We request the ID from our CVE Numbering Authority (CNA), Hackerone, which once we make the issue public will publish all details about it to MITRE, which hosts the central database.

In the curl project we have until today requested CVE Ids for and provided information about 135 vulnerabilities spread out over twenty-five years.

A CVE identifier affects a specific product (or set of products), and the problem affects the product from a version until a fixed version. And then there is a severity. How bad is the problem?

CVSS score

The Common Vulnerability Scoring System (CVSS) is a way to grade severity on a scale from zero to ten. You typically use a CVSS calculator, fill in the info as good as you can and voilá, out comes a score.

The ranges have corresponding names:

NameRange
Lowlower 4
Medium4.0-6.9
High7.0-8.9
Critical9 or higher

CVSS is a shitty system

Anyone who ever gets a problem reported for their project and tries to assess and set a CVSS score will immediately realize what an imperfect, simplified and one-dimensional concept this is.

The CVSS score leaves out several very important factors like how widespread the affected platform is, how common the affected configuration is and yet it is still very subjective as you need to assess as and mark different things as None, Low, Medium or High.

The same bug is therefore likely to end up with different CVSS scores depending on who fills in the form – even when the persons are familiar with the product and the error in question.

curl severity

In the curl project we decided to abandon CVSS years ago because of its inherent problems. Instead we use only the four severity names: Low, Medium, High, and Critical and we work out the severity together in the curl security team as we work on the vulnerability. We make sure we understand the problem, the risks, its prevalence and more. We take all factors into account and then we set a severity level we think helps the world understand it.

All security vulnerabilities are vulnerabilities and therefore security risks, even the ones set to severity Low, but having the correct severity is still important in messaging and for the rest of the world to get a better picture of how serious the issue is. Getting the right severity is important.

NVD

Let me introduce yet another player in this game. The National Vulnerability Database (NVD). (And no, it’s not “national” really).

NVD hosts a database of vulnerabilities. All CVEs that are submitted to MITRE are sucked in into NVD’s database. NVD says it “performs analysis on CVEs that have been published to the CVE Dictionary“.

That last sentence is probably important.

NVD imports CVEs into their database and they in turn offer other databases to import vulnerabilities from them. One large and known user of the NVD database is this I mentioned in a recent blog post: GitHub Security Advisory Database (GHSA DB) .

GHSA DB

This GitHub thing an ambitious database that subsequently hosts a lot of vulnerabilities that people and projects reported themselves in addition to them importing information about all vulnerabilities ever published with CVE Ids.

This creates a huge database that in theory should contain just about every software vulnerability ever reported in the public. Pretty cool.

Enter reality

NVD, in their great wisdom, rescores the CVSS score for CVE Ids they import into their database! (It’s not clear how or why, but they seem to not do it for all issues).

NVD decides they know better than the project that set the severity level for the issue, enters their own answers in the CVSS calculator and eventually sets that new score on the CVEs they import.

NVD clearly thinks they need to do this and that they improve the state of the CVEs by this practice, but the end result is close to scaremongering.

Result

Because NVD sets their own severity level and they have some sort of “worst case” approach, virtually all issues that NVD sets severity for is graded worse or much worse when they do it than how we set the severity levels.

Let’s take an example: CVE-2022-42915: HTTP proxy double-free. We deemed this a medium severity. It was not made higher partly because of the very limited time-window between the two frees, making it harder to take advantage of.

What did NVD say? Severity 9.8: critical. See the same issue on GitHub.

Yes, it makes you wonder what magic insights and knowledge the person/bots on NVD possessed when they did this.

Scaremongering

The different severity levels should not matter too much but people find those inflated ones and they believe them. Users also find the discrepancies, get confused and won’t know what to believe or whom to trust. After all, NVD is trust-inducing brand. People think they know their stuff and if they say critical and the curl project says medium, what are we expected to think?

I claim that NVD overstate their severity levels and there unnecessarily scares readers and make them think issues are worse and more dangerous than they actually are.

The fact that GitHub now imports all CVE data from NVD makes these severity levels get transported, shown and believed as they are now also shown in the GHSA DB.

Look how many critical issues there are!

Not exactly GitHub’s fault

This NVD habit of re-scoring is an old existing habit and I just recently learned it. GitHub’s displaying the severity levels highlighted it for me, especially since users out there seem to trust and use this GitHub database.

I have talked to humans on the GitHub database team and I push for them to ignore or filter out the severity levels as set by NVD, if possible. But me being just a single complaining maintainer I do not expect this to have much of an effect. I would urge NVD to stop this insanity if I had any way to.

Hackerone glitches?

(Updated after first post). It turns out that some CVEs that we have filed from the curl project that uses our CNA hackerone have been submitted to MITRE without any severity level or CVSS score at all. For such issues, I of course understand why someone would put their own score on the issue because then our originally set score/severity is not passed on. Then the “blame” is instead shifted to Hackerone. I have contacted them about it.

Dispute a CVSS

NVD provides a way to dispute their rescores, but that’s just an open free-text form. I have use that form to request that NVD stop rescoring all curl issues. Although I honestly think they should rather stop all rescoring and only do that in the rare occasions where the original score or severity is obviously wrong.

I cannot dispute the severity levels at GitHub. They show the NVD levels.

The 2022 curl security audit

tldr: several hundred hours of dedicated scrutinizing of curl by a team of security experts resulted in two CVEs and a set of less serious remarks. The link to the reports is at the bottom of this article.

Thanks to an OpenSSF grant, OSTIF helped us set up a curl security audit, which the excellent Trail of Bits was selected to perform in September 2022. We are most grateful to OpenSSF for doing this for us, and I hope all users who use and rely on curl recognize this extraordinary gift. OSTIF and Trail of Bits both posted articles about this audit separately.

We previously had an audit performed on curl back in 2016 by Cure53 (sponsored by Mozilla) but I like to think that we (curl) have traveled quite far and matured a lot since those days. The fixes from the discoveries reported in that old previous audit were all merged and shipped in the 7.51.0 release, in November 2016. Now over six years ago.

Changes since previous audit

We have done a lot in the project that have improved our general security situation over the last six years. I believe we are in a much better place than the last time around. But we have also grown and developed a lot more features since then.

curl is now at150,000 lines of C code. This count is for “product code” only and excludes blank lines but includes 19% comments.

71 additional vulnerabilities have been reported and fixed since then. (42 of those even existed in the version that was audited in 2016 but were obviously not detected)

We have 30,000 additional lines of code today (+27%), and we have done over 8,000 commits since.

We have 50% more test cases (now 1550).

We have done 47 releases featuring more than 4,200 documented bugfixes and 150 changes/new features.

We have 25 times the number of CI jobs: up from 5 in 2016 to 127 today.

The OSS-Fuzz project started fuzzing curl in 2017, and it has been fuzzing curl non-stop since.

We introduced our “dynbuf” system internally in 2020 for managing growing buffers to maybe avoid common C mistakes around those.

Audit

The Trail of Bits team was assigned this as a three-part project:

  1. Create a Threat Model document
  2. Testing Analysis and Improvements
  3. Secure code Review

The project was setup to use a total of 380 man hours and most of the time two Trail of Bits engineers worked in parallel on the different tasks. The Trail of Bits team themselves eventually also voluntarily extended the program with about a week. They had no problems finding people who wanted to join in and look into curl. We can safely say that they spent a significant amount of time and effort scrutinizing curl.

The curl security team members had frequent status meetings and assisted with details and could help answer questions. We would also get updates and reports on how they progressed.

Two security vulnerabilities were confirmed

The first vulnerability they found ended up known as the CVE-2022-42915: HTTP proxy double-free issue.

The second vulnerability was found after Trail of Bits had actually ended their work and their report, while they were still running a fuzzer that triggered a separate flaw. This second vulnerability is not covered in the report but was disclosed earlier today in sync with the curl 7.87.0 release announcement: CVE-2022-43552: HTTP Proxy deny use-after-free.

Minor frictions detected

Discoveries and remarks highlighted through their work that were not consider security sensitive we could handle on the fly. Some examples include:

  • Using --ssl now outputs a warning saying it is unsafe and instead recommending --ssl-reqd to be used.
  • The Alt-svc: header parser did not deal with illegal port numbers correctly
  • The URL parser accepted “illegal” characters in the host name part.
  • Harmless memory leaks

You should of course read the full reports to learn about all the twenty something issues with all details, including feedback from the curl security team.

Actions

The curl team acted on all reported issues that we think we could act on. We disagree with the Trail of Bits team on a few issues and there are some that are “good ideas” that we should probably work on getting addressed going forward but that can’t be fixed immediately – but also don’t leave any immediate problem or danger in the code.

Conclusions

Security is not something that can be checked off as done once and for all nor can it ever be considered complete. It is a process that needs to blend in and affect everything we do when we develop software. Now and forever going forward.

This team of security professionals spent more time and effort in this security auditing and poking on curl with fuzzers than probably anyone else has ever done before. Personally, I am thrilled that they only managed to uncovered two actual security problems. I think this shows that a lot of curl code has been written the right way. The CVEs they found were not even that terrible.

Lessons

Twenty something issues were detected, and while the report includes advice from the auditors on how we should improve things going forward, they are of the kind we all already know we should do and paths we should follow. I could not really find any real lessons as in obvious things or patterns we should stop or new paradigms och styles to adapt.

I think we learned or more correctly we got these things reconfirmed:

  • we seem to be doing things mostly correct
  • we can and should do more and better fuzzing
  • adding more tests to increase coverage is good

Security is hard

To show how hard security can be, we received no less than three additional security reports to the project during the actual life-time when this audit was being done. Those additional security reports of course came from other people and identified security problems this team of experts did not find.

My comments on the reports

The term Unresolved is used for a few issues in the report and I have a minor qualm with the use of that particular word in this context for all cases. While it is correct that we in several cases did not act on the advice in the report, we saw some cases where we distinctly disagree with the recommendations and some issues that mentioned things we might work on and address in the future. They are all just marked as unresolved in the reports, but they are not all unresolved to us in the curl project.

In particular I am not overly pleased with how the issue called TOB-CURLTM-6 is labeled severity high and status unresolved as I believe this wrongly gives the impression that curl has issues with high severity left unresolved in the code.

If you want to read the specific responses for each and every reported issue from the curl project, they are stored in this separate GitHub gist.

The reports

You find the two reports linked to from the curl security page. A total of almost 100 pages in two PDF documents.

Increased CVE activity in curl?

Recently I have received curious questions from users, customers and bystanders.

Can you explain the seemingly increased CVE activity in curl over like the last year or so?

(data and stats for this post comes mostly from the curl dashboard)

Pointless but related poll I ran on Twitter

Frequency

In 2022 we have already had 14 CVEs reported so far, and we will announce the 15th when we release curl 7.85.0 at the end of August. Going into September 2022, there have been a total of 18 reported CVEs in the last 12 months.

During the whole of 2021 we had 13 CVEs reported – and already that was a large amount and the most CVEs in a single year since 2016.

There has clearly been an increased CVE issue rate in curl as of late.

Finding and fixing problems is good

While every reported security problem stings my ego and makes my soul hurt since it was yet another mistake I feel I should have found or not made in the first place, the key take away is still that it is good that they are found and reported so that they can be fixed properly. Ideally, we also learn something from each such report and make it less likely that we ever introduce that (kind of) problem again.

That might be the absolutely hardest task around each CVE. To figure out what went wrong, detect a pattern and then lock it down. It’s almost amusing how all bugs look a like a one-off mistake with nothing to learn from…

Ironically, the only way we know people are looking really hard at curl security is when someone reports security problems. If there was no reports at all, can we really be sure that people are scrutinizing the code the way we want?

Who is counting?

Counting the amount of CVEs and giving that a meaning, or even comparing the number between projects, is futile and a bad idea. The number does not say much and comparing two projects this way is impossible and will not tell you anything. Every project is unique.

Just counting CVEs disregards the fact that they all have severity levels. Are they a dozen “severity low” or are they a handful of critical ones? Even if you take severity into account, they might have gotten entirely different severity for virtually the same error, or vice versa.

Further, some projects attract more scrutinize and investigation because they are considered more worthwhile targets. Or perhaps they just pay researchers more for their findings. Projects that don’t get the same amount of focus will naturally get fewer security problems reported for them, which does not necessarily mean that they have fewer problems.

Incentives

The curl bug-bounty really works as an incentive as we do reward security researchers a sizable amount of money for every confirmed security flaw they report. Recently, we have handed over 2,400 USD for each Medium-severity security problem.

In addition to that, finding and getting credited for finding a flaw in a widespread product such as curl is also seen as an extra “feather in the hat” for a lot of security-minded bug hunters.

Over the last year alone, we have paid about 30,000 USD in bug bounty rewards to security researchers, summing up a total of over 40,000 USD since the program started.

Accumulated bug-bounty rewards for curl CVEs over time

Who’s looking?

We have been fortunate to have received the attention of some very skilled, patient and knowledgeable individuals. To find security problems in modern curl, the best bug hunters both know the ins and outs of the curl project source code itself while at the same time they know the protocols curl speaks to a deep level. That’s how you can find mismatches that shouldn’t be there and that could lead to security problems.

The 15 reports we have received in 2022 so far (including the pending one) have been reported by just four individuals. Two of them did one each, the other two did 87% of the reporting. Harry Sintonen alone reported 60% of them.

Hardly any curl security problems are found with source code analyzers or even fuzzers these days. Those low hanging fruits have already been picked.

We care, we act

Our average response time for security reports sent to the curl project has been less than two hours during 2022, for the 56 reports received so far.

We give each report a thorough investigation and we spend a serious amount of time and effort to really make sure we understand all the angles of the claim, that it really is a security problem, that we produce the best possible fix for it and not the least: that we produce a mighty fine advisory for the issue that explains it to the world with detail and accuracy.

Less than 8% of the submissions we get are eventually confirmed actual security problems.

As a general rule all security problems we confirm, are fixed in the pending next release. The only acceptable exception would be if the report arrives just a day or two from the next release date.

We work hard to make curl more secure and to use more ways of writing secure code and tools to detect mistakes than ever, to minimize the risk for introducing security flaws.

Judge a project on how it acts

Since you cannot judge a project by the number of CVEs that come out of it, what you should instead pay more focus on when you assess the health of a software project is how it acts when security problems are reported.

Most problems are still very old

In the curl project we make a habit of tracing back and figuring out exactly in which release each and every security problem was once introduced. Often the exact commit. (Usually that commit was authored by me, but let’s not linger on that fact now.)

One fun thing this allows us to do, is to see how long time the offending code has been present in releases. The period during which all the eyeballs that presumably glanced over the code missed the fact that there was a security bug in there.

On average, curl security problems have been present an extended period of time before there are found and reported.

On average for all CVEs: 2,867 days

The average time bugs were present for CVEs reported during the last 12 months is a whopping 3,245 days. This is very close to nine years.

How many people read the code in those nine years?

The people who find security bugs do not know nor do they care about the age of a source code line when they dig up the problems. The fact that the bugs are usually old could be an indication that we introduced more security bugs in the past than we do now.

Finding vs introducing

Enough about finding issues for a moment. Let’s talk about introducing security problems. I already mentioned we track down exactly when security flaws were introduced. We know.

All CVEs in curl, green are found, red are introduced

With this, we can look at the trend and see if we are improving over time.

The project has existed for almost 25 years, which means that if we introduce problems spread out evenly over time, we would have added 4% of them every year. About 5 CVE problems are introduced per year on average. So, being above or below 5 introduced makes us above or below an average year.

Bugs are probably not introduced as a product of time, but more as a product of number of lines of code or perhaps as a ratio of the commits.

75% of the CVE errors were introduced before March 2014 – and yet the code base “only” had 102,000 lines of code at the end of that period. At that time, we had done 61% of the git commits (17684). One CVE per 189 commits.

15% of the security problems were introduced during the last five years and now we are at 148,000 lines of code. Finding needles in a growing haystack. With more code than ever before, we introduce bugs at a lower-than average rate. One CVE per 352 commits. This probably is also related to the ever-growing number of tests and CI jobs that help us detect more problems before we merge them.

Number of lines of code. Includes comments, excludes blank lines

The commit rate has remained between 1,100 and 1,700 commits per year since 2007 with ups and downs but no obvious growing or declining trend.

Number of commits per year

Harry Sintonen

Harry reported a large amount of the recent curl CVEs as mentioned above, and is credited for a total of 17 reported curl CVEs – no one else is even close to that track record. I figured it is apt to ask him about the curl situation of today, to make sure this is not just me hallucinating things up. What do you think is the reason for the increased number of CVEs in curl in 2022?

Harry replied:

  1. The news of good bounties being paid likely has been attracting more researchers to look at curl.
  2. I did put considerable effort in doing code reviews. I’m sure some other people put in a lot of effort too.
  3. When a certain before unseen type of vulnerability is found, it will attract people to look at the code as a whole for similar or surrounding issues. This quite often results in new, similar CVEs bundling up. This is kind of a clustering effect.

It’s worth mentioning, I think, that even though there has been more CVEs found recently, they have been found (well most of them at least) as part of the bug bounty program and get handled in a controlled manner. While it would be even better to have no vulnerabilities at all, finding and handling them in controlled manner is the 2nd best option.

This might be escaping a random observer who just looks at the recent amount of CVEs and goes “oh this is really bad – something must have gone wrong!”. Some of the issues are rather old and were only found now due to the increased attention.

Conclusion

We introduce CVE problems at a slower rate now than we did in the past even though we have gotten problems reported at a higher than usual frequency recently.

The way I see it, we are good. I suppose the future will tell if I am right.