Tag Archives: Security

curl 7.57.0 happiness

The never-ending series of curl releases continued today when we released version 7.57.0. The 171th release since the beginning, and the release that follows 37 days after 7.56.1. Remember that 7.56.1 was an extra release that fixed a few most annoying regressions.

We bump the minor number to 57 and clear the patch number in this release due to the changes introduced. None of them very ground breaking, but fun and useful and detailed below.

41 contributors helped fix 69 bugs in these 37 days since the previous release, using 115 separate commits. 23 of those contributors were new, making the total list of contributors now contain 1649 individuals! 25 individuals authored commits since the previous release, making the total number of authors 540 persons.

The curl web site currently sends out 8GB data per hour to over 2 million HTTP requests per day.

Support RFC7616 – HTTP Digest

This allows HTTP Digest authentication to use the must better SHA256 algorithm instead of the old, and deemed unsuitable, MD5. This should be a transparent improvement so curl should just be able to use this without any particular new option has to be set, but the server-side support for this version seems to still be a bit lacking.

(Side-note: I’m credited in RFC 7616 for having contributed my thoughts!)

Sharing the connection cache

In this modern age with multi core processors and applications using multi-threaded designs, we of course want libcurl to enable applications to be able to get the best performance out of libcurl.

libcurl is already thread-safe so you can run parallel transfers multi-threaded perfectly fine if you want to, but it doesn’t allow the application to share handles between threads. Before this specific change, this limitation has forced multi-threaded applications to be satisfied with letting libcurl has a separate “connection cache” in each thread.

The connection cache, sometimes also referred to as the connection pool, is where libcurl keeps live connections that were previously used for a transfer and still haven’t been closed, so that a subsequent request might be able to re-use one of them. Getting a re-used connection for a request is much faster than having to create a new one. Having one connection cache per thread, is ineffective.

Starting now, libcurl’s “share concept” allows an application to specify a single connection cache to be used cross-thread and cross-handles, so that connection re-use will be much improved when libcurl is used multi-threaded. This will significantly benefit the most demanding libcurl applications, but it will also allow more flexible designs as now the connection pool can be designed to survive individual handles in a way that wasn’t previously possible.

Brotli compression

The popular browsers have supported brotli compression method for a while and it has already become widely supported by servers.

Now, curl supports it too and the command line tool’s –compressed option will ask for brotli as well as gzip, if your build supports it. Similarly, libcurl supports it with its CURLOPT_ACCEPT_ENCODING option. The server can then opt to respond using either compression format, depending on what it knows.

According to CertSimple, who ran tests on the top-1000 sites of the Internet, brotli gets contents 14-21% smaller than gzip.

As with other compression algorithms, libcurl uses a 3rd party library for brotli compression and you may find that Linux distributions and others are a bit behind in shipping packages for a brotli decompression library. Please join in and help this happen. At the moment of this writing, the Debian package is only available in experimental.

(Readers may remember my libbrotli project, but that effort isn’t really needed anymore since the brotli project itself builds a library these days.)

Three security issues

In spite of our hard work and best efforts, security issues keep getting reported and we fix them accordingly. This release has three new ones and I’ll describe them below. None of them are alarmingly serious and they will probably not hurt anyone badly.

Two things can be said about the security issues this time:

1. You’ll note that we’ve changed naming convention for the advisory URLs, so that they now have a random component. This is to reduce potential information leaks based on the name when we pass these around before releases.

2. Two of the flaws happen only on 32 bit systems, which reveals a weakness in our testing. Most of our CI tests, torture tests and fuzzing are made on 64 bit architectures. We have no immediate and good fix for this, but this is something we must work harder on.

1. NTLM buffer overflow via integer overflow

(CVE-2017-8816) Limited to 32 bit systems, this is a flaw where curl takes the combined length of the user name and password, doubles it, and allocates a memory area that big. If that doubling ends up larger than 4GB, an integer overflow makes a very small buffer be allocated instead and then curl will overwrite that.

Yes, having user name plus password be longer than two gigabytes is rather excessive and I hope very few applications would allow this.

2. FTP wildcard out of bounds read

(CVE-2017-8817) curl’s wildcard functionality for FTP transfers is not a not very widely used feature, but it was discovered that the default pattern matching function could erroneously read beyond the URL buffer if the match pattern ends with an open bracket ‘[‘ !

This problem was detected by the OSS-Fuzz project! This flaw  has existed in the code since this feature was added, over seven years ago.

3. SSL out of buffer access

(CVE-2017-8818) In July this year we introduced multissl support in libcurl. This allows an application to select which TLS backend libcurl should use, if it was built to support more than one. It was a fairly large overhaul to the TLS code in curl and unfortunately it also brought this bug.

Also, only happening on 32 bit systems, libcurl would allocate a buffer that was 4 bytes too small for the TLS backend’s data which would lead to the TLS library accessing and using data outside of the heap allocated buffer.

Next?

The next release will ship no later than January 24th 2018. I think that one will as well add changes and warrant the minor number to bump. We have fun pending stuff such as: a new SSH backend, modifiable happy eyeballs timeout and more. Get involved and help us do even more good!

The life of a curl security bug

The report

Usually, security problems in the curl project come to us out of the blue. Someone has found a bug they suspect may have a security impact and they tell us about it on the curl-security@haxx.se email address. Mails sent to this address reach a private mailing list with the curl security team members as the only subscribers.

An important first step is that we respond to the sender, acknowledging the report. Often we also include a few follow-up questions at once. It is important to us to keep the original reporter in the loop and included in all subsequent discussions about this issue – unless they prefer to opt out.

If we find the issue ourselves, we act pretty much the same way.

In the most obvious and well-reported cases there are no room for doubts or hesitation about what the bugs and the impact of them are, but very often the reports lead to discussions.

The assessment

Is it a bug in the first place, is it perhaps even documented or just plain bad use?

If it is a bug, is this a security problem that can be abused or somehow put users in some sort of risk?

Most issues we get reported as security issues are also in the end treated as such, as we tend to err on the safe side.

The time plan

Unless the issue is critical, we prefer to schedule a fix and announcement of the issue in association with the pending next release, and as we do releases every 8 weeks like clockwork, that’s never very far away.

We communicate the suggested schedule with the reporter to make sure we agree. If a sooner release is preferred, we work out a schedule for an extra release. In the past we’ve did occasional faster security releases also when the issue already had been made public, so we wanted to shorten the time window during which users could be harmed by the problem.

We really really do not want a problem to persist longer than until the next release.

The fix

The curl security team and the reporter work on fixing the issue. Ideally in part by the reporter making sure that they can’t reproduce it anymore and we add a test case or two.

We keep the fix undisclosed for the time being. It is not committed to the public git repository but kept in a private branch. We usually put it on a private URL so that we can link to it when we ask for a CVE, see below.

All security issues should make us ask ourselves – what did we do wrong that made us not discover this sooner? And ideally we should introduce processes, tests and checks to make sure we detect other similar mistakes now and in the future.

Typically we only generate a single patch from the git master master and offer that as the final solution. In the curl project we don’t maintain multiple branches. Distros and vendors who ship older or even multiple curl versions backport the patch to their systems by themselves. Sometimes we get backported patches back to offer users as well, but those are exceptions to the rule.

The advisory

In parallel to working on the fix, we write up a “security advisory” about the problem. It is a detailed description about the problem, what impact it may have if triggered or abused and if we know of any exploits of it.

What conditions need to be met for the bug to trigger. What’s the version range that is affected, what’s the remedies that can be done as a work-around if the patch is not applied etc.

We work out the advisory in cooperation with the reporter so that we get the description and the credits right.

The advisory also always contains a time line that clearly describes when we got to know about the problem etc.

The CVE

Once we have an advisory and a patch, none of which needs to be their final versions, we can proceed and ask for a CVE. A CVE is a unique “ID” that is issued for security problems to make them easy to reference. CVE stands for Common Vulnerabilities and Exposures.

Depending on where in the release cycle we are, we might have to hold off at this point. For all bugs that aren’t proprietary-operating-system specific, we pre-notify and ask for a CVE on the distros@openwall mailing list. This mailing list prohibits an embargo longer than 14 days, so we cannot ask for a CVE from them longer than 2 weeks in advance before our release.

The idea here is that the embargo time gives the distributions time and opportunity to prepare updates of their packages so they can be pretty much in sync with our release and reduce the time window their users are at risk. Of course, not all operating system vendors manage to actually ship a curl update on two weeks notice, and at least one major commercial vendor regularly informs me that this is a too short time frame for them.

For flaws that don’t affect the free operating systems at all, we ask MITRE directly for CVEs.

The last 48 hours

When there is roughly 48 hours left until the coming release and security announcement, we merge the private security fix branch into master and push it. That immediately makes the fix public and those who are alert can then take advantage of this knowledge – potentially for malicious purposes. The security advisory itself is however not made public until release day.

We use these 48 hours to get the fix tested on more systems to verify that it is not doing any major breakage. The weakest part of our security procedure is that the fix has been worked out in secret so it has not had the chance to get widely built and tested, so that is performed now.

The release

We upload the new release. We send out the release announcement email, update the web site and make the advisory for the issue public. We send out the security advisory alert on the proper email lists.

Bug Bounty?

Unfortunately we don’t have any bug bounties on our own in the curl project. We simply have no money for that. We actually don’t have money at all for anything.

Hackerone offers bounties for curl related issues. If you have reported a critical issue you can request one from them after it has been fixed in curl.

 

The backdoor threat

— “Have you ever detected anyone trying to add a backdoor to curl?”

— “Have you ever been pressured by an organization or a person to add suspicious code to curl that you wouldn’t otherwise accept?”

— “If a crime syndicate would kidnap your family to force you to comply, what backdoor would you be be able to insert into curl that is the least likely to get detected?” (The less grim version of this question would instead offer huge amounts of money.)

I’ve been asked these questions and variations of them when I’ve stood up in front of audiences around the world and talked about curl and how it is one of the most widely used software components in the world, counting way over three billion instances.

Back door (noun)
— a feature or defect of a computer system that allows surreptitious unauthorized access to data.

So how is it?

No. I’ve never seen a deliberate attempt to add a flaw, a vulnerability or a backdoor into curl. I’ve seen bad patches and I’ve seen patches that brought bugs that years later were reported as security problems, but I did not spot any deliberate attempt to do bad in any of them. But if done with skills, certainly I wouldn’t have noticed them being deliberate?

If I had cooperated in adding a backdoor or been threatened to, then I wouldn’t tell you anyway and I’d thus say no to questions about it.

How to be sure

There is only one way to be sure: review the code you download and intend to use. Or get it from a trusted source that did the review for you.

If you have a version you trust, you really only have to review the changes done since then.

Possibly there’s some degree of safety in numbers, and as thousands of applications and systems use curl and libcurl and at least some of them do reviews and extensive testing, one of those could discover mischievous activities if there are any and report them publicly.

Infected machines or owned users

The servers that host the curl releases could be targeted by attackers and the tarballs for download could be replaced by something that carries evil code. There’s no such thing as a fail-safe machine, especially not if someone really wants to and tries to target us. The safeguard there is the GPG signature with which I sign all official releases. No malicious user can (re-)produce them. They have to be made by me (since I package the curl releases). That comes back to trusting me again. There’s of course no safe-guard against me being forced to signed evil code with a knife to my throat…

If one of the curl project members with git push rights would get her account hacked and her SSH key password brute-forced, a very skilled hacker could possibly sneak in something, short-term. Although my hopes are that as we review and comment each others’ code to a very high degree, that would be really hard. And the hacked person herself would most likely react.

Downloading from somewhere

I think the highest risk scenario is when users download pre-built curl or libcurl binaries from various places on the internet that isn’t the official curl web site. How can you know for sure what you’re getting then, as you couldn’t review the code or changes done. You just put your trust in a remote person or organization to do what’s right for you.

Trusting other organizations can be totally fine, as when you download using Linux distro package management systems etc as then you can expect a certain level of checks and vouching have happened and there will be digital signatures and more involved to minimize the risk of external malicious interference.

Pledging there’s no backdoor

Some people argue that projects could or should pledge for every release that there’s no deliberate backdoor planted so that if the day comes in the future when a three-letter secret organization forces us to insert a backdoor, the lack of such a pledge for the subsequent release would function as an alarm signal to people that something is wrong.

That takes us back to trusting a single person again. A truly evil adversary can of course force such a pledge to be uttered no matter what, even if that then probably is more mafia level evilness and not mere three-letter organization shadiness anymore.

I would be a bit stressed out to have to do that pledge every single release as if I ever forgot or messed it up, it should lead to a lot of people getting up in arms and how would such a mistake be fixed? It’s little too irrevocable for me. And we do quite frequent releases so the risk for mistakes is not insignificant.

Also, if I would pledge that, is that then a promise regarding all my code only, or is that meant to be a pledge for the entire code base as done by all committers? It doesn’t scale very well…

Additionally, I’m a Swede living in Sweden. The American organizations cannot legally force me to backdoor anything, and the Swedish versions of those secret organizations don’t have the legal rights to do so either (caveat: I’m not a lawyer). So, the real threat is not by legal means.

What backdoor would be likely?

It would be very hard to add code, unnoticed, that sends off data to somewhere else. Too much code that would be too obvious.

A backdoor similarly couldn’t really be made to split off data from the transfer pipe and store it locally for other systems to read, as that too is probably too much code that is too different than the current code and would be detected instantly.

No, I’m convinced the most likely backdoor code in curl is a deliberate but hard-to-detect security vulnerability that let’s the attacker exploit the program using libcurl/curl by some sort of specific usage pattern. So when triggered it can trick the program to send off memory contents or perhaps overwrite the local stack or the heap. Quite possibly only one step out of several steps necessary for a successful attack, much like how a single-byte-overwrite can lead to root access.

Any past security problems on purpose?

We’ve had almost 70 security vulnerabilities reported through the project’s almost twenty years of existence. Since most of them were triggered by mistakes in code I wrote myself, I can be certain that none of those problems were introduced on purpose. I can’t completely rule out that someone else’s patch modified curl along the way and then by extension maybe made a vulnerability worse or easier to trigger, could have been made on purpose. None of the security problems that were introduced by others have shown any sign of “deliberateness”. (Or were written cleverly enough to not make me see that!)

Maybe backdoors have been planted that we just haven’t discovered yet?

Discussion

Follow-up discussion/comments on hacker news.

keep finding old security problems

I decided to look closer at security problems and the age of the reported issues in the curl project.

One theory I had when I started to collect this data, was that we actually get security problems reported earlier and earlier over time. That bugs would be around in public release for shorter periods of time nowadays than what they did in the past.

My thinking would go like this: Logically, bugs that have been around for a long time have had a long time to get caught. The more eyes we’ve had on the code, the fewer old bugs should be left and going forward we should more often catch more recently added bugs.

The time from a bug’s introduction into the code until the day we get a security report about it, should logically decrease over time.

What if it doesn’t?

First, let’s take a look at the data at hand. In the curl project we have so far reported in total 68 security problems over the project’s life time. The first 4 were not recorded correctly so I’ll discard them from my data here, leaving 64 issues to check out.

The graph below shows the time distribution. The all time leader so far is the issue reported to us on March 10 this year (2017), which was present in the code since the version 6.5 release done on March 13 2000. 6,206 days, just three days away from 17 whole years.

There are no less than twelve additional issues that lingered from more than 5,000 days until reported. Only 20 (31%) of the reported issues had been public for less than 1,000 days. The fastest report was reported on the release day: 0 days.

The median time from release to report is a whopping 2541 days.

When we receive a report about a security problem, we want the issue fixed, responsibly announced to the world and ship a new release where the problem is gone. The median time to go through this procedure is 26.5 days, and the distribution looks like this:

What stands out here is the TLS session resumption bypass, which happened because we struggled with understanding it and how to address it properly. Otherwise the numbers look all reasonable to me as we typically do releases at least once every 8 weeks. We rarely ship a release with a known security issue outstanding.

Why are very old issues still found?

I think partly because the tools are gradually improving that aid people these days to find things much better, things that simply wasn’t found very often before. With new tools we can find problems that have been around for a long time.

Every year, the age of the oldest parts of the code get one year older. So the older the project gets, the older bugs can be found, while in the early days there was a smaller share of the code that was really old (if any at all).

What if we instead count age as a percentage of the project’s life time? Using this formula, a bug found at day 100 that was added at day 50 would be 50% but if it was added at day 80 it would be 20%. Maybe this would show a graph where the bars are shrinking over time?

But no. In fact it shows 17 (27%) of them having been present during 80% or more of the project’s life time! The median issue had been in there during 49% of the project’s life time!

It does however make another issue the worst offender, as one of the issues had been around during 91% of the project’s life time.

This counts on March 20 1998 being the birth day. Of course we got no reports the first few years since we basically had no users then!

Specific or generic?

Is this pattern something that is specific for the curl project or can we find it in other projects too? I don’t know. I have not seen this kind of data being presented by others and I don’t have the same insight on such details of projects with an enough amount of issues to be interesting.

What can we do to make the bars shrink?

Well, if there are old bugs left to find they won’t shrink, because for every such old security issue that’s still left there will be a tall bar. Hopefully though, by doing more tests, using more tools regularly (fuzzers, analyzers etc) and with more eyeballs on the code, we should iron out our security issues over time. Logically that should lead to a project where newly added security problems are detected sooner rather than later. We just don’t seem to be at that point yet…

Caveat

One fact that skews the numbers is that we are much more likely to record issues as security related these days. A decade ago when we got a report about a segfault or something we would often just consider it bad code and fix it, and neither us maintainers nor the reporter would think much about the potential security impact.

These days we’re at the other end of the spectrum where we people are much faster to jumping to a security issue suspicion or conclusion. Today people report bugs as security issues to a much higher degree than they did in the past. This is basically a good thing though, even if it makes it harder to draw conclusions over time.

Data sources

When you want to repeat the above graphs and verify my numbers:

  • vuln.pm – from the curl web site repository holds security issue meta data
  • releaselog – on the curl web site offers release meta data, even as a CSV download on the bottom of the page
  • report2release.pl – the perl script I used to calculate the report until release periods.

curl bug bounty

The curl project is a project driven by volunteers with no financing at all except for a few sponsors who pay for the server hosting and for contributors to work on features and bug fixes on work hours. curl and libcurl are used widely by companies and commercial software so a fair amount of work is done by people during paid work hours.

This said, we don’t have any money in the project. Nada. Zilch. We can’t pay bug bounties or hire people to do specific things for us. We can only ask people or companies to volunteer things or services for us.

This is not a complaint – far from it. It works really well and we have a good stream of contributions, bugs reports and more. We are fortunate enough to make widely used software which gives our project a certain impact in the world.

Bug bounty!

Hacker One coordinates a bug bounty program for flaws that affects “the Internet”, and based on previously paid out bounties, serious flaws in libcurl match that description and can be deemed worthy of bounties. For example, 3000 USD was paid for libcurl: URL request injection (the curl advisory for that flaw) and 1000 USD was paid for libcurl duphandle read out of bounds (the corresponding curl advisory).

I think more flaws in libcurl could’ve met the criteria, but I suspect more people than me haven’t been aware of this possibility for bounties.

I was glad to find out that this bounty program pays out money for libcurl issues and I hope it will motivate people to take an extra look into the inner workings of libcurl and help us improve.

What qualifies?

The bounty program is run and administered completely out of control or insight from the curl project itself and I must underscore that while libcurl issues can qualify, their emphasis is on fixing vulnerabilities in Internet software that have a potentially big impact.

To qualify for this bounty, vulnerabilities must meet the following criteria:

  • Be implementation agnostic: the vulnerability is present in implementations from multiple vendors or a vendor with dominant market share. Do not send vulnerabilities that only impact a single website, product, or project.
  • Be open source: finding manifests itself in at least one popular open source project.

In addition, vulnerabilities should meet most of the following criteria:

  • Be widespread: vulnerability manifests itself across a wide range of products, or impacts a large number of end users.
  • Have critical impact: vulnerability has extreme negative consequences for the general public.
  • Be novel: vulnerability is new or unusual in an interesting way.

If your libcurl security flaw matches this, go ahead and submit your request for a bounty. If you’re at a company using libcurl at scale, consider joining that program as a bounty sponsor!

Yes C is unsafe, but…

I posted curl is C a few days ago and it raced on hacker news, reddit and elsewhere and got well over a thousand comments in those forums alone. The blog post has been read more than 130,000 times so far.

Addendum a few days later

Many commenters of my curl is C post struck down on my claim that most of our security flaws aren’t due to curl being written in C. It turned out into some sort of CVE counting game in some of the threads.

I think that’s missing the point I was trying to make. Even if 75% of them happened due to us using C, that fact alone would still not be a strong enough reason for me to reconsider our language of choice (at this point in time). We use C for a whole range of reasons as I tried to lay out there in spite of the security challenges the language brings. We know C has tricky corners and we know we are likely to do more mistakes going forward.

curl is currently one of the most distributed and most widely used software components in the universe, be it open or proprietary and there are easily way over three billion instances of it running in appliances, servers, computers and devices across the globe. Right now. In your phone. In your car. In your TV. In your computer. Etc.

If we then have had 40, 50 or even 60 security problems because of us using C, through-out our 19 years of history, it really isn’t a whole lot given the scale and time we’re talking about here.

Using another language would’ve caused at least some problems due to that language, plus I feel a need to underscore the fact that none of the memory safe languages anyone would suggest we should switch to have been around for 19 years. A portion of our security bugs were even created in our project before those alternatives you would suggest were available! Let alone as stable and functional alternatives.

This is of course no guarantee that there isn’t still more ugly things to discover or that we won’t mess up royally in the future, but who will throw the first stone when it comes to that? We will continue to work hard on minimizing risks, detecting problems early by ourselves and work closely together with everyone who reports suspected problems to us.

Number of problems as a measurement

The fact that we have 62 CVEs to date (and more will follow surely) is rather a proof that we work hard on fixing bugs, that we have an open process that deals with the problems in the most transparent way we can think of and that people are on their toes looking for these problems. You should not rate a project in any way purely based on the number of CVEs – you really need to investigate what lies behind the numbers if you want to understand and judge the situation.

Future

Let me clarify this too: I can very well imagine a future where we transition to another language or attempt various others things to enhance the project further – security wise and more. I’m not really ruling anything out as I usually only have very vague ideas of what the future might look like. I just don’t expect it to be happening within the next few years.

These “you should switch language” remarks are strangely enough from the backseat drivers of the Internet. Those who can tell us with confidence how to run our project but who don’t actually show us any code.

Languages

What perhaps made me most sad in the aftermath of said previous post, is everyone who failed to hold more than one thought at a time in their heads. In my post I wrote 800 words on some of the reasoning behind us sticking to the language C in the curl project. I specifically did not say that I dislike certain other languages or that any of those alternative languages are bad or should be avoided. Please friends, I wrote about why curl uses C. There are many fine languages out there and you should all use them as much as you possibly can, and I will too – but not in the curl project (at the moment). So no, I don’t hate language XXXX. I didn’t say so, and I didn’t imply it either. Don’t put that label on me, thanks.

6 hours of bliss

I sent out the release announcement for curl 7.52.0 exactly 07:59 in the morning of December 21, 2016. A Wednesday. We typically  release curl on Wednesdays out of old habit. It is a good release day.

curl 7.52.0 was just as any other release. Perhaps with a slightly larger set of new features than what’s typical for us. We introduce TLS 1.3 support, we now provide HTTPS-proxy support and the command line tool has this option called –fail-early that I think users will start to appreciate once they start to discover it. We also  announced three fixed security vulnerabilities. And some other good things.

I pushed the code to git, signed and uploaded the tarballs, I updated the info on the web site and I sent off that release announcement email and I felt good. Release-time good. That short feeling of relief and starting over on a new slate that I often experience these release days. Release days make me happy.

Any bets?

It is not unusual for someone to find a bug really fast after a release has shipped. As I was feeling good, I had to joke in the #curl IRC channel (42 minutes after that email):

08:41 <bagder> any bets on when the first bug report on the new release shows up? =)

Hours passed and maybe, just maybe there was not going to be any quick bugs filed on this release?

But of course. I wouldn’t write this blog post if it all had been nice and dandy. At 14:03, I got the email. 6 hours and 4 minutes since I wrote the 7.52.0 announcement email.

The email was addressed to the curl project security email list and included a very short patch and explanation how the existing code is wrong and needs “this fix” to work correctly. And it was entirely correct!

Now I didn’t feel that sense of happiness anymore. For some reason it was now completely gone and instead I felt something that involved sensations like rage, embarrassment and general tiredness. How the [beep] could this slip through like this?

I’ve done releases in the past that were broken to various extents but this is a sort of a new record and an unprecedented event. Enough time had passed that I couldn’t just yank the package from the download page either. I had to take it through the correct procedures.

What happened?

As part of a general code cleanup during this last development round, I changed all the internals to use a proper internal API to get random data and if libcurl is built with a TLS library it uses its provided API to get secure and safe random data. As a move to improve our use of random internally. We use this internal API for getting the nonce in authentication mechanisms such as Digest and NTLM and also for generating the boundary string in HTTP multipart formposts and more. (It is not used for any TLS or SSH level protocol stuff though.)

I did the largest part of the random overhaul of this in commit f682156a4f, just a little over a month ago.

Of course I made sure that all test cases kept working and there were no valgrind reports or anything, the code didn’t cause any compiler warnings. It did not generate any reports in the many clang-analyzer or Coverity static code analyzer runs we’ve done since. We run clang-analyzer daily and Coverity perhaps weekly.

But there’s a valgrind report just here!

Kamil Dudka, who sent the 14:03 email, got a valgrind error and that’s what set him off – but how come he got that and I didn’t?

The explanation consists of the following two conditions that together worked to hide the problem for us quite successfully:

  1. I (and I suppose several of the other curl hackers) usually build curl and libcurl “debug enabled”. This allows me to run more tests, do more diagnostics and debug it easier when I run into problems. It also provides a system with “fake random” so that we can actually verify that functions that otherwise use real random values generate the correct output when given a known random value… and yeah, this debug system prevented valgrind from detecting any problem!
  2. In the curl test suite we once had a problem with valgrind generating reports on third party libraries etc which then ended up as false positives. We then introduced a “valgrind report parser” that would detect if the report concerns curl or something else. It turns out this parser doesn’t detect the errors if curl is compiled without the cc’s -g command line option. And of course… curl and libcurl both build without -g by default!

The patch?

The vulnerable function basically uses this simple prototype. It is meant to get an “int” worth of random value stored in the buffer ‘rnd’ points to. That’s 4 bytes.

randit(struct Curl_easy *data, unsigned int *rnd)

But due to circumstances I can’t explain on anything other than my sloppy programming, I managed to write the function store random value in the actual pointer instead of the buffer it points to. So when the function returns, there’s nothing stored in the buffer. No 4 bytes of random. Just the uninitialized value of whatever happened to be there, on the stack.

The patch that fixes this problem looks like this (with some names shortened to simplify but keep the idea):

- res = random(data, (char *)&rnd, sizeof(rnd));
+ res = random(data, (char *)rnd, sizeof(*rnd));

So yeah. I introduced this security flaw in 7.52.0. We had it fixed in 7.52.1, released roughly 48 hours later.

(I really do not need comments on what other languages that wouldn’t have allowed this mistake or otherwise would’ve brought us world peace a long time ago.)

Make it not happen again

The primary way to make this same mistake not happen again easily, is that I’m removing the valgrind report parsing function from the test suite and we will now instead assume that valgrind reports will be legitimate and if not, work on suppressing the false positives in a better way.

References

This flaw is officially known as CVE-2016-9594

The real commit that fixed this problem is here, or as stand-alone patch.

The full security advisory for this flaw is here: https://curl.haxx.se/docs/adv_20161223.html

Facepalm photo by Alex E. Proimos.

xkcd: 221

curl security audit

“the overall impression of the state of security and robustness
of the cURL library was positive.”

I asked for, and we were granted a security audit of curl from the Mozilla Secure Open Source program a while ago. This was done by Mozilla getting a 3rd party company involved to do the job and footing the bill for it. The auditing company is called Cure53.

good_curl_logoI applied for the security audit because I feel that we’ve had some security related issues lately and I’ve had the feeling that we might be missing something so it would be really good to get some experts’ eyes on the code. Also, as curl is one of the most used software components in the world a serious problem in curl could have a serious impact on tools, devices and applications everywhere. We don’t want that to happen.

Scans and tests and all

We run static analyzers on the code frequently with a zero warnings tolerance. The daily clang-analyzer scan hasn’t found a problem in a long time and the Coverity once-every-few-weeks occasionally finds something suspicious but we always fix those immediately.

We have  thousands of tests and unit tests that we run non-stop on the code on multiple platforms running multiple build combinations. We also use valgrind when running tests to verify memory use and check for potential memory leaks.

Secrecy

The audit itself. The report and the work on fixing the issues were all done on closed mailing lists without revealing to the world what was really going on. All as our fine security process describes.

There are several downsides with fixing things secretly. One of the primary ones is that we get much fewer eyes on the fixes and there aren’t that many people involved when discussing solutions or approaches to the issues at hand. Another is that our test infrastructure is made for and runs only public code so the code can’t really be fully tested until it is merged into the public git repository.

The report

We got the report on September 23, 2016 and it certainly gave us a lot of work.

The audit report has now been made public and is a very interesting work if you’re into security, C code and curl hacking. I find the report very clear, well written and it spells out each problem very accurately and even shows proof of concept code snippets and exploit examples to drive the points home.

Quoted from the report intro:

As for the approach, the test was rooted in the public availability of the source code belonging to the cURL software and the investigation involved five testers of the Cure53 team. The tool was tested over the course of twenty days in August and September of 2016 and main efforts were focused on examining cURL 7.50.1. and later versions of cURL. It has to be noted that rather than employ fuzzing or similar approaches to validate the robustness of the build of the application and library, the latter goal was pursued through a classic source code audit. Sources covering authentication, various protocols, and, partly, SSL/TLS, were analyzed in considerable detail. A rationale behind this type of scoping pointed to these parts of the cURL tool that were most likely to be prone and exposed to real-life attack scenarios. Rounding up the methodology of the classic code audit, Cure53 benefited from certain tools, which included ASAN targeted with detecting memory errors, as well as Helgrind, which was tasked with pinpointing synchronization errors with the threading model.

They identified no less than twenty-three (23) potential problems in the code, out of which nine were deemed security vulnerabilities. But I’d also like to emphasize that they did also actually say this:

At the same time, the overall impression of the state of security and robustness of the cURL library was positive.

Resolving problems

In the curl security team we decided to downgrade one of the 9 vulnerabilities to a “plain bug” since the required attack scenario was very complicated and the risk deemed small, and two of the issues we squashed into treating them as a single one. That left us with 7 security vulnerabilities. Whoa, that’s a lot. The largest amount we’ve ever fixed in a single release before was 4.

I consider handling security issues in the project to be one of my most important tasks; pretty much all other jobs are down-prioritized in comparison. So with a large queue of security work, a lot of bug fixing and work on features basically had to halt.

You can get a fairly detailed description of our work on fixing the issues in the fix and validation log. The report, the log and the advisories we’ve already posted should cover enough details about these problems and associated fixes that I don’t feel a need to write about them much further.

More problems

Just because we got our hands full with an audit report doesn’t mean that the world stops, right? While working on the issues one by one to have them fixed we also ended up getting an additional 4 security issues to add to the set, by three independent individuals.

All these issues gave me a really busy period and it felt great when we finally shipped 7.51.0 and announced all those eleven fixes to the world and I could get a short period of relief until the next tsunami hits.

a single byte write opened a root execution exploit

Thursday, September 22nd 2016. An email popped up in my inbox.

Subject: ares_create_query OOB write

As one of the maintainers of the c-ares project I’m receiving mails for suspected security problems in c-ares and this was such a one. In this case, the email with said subject came from an individual who had reported a ChromeOS exploit to Google.

It turned out that this particular c-ares flaw was one important step in a sequence of necessary procedures that when followed could let the user execute code on ChromeOS from JavaScript – as the root user. I suspect that is pretty much the worst possible exploit of ChromeOS that can be done. I presume the reporter will get a fair amount of bug bounty reward for this. (Update: he got 100,000 USD for it.)

The setup and explanation on how this was accomplished is very complicated and I am deeply impressed by how this was figured out, tracked down and eventually exploited in a repeatable fashion. But bear with me. Here comes a very simplified explanation on how a single byte buffer overwrite with a fixed value could end up aiding running exploit code as root.

The main Google bug for this problem is still not open since they still have pending mitigations to perform, but since the c-ares issue has been fixed I’ve been told that it is fine to talk about this publicly.

c-ares writes a 1 outside its buffer

c-ares has a function called ares_create_query. It was added in 1.10 (released in May 2013) as an updated version of the older function ares_mkquery. This detail is mostly interesting because Google uses an older version than 1.10 of c-ares so in their case the flaw is in the old function. This is the two functions that contain the problem we’re discussing today. It used to be in the ares_mkquery function but was moved over to ares_create_query a few years ago (and the new function got an additional argument). The code was mostly unchanged in the move so the bug was just carried over. This bug was actually already present in the original ares project that I forked and created c-ares from, back in October 2003. It just took this long for someone to figure it out and report it!

I won’t bore you with exactly what these functions do, but we can stick to the simple fact that they take a name string as input, allocate a memory area for the outgoing packet with DNS protocol data and return that newly allocated memory area and its length.

Due to a logic mistake in the function, you could trick the function to allocate a too short buffer by passing in a string with an escaped trailing dot. An input string like “one.two.three\.” would then cause the allocated memory area to be one byte too small and the last byte would be written outside of the allocated memory area. A buffer overflow if you want. The single byte written outside of the memory area is most commonly a 1 due to how the DNS protocol data is laid out in that packet.

This flaw was given the name CVE-2016-5180 and was fixed and announced to the world in the end of September 2016 when c-ares 1.12.0 shipped. The actual commit that fixed it is here.

What to do with a 1?

Ok, so a function can be made to write a single byte to the value of 1 outside of its allocated buffer. How do you turn that into your advantage?

The Redhat security team deemed this problem to be of “Moderate security impact” so they clearly do not think you can do a lot of harm with it. But behold, with the right amount of imagination and luck you certainly can!

Back to ChromeOS we go.

First, we need to know that ChromeOS runs an internal HTTP proxy which is very liberal in what it accepts – this is the software that uses c-ares. This proxy is a key component that the attacker needed to tickle really badly. So by figuring out how you can send the correctly crafted request to the proxy, it would send the right string to c-ares and write a 1 outside its heap buffer.

ChromeOS uses dlmalloc for managing the heap memory. Each time the program allocates memory, it will get a pointer back to the request memory region, and dlmalloc will put a small header of its own just before that memory region for its own purpose. If you ask for N bytes with malloc, dlmalloc will use ( header size + N ) and return the pointer to the N bytes the application asked for. Like this:

malloced-area

With a series of cleverly crafted HTTP requests of various sizes to the proxy, the attacker managed to create a hole of freed memory where he then reliably makes the c-ares allocated memory to end up. He knows exactly how the ChromeOS dlmalloc system works and its best-fit allocator, how big the c-ares malloc will be and thus where the overwritten 1 will end up. When the byte 1 is written after the memory, it is written into the header of the next memory chunk handled by dlmalloc:

two-mallocs

The specific byte of that following dlmalloc header that it writes to, is used for flags and the lowest bits of size of that allocated chunk of memory.

Writing 1 to that byte clears 2 flags, sets one flag and clears the lowest bits of the chunk size. The important flag it sets is called prev_inuse and is used by dlmalloc to tell if it can merge adjacent areas on free. (so, if the value 1 simply had been a 2 instead, this flaw could not have been exploited this way!)

When the c-ares buffer that had overflowed is then freed again, dlmalloc gets fooled into consolidating that buffer with the subsequent one in memory (since it had toggled that bit) and thus the larger piece of assumed-to-be-free memory is partly still being in use. Open for manipulations!

freed-malloc

Using that memory buffer mess

This freed memory area whose end part is actually still being used opened up the play-field for more “fun”. With doing another creative HTTP request, that memory block would be allocated and used to store new data into.

The attacker managed to insert the right data in that further end of the data block, the one that was still used by another part of the program, mostly since the proxy pretty much allowed anything to get crammed into the request. The attacker managed to put his own code to execute in there and after a few more steps he ran whatever he wanted as root. Well, the user would have to get tricked into running a particular JavaScript but still…

I cannot even imagine how long time it must have taken to make this exploit and how much work and sweat that were spent. The report I read on this was 37 very detailed pages. And it was one of the best things I’ve read in a long while! When this goes public in the future, I hope at least parts of that description will become available for you as well.

A lesson to take away from this?

No matter how limited or harmless a flaw may appear at a first glance, it can serve a malicious purpose and serve as one little step in a long chain of events to attack a system. And there are skilled people out there, ready to figure out all the necessary steps.

Update: A detailed write-up about this flaw (pretty much the report I refer to above) by the researcher who found it was posted on Google’s Project Zero blog on December 14:
Chrome OS exploit: one byte overflow and symlinks.

No more heartbleeds please

caution-quarantine-areaAs a reaction to the whole Heartbleed thing two years ago, The Linux Foundation started its Core Infrastructure Initiative (CII for short) with the intention to help track down well used but still poorly maintained projects or at least detect which projects that might need help. Where the next Heartbleed might occur.

A bunch of companies putting in money to improve projects that need help. Sounds almost like a fairy tale to me!

Census

In order to identify which projects to help, they run their Census Project: “The Census represents CII’s current view of the open source ecosystem and which projects are at risk.

The Census automatically extracts a lot of different meta data about open source projects in order to deduce a “Risk Index” for each project. Once you’ve assembled such a great data trove for a busload of projects, you can sort them all based on that risk index number and then you basically end up with a list of projects in a priority order that you can go through and throw code at. Or however they deem the help should be offered.

Which projects will fail?

The old blog post How you know your Free or Open Source Software Project is doomed to FAIL provides such a way, but it isn’t that easy to follow programmatically. The foundation has its own 88 page white paper detailing its methods and algorithm.

Risk Index

  • A project without a web site gets a point
  • If the project has had four or more CVEs (publicly disclosed security vulnerabilities) since 2010, it receives 3 points and if fewer than four there’s a diminishing scale.
  • The number of contributors the last 12 months is a rather heavy factor, which thus could make the index grow old fairly quick. 3 contributors still give 4 points.
  • Popular packages based on Debian’s popcon get points.
  • If the project’s main language is C or C++, it gets two points.
  • Network “exposed” projects get points.
  • some additional details like dependencies and how many outstanding patches not accepted upstream that exist

All combined, this grades projects’ “risk” between 0 and 15.

Not high enough resolution

Assuming that a larger number of CVEs means anything bad is just wrong. Even the most careful and active projects can potentially have large amounts of CVEs. It means they disclose what they find and that people are actually reviewing code, finding problems and are reporting problems. All good things.

Sure, security problems are not good but the absence of CVEs in a project doesn’t say that the project is one bit more secure. It could just mean that nobody ever looked closely enough or that the project doesn’t deal with responsible disclosure of the problems.

When I look through the projects they have right now, I get the feeling the resolution (0-15) is too low and they’ve shied away from more aggressively handing out penalty based on factors we all recognize in abandoned/dead projects (some of which are decently specified in Tom Calloway’s blog post mentioned above).

The result being that the projects get a score that is mostly based on what kind of project it is.

But this said, they have several improvements to their algorithm already suggested in their issue tracker. I firmly believe this will improve over time.

The riskiest ?

The top three projects, the only ones that scores 13 right now are expat, procmail and unzip. All of them really small projects (source code wise) that have been around since a very long time.

curl, being the project I of course look out for, scores a 9: many CVEs (3), written in C (2), network exposure (2), 5+ apps depend on it (2). Seriously, based on these factors, how would you say the project is situated?

In the sorted list with a little over 400 projects, curl is rated #73 (at the time of this writing at least). Just after reportbug but before libattr1. [curl summary – which is mentioning a very old curl release]

But the list of projects mysteriously lack many projects. Like I couldn’t find neither c-ares nor libssh2. They may not be super big, but they’re used by a bunch of smaller and bigger projects at least, including curl itself.

The full list of projects, their meta-data and scores are hosted in their repository on github.

Benefits for projects near me

I can see how projects in my own backyard have gotten some good out of this effort.

I’ve received some really great bug reports and gotten handed security problems in curl by an individual who did his digging funded by this project.

I’ve seen how the foundation sponsored a test suite for c-ares since the project lacked one. Now it doesn’t anymore!

Badges!

In addition to that, the Linux Foundation has also just launched the CII Best Practices Badge Program, to allow open source projects to fill in a bunch of questions and if meeting enough requirements, they will get a “badge” to boast to the world as a “well run project” that meets current open source project best practices.

I’ve joined their mailing list and provided some of my thoughts on the current set of questions, as I consider a few of them to be, well, lets call them “less than optimal”. But then again, which project doesn’t have bugs? We can fix them!

curl is just now marked as “100% compliance” with all the best practices listed. I hope to be able to keep it like that even with future and more best practices added.