Tag Archives: bugfixes

10,000 bugfixes in 10,000 days

We keep track of bugfixes done to curl. All bugfixes ever done. A while back I also went back and populated the lists with details from all the releases to the pre-cursors of curl: httpget and urlget. All and every change made since November 1996.

The bugfixes are all listed on the curl changelog page. The bugfix counter can be found on the release log page.

The rate of bugfixes has been increasing over the years. I think in terms of actual bugs being squashed and fixes being merged, but also partly because we have gotten much better at keeping meticulous logs and do better release notes.

A bugfix can be a single letter typo fix in a document, a spell-fix in a source code comment or it can fix a serious security vulnerability. From high to low, from important to a small subtle detail. The counter does not value, it is just a counter.

When we shipped the recent curl version, 8.6.0, the counter said 9,888 shipped bugfixes. The other day, when 8.7.0 and 8.7.1 shipped, the counter was upped to surpass 10,000 and now says: 10,051.

These bugfixes happened thanks to 3,134 contributors, out of which 1,252 persons have authored commits merged into the curl source repository.

This journey started with httpget. The first ever release of httpget 0.1 that was made public happened on November 11 1996. Today, that is exactly 10,000 days ago.

We only have git commits stored from late 1999, but that counter is almost at 32,000 now. Making a little less than every third commit ever done a logged bugfix.

How I do the release notes

This is highly scripted task.

It starts with: every commit of the RELEASE-NOTES file in git that makes it up-to-date needs to use the single word “synced” as commit message. The commit that syncs it.

Further, I have an alias in my ~/.gitconfig file that says:

[alias]
  latest = log @^{/RELEASE-NOTES:.synced}..

This allows me to invoke git latest to get a list of the latest changes done in the repository since I most recently synced the RELEASE-NOTES.

Sync

When that list starts to grow, typically roughly every four to ten days something, I invoke the release-notes.pl script we have in the curl git repository. This scripts gets all the changes since the most previous sync and inserts them into the RELEASE-NOTES file, complete with a correct reference to the associated GitHub issue or pull-request.

The actual bullet point text it inserts comes from the first line of the corresponding commit message. The links comes from parsing commit messages and finding keywords and links according to how the project dictates how they should be used. This is one reason why it is important to do good commit messages following the correct style in the project. It makes the release notes job easier and the results better.

The script does not know what’s a change, what’s a bugfix or what’s not even worthy of mentioning. It just adds all changes to top the list of changes (and includes a convenient separator so that it is easy to spot the newly added ones) and the next step for me is then to manually go over the list and delete the ones that aren’t intended to be mentioned there and move the few changes into the correct section of the release notes.

I run release-notes.pl cleanup which then sorts the lists alphabetically and removes dangling references (which are leftovers from the lines I removed).

Contributors

We keep track, try to say thanks to and give credit to every contributor that helps out in the project. No matter the size of the contribution. When someone has reported a bug. the reporter is credited in the commit message of the bugfix. We also give credit to co-authors and people assisting in solving the issues etc. To make sure we mention and give credit to the contributors and keep track of them beyond what git itself does.

We can also add names manually to the release notes file, like if we had forgotten to mention them in a commit message. Then I run the contributors.sh script. It reads the list of names currently in the RELEASE-NOTES and then scans all the git changes since the previous sync and generates an updated list of all git authors, committers and everyone else who are credited, and it outputs an updated list (and contributor counter). That updated list is then pasted into RELEASE-NOTES.

In recent years, in a normal eight week release cycle, we typically feature 60 to 80 named contributors in this file. Of course, top contributors in the project tend to get mentioned in just about every release notes file, as they just have to help out and contribute once every 56 days to appear there.

On release days, we update the docs/THANKS file (using the contrithanks.sh script) where all contributors who ever helped out are mentioned and saved for the future. That list of people is also made visible on the thanks page on the curl website.

Counters

At the top of the release notes we have a few counters displayed. It looks similar to:

Public curl releases:         255
Command line options:         258
curl_easy_setopt() options:   304
Public functions in libcurl:  93
Contributors:                 3119

After the list of contributors have been pasted into the current release notes, I invoke the delta script, which shows a lot of curl git repository statistics since the most previous release tag. That input includes the numbers shown in the release notes top, so if they are different now I update the release notes accordingly with the updated data. Most frequently, the contributor counter has been bumped.

Commit

  • included the lists of bugfixes and changes
  • updated contributors
  • updated the counters

The RELEASE-NOTES file is then committed to git using “synced” as commit message. Until it is time to sync it again.

Because of this work, we can offer the pending release notes on the website, as it is the work in progress file with the changes we have already logged that is targeted to be included in the next release.

Release

Of course, on release days I make sure to do a final update so that all the last changes get into the file before release as then the file ends up in the release tarball, that is locked, signed and stored the archives.

After a release, I just manually erase the lists from the file and clear the list of names and commit. Then we start rebuilding it again with new stuff in the new release cycle.

5 years on OSS-Fuzz

On July 1st 2017, exactly five years ago today, the OSS-Fuzz project “adopted” curl into their program and started running fuzz tests against it.

OSS-Fuzz is a project run by Google and they do fuzzing on a large amount of open source projects: OSS-Fuzz aims to make common open source software more secure and stable by combining modern fuzzing techniques with scalable, distributed execution.

That initial adoption of curl into OSS-Fuzz was done entirely by Google themselves and its fuzzing integration was rough and not ideal but it certainly got the ball rolling.

Later in in the fall of 2017, Max Dymond stepped up and seriously improved the curl-fuzzer so that it would better test protocols and libcurl options deeper and to a higher degree. (Max subsequently got a grant from Google for his work.)

Fuzzing curl

Fuzzing is really the next-level thing all projects should have a go at when there is no compiler warning left to fix and the static code analyzers all report zero defects, because fuzzing has a way to find silly mistakes, off-by-ones etc in a way no analyzers can and that is very hard for human reviewers to match.

curl is a network tool and library written in C, it is meant to handle non-trusted data properly and doing it wrongly can have serious repercussions.

For the curl project, OSS-Fuzz has found several silly things and bad code paths especially after that rework Max put in. We have no less than 6 curl CVEs registered in 2017 and 2018 giving OSS-Fuzz (at least partial) credit for their findings.

In the first year or two, OSS-Fuzz truly kept us on our toes and we fixed numerous other flaws as well that weren’t CVE worthy but still bugs. At least 36 separate ones to date.

The beauty of OSS-Fuzz

For a little Open Source project such as curl, this service is great and in case you never worked with it let me elaborate what makes me enjoy it so much.

  1. Google runs all the servers and keeps the service running. We don’t have to admin it, spend energy on maintaining etc.
  2. They automatically pull the latest curl code from our master branch and keep fuzzing it. All day, every day, non-stop.
  3. When they find a problem, we get a bug submitted and an email sent. The bug report they create contains stack traces and a minimized and reproducible test case. In most cases, that means I can download a small file that is often less than a hundred bytes, and run my locally built fuzzer code with this input and see the same issue happen in my end.
  4. The bug they submit is initially closed for the public and only accessible to select team of curl developers.
  5. 90 days after the initial bug was filed, it is made public automatically.

The number of false positives are remarkably low, meaning that almost every time the system emails me, it spotted a genuine problem.

Fuzzing per commit

In 2020, the OSS-Fuzz team introduced CIFuzz, which allows us to run a little bit of fuzzing – for a limited time – on a PR as part of the CI pipeline. Using this, we catch the most obvious mistakes even before we merge the code into our master branch!

After the first few years of us fixing bad code paths, and with the introduction of CIFuzz, we nowadays rarely get reports from OSS-Fuzz anymore. And when we do, we have it complain on code that hasn’t been released yet. The number of CVEs detected in curl by OSS-Fuzz has therefore seemingly stopped at 6.

Taking fuzzing further

After five years of fuzzing, we would benefit from extending and improving the “hooks” where and how the fuzzing is made so that we can help it find and reach code paths with junk that it so far hasn’t reached. There are of course still potentially many undetected flaws in curl remaining because of the limited set of entry point we have for the fuzzer.

I hope OSS-Fuzz continues and lives on for a long time. It certainly helps curl a lot!

Thanks!

curl + hackerone = TRUE

There seems to be no end to updated posts about bug bounties in the curl project these days. Not long ago I mentioned the then new program that sadly enough was cancelled only a few months after its birth.

Now we are back with a new and refreshed bug bounty program! The curl bug bounty program reborn.

This new program, which hopefully will manage to survive a while, is setup in cooperation with the major bug bounty player out there: hackerone.

Basic rules

If you find or suspect a security related issue in curl or libcurl, report it! (and don’t speak about it in public at all until an agreed future date.)

You’re entitled to ask for a bounty for all and every valid and confirmed security problem that wasn’t already reported and that exists in the latest public release.

The curl security team will then assess the report and the problem and will then reward money depending on bug severity and other details.

Where does the money come from?

We intend to use funds and money from wherever we can. The Hackerone Internet Bug Bounty program helps us, donations collected over at opencollective will be used as well as dedicated company sponsorships.

We will of course also greatly appreciate any direct sponsorships from companies for this program. You can help curl getting even better by adding funds to the bounty program and help us reward hard-working researchers.

Why bounties at all?

We compete for the security researchers’ time and attention with other projects, both open and proprietary. The projects that can help put food on these researchers’ tables might have a better chance of getting them to use their tools, time, skills and fingers to find our problems instead of someone else’s.

Finding and disclosing security problems can be very time and resource consuming. We want to make it less likely that people give up their attempts before they find anything. We can help full and part time security engineers sustain their livelihood by paying for the fruits of their labor. At least a little bit.

Only released code?

The state of the code repository in git is not subject for bounties. We need to allow developers to do mistakes and to experiment a little in the git repository, while we expect and want every actual public release to be free from security vulnerabilities.

So yes, the obvious downside with this is that someone could spot an issue in git and decide not to report it since it doesn’t give any money and hope that the flaw will linger around and ship in the release – and then reported it and claim reward money. I think we just have to trust that this will not be a standard practice and if we in fact notice that someone tries to exploit the bounty in this manner, we can consider counter-measures then.

How about money for the patches?

There’s of course always a discussion as to why we should pay anyone for bugs and then why just pay for reported security problems and not for heroes who authored the code in the first place and neither for the good people who write the patches to fix the reported issues. Those are valid questions and we would of course rather pay every contributor a lot of money, but we don’t have the funds for that. And getting funding for this kind of dedicated bug bounties seem to be doable, where as a generic pay contributors fund is trickier both to attract money but it is also really hard to distribute in an open project of curl’s nature.

How much money?

At the start of this program the award amounts are as following. We reward up to this amount of money for vulnerabilities of the following security levels:

Critical: 2,000 USD
High: 1,500 USD
Medium: 1,000 USD
Low: 500 USD

Depending on how things go, how fast we drain the fund and how much companies help us refill, the amounts may change over time.

Found a security flaw?

Report it!

Coverity’s open source bug report

The great guys at scan.coverity.com published their Open Source Report 2008 in which they detail findings about source code they’ve monitored and how quality and bug density etc have changed over time since they started scanning over 250 popular open source projects. curl is one of the projects included.

Some highlights from the report:

  • curl is mentioned as one of the (few) projects that fixed all defects identified by coverity
  • from their start, the average defect frequency has gone down from one defect per 3333 lines of code to one defect per 4000 lines
  • they find no support to backup the old belief that there’s a correlation between function length and bug count
  • the average function length is 66 lines

And the top-5 most frequently detected defects are:

  1. NULL Pointer Dereference 28%
  2. Resource Leak 26%
  3. Unintentional Ignored Expressions 10%
  4. Use Before Test (NULL) 8%
  5. Buffer Overrun (statically allocated) 6%

For all details and more fun reading, see the full Open Source Report 2008 (1MB pdf)

curl and libcurl 7.18.1

Mainly thanks to the 22 friends named in the release notes, curl and libcurl 7.18.1 was released today with the news and fixes that should prove this the best curl and libcurl versions ever – I guess we always have to believe that our latest is the greatest, why else would we release it?

cURL

The release notes identifies 23 bug fixes we did during the two months since the last release, and the news we introduce include these goodies:

  • added support for “HttpOnly” cookies
  • ‘make ca-bundle’ downloads and generates an updated ca bundle file
  • we no longer distribute or install a ca cert bundle
  • SSLv2 is now disabled by default for SSL operations
  • the test509-style setting URL in callback is officially no longer supported
  • support a full chain of certificates in a given PKCS12 certificate
  • resumed transfers work with SFTP
  • added type checking macros for curl_easy_setopt() and curl_easy_getinfo(), watch out for new warnings in code using libcurl (needs gcc-4.3 and currently only works in C mode)
  • curl_easy_setopt(), curl_easy_getinfo(), curl_share_setopt() and curl_multi_setopt() uses are now checked to use exactly three arguments
  • –with-ca-path=DIR configure option allows to set an openSSL CApath instead of a default ca bundle.

As usual, you can download it here.

curl on scan.coverity.com

On scan.coverity.com, the nice guys at Coverity run scans on open source projects to check for flaws in their source code. Their list currently includes 265 projects, and curl is one of them. I have only good words to say about their scanning, as they found no less than 27 flaws in curl 7.16.1 and only one of them was a false positive. All the others were valid and true flaws that we could fix. I don’t think anyone was any serious security risk, but still. 26 bugs detected in one go.

On January 8th 2008, Coverity announced their “rung 2” for eleven projects that had zero flaws left in rung 1 and the rung 2 projects get an upgraded analysis. curl was also at zero flaws left, but it isn’t clear to me what else we could to do to reach rung 2 or even how we can get them to do a follow-up scan on a newer release since 7.16.1 is quite old by now and with all the changes in the code over time there’s always the risk new nasty bugs have crept in… So we’re at rung 1 still with no recent release scanned.