Category Archives: Open Source

Open Source, Free Software, and similar

webinar: Why everyone is using curl and you should too

I’m please to invite you to our live webinar, “Why everyone is using curl and you should too!”, hosted by wolfSSL. Daniel Stenberg (me!), founder and Chief Architect of curl, will be live and talking about why everyone is using curl and you should too!

This is planned to last roughly 20-30 minutes with a following 10 minutes Q&A.

Space is limited so please register early!

When: Jan 14, 2020 08:00 AM Pacific Time (US and Canada) (16:00 UTC)

Register in advance for this webinar!

After registering, you will receive a confirmation email containing information about joining the webinar.

Not able to attend? Register now and after the event you will receive an email with link to the recorded presentation.

The presentation

curl 7.68.0 with etags and BearSSL

The year is still young, and we’re here to really kick off 2020 with a brand new curl release! curl 7.68.0 is available at curl.haxx.se as always. Once again we’ve worked hard and pushed through another release cycle to bring you the very best we could do in the 63 days since 7.67.0.

(The previous release was said to be the 186th, but it turned out we’ve been off-by-one on the release counter for a while.)

Numbers

the 188th release
6 changes
63 days (total: 7,964)

124 bug fixes (total: 5,788)
193 commits (total: 25,124)
1 new public libcurl function (total: 82)
0 new curl_easy_setopt() option (total: 269)

3 new curl command line option (total: 229)
70 contributors, 32 new (total: 2,088)
31 authors, 13 new (total: 756)
1 security fixes (total: 93)
400 USD paid in Bug Bounties

Security Vulnerability

CVE-2019-15601: SMB access smuggling via FILE URL on Windows.

Simply put: you could provide a FILE:// URL to curl that could trick it to try to access a host name over SMB – on Windows machines. This could happen because Windows apparently always do this automatically if given the correct file name and curl had no specific filter to avoid it.

For this discovery and report, the curl Bug Bounty program has rewarded Fernando Muñoz 400 USD.

Changes

We ship a new TLS backend: BearSSL. The 14th.

We ship two new command line options for ETags.

We provide a new API call to wakeup “sleeping” libcurl poll calls.

We changed the default handling in libcurl with OpenSSL for verifying certificates. We now allow “partial chains” by default, meaning that you can use an intermediate cert to verify the server cert, not necessarily the whole chain to the root, like you did before. This brings the OpenSSL backend to work more similar to the other TLS backends, and we offer a new option for applications to switch back on the old behavior (CURLSSLOPT_NO_PARTIALCHAIN).

The progress callback has a new feature: if you return CURL_PROGRESSFUNC_CONTINUE from the callback, it will continue and call the internal progress meter.

The new command line option --parallel-immediate is added, and if used will make curl do parallel transfers like before 7.68.0. Starting with 7.68.0, curl will default to defer new connections and rather try to multiplex new transfer over an existing connection if more than one transfer is specified to be done from the same host name.

Bug-fixes

Some of my favorite fixes done since the last release include…

Azure CI and torture

This cycle we started running a bunch of CI tests on Azure Pipelines, both Linux and macOS tests. We also managed to get torture tests running thanks to the new shallow mode.

Azure seem to run faster and more reliable than Travis CI, so moving a few jobs over has made a total build run often complete in less total time now.

prefer multiplexing to using new connections

A regression was found that made the connection reuse logic in libcurl to prefer new connections to multiplexing more than what was actually intended and once fixed we should see libcurl-using application do more and better HTTP/2 multiplexing.

support for ECDSA and ed25519 knownhost keys with libssh2

libssh2 is the primary SSH backend people use with curl. While the library itself has supported these new “knownhost” keys for a while, we hadn’t previously adjusted curl to play nicely with them. Until now.

openssl: Revert to less sensitivity for SYSCALL errors

Another regression in the OpenSSL backend code made curl overly sensitive to some totally benign TLS messages which would cause a curl error when they should just have been silently handled and closed the connection cleanly.

openssl: set X509_V_FLAG_PARTIAL_CHAIN by default

The OpenSSL backend now behaves more similar to other TLS backends in curl and now accepts “partial” certificate chains. That means you don’t need to have the entire chain locally all the way to the root in order to verify a server certificate. Unless you set CURLSSLOPT_NO_PARTIALCHAIN to enforce that behavior.

parsedate: offer a getdate_capped() alternative

The date parser was extended to make curl handle dates beyond 2038 better on 32 bit systems, which primarily seems to happen with cookies. Now the parser understands that they’re too big and will use the max time value it can hold instead of failing and using a zero, as that would make the cookies into “session cookies” which would have slightly different behavior.

Video presenting the release

curl option-of-the-week

The curl command line tool has a lot of options. And I really mean a lot, as in more than two hundred of them. After the pending release, I believe the exact count is 229.

Starting now, I intend to write and blog about a specific curl command line option per week (or so). With details, examples, maybe history and whatever I can think of is relevant to the option. Quite possibly some weeks will just get shorter posts if I don’t have a lot to say about it.

curl command line option of the week; ootw.

This should keep me occupied for a while.

Options

This will later feature the list of options with links to the dedicate blog posts about them. If you have preferences of what options to cover, do let me know and I can schedule them early on, otherwise I plan to go over them in a mostly random order.

--raw
-m, --max-time
-k, --insecure
-U, --proxy-user
--keepalive-time
--mail-from
--ftp-pasv
--next
--quote
--happy-eyeballs-timeout-ms
--retry-max-time
--proxy-basic
--verbose
--append
--ipv4
--remote-name-all
-G, --get
-Y, --speed-limit
-r, --range
--socks5
--ftp-skip-pasv-ip
--connect-timeout
--remote-time
--silent

Restored complete curl changelog

For a long time, the curl changelog on the web site showed the history of changes in the curl project all the way back to curl 6.0. Released on September 13 1999. Older changes were not displayed.

The reason for this was always basically laziness. The page in its current form was initially created back in 2001 and then I just went back a little in history and filled up with a set of previous releases. Since we don’t have pre-1999 code in our git tree (because of a sloppy CVS import), everything before 1999 is a bit of manual procedure to extract so we left it like that.

Until now.

I decided to once and for all fix this oversight and make sure that we get a complete changelog from the first curl release all the way up until today. The first curl release was called 4.0 and was shipped on March 20, 1998.

Before 6.0 we weren’t doing very careful release notes and they were very chatty. I got the CHANGES file from the curl 6.0 tarball and converted them over to the style of the current changelog.

Notes on the restoration work

The versions noted as “beta” releases in the old changelog are not counted or mentioned as real releases.

For the released versions between 4.0 and 4.9 there are no release dates recorded, so I’ve “estimated” the release dates based on the knowledge that we did them fairly regularly and that they probably were rather spread out over that 200 day time span. They won’t be exact, but close enough.

Complete!

The complete changelog is now showing on the site, and in the process I realized that I have at some point made a mistake and miscounted the total number of curl releases. Off-by one actually. The official count now says that the next release will become the 188th.

As a bonus from this work, the “releaselog” page is now complete and shows details for all curl releases ever. (Also, note that we provide all that info in a CSV file too if you feel like playing with the data.)

There’s a little caveat on the updated vulnerability information there: when we note how far vulnerabilities go, we have made it a habit to sometimes mark the first vulnerable version as “6.0” if the bad code exists in the first ever git imported code – simply because going back further and checking isn’t easy and usually isn’t worth the effort because that old versions are not used anymore.

Therefore, we will not have accurate vulnerability information for versions before 6.0. The vulnerability table will only show versions back to 6.0 for that reason.

Many bug-fixes

With the complete data, we also get complete numbers. Since the birth of curl until version 7.67.0 we have fixed exactly 5,664 bugs shipped in releases, and there were exactly 7,901 days between the 4.0 the 7.67.0 releases.

curl receives 10K USD donation

The largest ever single-shot monetary donation to the curl project just happened when indeed.com graciously boosted our economy with 10,000 USD. (It happened before the new year but as I was away then I haven’t had the chance to blog about it until now.)

curl remains a small project with no major financial backing, with no umbrella organization (*) and no major company sponsorships.

Indeed’s FOSS fund

At Indeed they run this awesome fund for donating to projects they use. See Duane O’Brien’s FOSDEM 2019 talk about it.

How to donate to curl

curl is not a legal, registered organization or company or anything that can actually hold on to assets such as money. In any country.

What we do have however, is a “collective” over at Open Collective. Skip over there to make monetary donations. Over there you also get a complete look into previous donations with full transparency as to what funds we have and spend in the project.

Money donated to us will only be spent on project related activities.

Other ways to donate to the project is of course to donate time and effort. Allow your employees to help out or spend your own time at writing code, fixing bugs or extend the documentation. Every little bit helps and will be appreciated!

curl sponsors

curl is held upright and pushing forward much thanks to the continuous financial support from champion companies. The primary curl sponsors being Haxx, wolfSSL, Fastly and Teamviewer.

The curl project’s use of donated money

We currently have two primary expenses in the project that aren’t already covered by sponsors:

The curl bug bounty. We’ve already discussed internally that we should try to raise the amounts we hand out as rewards for the flaws we get reported going forward. We started out carefully since we didn’t want to drain the funds immediately, but time has shown that we haven’t received so many reports and the funds are growing. This means we will raise the rewards levels to encourage researchers to dig deeper.

The annual curl up developers conference. I’d like us to sponsor top contributors’ and possibly student developers’ travels to enable a larger attendance – and a social development team dinner! The next curl up will take place in Berlin in May 2020.

(*) = curl has previously applied for membership in both Software Freedom Conservancy and Linux Foundation as they seemed like suitable stewards, but the first couldn’t accept us due to work load and the latter didn’t even bother to respond. It’s not a big bother, just reality.

Summing up My 2019

2019 is special in my heart. 2019 was different than many other years to me in several ways. It was a great year! This is what 2019 was to me.

curl and wolfSSL

I quit Mozilla last year and in the beginning of the year I could announce that I joined wolfSSL. For the first time in my life I could actually work with curl on my day job. As the project turned 21 I had spent somewhere in the neighborhood of 15,000 unpaid spare time hours on it and now I could finally do it “for real”. It’s huge.

Still working from home of course. My commute is still decent.

HTTP/3

Just in November 2018 the name HTTP/3 was set and this year has been all about getting it ready. I was proud to land and promote HTTP/3 in curl just before the first browser (Chrome) announced their support. The standard is still in progress and we hope to see it ship not too long into next year.

curl

Focusing on curl full time allows a different kind of focus. I’ve landed more commits in curl during 2019 than any other year going back all the way to 2005. We also reached 25,000 commits and 3,000 forks on github.

We’ve added HTTP/3, alt-svc, parallel transfers in the curl tool, tiny-curl, fixed hundreds of bugs and much, much more. Ten days before the end of the year, I’ve authored 57% (over 700) of all the commits done in curl during 2019.

We ran our curl up conference in Prague and it was awesome.

We also (re)started our own curl Bug Bounty in 2019 together with Hackerone and paid over 1000 USD in rewards through-out the year. It was so successful we’re determined to raise the amounts significantly going into 2020.

Public speaking

I’ve done 28 talks in six countries. A crazy amount in front of a lot of people.

In media

Dagens Nyheter published this awesome article on me. I’m now shown on the internetmuseum. I was interviewed and highlighted in Bloomberg Businessweek’s “Open Source Code Will Survive the Apocalypse in an Arctic Cave” and Owen William’s Medium post The Internet Relies on People Working for Free.

When Github had their Github Universe event in November and talked about their new sponsors program on stage (which I am part of, you can sponsor me) this huge quote of mine was shown on the big screen.

Maybe not media, but in no less than two Mr Robot episodes we could see curl commands in a TV show!

Podcasts

I’ve participated in three podcast episodes this year, all in Swedish. Kompilator episode 5 and episode 8, and Kodsnack episode 331.

Live-streamed

I’ve toyed with live-streamed programming and debugging sessions. That’s been a lot of fun and I hope to continue doing them on and off going forward as well. They also made me consider and get started on my libcurl video tutorial series. We’ll see where that will end…

2020?

I figure it can become another fun year too!

Internetmuseum

The Internet Museum translated to Swedish becomes “internetmuseum“. It is a digital, online-only, museum that collects Internet- and Web related historical information, especially focused on the Swedish angle to all of this. It collects stories from people who did the things. The pioneers, the ground breakers, the leaders, the early visionaries. Most of their documentation is done in the form of video interviews.

I was approached and asked to be part of this – as an Internet Pioneer. Me? Internet Pioneer, really?

Internetmuseum’s page about me.

I’m humbled and honored to be considered and I certainly had a lot of fun doing this interview. To all my friends not (yet) fluent in Swedish: here’s your grand opportunity to practice, because this is done entirely in this language of curl founders and muppet chefs.

Photo from Internetmusuem

Back in the morning of October 18th 2019, two guys showed up as planned at my door and I let them in. One of my guests was a photographer who set up his gear in my living room for the interview, and then me and and guest number two, interviewer Jörgen, sat down and talked for almost an hour straight while being recorded.

The result can be seen here below.

The Science museum was first

This is in fact the second Swedish museum to feature me.

I have already been honored with a display about me, at the Tekniska Museet in Stockholm, the “Science museum” which has an exhibition about past Polhem Prize award winners.

Information displayed about me at the Swedish Science museum in Stockholm. I have a private copy of the cardboard posters.

(Top image by just-pics from Pixabay)

How randomly skipping tests made them better!

In the curl project we produce and ship a rock solid and reliable library for the masses, we must never exit, leak memory or do anything in an ungraceful manner. We must free all resources and error out nicely whatever problem we run into at whatever moment in the process.

To help us stay true to this, we have a way of testing we call “torture tests”. They’re very effective error path tests. They work like this:

Torture tests

They require that the code is built with a “debug” option.

The debug option adds wrapper functions for a lot of common functions that allocate and free resources, such as malloc, fopen, socket etc. Fallible functions provided by the system curl runs on.

Each such wrapper function logs what it does and can optionally either work just like it normally does or if instructed, return an error.

When running a torture test, the complete individual test case is first run once, the fallible function log is analyzed to count how many fallible functions this specific test case invoked. Then the script reruns that same test case that number of times and for each iteration it makes another of the fallible functions return error.

First make function 1 return an error. Then make function 2 return and error. Then 3, 4, 5 etc all the way through to the total number. Right now, a typical test case uses between 100 and 200 such function calls but some have magnitudes more.

The test script that iterates over these failure points also verifies that none of these invokes cause a memory leak or a crash.

Very slow

Running many torture tests takes a long time.

This test method is really effective and finds a lot of issues, but as we have thousands of tests and this iterative approach basically means they all need to run a few hundred times each, completing a full torture test round takes many hours even on the fastest of machines.

In the CI, most systems don’t allow jobs to run more than an hour.

The net result: the CI jobs only run torture tests on a few selected test cases and virtually no human ever runs the full torture test round due to lack of patience. So most test cases end up never getting “tortured” and therefore we miss out verifying error paths even though we can and we have the tests for it!

But what if…

It struck me that when running these torture tests on a large amount of tests, a lot of error paths are actually identical to error paths that were already tested and will just be tested again and again in subsequent tests.

If I could identify the full code paths that were already tested, we wouldn’t have to test them again. But getting that knowledge would require insights that our test script just doesn’t have and it will be really hard to make portable to even a fraction of the platforms we run and test curl on. Not the most feasible idea.

I went with something much simpler.

I simply estimate that most test cases actually have many code paths in common with other test cases. By randomly skipping a few iterations on each test, those skipped code paths might still very well be tested in another test. As long as the skipping is random and we do a large number of tests, chances are we cover most paths anyway. I say most because it certainly will not be all.

Random skips

In my first shot at this (after I had landed to change that allows me to control the torture tests this way) I limited the number of errors to 40 per test case. Now suddenly the CI machines can actually blaze through the test cases at a much higher speed and as a result, they ran torture tests on tests we hadn’t tortured in a long time.

I call this option to the runtests.pl script --shallow.

Already on this my first attempt in doing this, I struck gold and the script highlighted code paths that would create memory leaks or even crashes!

As a direct result of more test cases being tortured, I found and fixed nine independent bugs in curl already before I could land this in the master branch, and there seems to be more failures that pop up after the merge too! The randomness involved may of course delay the detection of some problems.

Room for polishing

The test script right now uses a fixed random seed so that repeated invokes will make it work exactly the same. Which is good when you want to reproduce it elsewhere. It is bad in the way that each test will have the exact same tests skipped every test round – as long as the set of fallible functions are unmodified.

The seed can be set by a command line argument so I imagine a future improvement would be to set the random seed based on the git commit hash at the point where the tests are run, or something. That way, torture tests on subsequent commits would get a different random spread.

Alternatively, I will let the CI systems use a true random seed to make it test a different set every time independent of git etc – as when it detects an error the informational output will still be enough for a user to reproduce the problem without the need of a seed.

Further, I’ve started out running --shallow=40 (on the Ubuntu version) which is highly unscientific and arbitrary. I will experiment altering this amount both up and down a bit to see what I learn from that.

Torture via strace?

Another idea that’s been brewing in my head for a while but I haven’t yet actually attempted to do this.

The next level of torture testing is probably to run the tests with strace and use its error injection ability, as then we don’t even need to build a debug version of our code and we don’t need to write wrapper code etc.

Credits

Dice image by Erik Stein from Pixabay

Reporting documentation bugs in curl got easier

After I watched a talk by Marcus Olsson about docs as code (at foss-sthlm on December 12 2019), I got inspired to provide links on the curl web site to make it easier for users to report bugs on documentation.

Starting today, there are two new links on the top right side of all libcurl API function call documentation pages.

File a bug about this page – takes the user directly to a new issue in the github issue tracker with the title filled in with the name of the function call, and the label preset to ‘documentation’. All there’s left is for the user to actually provide a description of the problem and pressing submit (and yeah, a github account is also required).

View man page source – instead takes the user over to browsing that particular man pages’s source file in the github source code repository.

Since this is also already live on the site, you can also browse the documentation there. Like for example the curl_easy_init man page.

If you find mistakes or omissions in the docs – no matter how big or small – feel free to try out these links!

Credits

Image by Pexels from Pixabay

BearSSL is curl’s 14th TLS backend

curl supports more TLS libraries than any other software I know of. The current count stops at 14 different ones that can be used to power curl’s TLS-based protocols (HTTPS primarily, but also FTPS, SMTPS, POP3S, IMAPS and so on).

The beginning

The very first curl release didn’t have any TLS support, but already in June 1998 we shipped the first version that supported HTTPS. Back in those days the protocol was still really SSL. The library we used then was called SSLeay. (No, I never understood how that’s supposed to be pronounced)

The SSLeay library became OpenSSL very soon after but the API was brought along so curl supported it from the start.

More than one

In the spring of 2005 we merged the first support for building curl with a different TLS library. It was GnuTLS, which comes under a different license than OpenSSL and had a slightly different feature set. The race had began.

BearSSL

A short while ago and in time to get shipped in the coming 7.68.0 release (set to ship on January 8th 2020), the 14th TLS backend was merged into the curl source tree in the shape of support for BearSSL. BearSSL is a TLS library aimed at smaller devices and is perhaps lacking a bit in features (like no TLS 1.3 for example) but has still been requested by users in the past.

Multi-SSL

Since September 2017, you can even build libcurl to support one or more TLS libraries in the same build. When built that way, users can select which TLS backend curl should use at each start-up. A feature used and appreciated by for example git for Windows.

Time line

Below is an attempt to visualize how curl has grown in this area. Number of supported TLS backends over time, from the first curl release until today. The image comes from a slide I intend to use in a future curl presentation. A notable detail on this graph is the removal of axTLS support in late 2018 (removed in 7.63.0). PolarSSL is targeted to meet the same destiny in February 2020 since it gets no updates anymore and has in practice already been replaced by mbedTLS.

Click the image to enjoy the full resolution version!

QUIC and TLS

If you’ve heard me talk about HTTP/3 (h3) and QUIC (like my talk at Full Stack Fest 2019), you already know that QUIC needs new APIs from the TLS libraries.

For h3 support to become reality in curl shipped in distros etc, the TLS library curl is set to use needs to provide a QUIC compatible API and the QUIC/h3 library curl uses then needs to support that.

It is likely that some TLS libraries are going to be fast with providing such APIs and some are going be (very) slow. Their particular individual abilities combined with the desire to ship curl with h3 support is likely going to affect what TLS library you will see used by curl in your distro will affect what TLS library you will build your own curl builds to use in the future.

Credits

The recently added BearSSL backend was written by Michael Forney. Top image by LEEROY Agency from Pixabay