Category Archives: Open Source

Open Source, Free Software, and similar

Happy 21st, curl!

Another year has passed. The curl project is now 21 years old.

I think we can now say that it is a grown-up in most aspects. What have we accomplished in the project in these 21 years?

We’ve done 179 releases. Number 180 is just a week away.

We estimate that there are now roughly 6 billion curl installations world-wide. In phones, computers, TVs, cars, video games etc. With 4 billion internet users, that’s like 1.5 curl installation per Internet connected human on earth

669 persons have authored patches that was merged.

The curl source code now consists of 160,000 lines of code made in over 24,000 commits.

1,927 persons have helped out so far. With code, bug reports, advice, help and more.

The curl repository also hosts 429 man pages with a total of 36,900 lines of documentation. That count doesn’t even include the separate project Everything curl which is a dedicated book on curl with an additional 10,165 lines.

In this time we have logged more than 4,900 bug-fixes, out of which 87 were security related problems.

We keep doing more and more CI builds, auto-builds, fuzzing and static code analyzing on our code day-to-day and non-stop. Each commit is now built and tested in over 50 different builds and environments and are checked by at least four different static code analyzers, spending upwards 20-25 CPU hours per commit.

We have had 2 curl developer conferences, with the third curl up about to happen this coming weekend in Prague, Czech Republic.

The curl project was created by me and I’m still the lead developer. Up until today, almost 60% of the commits in the project have my name on them. I have done most commits per month in the project every single month since August 2015, and in 186 months out of the 232 months for which we have logged data.

Looking for the Refresh header

The other day someone filed a bug on curl that we don’t support redirects with the Refresh header. This took me down a rabbit hole of Refresh header research and I’ve returned to share with you what I learned down there.

tl;dr Refresh is not a standard HTTP header.

As you know, an HTTP redirect is specified to use a 3xx response code and a Location: header to point out the new URL (I use the term URL here but you know what I mean). This has been the case since RFC 1945 (HTTP/1.0). According to an old mail from Roy T Fielding (dated June 1996), Refresh “didn’t make it” into that spec. That was the first “real” HTTP specification. (And the HTTP we used before 1.0 didn’t even have headers!)

The little detail that it never made it into the 1.0 spec or any later one, doesn’t seem to have affected the browsers. Still today, browsers keep supporting the Refresh header as a sort of Location: replacement even though it seems to never have been present in a HTTP spec.

In good company

curl is not the only HTTP library that doesn’t support this non-standard header. The popular python library requests apparently doesn’t according to this bug from 2017, and another bug was filed about it already back in 2011 but it was just closed as “old” in 2014.

I’ve found no support in wget or wget2 either for this header.

I didn’t do any further extensive search for other toolkits’ support, but it seems that the browsers are fairly alone in supporting this header.

How common is the the Refresh header?

I decided to make an attempt to figure out, and for this venture I used the Rapid7 data trove. The method that data is collected with may not be the best – it scans the IPv4 address range and sends a HTTP request to each TCP port 80, setting the IP address in the Host: header. The result of that scan is 52+ million HTTP responses from different and current HTTP origins. (Exactly 52254873 responses in my 59GB data dump, dated end of February 2019).

Results from my scans

  • Location is used in 18.49% of the responses
  • Refresh is used in 0.01738% of the responses (exactly 9080 responses featured them)
  • Location is thus used 1064 times more often than Refresh
  • In 35% of the cases when Refresh is used, Location is also used
  • curl thus handles 99.9939% of the redirects in this test

Additional notes

  • When Refresh is the only redirect header, the response code is usually 200 (with 404 being the second most)
  • When both headers are used, the response code is almost always 30x
  • When both are used, it is common to redirect to the same target and it is also common for the Refresh header value to only contain a number (for the number of seconds until “refresh”).

Refresh from HTML content

Redirects can also be done by meta tags in HTML and sending the refresh that way, but I have not investigated how common as that isn’t strictly speaking HTTP so it is outside of my research (and interest) here.

In use, not documented, not in the spec

Just another undocumented corner of the web.

When I posted about these findings on the HTTPbis mailing list, it was pointed out that WHATWG mentions this header in their iana page. I say mention because calling that documenting would be a stretch…

It is not at all clear exactly what the header is supposed to do and it is not documented anywhere. It’s not exactly a redirect, but almost?

Will/should curl support it?

A decision hasn’t been made about it yet. With such a very low use frequency and since we’ve managed fine without support for it so long, maybe we can just maintain the situation and instead argue that we should just completely deprecate this header use from the web?

Updates

After this post first went live, I got some further feedback and data that are relevant and interesting.

  • Yoav Wiess created a patch for Chrome to count how often they see this header used in real life.
  • Eric Lawrence pointed out that IE had several incompatibilities in its Refresh parser back in the day.
  • Boris pointed out (in the comments below) the WHATWG documented steps for handling the header.
  • The use of <meta> tag refresh in contents is fairly high. The Chrome counter says almost 4% of page loads!

alt-svc in curl

The RFC 7838 was published already in April 2016. It describes the new HTTP header Alt-Svc, or as the title of the document says HTTP Alternative Services.

HTTP Alternative Services

An alternative service in HTTP lingo is a quite simply another server instance that can provide the same service and act as the same origin as the original one. The alternative service can run on another port, on another host name, on another IP address, or over another HTTP version.

An HTTP server can inform a client about the existence of such alternatives by returning this Alt-Svc header. The header, which has an expiry time, tells the client that there’s an optional alternative to this service that is hosted on that host name, that port number using that protocol. If that client is a browser, it can connect to the alternative in the background and if that works out fine, continue to use that host for the rest of the time that alternative is said to work.

In reality, this header becomes a little similar to the DNS records SRV or URI: it points out a different route to the server than what the A/AAAA records for it say.

The Alt-Svc header came into life as an attempt to help out with HTTP/2 load balancing, since with the introduction of HTTP/2 clients would suddenly use much more persistent and long-living connections instead of the very short ones used for traditional HTTP/1 web browsing which changed the nature of how connections are done. This way, a system that is about to go down can hint the clients on how to continue using the service, elsewhere.

Alt-Svc: h2="backup.example.com:443"; ma=2592000;

HTTP upgrades

Once that header was published, the by then already existing and deployed Google QUIC protocol switched to using the Alt-Svc header to hint clients (read “Chrome users”) that “hey, this service is also available over gQUIC“. (Prior to that, they used their own custom alternative header that basically had the same meaning.)

This is important because QUIC is not TCP. Resources on the web that are pointed out using the traditional HTTPS:// URLs, still imply that you connect to them using TCP on port 443 and you negotiate TLS over that connection. Upgrading from HTTP/1 to HTTP/2 on the same connection was “easy” since they were both still TCP and TLS. All we needed then was to use the ALPN extension and voila: a nice and clean version negotiation.

To upgrade a client and server communication into a post-TCP protocol, the only official way to it is to first connect using the lowest common denominator that the HTTPS URL implies: TLS over TCP, and only once the server tells the client what more there is to try, the client can go on and try out the new toys.

For HTTP/3, this is the official way for HTTP servers to tell users about the availability of an HTTP/3 upgrade option.

curl

I want curl to support HTTP/3 as soon as possible and then as I’ve mentioned above, understanding Alt-Svc is a key prerequisite to have a working “bootstrap”. curl needs to support Alt-Svc. When we’re implementing support for it, we can just as well support the whole concept and other protocol versions and not just limit it to HTTP/3 purposes.

curl will only consider received Alt-Svc headers when talking HTTPS since only then can it know that it actually speaks with the right host that has the authority enough to point to other places.

Experimental

This is the first feature and code that we merge into curl under a new concept we do for “experimental” code. It is a way for us to mark this code as: we’re not quite sure exactly how everything should work so we allow users in to test and help us smooth out the quirks but as a consequence of this we might actually change how it works, both behavior and API wise, before we make the support official.

We strongly discourage anyone from shipping code marked experimental in production. You need to explicitly enable this in the build to get the feature. (./configure –enable-alt-svc)

But at the same time we urge and encourage interested users to test it out, try how it works and bring back your feedback, criticism, praise, bug reports and help us make it work the way we’d like it to work so that we can make it land as a “normal” feature as soon as possible.

Ship

The experimental alt-svc code has been merged into curl as of commit 98441f3586 (merged March 3rd 2019) and will be present in the curl code starting in the public release 7.64.1 that is planned to ship on March 27, 2019. I don’t have any time schedule for when to remove the experimental tag but ideally it should happen within just a few release cycles.

alt-svc cache

The curl implementation of alt-svc has an in-memory cache of known alternatives. It can also both save that cache to a text file and load that file back into memory. Saving the alt-svc cache to disk allows it to survive curl invokes and to truly work the way it was intended. The cache file stores the expire timestamp per entry so it doesn’t matter if you try to use a stale file.

curl –alt-svc

Caveat: I now talk about how a feature works that I’ve just above said might change before it ships. With the curl tool you ask for alt-svc support by pointing out the alt-svc cache file to use. Or pass a “” (empty name) to make it not load or save any file. It makes curl load an existing cache from that file and at the end, also save the cache to that file.

curl also already since a long time features fancy connection options such as –resolve and –connect-to, which both let a user control where curl connects to, which in many cases work a little like a static poor man’s alt-svc. Learn more about those in my curl another host post.

libcurl options for alt-svc

We start out the alt-svc support for libcurl with two separate options. One sets the file name to the alt-svc cache on disk (CURLOPT_ALTSVC), and the other control various aspects of how libcurl should behave in regards to alt-svc specifics (CURLOPT_ALTSVC_CTRL).

I’m quite sure that we will have reason to slightly adjust these when the HTTP/3 support comes closer to actually merging.

commercial curl support!

If you want commercial support, ports of curl to other operating systems or just instant help to fix your curl related problems, we’re here to help. Get in touch now! This is the premiere. This has not been offered by me or anyone else before.

I’m not sure I need to say it, but I personally have authored almost 60% of all commits in the curl source code during my more than twenty years in the project. I started the project, I’ve designed its architecture etc. There is simply no one around with my curl experience and knowledge of curl internals. You can’t buy better curl expertise.

curl has become one of the world’s most widely used software components and is the transfer engine doing a large chunk of all non-browser Internet transfers in the world today. curl has reached this level of success entirely without anyone offering commercial services around it. Still, not every company and product made out there has a team of curl experts and in this demanding time and age we know there are times when you rather hire the right team to help you out.

We are the curl experts that can help you and your team. Contact us for all and any support questions at support@wolfssl.com.

What about the curl project?

I’m heading into this new chapter of my life and the curl project with the full knowledge that this blurs the lines between my job and my spare time even more than before. But fear not!

The curl project is free and open and will remain independent of any commercial enterprise helping out customers. I realize me offering companies and organizations to deal with curl problems and solving curl issues for compensation creates new challenges and questions where boundaries go, if for nothing else for me personally. I still think this is worth pursuing and I’m sure we can figure out and handle whatever minor issues this can lead to.

My friends, the community, the users and harsh critiques on twitter will all help me stay true and honest. I know this. This should end up a plus for the curl project in general as well as for me personally. More focus, more work and more money involved in curl related activities should improve the project.

It is with great joy and excitement I take on this new step.

curl 7.64.0 – like there’s no tomorrow

I know, has there been eight weeks since the previous release already? But yes it has – I double-checked! And then as the laws of nature dictates, there has been yet another fresh curl version released out into the wild.

Numbers

the 179th release
5 changes
56 days (total: 7,628)

76 bug fixes (total: 4,913)
128 commits (total: 23,927)
0 new public libcurl functions (total: 80)
3 new curl_easy_setopt() options (total: 265)

1 new curl command line option (total: 220)
56 contributors, 29 new (total: 1,904)
32 authors, 13 new (total: 658)
  3 security fixes (total: 87)

Security fixes

This release we have no less than three different security related fixes. I’ll describe them briefly here, but for the finer details I advice you to read the dedicated pages and documentation we’ve written for each one of them.

CVE-2018-16890 is a bug where the existing range check in the NTLM code is wrong, which allows a malicious or broken NTLM server to send a header to curl that will make it read outside a buffer and possibly crash or otherwise misbehave.

CVE-2019-3822 is related to the previous but with much worse potential effects. Another bad range check actually allows a sneaky NTLMv2 server to be able to send back crafted contents that can overflow a local stack based buffer. This is potentially in the worst case a remote code execution risk. I think this might be the worst security issue found in curl in a long time. A small comfort is that by disabling NTLM, you will avoid it until patched.

CVE-2019-3823 is a potential read out of bounds of a heap based buffer in the SMTP code. It is fairly hard to trigger and it will mostly cause a crash when it does.

Changes

  1. curl now supports Mike West’s cookie update known as draft-ietf-httpbis-cookie-alone. It basically means that cookies that are set as “secure” has to be set over HTTPS to be allow to override a previous secure cookie. Safer cookies.
  2. The –resolve option as well as CURLOPT_RESOLVE now support specifying a wildcard as port number.
  3. libcurl can now send trailing headers in chunked uploads using the new options.
  4. curl now offers options to enable HTTP/0.9 responses, The default is still enabled, but the plan is to deprecate that and in 6 months time switch over the to default to off.
  5. curl now uses higher resolution timer accuracy on windows.

Bug-fixes

Check out the full change log to see the whole list. Here are some of the bug fixes I consider to be most noteworthy:

  • We re-implemented the code coverage support for autotools builds due to a license problem. It turned out the previously used macro was GPLv2 licensed in an unusual way for autoconf macros.
  • We make sure –xattr never stores URLs with credentials, following the security problem reported on a related tool. Not considered a security problem since this is actually what the user asked for, but still done like this for added safety.
  • With -J, curl should not be allowed to append to the file. It could lead to curl appending to a file that was in the download directory since before.
  • –tls-max didn’t work correctly on macOS when built to use Secure Transport.
  • A couple of improvements in the libssh-powered SSH backend.
  • Adjusted the build for OpenSSL 3.0.0 (the coming future version).
  • We no longer refer to Schannel as “winssl” anywhere. winssl is dead. Long live Schannel!
  • When built with mbedTLS, ignore SIGPIPE accordingly!
  • Test cases were adjusted and verified to work fine up until February 2037.
  • We fixed several parsing errors in the URL parser, mostly related to IPv6 addresses. Regressions introduced in 7.62.0.

Next

The next release cycle will be one week shorter and we expect to ship next release on March 27 – just immediately after curl turns 22 years old. There are already several changes in the pipe so we expect that to become 7.65.0.

We love your help and support! File bugs you experience or see, submit pull requests for the features or corrections you work on!

I’m on team wolfSSL

Let me start by saying thank you to all and everyone who sent me job offers or otherwise reached out with suggestions and interesting career moves. I received more than twenty different offers and almost every one of those were truly good options that I could have said yes to and still pulled home a good job. What a luxury challenge to have to select something from that! Publicly announcing me leaving Mozilla turned out a great ego-boost.

I took some time off to really reflect and contemplate on what I wanted from my next career step. What would the right next move be?

I love working on open source. Internet protocols, and transfers and doing libraries written in C are things considered pure fun for me. Can I get all that and yet keep working from home, not sacrifice my wage and perhaps integrate working on curl better in my day to day job?

I talked to different companies. Very interesting companies too, where I have friends and people who like me and who really wanted to get me working for them, but in the end there was one offer with a setup that stood out. One offer for which basically all check marks in my wish-list were checked.

wolfSSL

On February 5, 2019 I’m starting my new job at wolfSSL. My short and sweet period as unemployed is over and now it’s full steam ahead again! (Some members of my family have expressed that they haven’t really noticed any difference between me having a job and me not having a job as I spend all work days the same way nevertheless: in front of my computer.)

Starting now, we offer commercial curl support and various services for and around curl that companies and organizations previously really haven’t been able to get. Time I do not spend on curl related activities for paying customers I will spend on other networking libraries in the wolfSSL “portfolio”. I’m sure I will be able to keep busy.

I’ve met Larry at wolfSSL physically many times over the years and every year at FOSDEM I’ve made certain to say hello to my wolfSSL friends in their booth they’ve had there for years. They’re truly old-time friends.

wolfSSL is mostly a US-based company – I’m the only Swede on the team and the only one based in Sweden. My new colleagues all of course know just as well as you that I’m prevented from traveling to the US. All coming physical meetings with my work mates will happen in other countries.

commercial curl support!

We offer all sorts of commercial support for curl. I’ll post separately with more details around this.

HTTP/3 talk on video

Yesterday, I had attracted audience enough to fill up the largest presentation room GOTO 10 has, which means about one hundred interested souls.

The subject of the day was HTTP/3. The event was filmed with a mevo camera and I captured the presentation directly from my laptop as well, and I then stitched together the two sources into this final version late last night. As you’ll notice, the sound isn’t awesome and the rest of the “production” isn’t exactly top notch either, but hey, I don’t think it matters too much.

I’ll talk about HTTP/3 (Photo by Jon Åslund)
I’m Daniel Stenberg. I was handed a medal from the Swedish king in 2017 for my work on… (Photo by OpenTokix)
HTTP/2 vs HTTP/3 (Photo by OpenTokix)
Some of the challenges to deploy HTTP/3 are…. (Photo by Jonathan Sulo)

The slide set can also be viewed on slideshare.

QUIC and missing APIs

I trust you’ve heard by now that HTTP/3 is coming. It is the next destined HTTP version, targeted to get published as an RFC in July 2019. Not very far off.

HTTP/3 will not be done over TCP. It will only be performed over QUIC, which is a transport protocol replacement for TCP that always is done encrypted. There’s no clear-text version of QUIC.

TLS 1.3

The encryption in QUIC is based on TLS 1.3 technologies which I believe everyone thinks is a good idea and generally the correct decision. We need to successively raise the bar as we move forward with protocols.

However, QUIC is not only a transport protocol that does encryption by itself while TLS is typically (and designed as) a protocol that is done on top of TCP, it was also designed by a team of engineers who came up with a design that requires APIs from the TLS layer that the traditional TLS over TCP use case doesn’t need!

New TLS APIs

A QUIC implementation needs to extract traffic secrets from the TLS connection and it needs to be able to read/write TLS messages directly – not using the TLS record layer. TLS records are what’s used when we send TLS over TCP. (This was discussed and decided back around the time for the QUIC interim in Kista.)

These operations need APIs that still are missing in for example the very popular OpenSSL library, but also in other commonly used ones like GnuTLS and libressl. And of course schannel and Secure Transport.

Libraries known to already have done the job and expose the necessary mechanisms include BoringSSL, NSS, quicly, PicoTLS and Minq. All of those are incidentally TLS libraries with a more limited number of application users and less mainstream. They’re also more or less developed by people who are also actively engaged in the QUIC protocol development.

The QUIC libraries in progress now are typically using either one of the TLS libraries that already are adapted or do what ngtcp2 does: it hosts a custom-patched version of OpenSSL that brings the needed functionality.

Matt Caswell of the OpenSSL development team acknowledged this situation already back in September 2017, but so far we haven’t seen this result in updated code shipped in a released version.

curl and QUIC

curl is TLS library agnostic and can get built with around 12 different TLS libraries – one or many actually, as you can build it to allow users to select TLS backend in run-time!

OpenSSL is without competition the most popular choice to build curl with outside of the proprietary operating systems like macOS and Windows 10. But even the vendor-build and provided mac and Windows versions are also built with libraries that lack APIs for this.

With our current keen interest in QUIC and HTTP/3 support for curl, we’re about to run into an interesting TLS situation. How exactly is someone going to build curl to simultaneously support both traditional TLS based protocols as well as QUIC going forward?

I don’t have a good answer to this yet. Right now (assuming we would have the code ready in our end, which we don’t), we can’t ship QUIC or HTTP/3 support enabled for curl built to use the most popular TLS libraries! Hopefully by the time we get our code in order, the situation has improved somewhat.

This will slow down QUIC deployment

I’m personally convinced that this little API problem will be friction enough when going forward that it will slow down and hinder QUIC deployment at least initially.

When the HTTP/2 spec shipped in May 2015, it introduced a dependency on the fairly new TLS extension called ALPN that for a long time caused head aches for server admins since ALPN wasn’t supported in the OpenSSL versions that was typically installed and used at the time, but you had to upgrade OpenSSL to version 1.0.2 to get that supported.

At that time, almost four years ago, OpenSSL 1.0.2 was already released and the problem was big enough to just upgrade to that. This time, the API we’re discussing here is not even in a beta version of OpenSSL and thus hasn’t been released in any version yet. That’s far worse than the HTTP/2 situation we had and that took a few years to ride out.

Will we get these APIs into an OpenSSL release to test before the QUIC specification is done? If the schedule sticks, there’s about six months left…

A curl 2018 retrospective

Another year reaches its calendar end and a new year awaits around the corner. In the curl project we’ve had another busy and event-full year. Here’s a look back at some of the fun we’ve done during 2018.

Releases

We started out the year with the 7.58.0 release in January, and we managed to squeeze in another six releases during the year. In total we count 658 documented bug-fixes and 31 changes. The total number of bug-fixes was actually slightly lower this year compared to last year’s 683. An average of 1.8 bug-fixes per day is still not too shabby.

Authors

I’m very happy to say that we again managed to break our previous record as 155 unique authors contributed code. 111 of them for the first time in the project, and 126 did fewer than three commits during the year. Basically this means we merged code from a brand new author every three days through-out the year!

The list of “contributors”, where we also include helpers, bug reporters, security researchers etc, increased with another 169 new names this year to a total of 1829 in the last release of the year. That’s 169 new names. Of course we also got a lot of help from people who were already mentioned in there!

Will we be able to reach 2000 names before the end of 2019?

Commits

At the time of this writing, almost two weeks before the end of the year, we’re still behind the last few years with 1051 commits done this year. 1381 commits were done in 2017.

Daniel’s commit share

I personally authored 535 (50.9%) of all commits during 2018. Marcel Raad did 65 and Daniel Gustafsson 61. In general I maintain my general share of the changes done in the project over time. Possibly I’ve even increased it slightly the last few years. This graph shows my share of the commits layered on top of the number of commits done.

Vulnerabilities

This year we got exactly the same amount of security problems reported as we did last year: 12. Some of the problems were one-off due curl being added to the OSS-Fuzz project in 2018 and it has taken a while to really hit some of our soft spots and as we’ve seen a slow-down in reports from there it’ll be interesting to see if 2019 will be a brighter year in this department. (In total, OSS-Fuzz is credited for having found six security vulnerabilities in curl to date.)

During the year we manage to both introduce new bug bounty program as well as retract that very same again when it shut down almost at once! 🙁

Lines of code

Counting all lines in the git repo in the src, lib and include directories, they grew nearly 6,000 lines (3.7%) during the year to 155,912. The primary code growing activities this year were:

  1. DNS-over-HTTPS support
  2. The new URL API and using that internally as well

Deprecating legacy

In July we created the DEPRECATE.md document to keep order of some things we’re stowing away in the cyberspace attic. During the year we cut off axTLS support as a first example of this deprecation procedure. HTTP pipelining, global DNS cache and HTTP/0.9 accepted by default are features next in line marked for removal, and the two first are already disabled in code.

curl up

We had our second curl conference this year; in Stockholm. It was blast again and I’m already looking forward to curl up 2019 in Prague.

Sponsor updates

Yours truly quit Mozilla and with that we lost them as a sponsor of the curl project. We have however gotten several new backers and sponsors over the year since we joined opencollective, and can receive donations from there.

Governance

Together with a bunch of core team members I put together a two-step proposal that I posted back in October:

  1. we join an umbrella organization
  2. we create a “board” to decide over money

As the first step turned out to be a very slow operation (ie we’ve applied, but the process has not gone very far yet) we haven’t yet made step 2 happen either.

2019

Things that didn’t happen in 2018 but very well might happen in 2019 include:

  1. Some first HTTP/3 and QUIC code attempts in curl
  2. HSTS support? A pull request for this has been lingering for a while already.

Note: the numbers for 2018 in this post were extracted and graphs were prepared a few weeks before the actual end of year, so some of the data quite possibly changed a little bit since.