Category Archives: Technology

Really everything related to technology

The backdoor threat

— “Have you ever detected anyone trying to add a backdoor to curl?”

— “Have you ever been pressured by an organization or a person to add suspicious code to curl that you wouldn’t otherwise accept?”

— “If a crime syndicate would kidnap your family to force you to comply, what backdoor would you be be able to insert into curl that is the least likely to get detected?” (The less grim version of this question would instead offer huge amounts of money.)

I’ve been asked these questions and variations of them when I’ve stood up in front of audiences around the world and talked about curl and how it is one of the most widely used software components in the world, counting way over three billion instances.

Back door (noun)
— a feature or defect of a computer system that allows surreptitious unauthorized access to data.

So how is it?

No. I’ve never seen a deliberate attempt to add a flaw, a vulnerability or a backdoor into curl. I’ve seen bad patches and I’ve seen patches that brought bugs that years later were reported as security problems, but I did not spot any deliberate attempt to do bad in any of them. But if done with skills, certainly I wouldn’t have noticed them being deliberate?

If I had cooperated in adding a backdoor or been threatened to, then I wouldn’t tell you anyway and I’d thus say no to questions about it.

How to be sure

There is only one way to be sure: review the code you download and intend to use. Or get it from a trusted source that did the review for you.

If you have a version you trust, you really only have to review the changes done since then.

Possibly there’s some degree of safety in numbers, and as thousands of applications and systems use curl and libcurl and at least some of them do reviews and extensive testing, one of those could discover mischievous activities if there are any and report them publicly.

Infected machines or owned users

The servers that host the curl releases could be targeted by attackers and the tarballs for download could be replaced by something that carries evil code. There’s no such thing as a fail-safe machine, especially not if someone really wants to and tries to target us. The safeguard there is the GPG signature with which I sign all official releases. No malicious user can (re-)produce them. They have to be made by me (since I package the curl releases). That comes back to trusting me again. There’s of course no safe-guard against me being forced to signed evil code with a knife to my throat…

If one of the curl project members with git push rights would get her account hacked and her SSH key password brute-forced, a very skilled hacker could possibly sneak in something, short-term. Although my hopes are that as we review and comment each others’ code to a very high degree, that would be really hard. And the hacked person herself would most likely react.

Downloading from somewhere

I think the highest risk scenario is when users download pre-built curl or libcurl binaries from various places on the internet that isn’t the official curl web site. How can you know for sure what you’re getting then, as you couldn’t review the code or changes done. You just put your trust in a remote person or organization to do what’s right for you.

Trusting other organizations can be totally fine, as when you download using Linux distro package management systems etc as then you can expect a certain level of checks and vouching have happened and there will be digital signatures and more involved to minimize the risk of external malicious interference.

Pledging there’s no backdoor

Some people argue that projects could or should pledge for every release that there’s no deliberate backdoor planted so that if the day comes in the future when a three-letter secret organization forces us to insert a backdoor, the lack of such a pledge for the subsequent release would function as an alarm signal to people that something is wrong.

That takes us back to trusting a single person again. A truly evil adversary can of course force such a pledge to be uttered no matter what, even if that then probably is more mafia level evilness and not mere three-letter organization shadiness anymore.

I would be a bit stressed out to have to do that pledge every single release as if I ever forgot or messed it up, it should lead to a lot of people getting up in arms and how would such a mistake be fixed? It’s little too irrevocable for me. And we do quite frequent releases so the risk for mistakes is not insignificant.

Also, if I would pledge that, is that then a promise regarding all my code only, or is that meant to be a pledge for the entire code base as done by all committers? It doesn’t scale very well…

Additionally, I’m a Swede living in Sweden. The American organizations cannot legally force me to backdoor anything, and the Swedish versions of those secret organizations don’t have the legal rights to do so either (caveat: I’m not a lawyer). So, the real threat is not by legal means.

What backdoor would be likely?

It would be very hard to add code, unnoticed, that sends off data to somewhere else. Too much code that would be too obvious.

A backdoor similarly couldn’t really be made to split off data from the transfer pipe and store it locally for other systems to read, as that too is probably too much code that is too different than the current code and would be detected instantly.

No, I’m convinced the most likely backdoor code in curl is a deliberate but hard-to-detect security vulnerability that let’s the attacker exploit the program using libcurl/curl by some sort of specific usage pattern. So when triggered it can trick the program to send off memory contents or perhaps overwrite the local stack or the heap. Quite possibly only one step out of several steps necessary for a successful attack, much like how a single-byte-overwrite can lead to root access.

Any past security problems on purpose?

We’ve had almost 70 security vulnerabilities reported through the project’s almost twenty years of existence. Since most of them were triggered by mistakes in code I wrote myself, I can be certain that none of those problems were introduced on purpose. I can’t completely rule out that someone else’s patch modified curl along the way and then by extension maybe made a vulnerability worse or easier to trigger, could have been made on purpose. None of the security problems that were introduced by others have shown any sign of “deliberateness”. (Or were written cleverly enough to not make me see that!)

Maybe backdoors have been planted that we just haven’t discovered yet?

Discussion

Follow-up discussion/comments on hacker news.

keep finding old security problems

I decided to look closer at security problems and the age of the reported issues in the curl project.

One theory I had when I started to collect this data, was that we actually get security problems reported earlier and earlier over time. That bugs would be around in public release for shorter periods of time nowadays than what they did in the past.

My thinking would go like this: Logically, bugs that have been around for a long time have had a long time to get caught. The more eyes we’ve had on the code, the fewer old bugs should be left and going forward we should more often catch more recently added bugs.

The time from a bug’s introduction into the code until the day we get a security report about it, should logically decrease over time.

What if it doesn’t?

First, let’s take a look at the data at hand. In the curl project we have so far reported in total 68 security problems over the project’s life time. The first 4 were not recorded correctly so I’ll discard them from my data here, leaving 64 issues to check out.

The graph below shows the time distribution. The all time leader so far is the issue reported to us on March 10 this year (2017), which was present in the code since the version 6.5 release done on March 13 2000. 6,206 days, just three days away from 17 whole years.

There are no less than twelve additional issues that lingered from more than 5,000 days until reported. Only 20 (31%) of the reported issues had been public for less than 1,000 days. The fastest report was reported on the release day: 0 days.

The median time from release to report is a whopping 2541 days.

When we receive a report about a security problem, we want the issue fixed, responsibly announced to the world and ship a new release where the problem is gone. The median time to go through this procedure is 26.5 days, and the distribution looks like this:

What stands out here is the TLS session resumption bypass, which happened because we struggled with understanding it and how to address it properly. Otherwise the numbers look all reasonable to me as we typically do releases at least once every 8 weeks. We rarely ship a release with a known security issue outstanding.

Why are very old issues still found?

I think partly because the tools are gradually improving that aid people these days to find things much better, things that simply wasn’t found very often before. With new tools we can find problems that have been around for a long time.

Every year, the age of the oldest parts of the code get one year older. So the older the project gets, the older bugs can be found, while in the early days there was a smaller share of the code that was really old (if any at all).

What if we instead count age as a percentage of the project’s life time? Using this formula, a bug found at day 100 that was added at day 50 would be 50% but if it was added at day 80 it would be 20%. Maybe this would show a graph where the bars are shrinking over time?

But no. In fact it shows 17 (27%) of them having been present during 80% or more of the project’s life time! The median issue had been in there during 49% of the project’s life time!

It does however make another issue the worst offender, as one of the issues had been around during 91% of the project’s life time.

This counts on March 20 1998 being the birth day. Of course we got no reports the first few years since we basically had no users then!

Specific or generic?

Is this pattern something that is specific for the curl project or can we find it in other projects too? I don’t know. I have not seen this kind of data being presented by others and I don’t have the same insight on such details of projects with an enough amount of issues to be interesting.

What can we do to make the bars shrink?

Well, if there are old bugs left to find they won’t shrink, because for every such old security issue that’s still left there will be a tall bar. Hopefully though, by doing more tests, using more tools regularly (fuzzers, analyzers etc) and with more eyeballs on the code, we should iron out our security issues over time. Logically that should lead to a project where newly added security problems are detected sooner rather than later. We just don’t seem to be at that point yet…

Caveat

One fact that skews the numbers is that we are much more likely to record issues as security related these days. A decade ago when we got a report about a segfault or something we would often just consider it bad code and fix it, and neither us maintainers nor the reporter would think much about the potential security impact.

These days we’re at the other end of the spectrum where we people are much faster to jumping to a security issue suspicion or conclusion. Today people report bugs as security issues to a much higher degree than they did in the past. This is basically a good thing though, even if it makes it harder to draw conclusions over time.

Data sources

When you want to repeat the above graphs and verify my numbers:

  • vuln.pm – from the curl web site repository holds security issue meta data
  • releaselog – on the curl web site offers release meta data, even as a CSV download on the bottom of the page
  • report2release.pl – the perl script I used to calculate the report until release periods.

“OPTIONS *” with curl

(Note: this blog post as been updated as the command line option changed after first publication, based on comments to this very post!)

curl is arguably a “Swiss army knife” of HTTP fiddling. It is one of the available tools in the toolbox with a large set of available switches and options to allow us to tweak and modify our HTTP requests to really test, debug and torture our HTTP servers and services.

That’s the way we like it.

In curl 7.55.0 it will take yet another step into this territory when we finally introduce a way for users to send “OPTION *” and similar requests to servers. It has been requested occasionally by users over the years but now the waiting is over. (brought by this commit)

“OPTIONS *” is special and peculiar just because it is one of the few specified requests you can do to a HTTP server where the path part doesn’t start with a slash. Thus you cannot really end up with this based on a URL and as you know curl is pretty much all about URLs.

The OPTIONS method was introduced in HTTP 1.1 already back in RFC 2068, published in January 1997 (even before curl was born) and with curl you’ve always been able to send an OPTIONS request with the -X option, you just were never able to send that single asterisk instead of a path.

In curl 7.55.0 and later versions, you can remove the initial slash from the path part that ends up in the request by using –request-target. So to send an OPTION * to example.com for http and https URLs, you could do it like:

$ curl --request-target "*" -X OPTIONS http://example.com
$ curl --request-target "*" -X OPTIONS https://example.com/

In classical curl-style this also opens up the opportunity for you to issue completely illegal or otherwise nonsensical paths to your server to see what it does on them, to send totally weird options to OPTIONS and similar games:

$ curl --request-target "*never*" -X OPTIONS http://example.com

$ curl --request-target "allpasswords" http://example.com

Enjoy!

curling over HTTP proxy

Starting in curl 7.55.0 (this commit), curl will no longer try to ask HTTP proxies to perform non-HTTP transfers with GET, except for FTP. For all other protocols, curl now assumes you want to tunnel through the HTTP proxy when you use such a proxy and protocol combination.

Protocols and proxies

curl supports 23 different protocols right now, if we count the S-versions (the TLS based alternatives) as separate protocols.

curl also currently supports seven different proxy types that can be set independently of the protocol.

One type of proxy that curl supports is a so called “HTTP proxy”. The official HTTP standard includes a defined way how to speak to such a proxy and ask it to perform the request on the behalf of the client. curl supports using that over either HTTP/1.1 or HTTP/1.0, where you’d typically only use the latter version if you the first really doesn’t work with your ancient proxy.

HTTP proxy

All that is fine and good. But HTTP proxies were really only defined to handle HTTP, and to some extent HTTPS. When doing plain HTTP transfers over a proxy, the client will send its request to the proxy like this:

GET http://curl.haxx.se/ HTTP/1.1
Host: curl.haxx.se
Accept: */*
User-Agent: curl/7.55.0

… but for HTTPS, which should provide end to end encryption, a client needs to ask the proxy to instead tunnel through the proxy so that it can do TLS all the way, without any middle man, to the server:

CONNECT curl.haxx.se:443 HTTP/1.1
Host: curl.haxx.se:443
User-Agent: curl/7.55.0

When successful, the proxy responds with a “200” which means that the proxy has established a TCP connection to the remote server the client asked it to connect to, and the client can then proceed and do the TLS handshake with that server. When the TLS handshake is completed, a regular GET request is then sent over that established and secure TLS “tunnel” to the server. A GET request that then looks like one that is sent without proxy:

GET / HTTP/1.1
Host: curl.haxx.se
User-Agent: curl/7.55.0
Accept: */*

FTP over HTTP proxy

Things get more complicated when trying to perform transfers over the HTTP proxy using schemes that aren’t HTTP. As already described above, HTTP proxies are basically designed only for doing HTTP over them, but as they have this concept of tunneling through to the remote server it doesn’t have to be limited to just HTTP.

Also, historically, for decades people have deployed HTTP proxies that recognize FTP URLs, and transparently handle them for the client so the client can almost believe it is HTTP while the proxy has to speak FTP to the remote server in the other end and convert it back to HTTP to the client. On such proxies (Squid and Apache both support this mode for example), this sort of request is possible:

GET ftp://ftp.funet.fi/ HTTP/1.1
Host: ftp.funet.fi
User-Agent: curl/7.55.0
Accept: */*

curl knows this and if you ask curl for FTP over an HTTP proxy, it will assume you have one of these proxies. It should be noted that this method of course limits what you can do FTP-wise and for example FTP upload is usually not working and if you ask curl to do FTP upload over and HTTP proxy it will do that with a HTTP PUT.

HTTP proxy tunnel

curl features an option (–proxytunnel) that lets the user forcible tell the client to not assume that the proxy speaks this protocol and instead use the CONNECT method with establishing a tunnel through the proxy to the remote server.

It should of course be noted that very few deployed HTTP proxies in the wild allow clients to CONNECT to whatever port they like. HTTP proxies tend to only allow connecting to port 443 as that is the official HTTPS port, and if you ask for another port it will respond back with a 4xx response code refusing to comply.

Not HTTP not FTP over HTTP proxy

So HTTP, HTTPS and FTP are sent over the HTTP proxy fine. That leaves us with nineteen more protocols. What happens with them when you ask curl to perform them over a HTTP proxy?

Now we have finally reached the change that has just been merged in curl and changes what curl does.

Before 7.55.0

curl would send all protocols as a regular GET to the proxy if asked to use a HTTP proxy without seeing the explicit proxy-tunnel option. This came from how FTP was done and grew from there without many people questioning it. Of course it wouldn’t ever work, but also very few people would actually attempt it because of that.

From 7.55.0

All protocols that aren’t HTTP, HTTPS or FTP will enable the tunnel-through mode automatically when a HTTP proxy is used. No more sending funny GET requests to proxies when they won’t work anyway. Also, it will prevent users from accidentally leak credentials to proxies that were intended for the server, which previously could happen if you omitted the tunnel option with a few authentication setups.

HTTP/2 proxy

Sorry, curl doesn’t support that yet. Patches welcome!

HTTP Workshop s03e02

(Season three, episode two)

Previously, on the HTTP Workshop. Yesterday ended with a much appreciated group dinner and now we’re back energized and eager to continue blabbing about HTTP frames, headers and similar things.

Martin from Mozilla talked on “connection management is hard“. Parts of the discussion was around the HTTP/2 connection coalescing that I’ve blogged about before. The ORIGIN frame is a draft for a suggested way for servers to more clearly announce which origins it can answer for on that connection which should reduce the frequency of 421 needs. The ORIGIN frame overrides DNS and will allow coalescing even for origins that don’t otherwise resolve to the same IP addresses. The Alt-Svc header, a suggested CERTIFICATE frame and how does a HTTP/2 server know for which origins it can do PUSH for?

A lot of positive words were expressed about the ORIGIN frame. Wildcard support?

Willy from HA-proxy talked about his Memory and CPU efficient HPACK decoding algorithm. Personally, I think the award for the best slides of the day goes to Willy’s hand-drawn notes.

Lucas from BBC talked about usage data for iplayer and how much data and number of requests they serve and how their largest share of users are “non-browsers”. Lucas mentioned their work on writing a libcurl adaption to make gstreamer use it instead of libsoup. Lucas talk triggered a lengthy discussion on what needs there are and how (if at all) you can divide clients into browsers and non-browser.

Wenbo from Google spoke about Websockets and showed usage data from Chrome. The median websockets connection time is 20 seconds and 10% something are shorter than 0.5 seconds. At the 97% percentile they live over an hour. The connection success rates for Websockets are depressingly low when done in the clear while the situation is better when done over HTTPS. For some reason the success rate on Mac seems to be extra low, and Firefox telemetry seems to agree. Websockets over HTTP/2 (or not) is an old hot topic that brought us back to reiterate issues we’ve debated a lot before. This time we also got a lovely and long side track into web push and how that works.

Roy talked about Waka, a HTTP replacement protocol idea and concept that Roy’s been carrying around for a long time (he started this in 2001) and to which he is now coming back to do actual work on. A big part of the discussion was focused around the wakli compression ideas, what the idea is and how it could be done and evaluated. Also, Roy is not a fan of content negotiation and wants it done differently so he’s addressing that in Waka.

Vlad talked about his suggestion for how to do cross-stream compression in HTTP/2 to significantly enhance compression ratio when, for example, switching to many small resources over h2 compared to a single huge resource over h1. The security aspect of this feature is what catches most of people’s attention and the following discussion. How can we make sure this doesn’t leak sensitive information? What protocol mechanisms exist or can we invent to help out making this work in a way that is safer (by default)?

Trailers. This is again a favorite topic that we’ve discussed before that is resurfaced. There are people around the table who’d like to see support trailers and we discussed the same topic in the HTTP Workshop in 2016 as well. The corresponding issue on trailers filed in the fetch github repo shows a lot of the concerns.

Julian brought up the subject of “7230bis” – when and how do we start the work. What do we want from such a revision? Fixing the bugs seems like the primary focus. “10 years is too long until update”.

Kazuho talked about “HTTP/2 attack mitigation” and how to handle clients doing many parallel slow POST requests to a CDN and them having an origin server behind that runs a new separate process for each upload.

And with this, the day and the workshop 2017 was over. Thanks to Facebook for hosting us. Thanks to the members of the program committee for driving this event nicely! I had a great time. The topics, the discussions and the people – awesome!

HTTP Workshop – London edition. First day.

The HTTP workshop series is back for a third time this northern hemisphere summer. The selected location for the 2017 version is London and this time we’re down to a two-day event (we seem to remove a day every year)…

Nothing in this blog entry is a quote to be attributed to a specific individual but they are my interpretations and paraphrasing of things said or presented. Any mistakes or errors are all mine.

At 9:30 this clear Monday morning, 35 persons sat down around a huge table in a room in the Facebook offices. Most of us are the same familiar faces that have already participated in one or two HTTP workshops, but we also have a set of people this year who haven’t attended before. Getting fresh blood into these discussions is certainly valuable. Most major players are represented, including Mozilla, Google, Facebook, Apple, Cloudflare, Fastly, Akamai, HA-proxy, Squid, Varnish, BBC, Adobe and curl!

Mark (independent, co-chair of the HTTP working group as well as the QUIC working group) kicked it all off with a presentation on quic and where it is right now in terms of standardization and progress. The upcoming draft-04 is becoming the first implementation draft even though the goal for interop is set basically at handshake and some very basic data interaction. The quic transport protocol is still in a huge flux and things have not settled enough for it to be interoperable right now to a very high level.

Jana from Google presented on quic deployment over time and how it right now uses about 7% of internet traffic. The Android Youtube app’s switch to QUIC last year showed a huge bump in usage numbers. Quic is a lot about reducing latency and numbers show that users really do get a reduction. By that nature, it improves the situation best for those who currently have the worst connections.

It doesn’t solve first world problems, this solves third world connection issues.

The currently observed 2x CPU usage increase for QUIC connections as compared to h2+TLS is mostly blamed on the Linux kernel which apparently is not at all up for this job as good is should be. Things have clearly been more optimized for TCP over the years, leaving room for improvement in the UDP areas going forward. “Making kernel bypassing an interesting choice”.

Alan from Facebook talked header compression for quic and presented data, graphs and numbers on how HPACK(-for-quic), QPACK and QCRAM compare when used for quic in different networking conditions and scenarios. Those are the three current header compression alternatives that are open for quic and Alan first explained the basics behind them and then how they compare when run in his simulator. The current HPACK version (adopted to quic) seems to be out of the question for head-of-line-blocking reasons, the QCRAM suggestion seems to run well but have two main flaws as it requires an awkward layering violation and an annoying possible reframing requirement on resends. Clearly some more experiments can be done, possible with a hybrid where some QCRAM ideas are brought into QPACK. Alan hopes to get his simulator open sourced in the coming months which then will allow more people to experiment and reproduce his numbers.

Hooman from Fastly on problems and challenges with HTTP/2 server push, the 103 early hints HTTP response and cache digests. This took the discussions on push into the weeds and into the dark protocol corners we’ve been in before and all sorts of ideas and suggestions were brought up. Some of them have been discussed before without having been resolved yet and some ideas were new, at least to me. The general consensus seems to be that push is fairly complicated and there are a lot of corner cases and murky areas that haven’t been clearly documented, but it is a feature that is now being used and for the CDN use case it can help with a lot more than “just an RTT”. But is perhaps the 103 response good enough for most of the cases?

The discussion on server push and how well it fares is something the QUIC working group is interested in, since the question was asked already this morning if a first version of quic could be considered to be made without push support. The jury is still out on that I think.

ekr from Mozilla spoke about TLS 1.3, 0-RTT, how the TLS 1.3 handshake looks like and how applications and servers can take advantage of the new 0-RTT and “0.5-RTT” features. TLS 1.3 is already passed the WGLC and there are now “only” a few issues pending to get solved. Taking advantage of 0RTT in an HTTP world opens up interesting questions and issues as HTTP request resends and retries are becoming increasingly prevalent.

Next: day two.

A curl delivery network

I’ve run my own public web sites on hardware I’ve administered myself for over twenty years now. I’ve hosted the curl web site myself since it’s inception.

The curl web site at curl.haxx.se has recently been delivering roughly 1.5 terabyte of data to the world per month. The CA bundle we convert to PEM from the Mozilla source code, is alone downloaded more than 100,000 times per day. Occasional blog entries I’ve posted here on my blog have climbed very fast on popular sites such as Hacker news and Reddit, and have resulted in intense visitor storms hitting this same server – sometimes reaching visitor counts above 200,000 “uniques” – most of them within the first few hours of the publication. At times, those visitor spikes have effectively brought the server to its knees.

Yes, my personal web site and the curl web site are both sharing the same physical server. It also hosts more than a dozen other sites and numerous services for our own pleasures and fun, providing services for a handful of different open source projects. So when the server has to cease doing work because it runs out of memory or hits other resource restraints, that causes interruptions all over. Oh yes, and my email doesn’t reach me.

Inconvenient and annoying.

The server

Haxx owns and runs this co-located server that we have a busload of web servers on – for the good of the projects and people that run things on it. This machine’s worst bottle neck is available RAM memory and perhaps I/O performance. Every time the server goes down to a crawl due to network traffic overload we discuss how we should upgrade it. Installing a new machine and transferring over all the sites and services is work. Work that none of us at Haxx are very happy to volunteer to do. So it hasn’t been done yet, and frankly the server handles the daily load just fine and without even a blink. Which is ninety nine point something percent of the time…

Haxx pays for a certain amount of network traffic so as long as we’re below some threshold we remain paying the same monthly fee. We don’t want to increase the traffic by magnitudes as that would cost more.

The specific machine, that sits deep inside a server room in Stockholm Sweden, is a five(?) years old Dell Poweredge E310, Intel Xeon X3440 2.53GHz with 8GB ram, This model is shown on the image at the top.

Alternatives that hasn’t helped

Why not a mirror system? We had a fair amount of curl site mirrors a few years ago, but it never worked well because they were always less reliable than the main site and they often turned stale and out of sync with the master site which eventually just hurt users.

They also trick visitors into bookmark or otherwise go back to the mirror site instead of the real one and there were always the annoying people who couldn’t resist but to fill the mirror with ads and stuff. Plus, they didn’t help much with with the storms to the main site.

Why not a cloud server? Because with the amount of services, servers and various things we do on our server, it would be inconvenient and expensive. But perhaps even more because we started out like this so we have invested time and energy into the infrastructure as it works right now. And I enjoy rowing my own boat!

The CDN

Fastly reached out and graciously offered to help us handle the load. Both on the account of traffic amounts but also to save our machine from struggling this hard the next time I’ll write something that tickles people’s curiosity (or rage) to that level when several thousands of visitors want to read the same article at the same time.

Starting now, the curl.haxx.se and the daniel.haxx.se web sites are fronted by Fastly. It should give web site visitors from all over the world faster response times and it will make the site more reliable and less likely to have problems due to traffic load going forward.

In case you’re not familiar with what a CDN is, a simplified explanation would say it is a globally distributed network of reverse proxy servers deployed in multiple data centers. These CDN servers front the Internet and will to the largest extent possible serve the visitors with the right content directly from their own caches instead of them reaching the actual lowly backend server I run that hosts the original content. Fastly has lots of servers across the globe for this purpose. Users who are a long way away from Sweden will probably be the ones who will notice this change the most, as you may suddenly find haxx.se content much closer (network round-trip wise) than before.

Standards

These new servers will host the sites over HTTPS just like before, and they will require TLS 1.2 and SNI. They will work over IPv6 and support HTTP/2.  Network standard wise, there shouldn’t be any step down – and honestly, I haven’t exactly been on the cutting edge of these technologies myself for these sites in the past.

Editing the site

We will keep editing and maintaining the site like before. It is made up of an old system with templates and include files that generate mostly static web pages. The site is mostly available on github and using that, you can build a local version for development and trying out changes before they land.

Hopefully, this move to Fastly will only make the site faster and more reliable. If you notice any glitches or experience any problems with the site, please let us know!

Fewer mallocs in curl

Today I landed yet another small change to libcurl internals that further reduces the number of small mallocs we do. This time the generic linked list functions got converted to become malloc-less (the way linked list functions should behave, really).

Instrument mallocs

I started out my quest a few weeks ago by instrumenting our memory allocations. This is easy since we have our own memory debug and logging system in curl since many years. Using a debug build of curl I run this script in my build dir:

#!/bin/sh
export CURL_MEMDEBUG=$HOME/tmp/curlmem.log
./src/curl http://localhost
./tests/memanalyze.pl -v $HOME/tmp/curlmem.log

For curl 7.53.1, this counted about 115 memory allocations. Is that many or a few?

The memory log is very basic. To give you an idea what it looks like, here’s an example snippet:

MEM getinfo.c:70 free((nil))
MEM getinfo.c:73 free((nil))
MEM url.c:294 free((nil))
MEM url.c:297 strdup(0x559e7150d616) (24) = 0x559e73760f98
MEM url.c:294 free((nil))
MEM url.c:297 strdup(0x559e7150d62e) (22) = 0x559e73760fc8
MEM multi.c:302 calloc(1,480) = 0x559e73760ff8
MEM hash.c:75 malloc(224) = 0x559e737611f8
MEM hash.c:75 malloc(29152) = 0x559e737a2bc8
MEM hash.c:75 malloc(3104) = 0x559e737a9dc8

Check the log

I then studied the log closer and I realized that there were many small memory allocations done from the same code lines. We clearly had some rather silly code patterns where we would allocate a struct and then add that struct to a linked list or a hash and that code would then subsequently add yet another small struct and similar – and then often do that in a loop.  (I say we here to avoid blaming anyone, but of course I myself am to blame for most of this…)

Those two allocations would always happen in pairs and they would be freed at the same time. I decided to address those. Doing very small (less than say 32 bytes) allocations is also wasteful just due to the very large amount of data in proportion that will be used just to keep track of that tiny little memory area (within the malloc system). Not to mention fragmentation of the heap.

So, fixing the hash code and the linked list code to not use mallocs were immediate and easy ways to remove over 20% of the mallocs for a plain and simple ‘curl http://localhost’ transfer.

At this point I sorted all allocations based on size and checked all the smallest ones. One that stood out was one we made in curl_multi_wait(), a function that is called over and over in a typical curl transfer main loop. I converted it over to use the stack for most typical use cases. Avoiding mallocs in very repeatedly called functions is a good thing.

Recount

Today, the script from above shows that the same “curl localhost” command is down to 80 allocations from the 115 curl 7.53.1 used. Without sacrificing anything really. An easy 26% improvement. Not bad at all!

But okay, since I modified curl_multi_wait() I wanted to also see how it actually improves things for a slightly more advanced transfer. I took the multi-double.c example code, added the call to initiate the memory logging, made it uses curl_multi_wait() and had it download these two URLs in parallel:

http://www.example.com/
http://localhost/512M

The second one being just 512 megabytes of zeroes and the first being a 600 bytes something public html page. Here’s the count-malloc.c code.

First, I brought out 7.53.1 and built the example against that and had the memanalyze script check it:

Mallocs: 33901
Reallocs: 5
Callocs: 24
Strdups: 31
Wcsdups: 0
Frees: 33956
Allocations: 33961
Maximum allocated: 160385

Okay, so it used 160KB of memory totally and it did over 33,900 allocations. But ok, it downloaded over 512 megabytes of data so it makes one malloc per 15KB of data. Good or bad?

Back to git master, the version we call 7.54.1-DEV right now – since we’re not quite sure which version number it’ll become when we release the next release. It can become 7.54.1 or 7.55.0, it has not been determined yet. But I digress, I ran the same modified multi-double.c example again, ran memanalyze on the memory log again and it now reported…

Mallocs: 69
Reallocs: 5
Callocs: 24
Strdups: 31
Wcsdups: 0
Frees: 124
Allocations: 129
Maximum allocated: 153247

I had to look twice. Did I do something wrong? I better run it again just to double-check. The results are the same no matter how many times I run it…

33,961 vs 129

curl_multi_wait() is called a lot of times in a typical transfer, and it had at least one of the memory allocations we normally did during a transfer so removing that single tiny allocation had a pretty dramatic impact on the counter. A normal transfer also moves things in and out of linked lists and hashes a bit, but they too are mostly malloc-less now. Simply put: the remaining allocations are not done in the transfer loop so they’re way less important.

The old curl did 263 times the number of allocations the current does for this example. Or the other way around: the new one does 0.37% the number of allocations the old one did…

As an added bonus, the new one also allocates less memory in total as it decreased that amount by 7KB (4.3%).

Are mallocs important?

In the day and age with many gigabytes of RAM and all, does a few mallocs in a transfer really make a notable difference for mere mortals? What is the impact of 33,832 extra mallocs done for 512MB of data?

To measure what impact these changes have, I decided to compare HTTP transfers from localhost and see if we can see any speed difference. localhost is fine for this test since there’s no network speed limit, but the faster curl is the faster the download will be. The server side will be equally fast/slow since I’ll use the same set for both tests.

I built curl 7.53.1 and curl 7.54.1-DEV identically and ran this command line:

curl http://localhost/80GB -o /dev/null

80 gigabytes downloaded as fast as possible written into the void.

The exact numbers I got for this may not be totally interesting, as it will depend on CPU in the machine, which HTTP server that serves the file and optimization level when I build curl etc. But the relative numbers should still be highly relevant. The old code vs the new.

7.54.1-DEV repeatedly performed 30% faster! The 2200MB/sec in my build of the earlier release increased to over 2900 MB/sec with the current version.

The point here is of course not that it easily can transfer HTTP over 20 Gigabit/sec using a single core on my machine – since there are very few users who actually do that speedy transfers with curl. The point is rather that curl now uses less CPU per byte transferred, which leaves more CPU over to the rest of the system to perform whatever it needs to do. Or to save battery if the device is a portable one.

On the cost of malloc: The 512MB test I did resulted in 33832 more allocations using the old code. The old code transferred HTTP at a rate of about 2200MB/sec. That equals 145,827 mallocs/second – that are now removed! A 600 MB/sec improvement means that curl managed to transfer 4300 bytes extra for each malloc it didn’t do, each second.

Was removing these mallocs hard?

Not at all, it was all straight forward. It is however interesting that there’s still room for changes like this in a project this old. I’ve had this idea for some years and I’m glad I finally took the time to make it happen. Thanks to our test suite I could do this level of “drastic” internal change with a fairly high degree of confidence that I don’t introduce too terrible regressions. Thanks to our APIs being good at hiding internals, this change could be done completely without changing anything for old or new applications.

(Yeah I haven’t shipped the entire change in a release yet so there’s of course a risk that I’ll have to regret my “this was easy” statement…)

Caveats on the numbers

There have been 213 commits in the curl git repo from 7.53.1 till today. There’s a chance one or more other commits than just the pure alloc changes have made a performance impact, even if I can’t think of any.

More?

Are there more “low hanging fruits” to pick here in the similar vein?

Perhaps. We don’t do a lot of performance measurements or comparisons so who knows, we might do more silly things that we could stop doing and do even better. One thing I’ve always wanted to do, but never got around to, was to add daily “monitoring” of memory/mallocs used and how fast curl performs in order to better track when we unknowingly regress in these areas.

Addendum, April 23rd

(Follow-up on some comments on this article that I’ve read on hacker news, Reddit and elsewhere.)

Someone asked and I ran the 80GB download again with ‘time’. Three times each with the old and the new code, and the “middle” run of them showed these timings:

Old code:

real    0m36.705s
user    0m20.176s
sys     0m16.072s

New code:

real    0m29.032s
user    0m12.196s
sys     0m12.820s

The server that hosts this 80GB file is a standard Apache 2.4.25, and the 80GB file is stored on an SSD. The CPU in my machine is a core-i7 3770K 3.50GHz.

Someone also mentioned alloca() as a solution for one of the patches, but alloca() is not portable enough to work as the sole solution, meaning we would have to do ugly #ifdef if we would want to use alloca() there.

curl bug bounty

The curl project is a project driven by volunteers with no financing at all except for a few sponsors who pay for the server hosting and for contributors to work on features and bug fixes on work hours. curl and libcurl are used widely by companies and commercial software so a fair amount of work is done by people during paid work hours.

This said, we don’t have any money in the project. Nada. Zilch. We can’t pay bug bounties or hire people to do specific things for us. We can only ask people or companies to volunteer things or services for us.

This is not a complaint – far from it. It works really well and we have a good stream of contributions, bugs reports and more. We are fortunate enough to make widely used software which gives our project a certain impact in the world.

Bug bounty!

Hacker One coordinates a bug bounty program for flaws that affects “the Internet”, and based on previously paid out bounties, serious flaws in libcurl match that description and can be deemed worthy of bounties. For example, 3000 USD was paid for libcurl: URL request injection (the curl advisory for that flaw) and 1000 USD was paid for libcurl duphandle read out of bounds (the corresponding curl advisory).

I think more flaws in libcurl could’ve met the criteria, but I suspect more people than me haven’t been aware of this possibility for bounties.

I was glad to find out that this bounty program pays out money for libcurl issues and I hope it will motivate people to take an extra look into the inner workings of libcurl and help us improve.

What qualifies?

The bounty program is run and administered completely out of control or insight from the curl project itself and I must underscore that while libcurl issues can qualify, their emphasis is on fixing vulnerabilities in Internet software that have a potentially big impact.

To qualify for this bounty, vulnerabilities must meet the following criteria:

  • Be implementation agnostic: the vulnerability is present in implementations from multiple vendors or a vendor with dominant market share. Do not send vulnerabilities that only impact a single website, product, or project.
  • Be open source: finding manifests itself in at least one popular open source project.

In addition, vulnerabilities should meet most of the following criteria:

  • Be widespread: vulnerability manifests itself across a wide range of products, or impacts a large number of end users.
  • Have critical impact: vulnerability has extreme negative consequences for the general public.
  • Be novel: vulnerability is new or unusual in an interesting way.

If your libcurl security flaw matches this, go ahead and submit your request for a bounty. If you’re at a company using libcurl at scale, consider joining that program as a bounty sponsor!

Talk: web transport, today and tomorrow

At the Netnod spring meeting 2017 in Stockholm on the 5th of April I did a talk with the title of this post.

Why was HTTP/2 introduced, how well has HTTP/2 been deployed and used, did it deliver on its promises, where doesn’t HTTP/2 perform as well. Then a quick (haha) overview on what QUIC is and how it intends to fix some of the shortcomings of HTTP/2 and TCP. In 28 minutes.