bug-bounty reward amounts in curl

A while ago I tweeted the good news that we’ve handed over our largest single monetary reward yet in the curl bug-bounty program: 700 USD. We announced this security problem in association with the curl 7.71.0 release the other day.

Someone responded to me and wanted this clarified: we award 700 USD to someone for reporting a curl bug that potentially affects users on virtually every computer system out there – while Apple just days earlier awarded a researcher 100,000 USD for an Apple-specific security flaw.

The difference in “amplitude” is notable.

A bug-bounty

I think first we should start with appreciating that we have a bug-bounty program at all! Most open source projects don’t, and we didn’t have any program like this for the first twenty or so years. Our program is just getting started and we’re getting up to speed.

Donations only

How can we in the curl project hand out any money at all? We get donations from companies and individuals. This is the only source of funds we have. We can only give away rewards if we have enough donations in our fund.

When we started the bug-bounty, we also rather recently had started to get donations (to our Open Collective fund) and we were careful to not promise higher amounts than we would be able to pay, as we couldn’t be sure how many problems people would report and exactly how it would take off.

The more donations the larger the rewards

Over time it has gradually become clear that we’re getting donations at a level and frequency that far surpasses what we’re handing out as bug-bounty rewards. As a direct result of that, we’ve agreed in the the curl security team to increase the amounts.

For all security reports we get now that end up in a confirmed security advisory, we will increase the handed out award amount – until we reach a level we feel we can be proud of and stand for. I think that level should be more than 1,000 USD even for the lowest graded issues – and maybe ten times that amount for an issue graded “high”. We will however never get even within a few magnitudes of what the giants can offer.

Accumulated curl bug-bounty payouts to date. A so called hockey stick graph.

Are we improving security-wise?

The graph with number of reported CVEs per year shows that we started to get a serious number of reports in 2013 (5 reports) and it also seems to show that we’ve passed the peak. I’m not sure we have enough enough data and evidence to back this up, but I’m convinced we do a lot of things much better in the project now that should help to keep the amount of reports down going forward. In a few years when we look back we can see if I was right.

We’re at mid year 2020 now with only two reports so far, which if we keep this rate will make this the best CVE-year after 2012. This, while we offer more money than ever for reported issues and we have a larger amount of code than ever to find problems in.

Number of CVEs reported for curl distributed over the year of the announcement

The companies surf along

One company suggests that they will chip in and pay for an increased curl bug bounty if the problem affects their use case, but for some reason the problems just never seem to affect them and I’ve pretty much stopped bothering to even ask them.

curl is shipped with a large number of operating systems and in a large number of applications but yet not even the large volume users participate in the curl bug bounty program but leave it to us (and they rarely even donate). Perhaps you can report curl security issues to them and have a chance of a higher reward?

You would possibly imagine that these companies should be keen on helping us out to indirectly secure users of their operating systems and applications, but no. We’re an open source project. They can use our products for free and they do, and our products improve their end products. But if there’s a problem in our stuff, that issue is ours to sort out and fix and those companies can then subsequently upgrade to the corrected version…

This is not a complaint, just an observation. I personally appreciate the freedom this gives us.

What can you do to help?

Help us review code. Report bugs. Report all security related problems you can find or suspect exists. Get your company to sponsor us. Write awesome pull requests that improve curl and the way it does things. Use curl and libcurl in your programs and projects. Buy commercial curl support from the best and only provider of commercial curl support.

curl 7.71.0 – blobs and retries

Welcome to the “prose version” of the curl 7.71.0 change log. There’s just been eight short weeks since I last blogged abut a curl release but here we are again and there’s quite a lot to say about this new one.

Presentation

Numbers

the 192nd release
4 changes
56 days (total: 8,132)

136 bug fixes (total: 6,209)
244 commits (total: 25,911)
0 new public libcurl function (total: 82)
7 new curl_easy_setopt() option (total: 277)

1 new curl command line option (total: 232)
59 contributors, 33 new (total: 2,202)
33 authors, 17 new (total: 803)
2 security fixes (total: 94)
1,100 USD paid in Bug Bounties

Security

CVE-2020-8169 Partial password leak over DNS on HTTP redirect

This is a nasty bug in user credential handling when doing authentication and HTTP redirects, which can lead to a part pf the password being prepended to the host name when doing name resolving, thus leaking it over the network and to the DNS server.

This bug was reported and we fixed it in public – and then someone else pointed out the security angle of it! Just shows my lack of imagination. As a result, even though this was a bug already reported – and fixed – and therefor technically not subject for a bug bounty, we decide to still reward the reporter, just maybe not with the full amount this would otherwise had received. We awarded the reporter 400 USD.

CVE-2020-8177 curl overwrite local file with -J

When curl -J is used it doesn’t work together with -i and there’s a check that prevents it from getting used. The check was flawed and could be circumvented, which the effect that a server that provides a file name in a Content-Disposition: header could overwrite a local file, since the check for an existing local file was done in the code for receiving a body – as -i wasn’t supposed to work… We awarded the reporter 700 USD.

Changes

We’re counting four “changes” this release.

CURLSSLOPT_NATIVE_CA – this is a new (experimental) flag that allows libcurl on Windows, built to use OpenSSL to use the Windows native CA store when verifying server certificates. See CURLOPT_SSL_OPTIONS. This option is marked experimental as we didn’t decide in time exactly how this new ability should relate to the existing CA store path options, so if you have opinions on this you know we’re interested!

CURLOPT-BLOBs – a new series of certificate related options have been added to libcurl. They all take blobs as arguments, which are basically just a memory area with a given size. These new options add the ability to provide certificates to libcurl entirely in memory without using files. See for example CURLOPT_SSLCERT_BLOB.

CURLOPT_PROXY_ISSUERCERT – turns out we were missing the proxy version of CURLOPT_ISSUERCERT so this completed the set. The proxy version is used for HTTPS-proxy connections.

--retry-all-errors is the new blunt tool of retries. It tells curl to retry the transfer for all and any error that might occur. For the cases where just --retry isn’t enough and you know it should work and retrying can get it through.

Interesting bug-fixes

This is yet another release with over a hundred and thirty different bug-fixes. Of course all of them have their own little story to tell but I need to filter a bit to be able to do this blog post. Here are my collected favorites, in no particular order…

  • Bug-fixed happy eyeballs– turns out the happy eyeballs algorithm for doing parallel dual-stack connections (also for QUIC) still had some glitches…
  • Curl_addrinfo: use one malloc instead of three – another little optimize memory allocation step. When we allocate memory for DNS cache entries and more, we now allocate the full struct in a single larger allocation instead of the previous three separate smaller ones. Another little cleanup.
  • options-in-versions – this is a new document shipped with curl, listing exactly which curl version added each command line option that exists today. Should help everyone who wants their curl-using scripts to work on their uncle’s ancient setup.
  • dynbuf – we introduced a new internal generic dynamic buffer functions to cake care of dynamic buffers, growing and shrinking them. We basically simplified and reduced the number of different implementations into a single one with better checks and stricter controls. The internal API is documented.
  • on macOS avoid DNS-over-HTTPS when given a numerical IP address – this bug made for example FTP using DoH fail on macOS. The reason this is macOS-specific is that it is the only OS on which we call the name resolving functions even for numerical-only addresses.
  • http2: keep trying to send pending frames after req.upload_done – HTTP/2 turned 5 years old in May 2020 but we can still find new bugs. This one was a regression that broke uploads in some conditions.
  • qlog support – for the HTTP/3 cowboys out there. This makes curl generate QUIC related logs in the directory specified with the environment variable QLOGDIR.
  • OpenSSL: have CURLOPT_CRLFILE imply CURLSSLOPT_NO_PARTIALCHAIN – another regression that had broken CURLOPT_CRLFILE. Two steps forward, one step back.
  • openssl: set FLAG_TRUSTED_FIRST unconditionally – with this flag set unconditionally curl works around the issue with OpenSSL versions before 1.1.0 when it would have problems if there are duplicate trust chains and one of the chains has an expired cert. The AddTrust issue.
  • fix expected length of SOCKS5 reply – my recent SOCKS overhaul and improvements brought this regression with SOCKS5 authentication.
  • detect connection close during SOCKS handshake – the same previous overhaul also apparently made the SOCKS handshake logic not correctly detect closed connection, which could lead to busy-looping and using 100% CPU for a while…
  • add https-proxy support to the test suite – Finally it happened. And a few new test cases for it was also then subsequently provided.
  • close connection after excess data has been read – a very simple change that begged the question why we didn’t do it before! If a server provides more data than what it originally told it was gonna deliver, the connection is one marked for closure and won’t be re-used. Such a re-use would usually just fail miserably anyway.
  • accept “any length” credentials for proxy auth – we had some old limits of 256 byte name and password for proxy authentication lingering for no reason – and yes a user ran into the limit. This limit is now gone and was raised to… 8MB per input string.
  • allocate the download buffer at transfer start– just more clever way to allocate (and free) the download buffers, to only have them around when they’re actually needed and not longer. Helps reducing the amount of run-time memory curl needs and uses.
  • accept “::” as a valid IPv6 address – the URL parser was a tad bit too strict…
  • add SSLKEYLOGFILE support for wolfSSLSSLKEYLOGFILE is a lovely tool to inspect curl’s TLS traffic. Now also available when built with wolfSSL.
  • enable NTLM support with wolfSSL – yeps, as simple as that. If you build curl with wolfSSL you can now play with NTLM and SMB!
  • move HTTP header storage to Curl_easy from connectdata – another one of those HTTP/2 related problems that surprised me still was lingering. Storing request-related data in the connection-oriented struct is a bad idea as this caused a race condition which could lead to outgoing requests with mixed up headers from another request over the same connection.
  • CODE_REVIEW: how to do code reviews in curl – thanks to us adding this document, we could tick off the final box and we are now at gold level
  • leave the HTTP method untouched in the set.* struct – when libcurl was told to follow a HTTP redirect and the response code would tell libcurl to change that the method, that new method would be set in the easy handle in a way so that if the handle was re-used at that point, the updated and not the original method would be used – contrary to documentation and how libcurl otherwise works.
  • treat literal IPv6 addresses with zone IDs as a host name – the curl tool could mistake a given numerical IPv6 address with a “zone id” containing a dash as a “glob” and return an error instead…

Coming up

There are more changes coming and some PR are already pending waiting for the feature window to open. Next release is likely to become version 7.72.0 and have some new features. Stay tuned!

curl ootw: –connect-timeout

Previous options of the week.

--connect-timeout [seconds] was added in curl 7.7 and has no short option version. The number of seconds for this option can (since 7.32.0) be specified using decimals, like 2.345.

How long to allow something to take?

curl shipped with support for the -m option already from the start. That limits the total time a user allows the entire curl operation to spend.

However, if you’re about to do a large file transfer and you don’t know how fast the network will be so how do you know how long time to allow the operation to take? In a lot of of situations, you then end up basically adding a huge margin. Like:

This operation usually takes 10 minutes, but what if everything is super overloaded at the time, let’s allow it 120 minutes to complete.

Nothing really wrong with that, but sometimes you end up noticing that something in the network or the remote server just dropped the packets and the connection wouldn’t even complete the TCP handshake within the given time allowance.

If you want your shell script to loop and try again on errors, spending 120 minutes for every lap makes it a very slow operation. Maybe there’s a better way?

Introducing the connect timeout

To help combat this problem, the --connect-timeout is a way to “stage” the timeout. This option sets the maximum time curl is allowed to spend on setting up the connection. That involves resolving the host name, connecting TCP and doing the TLS handshake. If curl hasn’t reached its “established connection” state before the connect timeout limit has been reached, the transfer will be aborted and an error is returned.

This way, you can for example allow the connect procedure to take no more than 21 seconds, but then allow the rest of the transfer to go on for a total of 120 minutes if the transfer just happens to be terribly slow.

Like this:

curl --connect-timeout 21 --max-time 7200 https://example.com

Sub-second

You can even set the connection timeout to be less than a second (with the exception of some special builds that aren’t very common) with the use of decimals.

Require the connection to be established within 650 milliseconds:

curl --connect-timeout 0.650 https://example.com

Just note that DNS, TCP and the local network conditions etc at the moment you run this command line may vary greatly, so maybe restricting the connection time a lot will have the effect that it sometimes aborts a connection a little too easily. Just beware.

A connection that stalls

If you prefer a way to detect and abort transfers that stall after a while (but maybe long before the maximum timeout is reached), you might want to look into using –limit-speed.

Also, if a connection goes down to zero bytes/second for a period of time, as in it doesn’t send any data at all, and you still want your connection and transfer to survive that, you might want to make sure that you have your –keepalive-time set correctly.

webinar: testing curl for security

Alternative title: “testing, Q&A, CI, fuzzing and security in curl”

June 30 2020, at 10:00 AM in Pacific Time (17:00 GMT, 19:00 CEST).

Time: 30-40 minutes

Abstract: curl runs in some ten billion installations in the world, in
virtually every connected device on the planet and ported to more operating systems than most. In this presentation, curl’s lead developer Daniel Stenberg talks about how the curl project takes on testing, QA, CI and fuzzing etc, to make sure curl remains a stable and secure component for everyone while still getting new features and getting developed further. With a Q&A session at the end for your questions!

Register here at attend the live event. The video will be made available afterward.

Daniel presenting at cs3sthlm 2019

curl user survey 2020 analysis

In 2020, the curl user survey ran for the 7th consecutive year. It ended on May 31 and this year we manage to get feedback donated by 930 individuals.

Number of respondents per year

Analysis

Analyzing this huge lump of data, comments and shared experiences is a lot of work and I’m sorry it’s taken me several weeks to complete it. I’m happy to share this 47 page PDF document here with you:

curl user survey 2020 analysis

If you have questions on the content or find mistakes or things looking odd in the data or graphs, do let me know!

If you want to help out to do a better survey or analysis next year, I hope you know that you’d be much appreciated…

QUIC with wolfSSL

We have started the work on extending wolfSSL to provide the necessary API calls to power QUIC and HTTP/3 implementations!

Small, fast and FIPS

The TLS library known as wolfSSL is already very often a top choice when users are looking for a small and yet very fast TLS stack that supports all the latest protocol features; including TLS 1.3 support – open source with commercial support available.

As manufacturers of IoT devices and other systems with memory, CPU and footprint constraints are looking forward to following the Internet development and switching over to upcoming QUIC and HTTP/3 protocols, wolfSSL is here to help users take that step.

A QUIC reminder

In case you have forgot, here’s a schematic view of HTTPS stacks, old vs new. On the right side you can see HTTP/3, QUIC and the little TLS 1.3 box there within QUIC.

ngtcp2

There are no plans to write a full QUIC stack. There are already plenty of those. We’re talking about adjustments and extensions of the existing TLS library API set to make sure wolfSSL can be used as the TLS component in a QUIC stack.

One of the leading QUIC stacks and so far the only one I know of that does this, ngtcp2 is written to be TLS library agnostic and allows different TLS libraries to be plugged in as different backends. I believe it makes perfect sense to make such a plugin for wolfSSL to be a sensible step as soon as there’s code to try out.

A neat effect of that, would be that once wolfSSL works as a backend to ngtcp2, it should be possible to do full-fledged HTTP/3 transfers using curl powered by ngtcp2+wolfSSL. Contact us with other ideas for QUIC stacks you would like us to test wolfSSL with!

FIPS 140-2

We expect wolfSSL to be the first FIPS-based implementation to add support for QUIC. I hear this is valuable to a number of users.

When

This work begins now and this is just a blog post of our intentions. We and I will of course love to get your feedback on this and whatever else that is related. We’re also interested to get in touch with people and companies who want to be early testers of our implementation. You know where to find us!

I can promise you that the more interest we can sense to exist for this effort, the sooner we will see the first code to test out.

It seems likely that we’re not going to support any older TLS drafts for QUIC than draft-29.

curl meets gold level best practices

About four years ago I announced that curl was 100% compliant with the CII Best Practices criteria. curl was one of the first projects on that train to reach a 100% – primarily of course because we were early joiners and participants of the Best Practices project.

The point of that was just to highlight and underscore that we do everything we can in the curl project to act as a responsible open source project and citizen of the larger ecosystem. You should be able to trust curl, in every aspect.

Going above and beyond basic

Subsequently, the best practices project added higher levels of compliance. Basically adding a bunch of requirements so if you want to grade yourself at silver or even gold level there are a whole series of additional requirements to meet. At the time those were added, I felt they were asking for quite a lot of specifics that we didn’t provide in the curl project and with a bit of a sigh I felt I had to accept the fact that we would remain on “just” 100% compliance and only reaching a part of the way toward Silver and Gold. A little disheartened of course because I always want curl to be in the top.

So maybe Silver?

I had left the awareness of that entry listing in a dusty corner of my brain and hadn’t considered it much lately, when I noticed the other day that it was announced that the Linux kernel project reached gold level best practice.

That’s a project with around 50 times more developers and commits than curl for an average release (and even a greater multiplier for amount of code) so I’m not suggesting the two projects are comparable in any sense. But it made me remember our entry on CII Best Practices web site.

I came back, updated a few fields that seemed to not be entirely correct anymore and all of a sudden curl quite unexpectedly had a 100% compliance at Silver level!

Further?

If Silver was achievable, what’s actually left for gold?

Sure enough, soon there were only a few remaining criteria left “unmet” and after some focused efforts in the project, we had created the final set of documents with information that were previously missing. When we now finally could fill in links to those docs in the final few entries, project curl found itself also scoring a 100% at gold level.

Best Practices: Gold Level

What does it mean for us? What does it mean for you, our users?

For us, it is a double-check and verification that we’re doing the right things and that we are providing the right information in the project and we haven’t forgotten anything major. We already knew that we were doing everything open source in a pretty good way, but getting a bunch of criteria that insisted on a number of things also made us go the extra way and really provide information for everything in written form. Some of what previously really only was implied, discussed in IRC or read between lines in various pull requests.

I’m proud to lead the curl project and I’m proud of all our maintainers and contributors.

For users, having curl reach gold level makes it visible that we’re that kind of open source project. We’re part of this top clique of projects. We care about every little open source detail and this should instill trust and confidence in our users. You can trust curl. We’re a golden open source project. We’re with you all the way.

The final criteria we checked off

Which was the last criteria of them all for curl to fulfill to reach gold?

The project MUST document its code review requirements, including how code review is conducted, what must be checked, and what is required to be acceptable (link)

This criteria is now fulfilled by the brand new document CODE_REVIEW.md.

What’s next?

We’re working on the next release. We always do. Stop the slacking now and get back to work!

Credits

Gold image by Erik Stein from Pixabay

800 authors and counting

Today marks the day when we merged the commit authored by the 800th person in the curl project.

We turned 22 years ago this spring but it really wasn’t until 2010 when we switched to git when we started to properly keep track of every single author in the project. Since then we’ve seen a lot of new authors and a lot of new code.

The “explosion” is clearly visible in this graph generated with fresh data just this morning (while we were still just 799 authors). See how we’ve grown maybe 250 authors since 1 Jan 2018.

Author number 800 is named Nicolas Sterchele and he submitted an update of the TODO document. Appreciated!

As the graph above also shows, a majority of all authors only ever authored a single commit. If you did 10 commits in the curl project, you reach position #61 among all the committers while 100 commits takes you all the way up to position #13.

Become one!

If you too want to become one of the cool authors of curl, I fine starting point for that journey could be the Help Us document. If that’s not enough, you’re also welcome to contact me privately or maybe join the IRC channel for some socializing and “group mentoring”.

If we keep this up, we could reach a 1,000 authors in 2022…

curl ootw: –ftp-skip-pasv-ip

(Other command line options of the week.)

--ftp-skip-pasv-ip has no short option and it was added to curl in 7.14.2.

Crash course in FTP

Remember how FTP is this special protocol for which we create two connections? One for the “control” where we send commands and read responses and then a second one for the actual data transfer.

When setting up the second connection, there are two ways to do it: the active way and the passive way. The wording there is basically in the eyes of the FTP server: should the server be active or passive in the creation and that’s the key. The traditional underlying FTP commands to do this is either PORT or PASV.

Due to the prevalence of firewalls and other network “complications” these days, the passive style is dominant for FTP. That’s when the client asks the server to listen on a new port (by issuing the PASV command) and then the client connects to the server with a second connection.

The PASV response

When a server responds to a PASV command that the client sends to it, it sends back an IPv4 address and a port number for the client to connect to – in a rather arcane way that looks like this:

227 Entering Passive Mode (192,168,0,1,156,64)

This says the server listens to the IPv4 address 192.168.0.1 on port 40000 (== 156 x 256 + 64).

However, sometimes the server itself isn’t perfectly aware of what IP address it actually is accessible as “from the outside”. Maybe there’s a NAT involved somewhere, maybe there are even more than one NAT between the client and the server.

We know better

For the cases when the server responds with a crazy address, curl can be told to ignore the address in the response and instead assume that the IP address used for the control connection will in fact work for the data connection as well – this is generally true and has actually become even more certain over time as FTP servers these days typically never return a different IP address for PASV.

Enter the “we know better than you” option --ftp-skip-pasv-ip.

What about IPv6 you might ask

The PASV command, as explained above, is explicitly only working with IPv4 as it talks about numerical IPv4 addresses. FTP was actually first described in the early 1970s, quite a lot time before IPv6 was born.

When FTP got support for IPv6, another command was introduced as a PASV replacement.: the EPSV command. If you run curl with -v (verbose mode) when doing FTP transfers, you will see that curl does indeed first try to use EPSV before it eventually falls back and tries PASV if the previous command doesn’t work.

The response to the EPSV command doesn’t even include an IP address but then it always assumes the same address as the control connection and it only returns back a TCP port number.

Example

Download a file from that server giving you a crazy PASV response:

curl --ftp-skip-pasv-ip ftp://example.com/file.txt

Related options

Change to active FTP mode with --ftp-port, switch off EPSV attempts with --disable-epsv.

on-demand buffer alloc in libcurl

Okay, so I’ll delve a bit deeper into the libcurl internals than usual here. Beware of low-level talk!

There’s a never-ending stream of things to polish and improve in a software project and curl is no exception. Let me tell you what I fell over and worked on the other day.

Smaller than what holds Linux

We have users who are running curl on tiny devices, often put under the label of Internet of Things, IoT. These small systems typically have maybe a megabyte or two of ram and flash and are often too small to even run Linux. They typically run one of the many different RTOS flavors instead.

It is with these users in mind I’ve worked on the tiny-curl effort. To make curl a viable alternative even there. And believe me, the world of RTOSes and IoT is literally filled with really low quality and half-baked HTTP client implementations. Often certainly very small but equally as often with really horrible shortcuts or protocol misunderstandings in them.

Going with curl in your IoT device means going with decades of experience and reliability. But for libcurl to be an option for many IoT devices, a libcurl build has to be able to get really small. Both the footprint on storage but also in the required amount of dynamic memory used while executing.

Being feature-packed and attractive for the high-end users and yet at the same time being able to get really small for the low-end is a challenge. And who doesn’t like a good challenge?

Reduce reduce reduce

I’ve set myself on a quest to make it possible to build libcurl smaller than before and to use less dynamic memory. The first tiny-curl releases were only the beginning and I already then aimed for a libcurl + TLS library within 100K storage size. I believe that goal was met, but I also think there’s more to gain.

I will make tiny-curl smaller and use less memory by making sure that when we disable parts of the library or disable specific features and protocols at build-time, they should no longer affect storage or dynamic memory sizes – as far as possible. Tiny-curl is a good step in this direction but the job isn’t done yet – there’s more “dead meat” to carve off.

One example is my current work (PR #5466) on making sure there’s much less proxy remainders left when libcurl is built without support for such. This makes it smaller on disk but also makes it use less dynamic memory.

To decrease the maximum amount of allocated memory for a typical transfer, and in fact for all kinds of transfers, we’ve just switched to a model with on-demand download buffer allocations (PR #5472). Previously, the download buffer for a transfer was allocated at the same time as the handle (in the curl_easy_init call) and kept allocated until the handle was cleaned up again (with curl_easy_cleanup). Now, we instead lazy-allocate it first when the transfer starts, and we free it again immediately when the transfer is over.

It has several benefits. For starters, the previous initial allocation would always first allocate the buffer using the default size, and the user could then set a smaller size that would realloc a new smaller buffer. That double allocation was of course unfortunate, especially on systems that really do want to avoid mallocs and want a minimum buffer size.

The “price” of handling many handles drastically went down, as only transfers that are actively in progress will actually have a receive buffer allocated.

A positive side-effect of this refactor, is that we could now also make sure the internal “closure handle” actually doesn’t use any buffer allocation at all now. That’s the “spare” handle we create internally to be able to associate certain connections with, when there’s no user-provided handles left but we need to for example close down an FTP connection as there’s a command/response procedure involved.

Downsides? It means a slight increase in number of allocations and frees of dynamic memory for doing new transfers. We do however deem this a sensible trade-off.

Numbers

I always hesitate to bring up numbers since it will vary so much depending on your particular setup, build, platform and more. But okay, with that said, let’s take a look at the numbers I could generate on my dev machine. A now rather dated x86-64 machine running Linux.

For measurement, I perform a standard single transfer getting a 8GB file from http://localhost, written to stderr:

curl -s http://localhost/8GB -o /dev/null

With all the memory calls instrumented, my script counts the number of memory alloc/realloc/free/etc calls made as well as the maximum total memory allocation used.

The curl tool itself sets the download buffer size to a “whopping” 100K buffer (as it actually makes a difference to users doing for example transfers from localhost or other really high bandwidth setups or when doing SFTP over high-latency links). libcurl is more conservative and defaults it to 16K.

This command line of course creates a single easy handle and makes a single HTTP transfer without any redirects.

Before the lazy-alloc change, this operation would peak at 168978 bytes allocated. As you can see, the 100K receive buffer is a significant share of the memory used.

After the alloc work, the exact same transfer instead ended up using 136188 bytes.

102,400 bytes is for the receive buffer, meaning we reduced the amount of “extra” allocated data from 66578 to 33807. By 49%

Even tinier tiny-curl: in a feature-stripped tiny-curl build that does HTTPS GET only with a mere 1K receive buffer, the total maximum amount of dynamically allocated memory is now below 25K.

Caveats

The numbers mentioned above only count allocations done by curl code. It does not include memory used by system calls or, when used, third party libraries.

Landed

The changes mentioned in this blog post have landed in the master branch and will ship in the next release: curl 7.71.0.

tech, open source and networking