tldr: curl uses 30K of dynamic memory for downloading a large HTTP file, plus the size of the download buffer.
Back in September 2020 I wrote about my work to trim curl allocations done for FTP transfers. Now I’m back again on the memory use in curl topic, from a different angle.
This time, I learned about the awesome tool pahole, which can (among other things) show structs and their sizes from a built library – and when embracing this fun toy, I ran some scripts on a range of historic curl releases to get a sense of how we’re doing over time – memory size and memory allocations wise.
The task I set out to myself was: figure out how the sizes of key structs in curl have changed over time, and correlate that with the number and size of allocations done at run-time. To make sure that trimming down the size of a specific struct doesn’t just get allocated by another one instead, thus nullifying the gain. I want to make sure we’re not slowly degrading – and if we do, we should at least know about it!
Also: we keep developing curl at a fairly good pace and we’re adding features in almost every release. Some growth is to be expected and should be tolerated I think. We also keep the build process very configurable so users with particular needs and requirements can switch off features and thus also gain memory.
Memory sizes in modern computing
Of course systems are growing every year and machines ship with more and more ram, which also goes for the smallest machines. But there are still a vast amount of systems out there with limited memory capabilities that want good Internet transfers as well. Also, by keeping sizes down, it allows applications and systems to scale better: a 10% decrease in size can imply a 10% increase in number of possible parallel transfers. curl, and especially libcurl, is still today in 2021 frequently used on machines with limited amounts of available memory. Sometimes in the few megabytes of ram range.
Fixed configuration
In my tests I did for this I used the exact same configuration and build config for all versions tested. The sizes and behavior will vary greatly depending on config, but I tried to use a fairly complete and typical build to see how code and memory use is for “most” users. I ran everything on my x86_64 Debian Linux dev machine. My focus is on curl versions from the last 3-4 years. I figured going back to ancient times won’t help here.
Key structs
struct Curl_easy
– this is the “easy handle”, what is allocated by curl_easy_init() and is the anchor for every transfer done with libcurl, no matter which API you’re using. An application creates one of these for each concurrent transfer it wants to do or keep around. Some applications allocate hundreds or even thousands of these.
struct Curl_multi
– this is the “multi handle”, allocated with curl_multi_init(). This handle is created by applications as a holder of many concurrent transfers so applications typically do not have a very large amount of these.
struct connectdata
– this is an internal struct that isn’t visible externally to applications. It is the holder of connection related data for a connection to a specific server. The connection pool curl uses to handle persistent connections will hold a number of these structs in memory after the transfer has completed, to allow subsequent reuse. The size of the connection pool is customizable. A busy application doing lots of transfers might end up with a sizeable number of connections in the pool, so the size of this struct adds up.
Dynamic allocations
In early curl history, the download and upload buffers for transfers were part of the Curl_easy struct, which made it fairly large.
In curl 7.53.0 (February 2017) the download buffer was turned dynamically sized and is since then allocated separately. Before that transition, curl 7.52.0 had a Curl_easy struct that was 36584 bytes, which included both the download and the upload buffers. In 7.58.0 the size was down to 21264 bytes since the download buffer was then allocated separately and was then also allowed to be done much larger than the previously set 16KB fixed size.
The 16KB upload buffer was moved out of the Curl_easy handle in the 7.62.0 release (October 2018) to be done on demand – which of course especially benefits everyone who doesn’t do uploads at all… The size of this struct was then down to 6208 bytes.
In curl 7.71.0 we also made the download buffer allocated on demand, and immediately freed after the transfer completes. This makes applications that keep handles around for reuse use significantly less memory. Applications are generally encouraged to keep the handles around to better facilitate connection reuse.
Struct size development
Curl_easy
The size in bytes of struct Curl_easy the last few years:
7.52.0 36584
7.58.0 21264
7.61.0 21344
7.62.0 6208
7.63.0 6216
7.64.0 6312
7.65.0 5976
7.66.0 6024
7.67.0 6040
7.68.0 6040
7.69.0 6040
7.70.0 6080
7.71.0 6448
7.72.0 6472
7.73.0 6464
7.74.0 6512
Current git: 5272 bytes (-19% from last release). With this, the struct is smaller than it has ever been before.
How we made this extra reduction? Primarily I noticed how we had a DoH related struct in the handle by default, which was turned into on-demand allocation. DoH is still rare and that data only needs to be allocated during the name resolving phase.
Curl_multi
The size in bytes of struct Curl_multi the last few years has remained very stable and it keeps being very small. Notable is that when we removed pipelining support in 7.65.0 it took away 96 bytes from this struct.
7.50.0 384
7.52.0 384
7.58.0 480
7.59.0 488
7.60.0 488
7.61.0 512
7.62.0 512
7.63.0 512
7.64.0 512
7.65.0 416
7.66.0 416
7.67.0 424
7.68.0 432
7.69.0 416
7.70.0 416
7.71.0 416
7.72.0 416
7.73.0 416
7.74.0 416
Current git: 416 bytes.
With this, we’re smaller than we were in the beginning of 2018.
connectdata
The size in bytes of struct connectdata. It’s been both up and down.
7.50.0 1904
7.52.0 2104
7.58.0 2112
7.59.0 2112
7.60.0 2112
7.61.0 2128
7.62.0 2152
7.63.0 2160
7.64.0 2160
7.65.0 1944
7.66.0 1960
7.67.0 1976
7.68.0 1976
7.69.0 2600
7.70.0 2608
7.71.0 2608
7.72.0 2624
7.73.0 2640
7.74.0 2656
Current git: 1472 bytes (-44% from last release)
The size bump in 7.69.0 was the insertion of a new struct for state data when doing SOCKS connections non-blocking, and the corresponding decrease again for the pending release is the removal of the buffer from that struct. With this, we’re down to a size we had a very long time ago.
Run-time memory use
To make sure that we don’t just move memory to other on-demand buffers that we need to allocate anyway, I ran a script with a lot of curl versions and counted the number of allocations needed and the peak amount of memory allocated. For a plain 512MB download over HTTP from localhost. The counted allocations were only the ones done by curl code (malloc, calloc, realloc, strdup etc).
Number of allocations
There are many reasons to allocate memory and while we want to keep the number down, lots of factors of course needs to be taken into account.
In the list below you’ll see that clearly we had some mistake in 7.52.0 and perhaps some more versions, as it did over 32,000 allocations. The situation was fixed in or before 7.58.0 and I haven’t bothered to go back to check exactly what it was.
7.52.0 32883
7.58.0 82
7.59.0 82
7.60.0 82
7.61.0 82
7.62.0 86
7.63.0 87
7.64.0 87
7.65.0 82
7.66.0 101
7.67.0 107
7.68.0 111
7.69.0 113
7.70.0 113
7.71.0 99
7.72.0 99
7.73.0 96
7.74.0 96
Current git: 96 allocations.
We do more allocations than some years back, but I think it is still within a reasonable growth.
Peak memory allocations
Here’s some developments to look closer at!
If we start out looking at the oldest versions in my test, we can see that they’re sub 100KB allocated – but we need to take into account the fact that back then we used a fixed 16KB download buffer. In curl 7.54.1 we bumped the default buffer size the curl tool uses to 100K which in the table below is visible in the 7.58.0 allocation.
7.50.0 84473
7.52.0 85329
7.58.0 174243
7.59.0 174315
7.60.0 174339
7.61.0 174531
7.62.0 143886
7.63.0 143928
7.64.0 144128
7.65.0 143152
7.66.0 168188
7.67.0 173365
7.68.0 168575
7.69.0 169167
7.70.0 169303
7.71.0 136573
7.72.0 136765
7.73.0 136875
7.74.0 137043
Current git: 131680 bytes.
The gain in 7.62.0 was mostly the removal of the default allocation of the upload buffer, which isn’t used in this test…
The current size tells me several things. We’re at a memory consumption level that is probably at its lowest point in the last decade – while at the same time having more features and being better than ever before. If we deduct the download buffer we have 29280 additional bytes allocated. Compare this to 7.50.0 which allocated 68089 bytes on top of the download buffer!
If I change my curl to use the smallest download buffer size allowed by libcurl (1KB) instead of the default 100KB, it ends up peaking at: 30304 bytes. That’s 44% of the memory needed by 7.50.0.
In my opinion, this is very good.
It might also be worth to reiterate that this is with a full featured libcurl build. We can shrink even further if we switch off undesired features or just go tiny-curl.
I hope this goes without saying, but of course all of this work has been done with the API and ABI still intact.
Graphs?
You know I like graphs, but for now I decided this blog post and analysis was enough. I’m going to think about how we can perhaps get this info somehow floated on a more regular and automated way in the future. Not sure it is worth spending a lot of effort on though.
Reproduce
- build curl with the
--enable-debug
option to configure. Don’t use the threaded resolver – I use the c-ares one, because it otherwise breaks the memdebug system. - Run your command line with tracing enabled and then run memanalyze on the log:
#!/bin/sh
export CURL_MEMDEBUG=/tmp/curlmem.log
./src/curl -v localhost/512M -o /dev/null
./tests/memanalyze.pl -v /tmp/curlmem.log
To get the struct sizes, just run pahole on the static libcurl lib after the build:
pahole -s lib/.libs/libcurl.a > sizes.txt
Credits
The photo was taken by me, in Siem Reap, Cambodia. “A smaller transport”
Updates
After the initial posting of this article I optimized the structs even further so the numbers have been updated since then to reflect the state of what’s in git a week later.