In the Swiss crime comedy TV series Tschugger, season two episode two at roughly 25:20, there is a shot with a curl command line in a terminal window using an unnecessary –request option.
Following the curl line is what looks like an interactive login procedure, which certainly is not something a real curl would present. Based on this, I think we need to give this use of curl a fairly low realism score: a 2 out 5.
Trying that displayed command line in a real terminal unfortunately only gives us Could not resolve host: secure.da-34-22.remote.com. I doubt that the TV company actually purchased this domain though. It seems a little too generic.
I have not seen it
I have not been able to view this episode so I cannot yet comment on the conditions and the surroundings for when this snapshot is taken. Once I do, I might be able to extend the description above somewhat.
Credits
First brought to my attention by Cybergossipgirl, who also took the snapshot seen above.
In the 2021 movie Silk Road, at around 19:23-19:26 into the film we can see Ross Ulbricht, the lead character, write a program on his laptop that uses curl. A few seconds we get a look at the screen as Ross types on the keyboard and explains to the female character who says I didn’t know you know how to code that he’s teaching himself to write code.
The code
Let’s take a look at the code on the screen. This is PHP code using the well known PHP/CURL binding. The URL on the screen on line two has really bad contrast, but I believe this is what it says:
.onion is a TLD for websites on Tor so this seems legit as it a URL for this purpose could look like this. But then Ross confuses matters a little. He uses twocurl_init() calls, one that sets a URL and then again a call without a URL. He could just have removed line three and four. This doesn’t prohibit the code from working, it just wouldn’t have passed a review.
The code then sets a proxy to use for the transfer, specified as an HTTP URL which is a little odd since the proxy type he then sets on the line below is 7, the number corresponding to CURLPROXY_SOCKS5_HOSTNAME – so not a HTTP proxy at all but a SOCKS5 proxy. The typical way you access Tor: as a SOCKS5 proxy to which you pass the host name, as opposed to resolving the host name locally.
The last line is incomplete but should ultimately be curl_close($ch); to close the handle after use.
All in all a seemingly credible piece of code, especially if we consider it as a work in progress code. The minor mistakes would be soon be fixed.
Credits
Viktor Szakats spotted this and sent me the screenshot above. Thanks!
This adventure started with an issue where a user pointed out that the libcurl function for base64 encoding actually would allocate a few bytes too many at times.
That turned out to be true and we fixed it fairly quickly.
As I glanced at that base64 encoder function that was still loaded and showing in my editor window, it struck me that it really was not written in an optimal way.
Base64 encoding
This “encoding” converts 8 bit data into a 6 bit data, where each 6 bit combination has a dedicated ASCII character. It uses these 64 different characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/.
Three 8-bit bytes make up 24 bit of data, which can be represented by four 6-bit symbols. Like this: the example byte sequence 0x12 , 0x34 and 0x56 creates the 24 bit value 0x123456. Shown in binary it looks like: 000100100 0110100 01010110
That 24 bit number is split into 6-bit chunks: 000100, 100011, 010001 and 010110. Written in decimal, they are 4, 35, 17 and 22. Pick the corresponding symbols from those indexes in the base64 table shown above and they make the base64 encoded sequence: EjRW. And so on.
That’s base 64 encoding.
A realization
The base64 encoder function source code I looked at, was introduced in curl in the late 1990s and existed in the first commit we have saved. It has remained mostly intact since. Over twenty two years old.
This is how the code used to look. Fairly readable, but with a lot of conditions and perhaps most importantly, with calls to msnprintf() to output data. msnprintf() is our internal snprintf implementation,
(The padstr variable in there is for handling the mode where it does not output any final = padding characters. )
Improving
I started out by writing a test program. I created a huge string, and made the test program base64-encode a part of that string. Starting from one byte string, then increasing the length one by one to iterating over all sizes until the final size which happened to be 106128 bytes. Maybe not the most realistic test in the world, but at least it does a lot of base64 encoding. The base64 encode algorithm is also content agnostic so it doesn’t matter what the exact content is, the size of it is the main thing.
My first casual attempt that only replaced the snprintf() calls with direct assigns into the target buffer first made me doubt my numbers or test program. My test program ran 14 times faster.
Motivated by the enormous performance gain seen with that minor change, I continued. I removed the use of the obuf array and it occurred to me I should deal with the encoding in two phases; one main one for all complete three-byte triplets and then do the padding final chunk separately – as then we can avoid conditions in the main loop.
The final result that I ended up merging showed an almost 29 times improvement. With the old code the test program took six minutes to complete, the new one finished in twelve seconds.
I think the new version still is highly readable, and it actually is significantly smaller in size than the previous version!
Base64 decoding
Energized by that fascinating improvement I managed to do to the encoder function, I turned my eyes to the base64 decoder function. This function is slightly newer in curl than the encoder, but still traces back to 2001 and it too was never improved much after its initial merge.
I started again by writing another test program. This one creates a 2948 bytes base64 encoded string which the application then iterates over and decodes pieces of. From one byte up to the full size, and then it loops so that it repeats that procedure a thousand times.
When decoding base64, the core of the code needs to find out what binary number each particular input octet represents. ‘A’ is zero, ‘B’ is one etc. Then it needs to take four such consecutive (effectively 6 bit) letters at a time and output 3 eight bit bytes. Over and over. In a final round it might need to deal with =-padding to make it an even 4 bytes size.
The old decoder
This is how the old decodeQuantum function looked like. It decodes a 4-letter sequence into three output bytes.
What immediately sticks out in the old code is the use of strchr() to find the letters’ offset in the base64 table as a means to figure out its byte value. The larger the value, the longer the function needs to search through the string to find it. And it needs to search for every single byte used in the input string.
What if we could replace the search by a lookup table instead?
My updated code features a lookup table for each input byte, and in similar fashion to how the encoder logic works, it now separates the final padding step in a separate extra block in the end to avoid extra conditions in the main loop.
With the simplified quad decoder, I put the whole thing in the same loop. Unfortunately I could not think of a way to avoid the check for invalid characters in the loop.
It turns out this new base64 decoder is about 4.5 times faster than the previous code!
for(i = 0; i < fullQuantums; i++) {
unsigned char val;
unsigned int x = 0;
int j;
for(j = 0; j < 4; j++) {
val = lookup[(unsigned char)*src++];
if(val == 0xff) /* bad symbol */
goto bad;
x = (x << 6) | val;
}
pos[2] = x & 0xff;
pos[1] = (x >> 8) & 0xff;
pos[0] = (x >> 16) & 0xff;
pos += 3;
}
Final words
I wrote my test applications to measure the impact of my changes. We could argue if they show real world use cases or not but I think they at still prove that my changes were good and made the functions faster. The exact numbers are not that important.
The exact performance numbers I mention in this blog post are based on results that I saw on my old Intel-based development machine as the best result of at least three consecutive runs. Other systems and architectures will for sure show different numbers.
Base64 encoding and decoding are not significant functions, and are not even very frequently called ones, in curl. These improvements are not likely to even be noticeable to curl or libcurl users. I still consider these optimizations worthwhile because why not do things as fast as you can if you are going to them anyway. As long as there’s no particular penalty involved.
The changes I did for the base64 handling were done entirely without changing the behavior or function prototypes in order to compartmentalize them and keep them constrained.
I occasionally do talks about curl. In these talks I often include a few slides that say something abut curl’s coverage and presence on different platforms. Mostly to boast of course, but also to help explain to the audience how curl has manged to reach its ten billion installations.
This is current incarnation of those seven slides in November 2022. I am of course also eager to get your feedback on the specific contents, especially if you miss details in them, that I should add so that my future curl presentations include more accurate data.
curl runs in all your devices
curl is used in (almost) every Internet-connected device out there, and I try to visualize that with this packed slide. Cars, servers, game consoles, medical devices, games, apps, operating systems, watches, robots, TVs, speakers, light-bulbs, freezers, printers, motorcycles, music instruments and more.
The intent being to show with images that it runs in quite a few devices.
28 transfer protocols
More strictly speaking these are the 28 URL schemes curl supports right now, including in experimental builds. The image tries to put them in a sort of hierarchical way so that you can see what underlying protocol that is used for them: TCP, SSH, TLS, QUIC etc.
60 bindings
Volunteers have created and maintain libcurl bindings for at least these 60 different environments, making it possible for developers to access and use libcurl powers from virtually every programming language you can think of.
37 third party dependencies
curl’s modular design and ability letting the developer who builds curl to mix and match features and what particular third party dependencies to use, makes it possible to craft exactly the curl builds you want. Device manufacturers get the combination they like for exactly their purposes and needs.
89 operating systems
This list has been worked on and bounced around several times between friends and it always brings out a few questions and people like to argue with me about a few entries I include and about a few entries I do not include. The problem is both that there is not a clear defining line between the definition of an operating systems and a distribution or a different running environment and sometimes it is just branding differences separating X from Y. With a “flexible” attitude to the definition of operating systems, this is the current collection of no less than 89 individual ones on which curl runs or has run on:
24 CPU architectures
Older versions of this slide used to have x86-64 as a separate one, but I think we have concluded that a large amount of the architectures have separate 64 bit versions so if I were to keep x86-64 I should also include a lot of other 64 bit versions so I removed the x86-64 from the slide. Maybe I should rather go the other way and add all the 64 bit version as separate architectures?
Anyway, curl has been made to run on virtually all modern or semi-modern 32 bit or larger CPU architectures we can find:
2 planets
I admit it. I include this slide in my presentation and in this blog post because it feels like the ultimate show-off. curl was used in the mars 2020 helicopter mission.
The curl project builds on foundations that started in late 1996 with the tool named httpget.
ANSI C became known as C89
In 1996 there were not too many good alternatives for making a small and efficient command line tool for doing Internet transfers. I am not saying that C was the only available language, but for me the choice was easy and frankly I did not even think about any other languages when this journey started. We called the C flavor “ANSI C” back then, as compared to the K&R “old style” C. The ANSI C version would later be renamed to C89 (confusingly enough it is also sometimes known as C90).
In the year 2000 we introduced libcurl, the library that provides Internet transfer super powers to whoever wants it. This made the choice of using C even better. C made it possible for us to provide a stable API/ABI without problems – something not even C++ could offer at the time. It was also a reasonably portable language that made it possible for us to bring curl and libcurl to virtually all modern operating systems.
As I wanted curl and libcurl to be system level options and I aimed for the widest possible adoption, they could not be written in any of the higher level languages like Perl, Python or similar. That would make them too big and require too much “extra baggage”.
I am convinced that the use of (conservative) C for curl is a key factor to its success and its ability to get used “everywhere”.
C99
C99 was published in (surprise!) 1999 but the adoption in compilers took a long time and it remained a blocker for adoption for us. We want curl available “everywhere” so as long some of the major compilers did not support C99 we did not even consider switching C flavor, as it would risk hamper curl adoption.
The slowest of the “big compilers” to adopt C99 was the Microsoft Visual C++ compiler, which did not adopt it properly until 2015 and added more compliance in 2019. A large number of our users/developers are still stuck on older MSVC versions so not even all users of this compiler suite can build C99 programs even today, in late 2022.
C11, C17 and beyond
Meanwhile, the ISO C Working Group continue to crank out updates to the C language. C11 shipped, C17 came and now they are working on the C2x pending version, presumed to end up called C23.
Bump the requirement for curl?
We are aware that other widely popular C projects are moving forward and have raised their requirements to C99 or beyond. Like the Linux kernel, the git project and more.
The discussion about bumping C flavor has been brought up on the libcurl mailing list as well, in particular as we are already planning a version 8 release to happen in the spring of 2023 so in theory it could be a good moment to make some changes like this.
What C99 features would improve a project like curl? The most interesting parts of C99 that could impact curl code that I could think of are:
// comments
__func__ predefined identifer
boolean type in <stdbool.h>
designated struct initializers
empty macro arguments
extended integer types in <inttypes.h> and <stdint.h>
flexible array members (zero size arrays)
inline functions
integer constant type rules
mixed declarations and code
the long long type and library functions
the snprintf() family of functions
trailing comma allowed in enum declaration
vararg macros
variable-length arrays
So sure, there are lots of cool things we could use. But do we need them?
For several of the features above, we already have decent and functional replacements. Several of the features don’t matter. The rest risk becoming distractions.
Opening up for C99 without conditions in curl code would risk opening the flood gates for people rewriting things, so we would have to go gently and open up for allowing new C99 features slowly. That is also how the git project does it. A challenge with that approach, is that it is hard to verify which features that are allowed vs used as existing tooling normally don’t have that resolution.
The question has also been asked that if we consider bumping the requirement, should we then not bump it to C11 at once instead of staying at C99?
Not now
Ultimately, not a single person has yet been able to clearly articulate what benefits such a C flavor requirement bump would provide for the curl project. We mostly see a risk that we all get caught in rather irrelevant discussions and changes that perhaps will not actually bring the project forward very much. Neither in features nor in quality/security.
I think there are still much better things to do and much more worthwhile efforts to spend our energy on that could actually improve the project and bring it forward.
Like improving the test suite, increasing test coverage, making sure more code is exercised by the fuzzers.
A minor requirement change
We have decided that starting with curl 8, we will require that the compiler supports a 64 bit data type. This is not something that existed in the original C89 version but was introduced in C99. However, there is no longer any modern compiler around that does not support this.
This is a way to allow us to stop caring about those odd platforms and write code and checks for when the large types are not very large. It is hard to verify that code nowadays since virtually nobody actually uses such compilers/systems.
Maybe this is the way we can continue to adapt to and use specific post C89 features going forward. By cherry-picking them one by one and adapting to them slowly over time.
It is not a no to C99 forever
I am sure we will bring up this topic for discussion again in the future. We have not closed the door forever or written anything in stone. We have only decided that for the moment we have not been persuaded to switch. Maybe we will in a future.
Other languages
We do not consider switching or rewriting curl into any other language.
In the curl project, one of the holiest and most sacred rules is:
we do not break the API or ABI
Everything else is a matter of discussion.
More features all the time
We keep adding features and we do improvements at a rather high pace. So much that we actually rarely do a release without introducing something new.
To be able to add features and to keep changing curl and making sure that it keeps up with the world around it and that it provides the features and the abilities that a world of Internet transfers needs, we need to make sure that the internals are written correctly. And by correctly, I mean in a way that allows us to extend and change curl when we want to – that doesn’t break the ABIs nor the tests.
Refactors
curl is old and choices sometimes need to be reconsidered. Over the years we have refactored and changed the curl internals and design quite drastically several times. Thanks to an extensive test suite and a library API that was designed from the start to hide most internal choices, this has been possible to do without being visible to users. The upside has been that the internals have become easier to maintain and to extend with more features.
Refactoring again
This time, we are again on a mission to extend the curl feature set as I blogged about recently, and this time we have Stefan Eissing on board to do it.
So, without changing any API or breaking the ABI and having the large set of test cases remain working in the many CI jobs we have, Stefan introduced a new internal concept for curl: connection filters.
Filters
We call them filters but they could also be seen as layers or maybe even domino pieces. Each filter is a piece of network logic and the idea is that we can chain them together at run-time to create protocol cakes (my word). curl can connect to a HTTP proxy, do TLS and speak HTTP/2 over that. That makes three separate filters put together.
Adding for example TLS to the proxy would just be inserting a filter in the right place in the chain, while using the filter-chain is done the same way no matter the filter chain length and independently of which exact filters it consists of.
The previous logic, before the filters, was a more like a vast number of conditional flag checks done in the right order. This new system reduces the amount of conditional checks and it also moves code for handling the different network filters into more localized and compartmentalized functions.
More protocol combos
In addition to the more localized code for specifics features, this new concept more notably makes it easier to build new protocol layer combinations. Adding support for HTTP/2 to the proxy for example, should now ideally be a matter of adding a filter the right way and the transfer pipeline should otherwise “just work”.
Not everything internally is yet converted to filters even if we have merged the first large pull request. Stefan now works on getting more curl code to use this concept before he can get into the actual protocol changes lined up for him.
Performance
The filters do not impact transfer performance, I/O works the same as before.
Details
If you long for more technical details and explanations about this, maybe to be able to dig into the curl source code yourself, then an excellent starting-point is the document in the curl source made for this purpose CONNECTION-FILTERS.md.
When setting up a TLS or QUIC connection, a client like curl needs a CA store in order to verify the certificate(s) the server provides in the TLS handshake.
CA store
A CA store is a fancy name for a number of certificates. Certificates for the Certificate Authorities (CAs) that a TLS client trusts. On the curl website, we offer a PEM version of the CA store that Mozilla maintains, for download. This set currently contains 142 certificates and while the exact amount vary a little over time, it has been more than a hundred for many years. A fair amount. And there is nothing in the pipe that will bring down the number significantly anytime soon, to my knowledge. These 142 certificates make up a file that is exactly 225,403 bytes. 1587 bytes per certificate on average.
Load and parse
When setting up a TLS connection, the 142 certificates need to be loaded from the external file into memory and parsed so that the server’s certificate can be verified. So that curl knows that the server it has connected to is indeed the correct server and not a man in the middle, an impostor.
This procedure is a rather costly one, in terms of CPU cycles needed.
Another cache
A classic approach to avoid heavy work is to cache the results from a previous use to be able to reuse them again. Starting in curl 7.87.0 curl introduces a CA store cache.
Now, curl can keep the loaded and parsed CA store in memory associated with the handle and then subsequent requests can avoid re-loading and re-parsing the CA data when new connections are created – if they use the same CA store of course. The performance gain in doing this shortcut can be enormous. After all, most transfers are done using the same single CA store.
To quote the numbers Michael Drake presented in the pull request for this new feature. He measured number of instructions to load and render a particular web page from BBC with the NetSurf browser (which obviously is using libcurl for its HTTPS transfers). With and without this cache.
CA store cache
Total instruction fetch cost
None
5,168,090,712
Enabled
1,020,984,411
I think a reduction to one fifth of the original cost is significant.
Converted into a little graph they compare like this (smaller is better):
But even in simpler applications and curl command lines this caching should have a measurable impact as soon as multiple TLS connections are done using the same handle. An extremely common usage pattern.
Life-time
Keeping the data around after use potentially changes the behavior a little, but the huge performance gain made us decide to still do this by default. We compensate this a little by setting the default life-time to 24 hours, so applications that keep handles alive for a very long time will still get the cache flushed and read from file again every day.
The CA store is typically not updated more frequently than once every few months or weeks.
This is a new option for libcurl that allows applications to tweak the life-time and CA cache behavior for when the default as described above is not enough.
Details
This CA cache system is so far only supported when curl is built to use OpenSSL or one of its forks. I hope others will get inspired and bring this support for other TLS backends as well as we go forward.
CA cache support for curl was authored by Michael Drake. Thanks!
curl offered the -d / --data option already in its first release back in 1998. curl 4.0. A trusted old friend.
curl also has some companion versions of this option that work slightly differently, but they all have the common feature that they append data to the the request body. Put simply: with these options users construct the body contents to POST. Very useful and powerful. Still today one of the most commonly used curl options, for apparent reasons.
A few years into curl’s lifetime, in 2001, we introduced the -G / --get option. This option let you use -d to create a data set, but the data is not sent as a POST body anymore but is instead converted to a query string and used in a GET request.
This would make curl send a GET request to this URL: https://example.com/?name=mrsmith&color=blue
The “query” is the part of the URL that sits on the right side of the question mark (but before the fragment that if it exists starts with the first # following the question mark).
URL-encode
In 2008 we added --data-urlencode which made it even easier for users and scripts to use these options correctly as now curl itself can URL-encode the given data instead of relying on the user to do it. Previously, script authors would have to do that encoding before passing the data to curl which was tedious and error prone. This feature also works in combination with -G of course.
How about both?
The -d options family make a POST. The -G converts it to a GET.
If you want convenient curl command line options to both make content to send in the POST body andto create query parameters in the URL you were however out of luck. You would then have to go back to use -d but handcraft and encode the query parameters “manually”.
Until curl 7.87.0. Due to ship on December 21, 2022. (this commit)
--url-query is your new friend
This is curl’s 249th command line option and it lets you append parameters to the query part of the given URL, using the same syntax as --data-urlencode uses.
Using this, a user can now conveniently create a POST request body and at the same time add a set of query parameters for the URL which the request uses.
A basic example that sends the same POST body and URL query:
I told you it uses the data-urlencode syntax, but let me remind you how that works. You use --url-query [data] where [data] can be provided using these different ways:
content
This will make curl URL-encode the content and pass that on. Just be careful so that the content does not contain any = or @ symbols, as that will then make the syntax match one of the other cases below!
=content
This will make curl URL-encode the content and pass that on. The preceding = symbol is not included in the data.
name=content
This will make curl URL-encode the content part and pass that on. Note that the name part is expected to be URL-encoded already.
@filename
This will make curl load data from the given file (including any newlines), URL-encode that data and pass it on in the POST.
name@filename
This will make curl load data from the given file (including any newlines), URL-encode that data and pass it on in the POST. The name part gets an equal sign appended, resulting in name=urlencoded-file-content. Note that the name is expected to be URL-encoded already.
+content
The data is provided as-is unencoded
For each new --url-query, curl will insert an ampersand (&) between the parts it adds to the query.
Replaces -G
This new friend we call --url-query makes -G rather pointless, as this is a more powerful option that does everything -G ever did and a lot more. We will of course still keep -G supported and working. Because that is how we work.
A boring fact of life is that new versions of curl trickle out into the world rather slowly to ordinary users. Because of this, we can be certain that scripts and users all over will need to keep using -G for yet another undefined period of time.
Trace
Finally: remember that if you want curl to show you what it sends in a POST request, the normal -v / --verbose does not suffice as it will not show you the request body. You then rather need to use --trace or --trace-ascii.
When doing HTTP(S) transfers, libcurl might erroneously use the read callback (CURLOPT_READFUNCTION) to ask for data to send, even when the CURLOPT_POSTFIELDS option has been set, if the same handle previously was used to issue a PUT request which used that callback.
This flaw may surprise the application and cause it to misbehave and either send off the wrong data or use memory after free or similar in the subsequent POST request.
The problem exists in the logic for a reused handle when it is changed from a PUT to a POST.
curl can be told to parse a .netrc file for credentials. If that file ends in a line with consecutive non-white space letters and no newline, curl could read past the end of the stack-based buffer, and if the read works, write a zero byte possibly beyond its boundary.
This will in most cases cause a segfault or similar, but circumstances might also cause different outcomes.
If a malicious user can provide a custom .netrc file to an application or otherwise affect its contents, this flaw could be used as denial-of-service.
If curl is told to use an HTTP proxy for a transfer with a non-HTTP(S) URL, it sets up the connection to the remote server by issuing a CONNECT request to the proxy, and then tunnels the rest of protocol through.
An HTTP proxy might refuse this request (HTTP proxies often only allow outgoing connections to specific port numbers, like 443 for HTTPS) and instead return a non-200 response code to the client.
Due to flaws in the error/cleanup handling, this could trigger a double-free in curl if one of the following schemes were used in the URL for the transfer: dict, gopher, gophers, ldap, ldaps, rtmp, rtmps, telnet
curl’s HSTS check could be bypassed to trick it to keep using HTTP.
Using its HSTS support, curl can be instructed to use HTTPS directly instead of using an insecure clear-text HTTP step even when HTTP is provided in the URL. This mechanism could be bypassed if the host name in the given URL uses IDN characters that get replaced to ASCII counterparts as part of the IDN conversion. Like using the character UTF-8 U+3002 (IDEOGRAPHIC FULL STOP) instead of the common ASCII full stop (U+002E) ..
Like this: http://curl?se?
Changes
This time around we add one and we remove one.
NPN support removed
curl no longer supports using NPN for negotiating HTTP/2. The standard way for doing this has been ALPN for a long time and the browsers removed their support for NPN several years ago.
WebSocket API
There is an experimental WebSocket API included in this release. It comes in the form of three new functions and a new setopt option to control behavior. The possibly best introduction to this new API is in everything curl.
I am very interested in feedback on the API.
Bugfixes
Here some of the fixed issues from this cycle that I think are especially worthy to highlight.
aws_sigv4 header computation
The sigv4 code got a significant overhaul and should now do much better than before. This is a fairly complicated setup and there are more improvements coming for future releases.
curl man page details multi-use for each option
Every command line option is documented in its own file, which is then used as input when the huge curl.1 man page is generated. Starting now, each such file needs to specify how the option functions when specified more than once. So from now on, this information is mentioned in the man page for all supported options.
deprecate builds with small curl_off_t
Starting in this release, we deprecate support for building curl for systems without 64 bit data types. Those systems are extremely rare this days and we believe it makes sense to finally simplify a few internals when it hurts virtually no one. This is still only deprecated so users can still build on such systems for a short while longer if they really want to.
the ngtcp2 configure option defaults to ‘no’
You need to explicitly ask for ngtcp2 to be enabled in the build.
reject cookie names or content with TAB characters
Cookies with tabs in names or content are not interoperable and they caused issues when curl saved them to disk, so we decided to rather reject them.
for builds with gcc + want warnings, set gnu89 standard
Just to make better sure we maintain compatibility.
use -O2 as default optimize for clang in configure
It was just a mistake that it did not already do this.
warn for –ssl use, considered insecure
To better highlight for users that this option merely suggests for curl that it should use TLS for the protocol, while --ssl-reqd is the one that requires TLS.
ctype functions converted to macros-only
We replaced the entire function family with macros.
100+ documentation spellfixes
After a massive effort and new CI jobs, we now regularly run a spellcheck on most man pages and as a result we fixed lots of typos and we should now be able to better maintain properly spelled documentation going forward.
make nghttp2 less picky about field whitespace in HTTP/2
If built with a new enough nghttp2 library, curl will now ask it to be less picky about trailing white space after header fields. The protocol spec says they should cause failure, but they are simply too prevalent in live servers responses for this to be a sensible behavior by curl.
use the URL-decoded user name for .netrc parsing
This regression made curl not URL decode the user name provided in a URL properly when it later used a .netrc file to find the corresponding password.
make certinfo available for QUIC
The CURLOPT_CERTINFO option now works for QUIC/HTTP/3 transfers as well.
make forced IPv4 transfers only use A queries
When asking curl to use IPv4-only for transfers, curl now only resolves IPv4 names. Out in the wide world there is a significant share of systems causing problems when asking for AAAA addresses so having this option to avoid them is needed.
schannel: when importing PFX, disable key persistence
Some operations when using the Schannel backend caused leftover files on disk afterward. It really makes you wonder who ever thought designing such a thing was a good idea, but now curl no longer triggers this effect.
add and use Curl_timestrcmp
curl now uses this new constant-time function when comparing secrets in the library in an attempt to make it even less likely for an outsider to be able to use timing as a feedback as to how closely a guessed user name or password match the secret ones.
curl: prevent over-queuing in parallel mode
The command line tool would too eagerly create and queue up pending transfers in parallel mode, making a command line with millions of transfers easily use ridiculous amounts of memory.
url parser: extract scheme better when not guessing
A URL has a scheme and we can use that fact to detect it better and more reliable when not using the scheme-guessing mode.
fix parsing URL without slash with CURLU_URLENCODE
When the URL encode option is set when parsing a URL, the parser would not always properly handle URLs with queries that also lacked a slash in the path. Like https://example.com?moo.
url parser: leaner with fewer allocs
The URL parser is now a few percent faster and makes significantly fewer memory allocations and it uses less memory in total.
url parser: reject more bad characters from the host name field
Another step on the journey of making the parser stricter.
wolfSSL: fix session management bug
The session-id cache handling could trigger a crash due to a missing reference counter.
Future
We have several pull-requests in the pipe that will add changes to trigger a minor number bump.
Removals
We are planning to remove the following features in a future-:
support for systems without 64 bit data type
support for the NSS TLS library
If you depend on one of those features, yell at us on the mailing list!
I am happy to announce that curl receives funding from The Sovereign Tech Fund. This funding is directed towards three specific projects that we have identified as interesting and worthwhile to push forward as ways to improve curl and the life of curl users.
This “investment” will fund two developers to work on curl over a period of six months: Stefan Eissing and myself. The three projects are explained at some detail below. Of course everyone and anyone is welcome to join in and help out with these projects. Everything will be done in the open, as usual.
At the end of the period, we will produce some kind of report or summary of how things turned out.
The three projects we are getting funded have been especially created and crafted (by me) to be good solid projects that we really want to see done. This funding is different than many others we have gotten over the years in that we got to decide and plan what we wanted done. These are things that are meant to improve curl as a project and to generally make Internet transfers better and more powerful for a vast amount of users.
Project 1 – known bugs cleanup
The curl project currently lists 120+ items as known bugs (up from 77 just two years ago).
The items in that list are reported problems that were recognized as problems at the time of their submission but as nobody worked on the issues at the time they were added to this list. The list includes everything from smaller irks up to big things that will either take a long time to fix or be (almost) impossible to address.
There is a good chance that the list will be extended during the project period just because some new bugs fall into the description mentioned above.
This project is an effort to go through as many as possible to make sure they are correctly categorized/described and work on fixing the issues or whatever is necessary to get them off the list.
The process would entail an initial proper research round to extend the description and increase the understanding of each entry, followed by a rough assessment of the amount of work it would take to fix them. Possibly with a 1 (easy) to 5 (extreme) scale.
The action would then be to address the issues, possibly in an easy to hard order. Addressing the issues could be to fix the code to remove the issue, dismiss it as not actually intended to work or document it as not working or even moving it over to the TODO document if it is more of a good idea for the future.
The goal being to reduce the list to zero entries and thus polish off numerous rough corners and annoyances in the project.
This will be done by Daniel Stenberg over a period of 6 months.
Project 2 – HTTP/3
Make HTTP/3 release-ready.
curl features experimental HTTP/3 and QUIC support already since a few years back, but there are several details still lacking:
known bugs
proper multiplexing: doing multiple transfers to the same host should be able to reuse an existing connection and multiplex over that, just as curl already does when using HTTP/2
HTTP/3 support for the test suite (and CI jobs) need to be done for us to be able to consider the support release ready. Cooperation can be had with QUIC libraries such as ngtcp2 to consider where/how some of the testing is best performed.
considerations for 0-RTT connection establishments (if anything needs to be done)
support for early data: to send off the HTTP request to the server faster.
connection migration:, a QUIC feature that allows a server to move over a live connection from one server to another without disruption
fallback to h1/h2 if the QUIC connection fails. The failure rate for QUIC connections are still in the 3-7% rate generally, so having a good fallback mechanism or documented for how applications can go back to an older HTTP version instead, is important.
HTTPS RR. This fairly new DNS resource record might contain information about the target server’s support for HTTP/3. If such a record is provided, curl can avoid superfluous round-trips to get the Alt-Svc header and rather connect directly to the HTTP/3 server.
All features and changes need to be documented. Functionality needs to be verified by test cases. Interop with real world servers is of course implied and assumed.
Stefan Eissing will spend 4 months on this project.
Project 3 – HTTP/2 over proxy
curl has provided support for doing network transfers via HTTP proxies since decades, and this is a very commonly used feature and network setup.
curl however only supports using HTTP version 1 over proxies. This makes applications less effective as it sometimes leads to many more TCP connections being used than otherwise would be necessary if HTTP/2 could be used. In particular when applications behind a proxy operate against many different hosts on the other side of the proxy.
It can also be noted that in many enterprise setups, this kind of HTTP proxy is used for all kinds of network operations through the use of the CONNECT method, so this functionality is not limited to plain HTTP(S), it should work for all TCP based protocols libcurl supports and which already work over HTTP/1 proxies.
The project will require adding new options to enable this functionality, to both the library and the command line tool. With associated documentation.
It will require creating or extending the HTTP/2 support in the curl test suite so that this new functionality can be verified and proven to work at a satisfactory level.
Considerations must be taken so that this work does not close the door for future extending this to support HTTP/3 over proxy. Time permitting, work should be taken to pave the road for that or even perhaps gently start the work to support that as well.
Stefan Eissing will spend 2 months on this project.
The outcome(s)
I hope and presume that the results of these projects will appear as a stream of pull-requests for curl that will be done and managed through-out the project period and not saved up to the end or anything. The review, test and merge process of these pull requests will follow our normal and standard project guidelines and procedures.
The projects are fully packed and (over) ambitious. There is a high risk that we will not be able to complete all the details for these projects within the time frame. But we will try.