Category Archives: cURL and libcurl

curl and/or libcurl related

A headers API for libcurl

For many years we’ve had this outstanding idea to add a new API to libcurl that would offer applications easy access to HTTP response headers.

Applications could already retrieve the headers using existing methods but that requires them to write a callback and to a certain amount of parsing and “understanding” HTTP that we always felt was a little unfortunate, a bit error-prone on the behalf of the applications and perhaps also a thing that forced a lot of applications out there having to write the same kind of extra function logic.

If libcurl provides this functionality, it would remove a lot of (duplicated) code from a lot of applications.

Designing the API

We started this process a while ago when I first wrote down a basic approach to an API for this and sent it off to the curl-library mailing list for feedback and critique.

/* first take */
char *curl_easy_header(CURL *easy,
                       const char *name);

The conversation that followed that first plea for help, made me realize that my first proposal had been far too basic and it wouldn’t at all work to satisfy the needs and use cases we could think of for this API.

Try again

I went back to mull over what I’ve learned and update my design proposal, trying to take the feedback into account in the best possible way. A few weeks later, I returned with a “proposal v2” and again I asked for comments and opinions on what I had put together.

/* second shot */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           struct curl_header **h);

As I had already adjusted the API from feedback the first time around, the feedback this time was perhaps not calling for as big changes or radical differences as they did the first time around. I could adapt my proposal to what people asked and suggested. We arrived at something that seemed like a pretty solid API for offering HTTP headers to applications.

Let’s do this

As the API proposal feedback settled down and the interface felt good and sensible, I decided it was time for me to write up a first implementation so that we can offer code to people to give everyone a chance to try out the API in real life as well. There’s one thing to give feedback on a “paper product”, actually being able to use it and try it in an application is way better. I dove in.

The final take

When the code worked to the level that I started to be able to extract the first headers with the API, it proved to that we needed to adjust the API a little more, so I did. I then ran into more questions and thoughts about specifics that we hadn’t yet dealt with or nailed proper in the discussions up to that point and I took some questions back to the curl community. This became an iterative process and we smoothed out questions about how access different header “sources” as well as how to deal with multiple headers and “request sequences”. All supported now.

/* final version */
CURLHcode curl_easy_header(CURL *easy,
                           const char *name,
                           size_t index,
                           unsigned int origin,
                           int request,
                           struct curl_header **h);

Multiple headers

This API allows applications to extract all headers from a previous transfer. It can get one or many headers when there are duplicated ones, like Set-Cookie: commonly arrive as.

Sources

The application can ask for “normal” headers, for trailers (that arrive after the body), headers associated with the CONNECT request (if such a one was performed), pseudo headers (that might arrive when HTTP/2 and HTTP/3 is used) or headers associated with a HTTP 1xx “intermediate” response.

Multiple responses

The libcurl APIs typically work on transfers, which means that a single transfer may end up doing multiple transfers, multiple HTTP requests. Primarily when redirects are followed but it can also be due to other reasons. This header API therefore allows the caller to extract headers from the entire “chain” of requests a previous transfer was made with.

EXPERIMENTAL

This API is initially merged (in this commit) labeled “experimental” to be included in the upcoming 7.83.0 release. The experimental label means a few different things to us:

  • The API is disabled by default in the build and you need to explicitly ask for it with --enable-headers-api when you run configure
  • There are no ABI and API promises for these functions yet. We might change the functions based on feedback before we remove the label.
  • We strongly discourage anyone from shipping experimentally labeled functions in production.
  • We rely on people to enable and test this and provide feedback, to give us confidence enough to remove the experimental label as soon as possible.

We use the experimental “route” to lower the bar for merging new stuff, so that we get some extra chances to fix up mistakes before the rules and API are carved in stone and we are set to support that for a life time.

This setup relies on users actually trying out the experimental stuff as otherwise it isn’t method for improving the API, it will only delay the introduction of it to the general public. And it risks becoming be less good.

Documentation

The two new functions have detailed man pages: curl_easy_header and curl_easy_nextheader. If there is anything missing on unclear in there, let us know!

I have also created an initial example source snippet showing header API use. See headerapi.c.

This API deserves its own little section in the everything curl book, but I think I will wait for it to get landed “for real” before I work on adding that.

Trevlig Mjukvara

It was a while since I last spoke Swedish on a podcast. I joined the friendly hosts Sebastian and Alex of the Trevlig Mjukvara (translates to something like “Nice Software”) podcast and we talked software development, open source, curl, Mozilla and a few other topics for an hour. I had a great time. (We had Jitsi act up on us more than once so we had to switch away from it mid-recording!)

Listen to Trevlig Mjukvara s10e04. In Swedish!

I’ve also participated in a lot of other podcasts over the years.

Fedora and curl-minimal

(This blog post has been updated a few times after the initial publication.)

In the Fedora project there is/was a proposal to introduce a curl-minimal package (and its companion libcurl-minimal) by default, as a way to provide default packages with smaller security risk areas. The full curl version packages would then be offered next to the minimal ones and require users to opt-in. (Related article on lwn.net)

The proposal is for making curl-minimal the default for “non-containerized installations of Fedora”. The curl-minimal packages already exists since 2017. Kamil Dudka had a talk about it at curl-up 2018.

curl-minimal would disable lesser used protocols and features. The discussion around exactly which parts it should disable is ongoing. The proposal to make it default was at least initially shut down by the Fedora Steering Committee on March 8, 2022 but I get the sense the curl-minimal idea have not died yet.

Balance

The balance is really tricky but yet seems to be the key to if this is going to be a worthwhile effort or not.

Disabling too many things in the name of security will make many more users install the full package, and then there is no security gain.

Enabling too many things in the minimal version makes it less of a security gain to begin with.

Security

The harsh truth about past security problems in curl and libcurl, is that most of them were found in components and parts that this minimal package would include.

The question is really how much a minimal package will actually save users from risk and not just cause endless amounts of friction going forward.

Not to mention that since Fedora aims to provide the full package as well, they will not avoid the risk of security problems even in the parts that are disabled in the minimal version. They can only reduce the impact of such flaws.

Features

It is really hard for packagers to know what curl features that are used and not used. There simply is no way to find out, besides shipping a version and listening to the screams of users in pain when things break. It will also force them into line-drawing decisions such as “only N users seem to use feature Z so let’s keep that in the full package” and figuring out the N number is a fuzzy estimate at best.

Some curl features are generally assumed to be there by tools and environments. An example is how a lot of tools and services, like for example web browsers, these days offer copy-as-curl functions. They put a generated curl command line in the clipboard so that users can paste that command in a shell prompt to reproduce an operation with curl.

If those generated command lines stop working because the newly installed curl package doesn’t have feature Z enabled while the generated curl command lines uses it, that’s going to make users unhappy.

The worst part for us in the curl project is probably that this is ultimately going to lead to an increased number of bug reports to the curl project because people will not understand why or how things go wrong.

Nobody asked us

Neither the curl project nor me personally have been asked or prompted for our views or feedback on this. It seems the Fedora people have not even considered the little and uncertain numbers on curl usage that exist – namely the results from the annual curl user surveys.

The 2021 analysis is here.

Update: I have been informed they are using that data and results as input. I was wrong above.

Loadable modules is not the fix

In the lwn comments on this topic, several people brought up that the curl project could “fix this” by making the support for different protocols into separate loadable modules, as then people could chose to only install the modules for the particular protocols they want.

That wouldn’t solve the issue at all. That would then instead just push users into installing several different protocol modules instead of minimal vs full. It would still be the same “this application suddenly broke because its needs YYY from libcurl”. Plus, the discussion around the curl-minimal package goes into more details and features than just protocols, and we can’t do every single feature a loadable module.

I have no intention of working on loadable modules for libcurl – for anything. That’s just a lot of work for no obvious benefit and it will introduce lots of new error and problem surfaces to users and it will not be possible to support on all platforms so it also needs to be provided conditionally.

Will curl-minimal happen?

I don’t know.

webinar: getting started with libcurl

On Thursday March 17 2022 at 09:00 PDT (16:00 UTC, 17:00 CET) I will run this free live webinar.

It is an introduction to doing Internet transfers using libcurl, may 30-35 minutes presentation followed by a Q&A session where I can answer all and any questions you may have.

Sign up here

The presentation will include:

  • Something about different versions of libcurl
  • How to find libcurl documentation
  • API and ABI promises
  • The different API sets within libcurl
  • API principles
  • “Easy handles” are for transfers
  • The easy interface basics
  • Creating an “easy handle”
  • Setting options in an easy handle
  • Include files to use
  • A first libcurl source code using the easy interface
  • Learn how to get and send data
  • Extract extra information from a performed transfer
  • Some hints on how to debug your libcurl-using code
  • Where and how to ask for help
  • The multi interface
  • How to do many concurrent transfers with libcurl

The webinar was recorded:

curl no clobber

Do you remember August 26 2002? I can’t say I particularly do but the curl git log remembers for us that it was on that day we added this TODO item:

Add an option that prevents cURL from overwriting existing local files. When used, and there already is an existing file with the target file name (either -O or -o), a number should be appended (and increased if already existing).

That idea hadn’t even been listed for twenty years before it was converted into code by HexTheDragon and landed in curl the other day (with this commit). To get included in the pending curl 7.83.0 release.

--no-clobber

This new command line option (curl’s 247th) is called --no-clobber and it works as suggested already back in 2002. If the output file already exists at the time when curl wants to create it, it will instead append a number to the end of the name. If that file also exists, curl retries iteratively with numbers up to a 100 before it gives up and returns error.

To help you write even cooler scripts. Oh, and the -w variable %{filename_effective} will show this actually used file name.

remove leftovers on curl error

We have just merged curl’s 246th command line option: --remove-on-error (with this commit). To be included in the upcoming curl 7.83.0 release.

This command line option is quite simple and does exactly what the name suggests. If you tell curl to download something into a local file and something goes wrong in that transfer – that makes curl return an error – this option will make curl remove that file rather than leaving the leftovers on disk in a possibly partial file.

The most basic use can look similar to:

curl --remove-on-error -O https://example.com/file

The option is in fact slightly more useful in more complicated cases, like when you want to download lots files in parallel and some of them might fail and you rather only keep the files that actually were transferred successfully:

curl --parallel -O 'https://example.com/file[1-999]'

Enjoy!

Deprecating things in curl

The curl project has been alive for decades. We gradually introduce new features and options into the command line tool and library over time and we work hard never to break existing behavior and keep the ABI and API stable.

Still, some features and functionalities go out of style sooner or later. Versions get deprecated, third party dependencies go stale and turn unsuitable for use.

How to discard “dead branches”

I like the mental image of curl as a big flourishing tree, with roots, the main trunk, branches and a multitude of thick green leaves.

Every once in a while some leaves or branches die. They turn all brown and dry and we need to do something about them. We trim the tree.

Dependencies go sour

A few times during curl’s life-time we have found ourselves in a position where we supported a third-party dependency for some functionality, but that library was maybe no longer a product we want to recommend our users to actually lean on. For libraries that aren’t being maintained correctly or that fall behind in other aspects and we are made aware of that fact, we need to make a decision.

To keep supporting the library, we indirectly give it our blessing. Products that no longer get updates or we no longer trust are keeping up with the world, we need to “chop off” from the curl family tree. We have done so with a few TLS libraries over the years for example. Users that want to keep doing TLS powered protocols with curl and libcurl then “just” have to switch to a supported TLS library.

We don’t have any special mechanisms or policies to detect this kind of expired products, but we simply have to use our judgement and do the best we can.

Protocol versions go extinct

The most obvious example here are SSL and TLS versions. Back in 1998 there actually existed servers that supported SSL version 2. Since that day, all SSL versions have been phased out from the internet and several TLS versions have. I doubt we have seen the last deprecated protocol version so more are likely to happen going forward.

In curl we follow the internet transfer ecosystem and in many cases we get told by the TLS libraries what curl can and cannot support. The options that we once added that ask for certain specific protocol versions thus no longer actually work for most system installs. This is rarely a problem because even if users could ask for a really old TLS version to be attempted, rarely any server side is actually supporting those so this usually isn’t a cause for concern.

For the most desperate users in niche situations, they can usually go build their own versions of the TLS libraries and re-enable the deprecated versions if they really need.

We might give up on features

Sometimes we add features to libcurl for stuff that then over time never really gets used or work correctly. This can happen because the Internet world decided that the particular feature isn’t cool or even because it doesn’t work perfectly in the curl architecture. In cases where we can then remove the feature without breaking the ABI, like when the option the user sets asks for something to get used if possible, things are good. This is for example how we could remove support for HTTP/1 Pipelining from curl without breaking any promises.

Unused platforms erode

The source code for curl and libcurl is written to be extremely portable and has reportedly run on at least 86 different operating systems at some point. Platforms come and go, and popularity and support for them go up and down over time. We might add support for a popular platform one year, which later, fifteen years or so down the line is basically never used or heard of any longer. Code for platforms that we never build or verify slowly but very surely “rots” and will no longer be possible to build.

After a certain number of years without attention, the cost of keeping the code around that presumably does not work anyway, gets higher than the value of keeping it around. If then nobody raises their hand and says they are in fact using curl on platform ZZ we might rip out the adjustments we have in the code for that platform. We can always bring back support if someone would suddenly emerge with such a desire and plan.

In practical terms

When we figure out that there’s something in the project we think should be deprecated. A feature, a backend, something, we bring it up for discussion on the libcurl mailing list and we add it to the docs/DEPRECATE.md document, to be removed no earlier than six months later. This way, users can check this document and get informed about these plans in several releases ahead.

If someone would object to a particular deprecation plan, we would take that discussion and possibly reconsider or delay the deprecation plans.

Credits

Top image by Ron Porter from Pixabay

Tree image by OpenClipart-Vectors from Pixabay

curl 7.82.0 Impartial Content

Welcome to the 206th curl release, 59 days since we shipped curl 7.81.0. The extra three days because I was away on the day the release would normally have been done. (I call it Impartial Content as a little play on the HTTP 206 response code message.)

Download curl 7.82.0 from curl.se as always.

Release presentation

Numbers

the 206th release
2 change
59 days (total: 8,751)

173 bug-fixes (total: 7,691)
266 commits (total: 28,321)
0 new public libcurl function (total: 86)
0 new curl_easy_setopt() option (total: 295)

1 new curl command line option (total: 245)
67 contributors, 39 new (total: 2,597)
43 authors, 24 new (total: 1,014)
0 security fixes (total: 111)
0 USD paid in Bug Bounties (total: 16,900 USD)

Changes

There are only two changes this time around:

The JSON option

With the new --json command line option, curl suddenly made it more convenient to send JSON from command lines and shell scripts.

MesaLink: removed support

curl supports a crazy amount of different TLS libraries, but the amount was now decreased by one (to 13) as we officially drop support for MesaLink. The library is not developed anymore so we don’t want to encourage future users to go down that road.

Bug-fixes

Here are some of my favorite bug-fixes from this release cycle.

bearssl

We landed three notable fixes for the bearssl backend which should make users of it happy. For cert expiration, incomplete CA certs and session resumption.

strlen call removals

After I posted a library-call count to the mailing list that showed quite a large number of calls to strlen(), a cleanup race started that subsequently reduced the number of calls by over 60% in some use cases! Primarily replaced by compile-time constants.

configure requires –with-nss-deprecated

To build curl with the NSS backend using configure, you now need to confirm this choice by also passing on --with-nss-deprecated to make it clear to users what the future looks like for our NSS support.

erase some more sensitive command line arguments

After a lengthy discussion we now hide even more command line arguments from appearing in ps output (on systems that support it). Since the hiding is done by curl itself, there is still a short moment during which they will be visible, plus that we cannot hide everything so there is still a risk that some argument might leak information unwillingly. That is the nature of command line arguments. Use the config file concept or stdin etc to work around that.

NPN is deprecated

The TLS extension NPN is now marked “deprecated” and is scheduled for removal in six months unless someone yells very loudly and explains why not. This extension was once used to negotiate SPDY and early HTTP/2 but have no purpose these days. The browsers removed support for it several years ago.

allow CURLOPT_HTTPHEADER change “:scheme”

The only pseudo header for HTTP/2 and HTTP/3 that couldn’t be modified by a user can now be changed at will.

remove support for TPF, Netware, vxWorks

Support for these platforms for which the code haven’t been modified for the last decade or so, and therefore are highly unlikely to still work, were dropped. After this, I had it confirmed that you can still build curl for vxWorks using the “regular” build!

remove support for CURL_DOES_CONVERSIONS

After support for TPF was dropped, we took the next step and removed support for the charset conversion functions necessary to run curl on non-ASCII platforms such EBCDIC using ones. As TPF was the only/last platform such platform we supported, this cleanup improved lots of code paths.

allow user callbacks to call curl_multi_assign

A regression in 7.81.0 made curl_multi_assign() return error if used from a callback.

http3: quiche and ngtcp2 fixes

We landed several fixes in both HTTP/3 backends, improving the situation for everyone who plays HTTP/3 with curl.

reduce memory use when FTP is disabled

After several cleanups the total memory footprint for builds with FTP and/or proxy support disabled has been reduced.

check for ~/.config/curlrc too

curl now also checks for its default “config file” in the path mentioned above.

DNS options that need c-ares now fail without it

The command line tool offers a set of functions to control DNS specific details, and since those options only work if libcurl was built to use c-ares and not at all if it was built to use another resolver backend, curl will now correctly return error when one of those options are used when libcurl can’t execute them.

keep trailing dot in host name

If there is a trailing dot after the host name in the URL, that dot is now kept in the name when used everywhere internally – except for the SNI field in TLS.

wolfssl: when SSL_read() returns zero, check the error

Even while obviously very rare, curl could wrongly return an “end of transfer” prematurely before this fix.

Next

The next release is scheduled to ship on April 27, 2022.

curl on “software at scale”

I was a guest at the software at scale podcast a while ago and the recording went up recently. We talked about a lot of things curl, including:

  • The complexity behind HTTP. What goes on behind the scenes when I make a web request?
  • The organizational work behind internet-wide RFCs, like HTTP/3.
  • Rust in curl. The developer experience, and the overall experience of integrating Hyper.
  • WebSockets support in curl
  • Fostering an open-source community.
  • People around the world think Daniel has hacked their system, because of the curl license often included in malicious tools.
  • Does curl have a next big thing?

Listen to it here.

Also: links to all my guest appearances on podcasts and shows.