All posts by Daniel Stenberg

curl 7.77.0 – 200 OK

Welcome to the 200th curl release. We call it 200 OK. It coincides with us counting more than 900 commit authors and surpassing 2,400 credited contributors in the project. This is also the first release ever in which we thank more than 80 persons in the RELEASE-NOTES for having helped out making it and we’ve set two new record in the bug-bounty program: the largest single payout ever for a single bug (2,000 USD) and the largest total payout during a single release cycle: 3,800 USD.

This release cycle was 42 days only, two weeks shorter than normal due to the previous 7.76.1 patch release.

Release Presentation

Numbers

the 200th release
5 changes
42 days (total: 8,468)

133 bug-fixes (total: 6,966)
192 commits (total: 27,202)
0 new public libcurl function (total: 85)
2 new curl_easy_setopt() option (total: 290)

2 new curl command line option (total: 242)
82 contributors, 44 new (total: 2,410)
47 authors, 23 new (total: 901)
3 security fixes (total: 103)
3,800 USD paid in Bug Bounties (total: 9,000 USD)

Security

We set two new records in the curl bug-bounty program this time as mentioned above. These are the issues that made them happen.

CVE-2021-22901: TLS session caching disaster

This is a Use-After-Free in the OpenSSL backend code that in the absolutely worst case can lead to an RCE, a Remote Code Execution. The flaw is reasonably recently added and it’s very hard to exploit but you should upgrade or patch immediately.

The issue occurs when TLS session related info is sent from the TLS server when the transfer that previously used it is already done and gone.

The reporter was awarded 2,000 USD for this finding.

CVE-2021-22898: TELNET stack contents disclosure

When libcurl accepts custom TELNET options to send to the server, it the input parser was flawed which could be exploited to have libcurl instead send contents from the stack.

The reporter was awarded 1,000 USD for this finding.

CVE-2021-22897: schannel cipher selection surprise

In the Schannel backend code, the selected cipher for a transfer done with was stored in a static variable. This caused one transfer’s choice to weaken the choice for a single set transfer could unknowingly affect other connections to a lower security grade than intended.

The reporter was awarded 800 USD for this finding.

Changes

In this release we introduce 5 new changes that might be interesting to take a look at!

Make TLS flavor explicit

As explained separately, the curl configure script no longer defaults to selecting a particular TLS library. When you build curl with configure now, you need to select which library to use. No special treatment for any of them!

No more SSL

curl now has no more traces of support for SSLv2 or SSLv3. Those ancient and insecure SSL versions were already disabled by default by TLS libraries everywhere, but now it’s also impossible to activate them even in special build. Stripped out from both the curl tool and the library (thus counted as two changes).

HSTS in the build

We brought HSTS support a while ago, but now we finally remove the experimental label and ship it enabled in the build by default for everyone to use it more easily.

In-memory cert API

We introduce API options for libcurl that allow users to specify certificates in-memory instead of using files in the file system. See CURLOPT_CAINFO_BLOB.

Favorite bug-fixes

Again we manage to perform a large amount of fixes in this release, so I’m highlighting a few of the ones I find most interesting!

Version output

The first line of curl -V output got updated: libcurl now includes OpenLDAP and its version of that was used in the build, and then the curl tool can add libmetalink and its version of that was used in the build!

curl_mprintf: add description

We’ve provided the *printf() clone functions in the API since forever, but we’ve tried to discourage users from using them. Still, now we have a first shot at a man page that clearly describes how they work.

This is important as they’re not quite POSIX compliant and users who against our advice decide to rely on them need to be able to know how they work!

CURLOPT_IPRESOLVE: preventing wrong IP version from being used

This option was made a little stricter than before. Previously, it would be lax about existing connections and prefer reuse instead of resolving again, but starting now this option makes sure to only use a connection with the request IP version.

This allows applications to explicitly create two separate connections to the same host using different IP versions when desired, which previously libcurl wouldn’t easily let you do.

Ignore SIGPIPE in curl_easy_send

libcurl makes its best at ignoring SIGPIPE everywhere and here we identified a spot where we had missed it… We also made sure to enable the ignoring logic when built to use wolfSSL.

Several HTTP/2-fixes

There are no less than 6 separate fixes mentioned in the HTTP/2 module in this release. Some potential memory leaks but also some more behavior improving things. Possibly the most important one was the move of the transfer-related error code from the connection struct to the transfers struct since it was vulnerable to a race condition that could make it wrong. Another related fix is that libcurl no longer forcibly disconnects a connection over which a transfer gets HTTP_1_1_REQUIRED returned.

Partial CONNECT requests

When the CONNECT HTTP request sent to a proxy wasn’t all sent in a single send() call, curl would fail. It is baffling that this bug hasn’t been found or reported earlier but was detected this time when the reporter issued a CONNECT request that was larger than 16 kilobytes…

TLS: add USE_HTTP2 define

There was several remaining bad assumptions that HTTP/2 support in curl relies purely on nghttp2. This is no longer true as HTTP/2 support can also be provide by hyper.

normalize numerical IPv4 hosts

The URL parser now knows about the special IPv4 numerical formats and parses and normalizes URLs with numerical IPv4 addresses.

Timeout, timed out libssh2 disconnects too

When libcurl (built with libssh2 support) stopped an SFTP transfer because a timeout was triggered, the following SFTP disconnect procedure was subsequently also stopped because of the same timeout and therefore wasn’t allowed to properly clean up everything, leading to a memory-leak!

IRC network switch

We moved the #curl IRC channel to the new network libera.chat. Come join us there!

Next release

On Jul 21, 2021 we plan to ship the next release. The version number for that is not yet decided but we have changes in the pipeline, making a minor version number bump very likely.

Credits

7.77.0 release image by Filip Dimitrovski.

The curl user survey 2021

For the eighth consecutive year we run the annual curl user survey again in 2021. The form just went up and I would love to have you spend 10 minutes of your busy life to tell us how you think curl works, what doesn’t work and what we should do next.

We have no tracking on the website and we have no metrics or usage measurements of the curl tool or the libcurl library. The only proper way we have left to learn how users and people in general think of us and how curl works, is to ask. So this is what we do, and we limit the asking to once per year.

You can also view this from your own “selfish” angle: this is a way for you to submit your input, your opinions and we will listen.

The survey will be up two weeks during which I hope to get as many people as possible to respond. If you have friends you know use curl or libcurl, please have them help us out too!

Take the survey

Yes really, please take the survey!

Bonus: see the extensive analysis of the 2020 user survey. There’s a lot of user feedback to learn from it.

“I could rewrite curl”

Collected quotes and snippets from people publicly sneezing off or belittling what curl is, explaining how easy it would be to make a replacement in no time with no effort or generally not being very helpful.

These are statements made seriously. For all I know, they were not ironic. If you find others to add here, please let me know!

Listen. I’ve been young too once and I’ve probably thought similar things myself in the past. But there’s a huge difference between thinking and saying. Quotes included here are mentioned for our collective amusement.

I can do it in less than a 100 lines

[source]

I can do it in a three day weekend

(The yellow marking in the picture was added by me.)

[source]

No reason to be written in C

Maybe not exactly in the same category as the two ones above, but still a significant “I know this” vibe:

[source]

We sold a curl exploit

Some people deliberately decides to play for the other team.

[source]

This isn’t a big deal

It’s easy to say things on Twitter…

This tweet was removed by its author after I and others replied to it so I cannot link it. The name has been blurred on purpose because of this.

100 lines of Python

“I think you could replace 99% of the uses of Curl … with like 100 lines of Python or Rust or Go”

[source]

I should just fix curl

I reported a vulnerable old curl installation being hosted by nuget, only to get this user tell me… to fix the vulnerabilities we already fixed long ago.

Discussions

Hacker news, Reddit

200 OK

One day in March 1998 I released a little file transfer tool I called curl. The first ever curl release. That was good.

10

By the end of July the same year, I released the 10th curl release. I’ve always believed in release early release often as a service to users and developers alike.

20

In December 1998 I released the 20th curl release. I started to get a hang of this.

50

In January 2001, not even three years in, we shipped the 50th curl release (version 7.5.2). We were really cramming them out back then!

200

Next week. 23 years, two months and six days after the first release, we will ship the 200th curl release. We call it curl 7.77.0.

Yes, there are exactly 200 stickers used in the photo. But the visual comparison with 50 is also apt: it isn’t that big difference seen from a distance.

I’ve personally done every release to date, but there’s nothing in the curl release procedure that says it has to be me, as long as the uploader has access to put the new packages on the correct server.

The fact that 200 is an HTTP status code that is indicating success is an awesome combination.

Release cadence

In 2014 we formally switched to an eight week release cycle. It was more or less what we already used at the time, but from then on we’ve had it documented and we’ve tried harder to stick to it.

Assuming no alarmingly bad bugs are found, we let 56 days pass until we ship the next release. We occasionally slip up and fail on this goal, and then we usually do a patch release and cut the next cycle short. We never let the cycle go longer than those eight weeks. This makes us typically manage somewhere between 6 and 10 releases per year.

Lessons learned

  • Make a release checklist, and stick to that when making releases
  • Update the checklist when needed
  • Script as much as possible of the procedure
  • Verify the release tarballs/builds too in CI
  • People never test your code properly until you actually release
  • No matter how hard you try, some releases will need quick follow-up patch releases
  • There is always another release
  • Time-based release scheduling beats feature-based

curl -G vs curl -X GET

(This is a repost of a stackoverflow answer I once wrote on this topic. Slightly edited. Copied here to make sure I own and store my own content properly.)

curl knows the HTTP method

You normally use curl without explicitly saying which request method to use.

If you just pass in a HTTP URL like curl http://example.com, curl will use GET. If you use -d or -F curl will use POST, -I will cause a HEAD and -T will make it a PUT.

If for whatever reason you’re not happy with these default choices that curl does for you, you can override those request methods by specifying -X [WHATEVER]. This way you can for example send a DELETE by doing curl -X DELETE [URL].

It is thus pointless to do curl -X GET [URL] as GET would be used anyway. In the same vein it is pointless to do curl -X POST -d data [URL]... But you can make a fun and somewhat rare request that sends a request-body in a GET request with something like curl -X GET -d data [URL].

Digging deeper

curl -GET (using a single dash) is just wrong for this purpose. That’s the equivalent of specifying the -G, -E and -T options and that will do something completely different.

There’s also a curl option called --get to not confuse matters with either. It is the long form of -G, which is used to convert data specified with -d into a GET request instead of a POST.

(I subsequently used this answer to populate the curl FAQ to cover this.)

Warnings

Modern versions of curl will inform users about this unnecessary and potentially harmful use of -X when verbose mode is enabled (-v) – to make users aware. Further explained and motivated here.

-G converts a POST + body to a GET + query

You can ask curl to convert a set of -d options and instead of sending them in the request body with POST, put them at the end of the URL’s query string and issue a GET, with the use of `-G. Like this:

curl -d name=daniel -d grumpy=yes -G https://example.com/

… which does the exact same thing as this command:

curl https://example.com/?name=daniel&grumpy=yes

The libcurl transfer state machine

I’ve worked hard on making the presentation I ended up calling libcurl under the hood. A part of that presentation is spent on explaining the main libcurl transfer state machine and here I’ll try to document some of what, in a written form. Understanding the main transfer state machine in libcurl could be valuable and interesting for anyone who wants to work on libcurl internals and maybe improve it.

Background

The state is kept in easy handle in the struct field called mstate. The source file for this state machine is called multi.c.

An easy handle is always in exactly one of these states for as long as it exists.

This transfer state machine is designed to work for all protocols libcurl supports, but basically no protocol will transition through all states. As you can see in the drawing, there are many different possible transitions from a lot of the states.

libcurl transfer state machine

(click the image for a larger version)

Start

A transfer starts up there above the surface in the INIT state. That’s a yellow box next to the little start button. Basically the boat shows how it goes from INIT to the right over to MSGSENT with it’s finish flag, but the real path is all done under the surface.

The yellow boxes (states) are the ones that exist before or when a connection is setup. The striped background is for all states that has a single and specific connectdata struct associated with the transfer.

CONNECT

If there’s a connection limit, either in total or per host etc, the transfer can get sent to the PENDING state to wait for conditions to change. If not, the state probably moves on to one of the blue ones to resolve host name and connect to the server etc. If a connection could be reused, it can shortcut immediately over to the green DO state.

The green states are all about setting up the connection to a state of fully connected, authenticated and logged in. Ready to send the first request.

DO

The green DO states are all about sending the request with one or more commands so that the file transfer can begin. There are several such states to properly support all protocols but also for historical reasons. We could probably remove a state there by some clever reorgs if we wanted.

PERFORMING

When a request has been issued and the transfer starts, it transitions over to PERFORMING. In the white states data is flowing. Potentially a lot. Potentially in both or either direction. If during the transfer curl finds out that the transfer is faster than allowed, it will move into RATELIMITING until it has cooled down a bit.

DONE

All the post-transfer states are red in the picture. The DONE is the first of them and after having done what it needs to round up the transfer, it disassociates with the connection and moves to COMPLETED. There’s no stripes behind that state. Disassociate here means that the connection is returned back to the connection pool for later reuse, or in the worst case if deemed that it can’t be reused or if the application has instructed it so, closed.

As you’ll note, there’s no disconnect anywhere in the state machine. This is simply because the disconnect is not really a part of the transfer at all.

COMPLETED

This is the end of the road. In this state a message will be created and put in the outgoing queue for the application to read, and then as a final last step it moves over to MSGSENT where nothing more happens.

A typical handle remains in this state until the transfer is reused and restarted, in which it will be set back to the INIT state again and the journey begins again. Possibly with other transfer parameters and URL this time. Or perhaps not.

State machines within each state

What this state diagram and explanation doesn’t show is of course that in each of these states, there can be protocol specific handling and each of those functions might in themselves of course have their own state machines to control what to do and how to handle the protocol details.

Each protocol in libcurl has its own “protocol handler” and most of the protocol specific stuff in libcurl is then done by calls from the generic parts to the protocol specific parts with calls like protocol_handler->proto_connect() that calls the protocol specific connection procedure.

This allows the generic state machine described in this blog post to not really know the protocol specifics and yet all the currently support 26 transfer protocols can be supported.

libcurl under the hood – the video

Here’s the full video of libcurl under the hood.

If you want to skip directly to the state machine diagram and the following explanation, go here.

Credits

Image by doria150 from Pixabay

curl up 2021

curl up 2021 happened today.

We had five presentations done, all prerecorded and made available before the event. At the Sunday afternoon we gathered to discuss the presentations and everything around those topics.

The presentations

  1. The state of curl 2021 – Daniel Stenberg
  2. curl security 2021 – Daniel Stenberg
  3. libcurl under the hood – Daniel Stenberg
  4. Interfacing rust – Stefan Eissing
  5. Curl profiling – Jim Fuller.

Discussions

We were not very many who actually joined the meeting, and out of the people in the meeting a majority decided to be spectators only and remained muted with their cameras off.

It turned out as a two hour long mostly casual talk among me, Stefan Eissing and Emil Engler about the presentations and related topics. Toward the end, Kamil Dudka appeared.

The three of us get to talk about roadmap items, tests, security, writing code that interfaces modules written in rust and what more details in the libcurl internals that could use further descriptions and documentation.

The video

The agenda in the video is roughly following the agenda order in the 2021 wiki page and the discussion topics mentioned there.

Sponsored

Thanks to wolfSSL for sponsoring the video meeting account used!

curl pictures

“Memes” or other fun images involving curl. Please send or direct me to other ones you think belong in this collection! Kept here solely to boost my ego.

All modern digital infrastructure

This is the famous xkcd strip number 2347, modified to say Sweden and 1997 by @tsjost. I’ve seen this picture taking some “extra rounds” in various places, somehow also being claimed to be xkcd 2347 when people haven’t paid attention to the “patch” in the text.

Entire web infrastructure

Image by @matthiasendler

Car contract

This photo of a rental car contract with an error message on the printed paper was given to me by a good person I’ve unfortunately lost track of.

The developer dice

Thanks to Cassidy. (For purchase here.)

Don’t use -X

Remember that using curl -X is very often just the wrong thing to do. Jonas Forsberg helps us remember:

The curl

In an email from NASA that I received and shared, the person asked about details for “the curl”.

Image by eichkat3r at mastodon.

You’re sure this is safe?

Piping curl output straight into a shell is a much debated practice…

Picture by Tim Chase.

curl, reinvented by…

Remember the powershell curl alias?

Picture by Shashimal Senarath.

Billboard

This is an old classic:

curl please

Related

Screenshotted curl credits.

curl is just the hobby

Every base is base 10

This image originally comes from cowbirdsinlove.com but sadly it seems the page that once showed it is no longer there. I saved it from that site already back in 2015, but I cannot recall the exact URL it used. The image is still available at https://cowbirdsinlove.com/comics/base10[1].png.

Since I consider this picture such an iconic classic and masterpiece, I decided I better host it here in a small attempt to preserve it for everyone to enjoy.

Because, you know, every base is base 10.

Update: the original page on archive.org.

fixed vulnerabilities were once created

In the curl project we make great efforts to store a lot of meta data about each and every vulnerability that we have fixed over the years – and curl is over 23 years old. This data set includes CVE id, first vulnerable version, last vulnerable version, name, announce date, report to the project date, CWE, reward amount, code area and “C mistake kind”.

We also keep detailed data about releases, making it easy to look up for example release dates for specific versions.

Dashboard

All this, combined with my fascination (some would call it obsession) of graphs is what pushed me into creating the curl project dashboard, with an ever-growing number of daily updated graphs showing various data about the curl projects in visual ways. (All scripts for that are of course also freely available.)

What to show is interesting but of course it is sometimes even more important how to show particular data. I don’t want the graphs just to show off the project. I want the graphs to help us view the data and make it possible for us to draw conclusions based on what the data tells us.

Vulnerabilities

The worst bugs possible in a project are the ones that are found to be security vulnerabilities. Those are the kind we want to work really hard to never introduce – but we basically cannot reach that point. This special status makes us focus a lot on these particular flaws and we of course treat them special.

For a while we’ve had two particular vulnerability graphs in the dashboard. One showed the number of fixed issues over time and another one showed how long each reported vulnerability had existed in released source code until a fix for it shipped.

CVE age in code until report

The CVE age in code until report graph shows that in general, reported vulnerabilities were introduced into the code base many years before they are found and fixed. In fact, the all time average time suggests they are present for more than 2,700 – more than seven years. Looking at the reports from the last 12 months, the average is even almost 1000 days more!

It takes a very long time for vulnerabilities to get found and reported.

When were the vulnerabilities introduced

Just the other day it struck me that even though I had a lot of graphs already showing in the dashboard, there was none that actually showed me in any nice way at what dates we created the vulnerabilities we spent so much time and effort hunting down, documenting and talking about.

I decided to use the meta data we already have and add a second plot line to the already existing graph. Now we have the previous line (shown in green) that shows the number of fixed vulnerabilities bumped at the date when a fix was released.

Added is the new line (in red) that instead is bumped for every date we know a vulnerability was first shipped in a release. We know the version number from the vulnerability meta data, we know the release date of that version from the release meta data.

This all new graph helps us see that out of the current 100 reported vulnerabilities, half of them were introduced into the code before 2010.

Using this graph it also very clear to me that the increased CVE reporting that we can spot in the green line started to accelerate in the project in 2016 was not because the bugs were introduced then. The creation of vulnerabilities rather seem to be fairly evenly distributed over time – with occasional bumps but I think that’s more related to those being particular releases that introduced a larger amount of features and code.

As the average vulnerability takes 2700 days to get reported, it could indicate that flaws landed since 2014 are too young to have gotten reported yet. Or it could mean that we’ve improved over time so that new code is better than old and thus when we find flaws, they’re more likely to be in old code paths… I don’t think the red graph suggests any particular notable improvement over time though. Possibly it does if we take into account the massive code growth we’ve also had over this time.

The green “fixed” line at least has a much better trend and growth angle.

Present in which releases

As we have the range of vulnerable releases stored in the meta data file for each CVE, we can then add up the number of the flaws that are present in every past release.

Together with the release dates of the versions, we can make a graph that shows the number of reported vulnerabilities that are present in each past release over time, in a graph.

You can see that some labels end up overwriting each other somewhat for the occasions when we’ve done two releases very close in time.

curl security 2021