Tag Archives: cURL and libcurl

curl is 18 years old tomorrow

Another notch on the wall as we’ve reached the esteemed age of 18 years in the cURL project. 9 releases were shipped since our last birthday and we managed to fix no less than a total of 457 bugs in that time.

18notches

On this single day in history…

20,000 persons will be visiting the web site, transferring over 4GB of data.

1.3 bug fixes will get pushed to the git repository (out of the 3 commits made)

300 git clones are made of the curl source tree, by 100 unique users.

4000 curl source archives will be downloaded from the curl web site

8 mails get posted on the curl mailing lists (at least one of them will be posted by me).

I will spend roughly 2 hours on curl related work. Mostly answering mail, bug reports and debugging, but also maintaining infrastructure, poke on the web site and if lucky, actually spending a few minutes writing new code.

Every human in the connected world will use at least one service,  tool or application that runs curl.

Happy birthday to us all!

HTTP redirects

I find that many web minded people working client-side or even server-side have neglected to learn the subtle details of the redirects of today. Here’s my attempt at writing another text about it that the ones who should read it still won’t.

Nothing here, go there!

The “redirect” is a fundamental part of the HTTP protocol. The concept was present and is documented already in the first spec (RFC 1945), published in 1996, and it has remained well used ever since.

A redirect is exactly what it sounds like. It is the sredirect-signerver sending back an instruction to the client – instead of giving back the contents the client wanted. The server basically says “go look over [here] instead for that thing you asked for“.

But not all redirects are alike. How permanent is the redirect? What request method should the client use in the next request?

All redirects also need to send back a Location: header with the new URI to ask for, which can be absolute or relative.

Permanent or Temporary

Is the redirect meant to last or just remain for now? If you want a GET to resource A permanently redirect users to resource B with another GET, send back a 301. It also means that the user-agent (browser) is meant to cache this and keep going to the new URI from now on when the original URI is requested.

The temporary alternative is 302. Right now the server wants the client to send a GET request to B, but it shouldn’t cache this but keep trying the original URI when directed to it.

Note that both 301 and 302 will make browsers do a GET in the next request, which possibly means changing method if it started with a POST (and only if POST). This changing of the HTTP method to GET for 301 and 302 responses is said to be “for historical reasons”, but that’s still what browsers do so most of the public web will behave this way.

In practice, the 303 code is very similar to 302. It will not be cached and it will make the client issue a GET in the next request. The differences between a 302 and 303 are subtle, but 303 seems to be more designed for an “indirect response” to the original request rather than just a redirect.

These three codes were the only redirect codes in the HTTP/1.0 spec.

GET or POST?

All three of these response codes, 301 and 302/303, will assume that the client sends a GET to get the new URI, even if the client might’ve sent a POST in the first request. This is very important, at least if you do something that doesn’t use GET.

If the server instead wants to redirect the client to a new URI and wants it to send the same method in the second request as it did in the first, like if it first sent POST it’d like it to send POST again in the next request, the server would use different response codes.

To tell the client “the URI you sent a POST to, is permanently redirected to B where you should instead send your POST now and in the future”, the server responds with a 308. And to complicate matters, the 308 code is only recently defined (the spec was published in June 2014) so older clients may not treat it correctly! If so, then the only response code left for you is…

The (older) response code to tell a client to send a POST also in the next request but temporarily is 307. This redirect will not be cached by the client though so it’ll again post to A if requested to. The 307 code was introduced in HTTP/1.1.

Oh, and redirects work the exact same way in HTTP/2 as they do in HTTP/1.1.

The helpful table version

Permanent Temporary
Switch to GET 301 302 and 303
Keep original method 308 307

It’s a gap!

Yes. The 304, 305, and 306 codes are not used for redirects at all.

What about other HTTP methods?

They don’t change methods! This table above is only for changing from POST to GET, other methods will not change.

curl and redirects

I couldn’t write a text like this without spicing it up with some curl details!

First, curl and libcurl don’t follow redirects by default. You need to ask curl to do it with -L (or –location) or libcurl with CURLOPT_FOLLOWLOCATION.

It turns out that there are web services out there in the world that want a POST sent, are responding with HTTP redirects that use a 301, 302 or 303 response code and still want the HTTP client to send the next request as a POST. As explained above, browsers won’t do that and neither will curl – by default.

Since these setups exist, and they’re actually not terribly rare, curl offers options to alter its behavior.

You can tell curl to not change the POST request method to GET after a 30x response by using the dedicated options for that:
–post301, –post302 and –post303. If you are instead writing a libcurl based application, you control that behavior with the CURLOPT_POSTREDIR option.

Here’s how a simple HTTP/1.1 redirect can look like. Note the 301, this is “permanent”:
curl-shows-redirect

Survey: a curl related event?

Call it a conference, a meetup or a hackathon. As curl is about to turn 18 years next month, I’m checking if there’s enough interest to try to put together a physical event to gather curl hackers and fans somewhere at some point. We’ve never done it in the past. Is the time ripe now?

Please tell us your views on this by filling out this survey that we run during this week only!

“Subject: Urgent Warning”

Back in December I got a desperate email from this person. A woman who said her Instagram had been hacked and since she found my contact info in the app she mailed me and asked for help. I of course replied and said that I have nothing to do with her being hacked but I also have nothing to do with Instagram other than that they use software I’ve written.

Today she writes back. Clearly not convinced I told the truth before, and now she strikes back with more “evidence” of my wrongdoings.

Dear Daniel,

I had emailed you a couple months ago about my “screen dumps” aka screenshots and asked for your help with restoring my Instagram account since it had been hacked, my photos changed, and your name was included in the coding. You claimed to have no involvement whatsoever in developing a third party app for Instagram and could not help me salvage my original Instagram photos, pre-hacked, despite Instagram serving as my Photography portfolio and my career is a Photographer.

Since you weren’t aware that your name was attached to Instagram related hacking code, I thought you might want to know, in case you weren’t already aware, that your name is also included in Spotify terms and conditions. I came across this information using my Spotify which has also been hacked into and would love your help hacking out of Spotify. Also, I have yet to figure out how to unhack the hackers from my Instagram so if you change your mind and want to restore my Instagram to its original form as well as help me secure my account from future privacy breaches, I’d be extremely grateful. As you know, changing my passwords did nothing to resolve the problem. Please keep in mind that Facebook owns Instagram and these are big companies that you likely don’t want to have a trail of evidence that you are a part of an Instagram and Spotify hacking ring. Also, Spotify is a major partner of Spotify so you are likely familiar with the coding for all of these illegally developed third party apps. I’d be grateful for your help fixing this error immediately.

Thank you,

[name redacted]

P.S. Please see attached screen dump for a screen shot of your contact info included in Spotify (or what more likely seems to be a hacked Spotify developed illegally by a third party).

Spotify credits screenshot

Here’s the Instagram screenshot she sent me in a previous email:

Instagram credits screenshot

I’ve tried to respond with calm and clear reasonable logic and technical details on why she’s seeing my name there. That clearly failed. What do I try next?

Tales from my inbox, part++

“Josh” sent me an email. Pardon the language here but I decided to show the mail body unaltered:

From: Josh Yanez <a gmail address>
Date: Wed, 6 Jan 2016 22:27:13 -0800
To: daniel
Subject: Hey fucker

I got all your fucking info either you turn yourself in or ill show it to the police. You think I'm playing try me I got all your stupid little coding too.

Sent from my iPhone

This generates so many questions

  1. I’ve had threats mailed to be before (even done over phone) so this is far from the first time. The few times I’ve bothered to actually try to understand what these people are hallucinating about, it usually turns out that they’ve discovered that someone has hacked them or targeted them in some sort of attack and curl was used and I am the main author so I’m the bad guy.
  2. He has all my “info” and my “stupid little coding too” ? What “coding” could that be? What is all my info?
  3. Is this just a spam somehow that wants me to reply? It is directed to me only and I’ve not heard of anyone else who got a mail similar to this.
  4. The lovely “Sent from my iPhone” signature is sort of hilarious too after such an offensive message.

Very aware this could just as well suck me into a deep and dark hole of sadness, I was just too curious to resist so I responded. Unfortunately I didn’t get anything further back so the story thus ends here, a bit abrupt. 🙁

A 2015 retrospective

Wow, another year has passed. Summing up some things I did this year.

Commits

I don’t really have good global commit count for the year, but github counts 1300 commits and I believe the vast majority of my commits are hosted there. Most of them in curl and curl-oriented projects.

We did 8 curl releases during the year featuring a total of 575 bug fixes. The almost 1,200 commits were authored by 107 different individuals.

Books

I continued working on http2 explained during the year, and after having changed to markdown format it is now available in more languages than ever thanks to our awesome translators!

I started my second book project in the fall of 2015, using the working title everything curl, which is a much larger book effort than the HTTP/2 book and after having just passed 23,500 words that create over 110 pages in the PDF version, almost half of the planned sections are still left to write…

Twitter

I almost doubled my number of twitter followers during this year, now at 2,850 something. While this is a pointless number, reaching out slightly further does have the advantage that I get better responses and that makes me appreciate and get more out of twitter.

Stackoverflow

I’ve continued to respond to questions there, and my total count is now at 550 answers, out of which I wrote about 80 this year. The top scored answer I wrote during 2015 is for a question that isn’t phrased like one: Apache and HTTP2.

Keyboard use

I’ve pressed a bit over 6.4 million keys on my primary keyboard during the year, and 10.7% of the keys were pressed on weekends.

During the 2900+ hours when at least one key press were registered, I averaged on 2206 key presses per hour.

The most excessive key banging hour of the year started  September 21 at 14:00 and ended with me reaching 10,875 key presses.

The most excessive day was June 9, during which I pushed 63,757 keys.

Talks

This is all the 16 opportunities where I’ve talked in front of an audience during 2015. As you will see, the list of topics were fairly limited…

Daniel talking at Apachecon 2015

curl and HTTP/2 by default

cURL

Followers of this blog know that I’ve dabbled with HTTP/2 stuff for quite some time, and curl got its initial support for the new protocol version already in September 2013.

curl shipped “proper” HTTP/2 support as it looks in the final specification both for the command line tool and the libcurl library before any browsers did in their release versions. (Firefox was the first browser to ship HTTP/2 in a release version, followed by Chrome. Both did this in the beginning of 2015.)

libcurl features an option that lets the application to select HTTP version to use, and that includes HTTP/2 since back then. The command line tool got a corresponding command line option (aptly named –http2) to switch on this protocol version.

This goes hand in hand with curl’s general philosophy that it just does the basics and you have to specifically switch on more features and tell it to enable things you want to use. This conservative approach makes it very reliable protocol-wise and provides applications a very large level of control. The downside is of course that fewer people switch on certain features since they’re just not aware of them. Or as in this case with HTTP/2, it also complicates matters that only a subset of users still have a HTTP/2 tool and library since they might still run outdated versions or they may run recent versions that were built without the necessary prerequisites present (basically the nghttp2 library).

By default?

libcurl is even more conservative that the curl tool so switching default for the library isn’t really on the agenda yet. We are very careful of modifying behavior so we’re saving that for later but what about upping the curl tool a notch?

We could switch the default to use HTTP/2 as soon as the tool has the powers built-in. But for regular clear text HTTP, using the Upgrade: header has a few drawbacks. It makes the requests larger, it complicates matter somewhat that most servers don’t do upgrades on HTTP POST requests (and a few others) so there might indeed be several requests before an upgrade is actually made even on a server that supports HTTP/2 and perhaps the strongest reason: we already found servers that (wrongly, I would claim) reject requests with Upgrade: headers they don’t understand. All this taken together, Upgrade over HTTP will break too many requests that work with HTTP 1.1. And then we haven’t even considered how the breakage situation will be when using explicit or transparent proxies…

By default!

To help users with this problem of HTTP upgrades not being feasible by default, we’ve just landed a new alternative to the “set HTTP version” that only sets HTTP/2 for HTTPS requests and leaves it to HTTP/1.1 for clear text HTTP requests. This option will ship in the next release, to be called 7.47.0, and can of course be tested out before that with git or daily snapshot builds.

Setting this option is next to risk-free, as the HTTP/2 negotiation in TLS is based on one or two TLS extensions (NPN and ALPN) that both have proper fallbacks to 1.1.

Said and done. The curl tool now sets this option. Using the curl tool on a HTTPS:// URL will from now on use HTTP/2 by default as soon as both the libcurl it uses and the server it connects to support HTTP/2!

We will of course keep our eyes and ears open to see if this causes any problems. Let us know what you see!

Everything curl – work in progress

everything-curl-cover

… is a book I’m slowly working on. Click the image above to see it in its current state.  It is not complete.

As the title should hint, I intend to cover just about everything that is to say about curl. The project, the products, the development, the source code, its history, its future, the policies, the ideas and whatever else that I can think of has anything to do with curl.

The book is completely open and available for free – in a variety of formats. When I write this, there are about 60 pages and almost 13,000 words written. There are 220+ sections or sub chapters planned (so far) out of which 111 are still to be written. Of course that doesn’t really mean that the 115 already written ones are complete or without flaws that need to be corrected. I also suspect I’ve written the easiest ones first…

I welcome and encourage all the help I can get. The source is all written in markdown, and everything is on github. File issues, send pull-requests or whatever you can think of!

I’m especially interested in getting suggestions for new sections that I haven’t yet thought about. Or sub sections, or examples. Or some fun stories from the wild Internet that you overcame with the help of curl. Or suggestions on where we should insert images (and what images to insert). Or other artworks, like a nicer cover. Anything!

If things go as planned, I have filled in most of the blanks by the summer 2016 and can then offer the complete curl book.

What’s new in curl

CURL keyboardWe just shipped our 150th public release of curl. On December 2, 2015.

curl 7.46.0

One hundred and fifty public releases done during almost 18 years makes a little more than 8 releases per year on average. In mid November 2015 we also surpassed 20,000 commits in the git source code repository.

With the constant and never-ending release train concept of just another release every 8 weeks that we’re using, no release is ever the grand big next release with lots of bells and whistles. Instead we just add a bunch of things, fix a bunch of bugs, release and then loop. With no fanfare and without any press-stopping marketing events.

So, instead of just looking at what was made in this last release, because you can check that out yourself in our changelog, I wanted to take a look at the last two years and have a moment to show you want we have done in this period. curl and libcurl are the sort of tool and library that people use for a long time and a large number of users have versions installed that are far older than two years and hey, now I’d like to tease you and tell you what can be yours if you take the step straight into the modern day curl or libcurl.

Thanks

Before we dive into the real contents, let’s not fool ourselves and think that we managed these years and all these changes without the tireless efforts and contributions from hundreds of awesome hackers. Thank you everyone! I keep calling myself lead developer of curl but it truly would not not exist without all the help I get.

We keep getting a steady stream of new contributors and quality patches. Our problem is rather to review and receive the contributions in a timely manner. In a personal view, I would also like to just add that during these two last years I’ve had support from my awesome employer Mozilla that allows me to spend a part of my work hours on curl.

What happened the last 2 years in curl?

We released curl and libcurl 7.34.0 on December 17th 2013 (12 releases ago). What  did we do since then that could be worth mentioning? Well, a lot, and then I’m going to mostly skip the almost 900 bug fixes we did in this time.

Many security fixes

Almost half (18 out of 37) of the security vulnerabilities reported for our project were reported during the last two years. It may suggest a greater focus and more attention put on those details by users and developers. Security reports are a good thing, it means that we address and find problems. Yes it unfortunately also shows that we introduce security issues at times, but I consider that secondary, even if we of course also work on ways to make sure we’ll do this less in the future.

URL specific options: –next

A pretty major feature that was added to the command line tool without much bang or whistles. You can now add –next as a separator on the command line to “group” options for specific URLs. This allows you to run multiple different requests on URLs that still can re-use the same connection and so on. It opens up for lots of more fun and creative uses of curl and has in fact been requested on and off for the project’s entire life time!

HTTP/2

There’s a new protocol version in town and during the last two years it was finalized and its RFC went public. curl and libcurl supports HTTP/2, although you need to explicitly ask for it to be used still.

HTTP/2 is binary, multiplexed, uses compressed headers and offers server push. Since the command line tool is still serially sending and receiving data, the multiplexing and server push features can right now only get fully utilized by applications that use libcurl directly.

HTTP/2 in curl is powered by the nghttp2 library and it requires a fairly new TLS library that supports the ALPN extension to be fully usable for HTTPS. Since the browsers only support HTTP/2 over HTTPS, most HTTP/2 in the wild so far is done over HTTPS.

We’ve gradually implemented and provided more and more HTTP/2 features.

Separate proxy headers

For a very long time, there was no way to tell curl which custom headers to use when talking to a proxy and which to use when talking to the server. You’d just add a custom header to the request. This was never good and we eventually made it possible to specify them separately and then after the security alert on the same thing, we made it the default behavior.

Option man pages

We’ve had two user surveys as we now try to make it an annual spring tradition for the project. To learn what people use, what people think, what people miss etc. Both surveys have told us users think our documentation needs improvement and there has since been an extra push towards improving the documentation to make it more accessible and more readable.

One way to do that, has been to introduce separate, stand-alone, versions of man pages for each and very libcurl option. For the functions curl_easy_setopt, curl_multi_setopt and curl_easy_getinfo. Right now, that means 278 new man pages that are easier to link directly to, easier to search for with Google etc and they are now written with more text and more details for each individual option. In total, we now host and maintain 351 individual man pages.

The boringssl / libressl saga

The Heartbleed incident of April 2014 was a direct reason for libressl being created as a new fork of OpenSSL and I believe it also helped BoringSSL to find even more motivation for its existence.

Subsequently, libcurl can be built to use either one of these three forks based on the same origin.  This is however not accomplished without some amount of agony.

SSLv3 is also disabled by default

The continued number of problems detected in SSLv3 finally made it too get disabled by default in curl (together with SSLv2 which has been disabled by default for a while already). Now users need to explicitly ask for it in case they need it, and in some cases the TLS libraries do not even support them anymore. You may need to build your own binary to get the support back.

Everyone should move up to TLS 1.2 as soon as possible. HTTP/2 also requires TLS 1.2 or later when used over HTTPS.

support for the SMB/CIFS protocol

For the first time in many years we’ve introduced support for a new protocol, using the SMB:// and SMBS:// schemes. Maybe not the most requested feature out there, but it is another network protocol for transfers…

code of conduct

Triggered by several bad examples in other projects, we merged a code of conduct document into our source tree without much of a discussion, because this is the way this project always worked. This just makes it clear to newbies and outsiders in case there would ever be any doubt. Plus it offers a clear text saying what’s acceptable or not in case we’d ever come to a point where that’s needed. We’ve never needed it so far in the project’s very long history.

–data-raw

Just a tiny change but more a symbol of the many small changes and advances we continue doing. The –data option that is used to specify what to POST to a server can take a leading ‘@’ symbol and then a file name, but that also makes it tricky to actually send a literal ‘@’ plus it makes scripts etc forced to make sure it doesn’t slip in one etc.

–data-raw was introduced to only accept a string to send, without any ability to read from a file and not using ‘@’ for anything. If you include a ‘@’ in that string, it will be sent verbatim.

attempting VTLS as a lib

We support eleven different TLS libraries in the curl project – that is probably more than all other transfer libraries in existence do. The way we do this is by providing an internal API for TLS backends, and we call that ‘vtls’.

In 2015 we started made an effort in trying to make that into its own sub project to allow other open source projects and tools to use it. We managed to find a few hackers from the wget project also interested and to participate. Unfortunately I didn’t feel I could put enough effort or time into it to drive it forward much and while there was some initial work done by others it soon was obvious it wouldn’t go anywhere and we pulled the plug.

The internal vtls glue remains fine though!

pull-requests on github

Not really a change in the code itself but still a change within the project. In March 2015 we changed our policy regarding pull-requests done on github. The effect has been a huge increase in number of pull-requests and a slight shift in activity away from the mailing list over to github more. I think it has made it easier for casual contributors to send enhancements to the project but I don’t have any hard facts backing this up (and I wouldn’t know how to measure this).

… as mentioned in the beginning, there have also been hundreds of smaller changes and bug fixes. What fun will you help us make reality in the next two years?