isalnum() is not my friend

The other day we noticed some curl test case failures, that only happened on macos and not on Linux. Curious!

The failures were detected in our unit test 1307, when testing a particular internal pattern matching function (Curl_fnmatch). Both targets run almost identical code but somehow they ended up with different results! Test cases acting differently on different platforms isn’t an extremely rare situation, but in this case it is just a pattern matching function and there’s really nothing timing dependent or anything that I thought could explain different behaviors. It piqued my interest, so I dug in.

The isalnum() return value

Eventually I figured out that the libc function isalnum(), when it got the 8 input value hexadecimal c3 (decimal 195), would return true on the macos machine and false on the box running Linux with glibc!

int value = isalnum(0xc3);

Setting LANG=C before running the test on macos made its isalnum() return false. The input became c3 because the test program has an UTF-8 encoded character in it and the function works on bytes, not “characters”.

Or in the words of the opengroup.org documentation:

The isalnum() function shall test whether c is a character of class alpha or digit in the program’s current locale.

It’s all documented – of course. It was just me not really considering the impact of this.

Avoiding this

I don’t like different behaviors on different platforms given the same input. I don’t like having string functions in curl act differently depending on locale, mostly because curl and libcurl can very well be used with many different locales and I prefer having a stable fixed behavior that we can document and stand by. Also, the libcurl functionality has never been documented to vary due to locale so it would be a surprise (bug!) to users anyway.

We’ve now introduced a private version of isalnum() and the rest of the ctype family of functions for curl. Hopefully this will make the tests more stable now. And make our functions work more similar and independent of locale.

See also: strcasecmp in Turkish.

Time’s up to shut up and sign up for curl up

We have just opened up the registration site for curl up 2018, the annual curl developers meeting that this year takes place in Stockholm, Sweden, over the weekend April 14-15. There’s a limited number of seats available, so if you want to join in the fun it might be a good idea to decide early on.

Sign up!

Also, to get into the proper curl up spirit, here’s the curl quiz we ran last year. I hope to run something similar again this year, but of course with a different set of questions.

Cheers for curl 7.58.0

Here’s to another curl release!

curl 7.58.0 is the 172nd curl release and it contains, among other things, 82 bug fixes thanks to 54 contributors (22 new). All this done with 131 commits in 56 days.

The bug fix rate is slightly lower than in the last few releases, which I tribute mostly to me having been away on vacation for a month during this release cycle. I retain my position as “committer of the Month” and January 2018 is my 29th consecutive month where I’ve done most commits in the curl source code repository. In total, almost 58% of the commits have been done by me (if we limit the count to all commits done since 2014, I’m at 43%). We now count a total of 545 unique commit authors and 1,685 contributors.

So what’s new this time? (full changelog here)

libssh backend

Introducing the pluggable SSH backend, and libssh is now the new alternative SSH backend to libssh2 that has been supported since late 2006. This change alone brought thousands of new lines of code.

Tell configure to use it with –with-libssh and you’re all set!

The libssh backend work was done by Nikos Mavrogiannopoulos, Tomas Mraz, Stanislav Zidek, Robert Kolcun and Andreas Schneider.

Security

Yet again we announce security issues that we’ve found and fixed. Two of them to be exact:

  1. We found a problem with how HTTP/2 trailers was handled, which could lead to crashes or even information leakage.
  2. We addressed a problem for users sending custom Authorization: headers to HTTP servers and who are then redirected to another host that shouldn’t receive those Authorization headers.

Progress bar refresh

A minor thing, but we refreshed the progress bar layout for when no total size is known.

Next?

March 21 is the date set for next release. Unless of course we find an urgent reason to fix and release something before then…

A flying curl progress bar

curl features an alternative progress bar. When you invoke it with -# or the longer version –progress-bar, curl will show the transfer progress using a single “bar” on the screen instead of the default meter that shows a lot of data like amount of data, transfer speeds and times.

$ curl -# -O https://example.com/coolfile.tar.gz
############################################ 100.0%

The alternative progress bar works great when the amount of data to transfer is known since then it can actually know how large part of the transfer that is done etc. If the amount of data is unknown – which is not a super rare situation – the progress bar output instead used to output one ‘#’ per kilobyte of data so that it would still show something. That could then end up filling up the screen and more if you did a large transfer.

$ curl -# -O https://example.com/nosize.html
###########################################################################################################

The space ship bar

Starting in curl 7.58.0 (to be released on January 24, 2018), this latter progress bar layout is modified. If the total size is unknown, it will now instead display a small space ship flying across the line, back and forth – and it will only move as long as there is data being transferred. If it stalls, the little ship stops.

“Over” the space ship there are four nonsensical flying hashes (‘#’) that are simply moving across the line on a sine wave, following each other. They move independently of there being data transferred or not.

It can then end up looking similar to this:

Pointless

There’s no real “meaning” behind this new progress bar output mode. I wanted it to

  1. only use a single line, even in the no-total size known case
  2. somehow indicate when there’s no data flying (ie space ship stops)
  3. make it slightly more interesting to watch than just one # per kilobyte

Since this new bar has just landed and this is the first time we ship a release with it, I wouldn’t be surprised if we end up polishing it further later on.

Can you tell I started out my programming life as a demo programmer on the Commodore 64? 🙂

Inspect curl’s TLS traffic

Since a long time back, the venerable network analyzer tool Wireshark (screenshot above) has provided a way to decrypt and inspect TLS traffic when sent and received by Firefox and Chrome.

You do this by making the browser tell Wireshark the SSL secrets:

  1. set the environment variable named SSLKEYLOGFILE to a file name of your choice before you start the browser
  2. Setting the same file name path in the Master-secret field in Wireshark. Go to Preferences->Protocols->SSL and edit the path as shown in the screenshot below.

Having done this simple operation, you can now inspect your browser’s HTTPS traffic in Wireshark. Just super handy and awesome.

Just remember that if you record TLS traffic and want to save it for analyzing later, you need to also save the file with the secrets so that you can decrypt that traffic capture at a later time as well.

curl

Adding curl to the mix. curl can be built using a dozen different TLS libraries and not just a single one as the browsers do. It complicates matters a bit.

In the NSS library for example, which is the TLS library curl is typically built with on Redhat and Centos, handles the SSLKEYLOGFILE magic all by itself so by extension you have been able to do this trick with curl for a long time – as long as you use curl built with NSS. A pretty good argument to use that build really.

Since curl version 7.57.0 the SSLKEYLOGFILE feature can also be enabled when built with GnuTLS, BoringSSL or OpenSSL. In the latter two libs, the feature is powered by new APIs in those libraries and in GnuTLS the library’s own logic similar to how NSS does it. Since OpenSSL is the by far most popular TLS backend for curl, this feature is now brought to users more widely.

In curl 7.58.0 (due to ship on Janurary 24, 2018), this feature is built by default also for curl with OpenSSL and in 7.57.0 you need to define ENABLE_SSLKEYLOGFILE to enable it for OpenSSL and BoringSSL.

And what’s even cooler? This feature is at the same time also brought to every single application out there that is built against this or later versions of libcurl. In one single blow. now suddenly a whole world opens to make it easier for you to debug, diagnose and analyze your applications’ TLS traffic when powered by libcurl!

Like the description above for browsers, you

  1. set the environment variable SSLKEYLOGFILE to a file name to store the secrets in
  2. tell Wireshark to use that same file to find the TLS secrets (Preferences->Protocols->SSL), as the screenshot showed above
  3. run the libcurl-using application (such as curl) and Wireshark will be able to inspect TLS-based protocols just fine!

trace options

Of course, as a light weight alternative: you may opt to use the –trace or –trace-ascii options with the curl tool and be fully satisfied with that. Using those command line options, curl will log everything sent and received in the protocol layer without the TLS applied. With HTTPS you’ll see all the HTTP traffic for example.

Credits

Most of the curl work to enable this feature was done by Peter Wu and Ray Satiro.

Microsoft curls too

On December 19 2017, Microsoft announced that since insider build 17063 of Windows 10, curl is now a default component. I’ve been away from home since then so I haven’t really had time to sit down and write and explain to you all what this means, so while I’m a bit late, here it comes!

I see this as a pretty huge step in curl’s road to conquer the world.

curl was already existing on Windows

Ever since we started shipping curl, it has been possible to build curl for Windows and run it on Windows. It has been working fine on all Windows versions since at least Windows 95. Running curl on Windows is not new to us. Users with a little bit of interest and knowledge have been able to run curl on Windows for almost 20 years already.

Then we had the known debacle with Microsoft introducing a curl alias to PowerShell that has put some obstacles in the way for users of curl.

Default makes a huge difference

Having curl shipped by default by the manufacturer of an operating system of course makes a huge difference. Once this goes out to the general public, all of a sudden several hundred million users will get a curl command line tool install for them without having to do anything. Installing curl yourself on Windows still requires some skill and knowledge and on places like stackoverflow, there are many questions and users showing how it can be problematic.

I expect this to accelerate the curl command line use in the world. I expect this to increase the number of questions on how to do things with curl.

Lots of people mentioned how curl is a “good” new tool to use for malicious downloads of files to windows machines if you manage to run code on someone’s Windows computer. curl is quite a capable thing that you truly do not want to get invoked involuntarily. But sure, any powerful and capable tool can of course be abused.

About the installed curl

This is what it looks when you check out the curl version on this windows build:

(screenshot from Steve Holme)

I don’t think this means that this is necessarily exactly what curl will look like once this reaches the general windows 10 installation, and I also expect Microsoft to update and upgrade curl as we go along.

Some observations from this simple screenshot, and if you work for Microsoft you may feel free to see this as some subtle hints on what you could work on improving in future builds:

  1. They ship 7.55.1, while 7.57.0 was the latest version at the time. That’s just three releases away so I consider that pretty good. Lots of distros and others ship (much) older releases. It’ll be interesting to see how they will keep this up in the future.
  2. Unsurprisingly, they use a build that uses the WinSSL backend for TLS.
  3. They did not build it with IDN support.
  4. They’ve explicitly disabled support a whole range of protocols that curl supports natively by default (gopher, smb, rtsp etc), but they still have a few rare protocols enabled (like dict).
  5. curl supports LDAP using the windows native API, but that’s not used.
  6. The Release-Date line shows they built curl from unreleased sources (most likely directly from a git clone).
  7. No HTTP/2 support is provided.
  8. There’s no automatic decompression support for gzip or brotli content.
  9. The build doesn’t support metalink and no PSL (public suffix list).

(curl gif from the original Microsoft curl announcement blog post)

Independent

Finally, I’d like to add that like all operating system distributions that ship curl (macOS, Linux distros, the BSDs, AIX, etc) Microsoft builds, packages and ships the curl binary completely independently from the actual curl project.

Sure I’ve been in contact with the good people working on this from their end, but they are working totally independently of us in the curl project. They mostly get our code, build it and ship it.

I of course hope that we will get bug fixes and improvement from their end going forward when they find problems or things to polish.

The future looks as great as ever before!

Update: in March 2018, they mentioned that curl comes in Windows 10 version 1803.