Tag Archives: testing

Test servers for curl

curl supports some twenty-three protocols (depending on exactly how you count).

In order to properly test and verify curl’s implementations of each of these protocols, we have a test suite. In the test suite we have a set of handcrafted servers that speak the server-side of these protocols. The more used a protocol is, the more important it is to have it thoroughly tested.

We believe in having test servers that are “stupid” and that offer buttons, levers and thresholds for us to control and manipulate how they act and how they respond for testing purposes. The control of what to send should be dictated as much as possible by the test case description file. If we want a server to send back a slightly broken protocol sequence to check how curl supports that, the server must be open for this.

In order to do this with a large degree of freedom and without restrictions, we’ve found that using “real” server software for this purpose is usually not good enough. Testing the broken and bad cases are typically not easily done then. Actual server software tries hard to do the right thing and obey standards and protocols, while we rather don’t want the server to make any decisions by itself at all but just send exactly the bytes we ask it to. Simply put.

Of course we don’t always get what we want and some of these protocols are fairly complicated which offer challenges in sticking to this policy all the way. Then we need to be pragmatic and go with what’s available and what we can make work. Having test cases run against a real server is still better than no test cases at all.

Now SOCKS

“SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

(according to Wikipedia)

Recently we fixed a bug in how curl sends credentials to a SOCKS5 proxy as it turned out the protocol itself only supports user name and password length of 255 bytes each, while curl normally has no such limits and could pass on credentials with virtually infinite lengths. OK, that was silly and we fixed the bug. Now curl will properly return an error if you try such long credentials with your SOCKS5 proxy.

As a general rule, fixing a bug should mean adding at least one new test case, right? Up to this time we had been testing the curl SOCKS support by firing up an ssh client and having that setup a SOCKS proxy that connects to the other test servers.

curl -> ssh with SOCKS proxy -> test server

Since this setup doesn’t support SOCKS5 authentication, it turned out complicated to add a test case to verify that this bug was actually fixed.

This test problem was fixed by the introduction of a newly written SOCKS proxy server dedicated for the curl test suite (which I simply named socksd). It does the basic SOCKS4 and SOCKS5 protocol logic and also supports a range of commands to control how it behaves and what it allows so that we can now write test cases against this server and ask the server to misbehave or otherwise require fun things so that we can make really sure curl supports those cases as well.

It also has the additional bonus that it works without ssh being present so it will be able to run on more systems and thus the SOCKS code in curl will now be tested more widely than before.

curl -> socksd -> test server

Going forward, we should also be able to create even more SOCKS tests with this and make sure to get even better SOCKS test coverage.

curl 7.62.0 MOAR STUFF

This is a feature-packed release with more new stuff than usual.

Numbers

the 177th release
10 changes
56 days (total: 7,419)

118 bug fixes (total: 4,758)
238 commits (total: 23,677)
5 new public libcurl functions (total: 80)
2 new curl_easy_setopt() options (total: 261)

1 new curl command line option (total: 219)
49 contributors, 21 new (total: 1,808)
38 authors, 19 new (total: 632)
  3 security fixes (total: 84)

Security

New since the previous release is the dedicated curl bug bounty program. I’m not sure if this program has caused any increase in reports as it feels like a little too early to tell.

CVE-2018-16839 – an integer overflow case that triggers on 32 bit machines given extremely long input user name argument, when using POP3, SMTP or IMAP.

CVE-2018-16840 – a use-after-free issue. Immediately after having freed a struct in the easy handle close function, libcurl might write a boolean to that struct!

CVE-2018-16842 – is a vulnerability in the curl command line tool’s “warning” message display code which can make it read outside of a buffer and send unintended memory contents to stderr.

All three of these issues are deemed to have low severity and to be hard to exploit.

New APIs!

We introduce a brand new URL API, that lets applications parse and generate URLs, using libcurl’s own parser. Five new public functions in one go there! The link goes to the separate blog entry that explained it.

A brand new function is introduced (curl_easy_upkeep) to let applications maintain idle connections while no transfers are in progress! Perfect to maintain HTTP/2 connections for example that have a PING frame that might need attention.

More changes

Applications using libcurl’s multi interface will now get multiplexing enabled by default, and HTTP/2 will be selected for HTTPS connections. With these new changes of the default behavior, we hope that lots of applications out there just transparently and magically will start to perform better over time without anyone having to change anything!

We shipped DNS-over-HTTPS support. With DoH, your internet client can do secure and private name resolves easier. Follow the link for the full blog entry with details.

The good people at MesaLink has a TLS library written in rust, and in this release you can build libcurl to use that library. We haven’t had a new TLS backend supported since 2012!

Our default IMAP handling is slightly changed, to use the proper standards compliant “UID FETCH” method instead of just “FETCH”. This might introduce some changes in behavior so if you’re doing IMAP transfers, I advice you to mind your step into this upgrade.

Starting in 7.62.0, applications can now set the buffer size libcurl will use for uploads. The buffers used for download and upload are separate and applications have been able to specify the download buffer size for a long time already and now they can finally do it for uploads too. Most applications won’t need to bother about it, but for some edge case uses there are performance gains to be had by bumping this size up. For example when doing SFTP uploads over high latency high bandwidth connections.

curl builds that use libressl will now at last show the correct libressl version number in the “curl -V” output.

Deprecating legacy

CURLOPT_DNS_USE_GLOBAL_CACHE is deprecated! If there’s not a massive complaint uproar, this means this option will effectively be made pointless in April 2019. The global cache isn’t thread-safe and has been called obsolete in the docs since 2002!

HTTP pipelining support is deprecated! Starting in this version, asking for pipelining will be ignored by libcurl. We strongly urge users to switch to and use HTTP/2, which in 99% of the cases is the better alternative to HTTP/1.1 Pipelining. The pipelining code in libcurl has stability problems. The impact of disabled pipelining should be minimal but some applications will of course notice. Also note the section about HTTP/2 and multiplexing by default under “changes” above.

To get an overview of all things marked for deprecation in curl and their individual status check out this page.

Interesting bug-fixes

TLS 1.3 support for GnuTLS landed. Now you can build curl to support TLS 1.3 with most of the TLS libraries curl supports: GnuTLS, OpenSSL, BoringSSL, libressl, Secure Transport, WolfSSL, NSS and MesaLink.

curl got Windows VT Support and UTF-8 output enabled, which should make fancy things like “curl wttr.in” to render nice outputs out of the box on Windows as well!

The TLS backends got a little cleanup and error code use unification so that they should now all return the same error code for the same problem no matter which backend you use!

When you use curl to do URL “globbing” as for example “curl http://localhost/[1-22]” to fetch a range or a series of resources and accidentally mess up the range, curl would previously just say that it detected an error in the glob pattern. Starting now, it will also try to show exactly where in which pattern it found the error that made it stop processing it.

CI

The curl for Windows CI builds on AppVeyor are now finally also running the test suite! Actually making sure that the Windows build is intact in every commit and PR is a huge step forward for us and our aim to keep curl functional. We also build several additional and different build combinations on Windows in the CI than we did previously. All in an effort to reduce regressions.

We’ve added four new checks to travis (that run on every pull-request and commit):

  1. The “tidy” build runs clang-tidy on all sources in src/ and lib/.
  2. a –disable-verbose build makes sure this configure option still builds curl warning-free
  3. the “distcheck” build now scans all files for accidental unicode BOM markers
  4. a MesaLink-using build verifies this configuration

CI build times

We’re right now doing 40 builds on every commit, spending around 12 hours of CPU time for a full round. With >230 landed commits in the tree that originated from 150-something pull requests,  with a lot of them having been worked out using multiple commits, we’ve done perhaps 500 full round CI builds in these 56 days.

This of course doesn’t include all the CPU time developers spend locally before submitting PRs or even the autobuild system that currently runs somewhere in the order of 50 builds per day. If we assume an average time spent for each build+test to take 20 minutes, this adds another 930 hours of CI hours done from the time of the previous release until this release.

To sum up, that’s about 7,000 hours of CI spent in 56 days, equaling about 520% non-stop CPU time!

We are grateful for all the help we get!

Next release

The next release will ship on December 12, 2018 unless something urgent happens before that.

Note that this date breaks the regular eight week release cycle and is only six weeks off. We do this since the originally planned date would happen in the middle of Christmas when “someone” plans to be off traveling…

The next release will probably become 7.63.0 since we already have new changes knocking on the door waiting to get merged that will warrant another minor number bump. Stay tuned for details!

isalnum() is not my friend

The other day we noticed some curl test case failures, that only happened on macos and not on Linux. Curious!

The failures were detected in our unit test 1307, when testing a particular internal pattern matching function (Curl_fnmatch). Both targets run almost identical code but somehow they ended up with different results! Test cases acting differently on different platforms isn’t an extremely rare situation, but in this case it is just a pattern matching function and there’s really nothing timing dependent or anything that I thought could explain different behaviors. It piqued my interest, so I dug in.

The isalnum() return value

Eventually I figured out that the libc function isalnum(), when it got the 8 input value hexadecimal c3 (decimal 195), would return true on the macos machine and false on the box running Linux with glibc!

int value = isalnum(0xc3);

Setting LANG=C before running the test on macos made its isalnum() return false. The input became c3 because the test program has an UTF-8 encoded character in it and the function works on bytes, not “characters”.

Or in the words of the opengroup.org documentation:

The isalnum() function shall test whether c is a character of class alpha or digit in the program’s current locale.

It’s all documented – of course. It was just me not really considering the impact of this.

Avoiding this

I don’t like different behaviors on different platforms given the same input. I don’t like having string functions in curl act differently depending on locale, mostly because curl and libcurl can very well be used with many different locales and I prefer having a stable fixed behavior that we can document and stand by. Also, the libcurl functionality has never been documented to vary due to locale so it would be a surprise (bug!) to users anyway.

We’ve now introduced a private version of isalnum() and the rest of the ctype family of functions for curl. Hopefully this will make the tests more stable now. And make our functions work more similar and independent of locale.

See also: strcasecmp in Turkish.

Testing curl

In order to ship a quality product – once every eight weeks – we need lots of testing. This is what we do to test curl and libcurl.

checksrc

We have basic script that verifies that the source code adheres to our code standard. It doesn’t catch all possible mistakes, but usually it complains with enough details to help contributors to write their code to match the style we already use. Consistent code style makes the code easier to read. Easier reading makes less bugs and quicker debugging.

By doing this check with a script (that can be run automatically when building curl), it makes it easier for everyone to ship properly formatted code.

We have not (yet) managed to convince clang-format or other tools to reformat code to correctly match our style, and we don’t feel like changing it just for the sake of such a tool. I consider this a decent work-around.

make test

The test suite that we bundle with the source code in the git repository has a large number of tests that test…

  • curl – it runs the command line tool against test servers for a large range of protocols and verifies error code, the output, the protocol details and that there are no memory leaks
  • libcurl – we then build many small test programs that use the libcurl API and perform tests against test servers and verifies that they behave correctly and don’t leak memory etc.
  • unit tests – we build small test programs that use libcurl internal functions that aren’t exposed in the API and verify that they behave correctly and generate the presumed output.
  • valgrind – all the tests above can be run with and without valgrind to better detect memory issues
  • “torture” – a special mode that can run the tests above in a way that first runs the entire test, counts the number of memory related functions (malloc, strdup, fopen, etc) that are called and then runs the test again that number of times and for each run it makes one of the memory related functions fail – and makes sure that no memory is leaked in any of those situations and no crash occurs etc. It runs the test over and over until all memory related functions have been made to fail once each.

Right now, a single “make test” runs over 1100 test cases, varying a little depending on exactly what features that are enabled in the build. Without valgrind, running those tests takes about 8 minutes on a reasonably fast machine but still over 25 minutes with valgrind.

Then we of course want to run all tests with different build options…

CI

For every pull request and for every source code commit done, the curl source is built for Linux, mac and windows. With a large set of different build options and TLS libraries selected, and all the tests mentioned above are run for most of these build combinations. Running ‘checksrc’ on the pull requests is of course awesome so that humans don’t have to remark on code style mistakes much. There are around 30 different builds done and verified for each commit.

If any CI build fails, the pull request on github gets a red X to signal that something was not OK.

We also run test case coverage analyses in the CI so that we can quickly detect if we for some reason significantly decrease test coverage or similar.

We use Travis CI, Appveyor and Coveralls.io for this.

Autobuilds

Independently of the CI builds, volunteers run machines that regularly update from git, build and run the entire test suite and then finally email the results back to a central server. These setups help us cover even more platforms, architectures and build combinations. Just with a little longer turn around time.

With millions of build combinations and support for virtually every operating system and CPU architecture under the sun, we have to accept that not everything can be fully tested. But since almost all code is shared for many platforms, we can still be reasonably sure about the code even for targets we don’t test regularly.

Static code analyzing

We run the clang scan-build on the source code daily and we run Coverity scans on the code “regularly”, about once a week.

We always address defects detected by these analyzers immediately when notified.

Fuzzing

We’re happy to be part of Google’s OSS-fuzz effort, which with a little help with integration from us keeps hammering our code with fuzz to make sure we’re solid.

OSS-fuzz has so far resulted in two security advisories for curl and a range of other bug fixes. It hasn’t been going on for very long and based on the number it has detected so far, I expect it to keep finding flaws – at least for a while more into the future.

Fuzzing is really the best way to hammer out bugs. When we’re down to zero detected static analyzer detects and thousands of test cases that all do good, the fuzzers can still continue to find holes in the net.

External

Independently of what we test, there are a large amount of external testing going on, for each curl release we do.

In a presentation by Google at curl up 2017, they mentioned their use of curl in “hundreds of applications” and how each curl release they adopt gets tested more than 400,000 times. We also know a lot of other users also have curl as a core component in their systems and test their installations extensively.

We have a large set of security interested developers who run tests and fuzzers on curl at their own will.

(image from pixabay)

6 hours of bliss

I sent out the release announcement for curl 7.52.0 exactly 07:59 in the morning of December 21, 2016. A Wednesday. We typically  release curl on Wednesdays out of old habit. It is a good release day.

curl 7.52.0 was just as any other release. Perhaps with a slightly larger set of new features than what’s typical for us. We introduce TLS 1.3 support, we now provide HTTPS-proxy support and the command line tool has this option called –fail-early that I think users will start to appreciate once they start to discover it. We also  announced three fixed security vulnerabilities. And some other good things.

I pushed the code to git, signed and uploaded the tarballs, I updated the info on the web site and I sent off that release announcement email and I felt good. Release-time good. That short feeling of relief and starting over on a new slate that I often experience these release days. Release days make me happy.

Any bets?

It is not unusual for someone to find a bug really fast after a release has shipped. As I was feeling good, I had to joke in the #curl IRC channel (42 minutes after that email):

08:41 <bagder> any bets on when the first bug report on the new release shows up? =)

Hours passed and maybe, just maybe there was not going to be any quick bugs filed on this release?

But of course. I wouldn’t write this blog post if it all had been nice and dandy. At 14:03, I got the email. 6 hours and 4 minutes since I wrote the 7.52.0 announcement email.

The email was addressed to the curl project security email list and included a very short patch and explanation how the existing code is wrong and needs “this fix” to work correctly. And it was entirely correct!

Now I didn’t feel that sense of happiness anymore. For some reason it was now completely gone and instead I felt something that involved sensations like rage, embarrassment and general tiredness. How the [beep] could this slip through like this?

I’ve done releases in the past that were broken to various extents but this is a sort of a new record and an unprecedented event. Enough time had passed that I couldn’t just yank the package from the download page either. I had to take it through the correct procedures.

What happened?

As part of a general code cleanup during this last development round, I changed all the internals to use a proper internal API to get random data and if libcurl is built with a TLS library it uses its provided API to get secure and safe random data. As a move to improve our use of random internally. We use this internal API for getting the nonce in authentication mechanisms such as Digest and NTLM and also for generating the boundary string in HTTP multipart formposts and more. (It is not used for any TLS or SSH level protocol stuff though.)

I did the largest part of the random overhaul of this in commit f682156a4f, just a little over a month ago.

Of course I made sure that all test cases kept working and there were no valgrind reports or anything, the code didn’t cause any compiler warnings. It did not generate any reports in the many clang-analyzer or Coverity static code analyzer runs we’ve done since. We run clang-analyzer daily and Coverity perhaps weekly.

But there’s a valgrind report just here!

Kamil Dudka, who sent the 14:03 email, got a valgrind error and that’s what set him off – but how come he got that and I didn’t?

The explanation consists of the following two conditions that together worked to hide the problem for us quite successfully:

  1. I (and I suppose several of the other curl hackers) usually build curl and libcurl “debug enabled”. This allows me to run more tests, do more diagnostics and debug it easier when I run into problems. It also provides a system with “fake random” so that we can actually verify that functions that otherwise use real random values generate the correct output when given a known random value… and yeah, this debug system prevented valgrind from detecting any problem!
  2. In the curl test suite we once had a problem with valgrind generating reports on third party libraries etc which then ended up as false positives. We then introduced a “valgrind report parser” that would detect if the report concerns curl or something else. It turns out this parser doesn’t detect the errors if curl is compiled without the cc’s -g command line option. And of course… curl and libcurl both build without -g by default!

The patch?

The vulnerable function basically uses this simple prototype. It is meant to get an “int” worth of random value stored in the buffer ‘rnd’ points to. That’s 4 bytes.

randit(struct Curl_easy *data, unsigned int *rnd)

But due to circumstances I can’t explain on anything other than my sloppy programming, I managed to write the function store random value in the actual pointer instead of the buffer it points to. So when the function returns, there’s nothing stored in the buffer. No 4 bytes of random. Just the uninitialized value of whatever happened to be there, on the stack.

The patch that fixes this problem looks like this (with some names shortened to simplify but keep the idea):

- res = random(data, (char *)&rnd, sizeof(rnd));
+ res = random(data, (char *)rnd, sizeof(*rnd));

So yeah. I introduced this security flaw in 7.52.0. We had it fixed in 7.52.1, released roughly 48 hours later.

(I really do not need comments on what other languages that wouldn’t have allowed this mistake or otherwise would’ve brought us world peace a long time ago.)

Make it not happen again

The primary way to make this same mistake not happen again easily, is that I’m removing the valgrind report parsing function from the test suite and we will now instead assume that valgrind reports will be legitimate and if not, work on suppressing the false positives in a better way.

References

This flaw is officially known as CVE-2016-9594

The real commit that fixed this problem is here, or as stand-alone patch.

The full security advisory for this flaw is here: https://curl.haxx.se/docs/adv_20161223.html

Facepalm photo by Alex E. Proimos.

xkcd: 221

Testing curl_multi_socket_action

We’re introducing a brand new way to test the event-based socket_action API in libcurl! (available in curl since commit 6cf8413e3162)

Background

Since 2006 we’ve had three major API families in libcurl for doing file transfers:

  1. the easy interface – a synchronous and yes, easy, interface for getting things done
  2. the multi interface – a non-blocking interface that allows multiple simultaneous transfers (in both directions) in the same thread
  3. the socket_action interface – a brother of the multi interface but designed for  use with an event-based library/engine for high performance and large scale transfers

The curl command line tool uses the easy interface and our test suite for curl + libcurl consists of perhaps 80% curl tests, while the rest are libcurl-using programs testing both the easy interface and the multi interface.

Early this year we modified libcurl’s internals so that the functions driving the easy interface transfer would use the multi interface internally. Then all of a sudden all the curl-using tests using the easy interface also then by definition tested that the operation worked fine with the multi interface. Needless to say, this pushed several bugs up to the surface that we could fix.

So the multi and the easy interfaces are tested by many hundred test cases on a large number of various systems every day around the clock. Nice! But what about the third interface? The socket_action interface isn’t tested at all! Time to change this sorry state.

Event-based test challenges

The event-based API has its own set of challenges; like it needs to react on socket state changes (only) and allow smooth interactions with the user’s own choice of event library etc. This is our newest API family and also the least commonly used. One reason for this may very well be that event-based coding is generally harder to do than more traditional poll-based code. Event-based code forces the application into using state-machines all over to a much higher degree and the frequent use of callbacks easily makes the code hard to read and its logic hard to follow.

So, curl_multi_socket_action() acts in ways that aren’t done or even necessary when the regular select-oriented multi interface is used. Code that then needs to be tested to remain working!

Introducing an alternative curl_easy_perform

As I mentioned before, we made the general multi interface widely tested by making sure the easy interface code uses the multi interface under the hood. We can’t easily do the same operation again, but instead this time we introduce a separate implementation (for debug-enabled builds) called curl_easy_perform_ev that instead uses the event-based API internally to drive the transfer.

The curl_multi_socket_action() is meant to use an event library to work really well multi-platform, or something like epoll directly if Linux-only functionality is fine for you. curl and libcurl is quite likely among the most portable code you can find so after having fought with this agony a while (how to introduce event-based testing without narrowing the tested platforms too much) I settled on a simple but yet brilliant solution (I can call it brilliant since I didn’t come up with the idea on my own):

We write an internal “simulated” event-based library with functionality provided by the libcurl internal function Curl_poll() (the link unfortunately goes to a line number, you may need to move around in the file to find the function). It is in itself a wrapper function that can work with either poll() or select() and should therefor work on just about any operating system written since the 90s, and most of the ones since before that as well! Doing such an emulation code may not be the most clever action if the aim would be to write a high performance and low latency application, but since my objective now is to exercise the API and make an curl_easy_perform clone it was perfect!

It should be carefully noted that curl_easy_perform_ev is only for testing and will only exist in debug-enabled builds and is therefor not considered stable nor a part of the public API!

Running event-based tests

The fake event library works with the curl_multi_socket_action() family of functions and when curl is invoked with –test-event, it will call curl_easy_perform_ev instead of curl_easy_perform and the transfer should then work exactly as without –test-event.

The main test suite running script called ‘runtests.pl’ now features the option -e that will run all ~800 curl tests with –test-event. It will skip tests it can’t run event-based – basically all the tests that don’t use the curl tool.

Many sockets is slow if not done with events

This picture on the right shows some very old performance measurements done on libcurl in the year 2005, but the time spent growing exponentially when the amount of sockets grow is exactly why you want to use something event-based instead of something poll or select based.

See also my document discussing poll, select and event-based.

ptest because “make test” is insufficient

CAUTION: test in progressMuch thanks to autoconf and automake we have an established more or less standardized way to build and install tools, libraries and other software. We build them with ‘make’ and we install them with ‘make install‘. This works great and it works equally fine even when we build stuff cross-compiled.

For testing however, the established concept and procedure is not as good. For testing we have ‘make test’ or ‘make check’ which typically first builds whatever needs to get built for the tests to run and then it runs all tests.

This is not good enough

Why? Because in lots of use-cases we build software using a cross-compiler on a build system that can’t run the executables. Therefore we need to first build the tests, then install the tests (somewhere that is reachable from the target system) and finally execute them. These steps need to be possible to run independently since at least the building and installing will sometimes happen on a different host than the execution of the tests.

yocto-projectIntroducing ptest

Within the yocto project, Björn Stenberg has pushed for ptest to be the basis of this new reform and concept. The responses he’s gotten so far has been positive and there’s a pending updated patch to be posted to the upstream oe-core list soon.

The work does not end there

Even if or when this can be incorporated into OpenEmbedded and Yocto – and I really think it is a matter of when since I believe we can work out all the flaws and quirks until virtually everyone involved is happy. The bulk of the changes however, really should be done upstream, in hundreds and thousands of open source packages. We (as upstream open source projects) need to start doing testing in at least two different steps, where one step build everything that needs to be built for the tests and then a second step that run the suite. The two steps could then in a cross-compiled scenario get executed first on the host system and then on the target system.

I expect that this will mean a whole bunch of patches and scripts to have to be maintained within OpenEmbedded for a while, when things will be tried to get merged into upstream projects and I also foresee that a certain percentage of all projects just won’t accept this new approach and will reject all patches in this vein.

Output format

I think the most controversial part of these suggested “universal” changes is the common test suite output format. The common format is of course required so that we can “supervise” the output and results from any package without having to know any specifics.

While the ptest output format follows the automake test output syntax, I expect many projects that have selected a particular output format to rather stick with that. Hopefully we can then make projects introduce a separate make target or option that runs the test suite with the standard output format.

One little step forward

Building full-fledged Linux distributions cross-compiled that are completely tested on target will remain being hard work for a while more. But we are improving things, one step at a time.

Of course, the name ‘ptest’ is what the system is currently called by Björn within the yocto/OE environment. It is not supposed to be a catchy name for this idea outside of there. The ‘P’ refers to package, as opposed to for example system test and to make it less generic than simply test.