Tag Archives: windows

much faster curl uploads on Windows with a single tiny commit

These days, operating system kernels provide TCP/IP stacks that can do really fast network transfers. It's not even unusual for ordinary people to have gigabit connections at home and of course we want our applications to be able take advantage of them.

I don't think many readers here will be surprised when I say that fulfilling this desire turns out much easier said than done in the Windows world.


Since Windows 7 / 2008R2, Windows implements send buffer autotuning. Simply put, the faster transfer and longer RTT the connection has, the larger the buffer it uses (up to a max) so that more un-acked data can be outstanding and thus enable the system to saturate even really fast links.

Turns out this useful feature isn't enabled when applications use non-blocking sockets. The send buffer isn't increased at all then.

Internally, curl is using non-blocking sockets and most of the code is platform agnostic so it wouldn't be practical to switch that off for a particular system. The code is pretty much independent of the target that will run it, and now with this latest find we have also started to understand why it doesn't always perform as well on Windows as on other operating systems: the upload buffer (SO_SNDBUF) is fixed size and simply too small to perform well in a lot of cases

Applications can still enlarge the buffer, if they're aware of this bottleneck, and get better performance without having to change libcurl, but I doubt a lot of them do. And really, libcurl should perform as good as it possibly can just by itself without any necessary tuning by the application authors.

Users testing this out

Daniel Jelinski brought a fix for this that repeatedly poll Windows during uploads to ask for a suitable send buffer size and then resizes it on the go if it deems a new size is better. In order to figure out that if this patch is indeed a good idea or if there's a downside for some, we went wide and called out for users to help us.

The results were amazing. With speedups up to almost 7 times faster, exactly those newer Windows versions that supposedly have autotuning can obviously benefit substantially from this patch. The median test still performed more than twice as fast uploads with the patch. Pretty amazing really. And beyond weird that this crazy thing should be required to get ordinary sockets to perform properly on an updated operating system in 2018.

Windows XP isn't affected at all by this fix, and we've seen tests running as VirtualBox guests in NAT-mode also not gain anything, but we believe that's VirtualBox's "fault" rather than Windows or the patch.


The commit is merged into curl's master git branch and will be part of the pending curl 7.61.1 release, which is due to ship on September 5, 2018. I think it can serve as an interesting case study to see how long time it takes until Windows 10 users get their versions updated to this.

Table of test runs

The Windows versions, and the test times for the runs with the unmodified curl, the patched one, how much time the second run needed as a percentage of the first, a column with comments and last a comment showing the speedup multiple for that test.

Thank you everyone who helped us out by running these tests!

Version Time vanilla Time patched New time Comment speedup
6.0.6002 15.234 2.234 14.66% Vista SP2 6.82
6.1.7601 8.175 2.106 25.76% Windows 7 SP1 Enterprise 3.88
6.1.7601 10.109 2.621 25.93% Windows 7 Professional SP1 3.86
6.1.7601 8.125 2.203 27.11% 2008 R2 SP1 3.69
6.1.7601 8.562 2.375 27.74% 3.61
6.1.7601 9.657 2.684 27.79% 3.60
6.1.7601 11.263 3.432 30.47% Windows 2008R2 3.28
6.1.7601 5.288 1.654 31.28% 3.20
10.0.16299.309 4.281 1.484 34.66% Windows 10, 1709 2.88
10.0.17134.165 4.469 1.64 36.70% 2.73
10.0.16299.547 4.844 1.797 37.10% 2.70
10.0.14393 4.281 1.594 37.23% Windows 10, 1607 2.69
10.0.17134.165 4.547 1.703 37.45% 2.67
10.0.17134.165 4.875 1.891 38.79% 2.58
10.0.15063 4.578 1.907 41.66% 2.40
6.3.9600 4.718 2.031 43.05% Windows 8 (original) 2.32
10.0.17134.191 3.735 1.625 43.51% 2.30
10.0.17713.1002 6.062 2.656 43.81% 2.28
6.3.9600 2.921 1.297 44.40% Windows 2012R2 2.25
10.0.17134.112 5.125 2.282 44.53% 2.25
10.0.17134.191 5.593 2.719 48.61% 2.06
10.0.17134.165 5.734 2.797 48.78% run 1 2.05
10.0.14393 3.422 1.844 53.89% 1.86
10.0.17134.165 4.156 2.469 59.41% had to use the HTTPS endpoint 1.68
6.1.7601 7.082 4.945 69.82% over proxy 1.43
10.0.17134.165 5.765 4.25 73.72% run 2 1.36
5.1.2600 10.671 10.157 95.18% Windows XP Professional SP3 1.05
10.0.16299.547 1.469 1.422 96.80% in a VM runing on Linux 1.03
5.1.2600 11.297 11.046 97.78% XP 1.02
6.3.9600 5.312 5.219 98.25% 1.02
5.2.3790 5.031 5 99.38% Windows 2003 1.01
5.1.2600 7.703 7.656 99.39% XP SP3 1.01
10.0.17134.191 1.219 1.531 125.59% FTP 0.80
TOTAL 205.303 102.271 49.81% 2.01
MEDIAN 43.51% 2.30

Removing the PowerShell curl alias?

PowerShell is a spiced up command line shell made by Microsoft. According to some people, it is a really useful and good shell alternative.

Already a long time ago, we got bug reports from confused users who couldn't use curl from their PowerShell prompts and it didn't take long until we figured out that Microsoft had added aliases for both curl and wget. The alias had the shell instead invoke its own command called "Invoke-WebRequest" whenever curl or wget was entered. Invoke-WebRequest being PowerShell's own version of a command line tool for fiddling with URLs.

Invoke-WebRequest is of course not anywhere near similar to neither curl nor wget and it doesn't support any of the command line options or anything. The aliases really don't help users. No user who would want the actual curl or wget is helped by these aliases, and user who don't know about the real curl and wget won't use the aliases. They were and remain pointless. But they've remained a thorn in my side ever since. Me knowing that they are there and confusing users every now and then - not me personally, since I'm not really a Windows guy.

Fast forward to modern days: Microsoft released PowerShell as open source on github yesterday. Without much further ado, I filed a Pull-Request, asking the aliases to be removed. It is a minuscule, 4 line patch. It took way longer to git clone the repo than to make the actual patch and submit the pull request!

It took 34 minutes for them to close the pull request:

"Those aliases have existed for multiple releases, so removing them would be a breaking change."

To be honest, I didn't expect them to merge it easily. I figure they added those aliases for a reason back in the day and it seems unlikely that I as an outsider would just make them change that decision just like this out of the blue.

But the story didn't end there. Obviously more Microsoft people gave the PR some attention and more comments were added. Like this:

"You bring up a great point. We added a number of aliases for Unix commands but if someone has installed those commands on WIndows, those aliases screw them up.

We need to fix this."

So, maybe it will trigger a change anyway? The story is ongoing...

curl on windows versions

I had to ask. Just to get a notion of which Windows versions our users are actually using, so that we could get an indication which versions we still should make an effort to keep working on. As people download and run libcurl on their own, we just have no other ways to figure this out.

As always when asking a question to our audience, we can't really know which part of our users that responded and it is probably more safe to assume that it is not a representative distribution of our actual user base but it is simply as good as it gets. A hint.

I posted about this poll on the libcurl mailing list and over twitter. I had it open for about 48 hours. We received 86 responses. Click the image below for the full res version:

windows-versions-used-for-curlSo, Windows 10, 8 and 7 are very well used and even Vista and XP clocked in fairly high on 14% and 23%. Clearly those are Windows versions we should strive to keep supported.

For Windows versions older than XP I was sort of hoping we'd get a zero, but as you can see in the graph we have users claiming to use curl on as old versions as Windows NT 4. I even checked, and it wasn't the same two users that checked all those three oldest versions.

The "Other" marks were for Windows 2008 and 2012, and bonus points for the user who added "Other: debian 7". It is interesting that I specifically asked for users running curl on windows to answer this survey and yet 26% responded that they don't use Windows at all..

Changing networks with Firefox running

Short recap: I work on network code for Mozilla. Bug 939318 is one of "mine" - yesterday I landed a fix (a patch series with 6 individual patches) for this and I wanted to explain what goodness that should (might?) come from this!


diffstat reports this on the complete patch series:

29 files changed, 920 insertions(+), 162 deletions(-)

The change set can be seen in mozilla-central here. But I guess a proper description is easier for most...

The bouncy road to inclusion

This feature set and associated problems with it has been one of the most time consuming things I've developed in recent years, I mean in relation to the amount of actual code produced. I've had it "landed" in the mozilla-inbound tree five times and yanked out again before it landed correctly (within a few hours), every time of course reverted again because I had bugs remaining in there. The bugs in this have been really tricky with a whole bunch of timing-dependent and race-like problems and me being unfamiliar with a large part of the code base that I'm working on. It has been a highly frustrating journey during periods but I'd like to think that I've learned a lot about Firefox internals partly thanks to this resistance.

As I write this, it has not even been 24 hours since it got into m-c so there's of course still a risk there's an ugly bug or two left, but then I also hope to fix the pending problems without having to revert and re-apply the whole series...

Many ways to connect to networks

Firefox Nightly screenshotIn many network setups today, you get an environment and a network "experience" that is crafted for that particular place. For example you may connect to your work over a VPN where you get your company DNS and you can access sites and services you can't even see when you connect from the wifi in your favorite coffee shop. The same thing goes for when you connect to that captive portal over wifi until you realize you used the wrong SSID and you switch over to the access point you were supposed to use.

For every one of these setups, you get different DHCP setups passed down and you get a new DNS server and so on.

These days laptop lids are getting closed (and the machine is put to sleep) at one place to be opened at a completely different location and rarely is the machine rebooted or the browser shut down.

Switching between networks

Switching from one of the networks to the next is of course something your operating system handles gracefully. You can even easily be connected to multiple ones simultaneously like if you have both an Ethernet card and wifi.

Enter browsers. Or in this case let's be specific and talk about Firefox since this is what I work with and on. Firefox - like other browsers - will cache images, it will cache DNS responses, it maintains connections to sites a while even after use, it connects to some sites even before you "go there" and so on. All in the name of giving the users an as good and as fast experience as possible.

The combination of keeping things cached and alive, together with the fact that switching networks brings new perspectives and new "truths" offers challenges.

Realizing the situation is new

The changes are not at all mind-bending but are basically these three parts:

  1. Make sure that we detect network changes, even if just the set of available interfaces change. Send an event for this.
  2. Make sure the necessary parts of the code listens and understands this "network topology changed" event and acts on it accordingly
  3. Consider coming back from "sleep" to be a network changed event since we just cannot be sure of the network situation anymore.

The initial work has been made for Windows only but it allows us to smoothen out any rough edges before we continue and make more platforms support this.

The network changed event can be disabled by switching off the new "network.notify.changed" preference. If you do end up feeling a need for that, I really hope you file a bug explaining the details so that we can work on fixing it!

Act accordingly

So what is acting properly? What if the network changes in a way so that your active connections suddenly can't be used anymore due to the new rules and routing and what not? We attack this problem like this: once we get a "network changed" event, we "allow" connections to prove that they are still alive and if not they're torn down and re-setup when the user tries to reload or whatever. For plain old HTTP(S) this means just seeing if traffic arrives or can be sent off within N seconds, and for websockets, SPDY and HTTP2 connections it involves sending an actual ping frame and checking for a response.

The internal DNS cache was a bit tricky to handle. I initially just flushed all entries but that turned out nasty as I then also killed ongoing name resolves that caused errors to get returned. Now I instead added logic that flushes all the already resolved names and it makes names "in transit" to get resolved again so that they are done on the (potentially) new network that then can return different addresses for the same host name(s).

This should drastically reduce the situation that could happen before when Firefox would basically just freeze and not want to do any requests until you closed and restarted it. (Or waited long enough for other timeouts to trigger.)

The 'N seconds' waiting period above is actually 5 seconds by default and there's a new preference called "network.http.network-changed.timeout" that can be altered at will to allow some experimentation regarding what the perfect interval truly is for you.

Firefox BallInitially on Windows only

My initial work has been limited to getting the changed event code done for the Windows back-end only (since the code that figures out if there's news on the network setup is highly system specific), and now when this step has been taken the plan is to introduce the same back-end logic to the other platforms. The code that acts on the event is pretty much generic and is mostly in place already so it is now a matter of making sure the event can be generated everywhere.

My plan is to start on Firefox OS and then see if I can assist with the same thing in Firefox on Android. Then finally Linux and Mac.

I started on Windows since Windows is one of the platforms with the largest amount of Firefox users and thus one of the most prioritized ones.

More to do

There's separate work going on for properly detecting captive portals. You know the annoying things hotels and airports for example tend to have to force you to do some login dance first before you are allowed to use the internet at that location. When such a captive portal is opened up, that should probably qualify as a network change - but it isn't yet.

WSAPoll is broken

Microsoft admits the WSApoll function is broken but won't do anything about it. Unless perhaps if customers keep nagging them.

Doing portable socket programming has always meant using a bunch of #ifdefs and similar. A program needs to be built on many systems and slowly get adjusted to work really well all over. For ages, for example, Windows only supported select() and not poll() while all sensible systems[*] out there supported poll(). There are several reasons to prefer poll to select when writing code.

Then one day in 2006, Chad Charlin, a developer at Microsoft wrote the following when talking about the new WSApoll() function they introduced in Windows Vista:

Among the many improvements to the Winsock API shipping in Vista is the new WSAPoll function. Its primary purpose is to simplify the porting of a sockets application that currently uses poll() by providing an identical facility in Winsock for managing groups of sockets.

Great! Starting September 2006 curl started using it (shipped in the release curl and libcurl 7.16.0). It seemed like a huge step forward, and as Chad wrote:

If you have experience developing applications using poll(), WSAPoll will be very familiar. It is designed to behave just like poll().

Emphasis added by me. It was (of course) made to work like poll, and that's why the API is made like that. Why would you introduce something that is almost like poll() except in minor details?

Since the new function only was available in Vista and later, it took a while until libcurl users in a more wider scale got to use it but over time Windows XP users are slowly shifting away and more and more libcurl Windows users therefore use the WSApoll based builds. Life seemed to be good. Some users noticed funny things and reported bugs we couldn't repeat (on other platforms) but nothing really stood out and no big alarm bells went off.

During July 2012, a user of libcurl on Windows, Jan Koen Annot experienced such problems and he didn't just sigh and move on. He rolled up his sleeves and decided to get to the bottom. Perhaps he could fix a bug or two while at it? (It seems reasonable that he thought so, I haven't actually asked him!) What he found was however not a bug in libcurl. He found out that WSApoll did indeed not work like poll (his initial post to curl-library on the problem)! On August 1st he submitted a support issue to Microsoft about it. On August 7 we pushed the commit to curl that removed our use of WSApoll.

A few days go Jan reported back on how the case has gone, where his journey down the support alleys took him.

It turns out Microsoft already knew about this bug, which they apparently have named "Windows 8 Bugs 309411 - WSAPoll does not report failed connections". The ticket has been resolved as Won't Fix... (I haven't found any public access of this.)

Jan argued for the case that since WSApoll is designed and used as a plain poll() replacement it would make sense to actually make it also work the same way:

First, it will cost much time to find out that some 'real-life' issue can be traced back to this WSAPoll bug. In my case we were lucky to have a regression test which triggered when we started using a slightly different cURL-library configuration on Windows. Tracing back that the test was triggered because of this bug in WSAPoll took several hours. Imagine what it would cost, if some customer in the field reported annoying delays, to trace such a vague complaint back to a bug in the WSAPoll function!

Second, even if we know beforehand about this bug in WSAPoll, then it is difficult to determine in which situations in your code you can safely use WSAPoll and in which situations you might suffer from this bug. So a better recommendation would be to simply not use WSAPoll. (...)

Third, porting code which uses the poll() function to the Windows sockets API is made more complex. The introduction of WSAPoll was meant specifically for this, so it should have compatible behavior, without a recommendation to not use it in certain circumstances.

Fourth, your recommendation will only have effect when actively promoted to developers using WSAPoll. A much better approach would be to repair the bug and publish an update. Microsoft has some nice mechanisms in place for that.

So, my conclusion is that, even if in our case the business impact may be low because we found the bug in an early stage, it is still important that Microsoft fixes the bug and publishes an update.

In my eyes all very good and sensible arguments. Perhaps not too surprisingly, these fine reasons didn't have any particular impact on how Microsoft views this old and known bug that "has been like this forever and people are already used to it.". It will remain closed, and Microsoft motivated this decision to Jan quite clearly and with arguments one can understand:

A discussion has been conducted around this topic and the taken decision was not to have the fix implemented due to the following reasons:

  • This issue since Vista
  • no other Microsoft customer has asked for a Hotfix since Vista timeframe
  • fixing this old issue might have some application compatibility risk (for those customers who might have somehow taken a dependency on WSAPoll failing with a timeout in the cases of connect failure as opposed to POLLERR).
  • This API will become more irrelevant as the Windows versions increase; the networking APIs will move away from classic select/poll to more advanced I/O completion mechanisms.

Argument one and two are really weak and silly. Microsoft users are very rarely complaining to Microsoft and most wouldn't even know how to do it. Also, this problem may certainly still affect many users even if nobody has asked for a fix.

The compatibility risk is a valid point, but that's a bit of a hard argument to have. All bugs that are about behavior will of course risk that users have adapted to the wrong behavior so a bug fix may break those. All of us who write and maintain stable APIs are used to this problem, but sticking to the buggy way of working because it has been doing this for so long is in my eyes only correct if you document this with very large letters and emphasis in all documentation: WSApoll is not fully emulating poll - beware!

The fact that they will focus more on other APIs is also understandable but besides the point. We want reliable APIs that work as documented. Applications that are Windows-only probably already very rarely use WSApoll, it will probably remain being more important for porting socket style programs to Windows.

Jan also especially highlights a funny line from this Microsoft person:

The best way to add pressure for a hotfix to be released would be to have the customers reporting it again on http://connect.microsoft.com.

Okay, so even if they have motives why they won't fix this bug they seem to hint that if more customers nag them about it they might change their minds. Fair enough. But the users of libcurl who for five years perhaps experienced funny effects are extremely unlikely to ever report and complain to Microsoft about this. They are way more likely to complain to us, or possibly to just work around the issue somehow.

Of course, users of WSApoll can adapt to the differences and make conditional code that handles them and that could be what we end up with in the curl project in the future if we just get volunteers to adapt the code accordingly. In the mean time we've just reverted to the old select()-using code instead, since select() does in fact mimic the "real" select much better...

[*] = clearly Mac OS X is not a sensible system since its poll() implementation is even worse than Windows and is mostly broken or just unreliable. Subject for another blog post another time.

schannel support in libcurl

schannel is the API Microsoft provides to allow applications to for example implement SSL natively, without needing any third part library.

On Monday June 11th we merged the 30+ commits Marc Hörsken brought us. This is now the 8th SSL variation supported by libcurl, and I figure this is going to become fairly popular now in the Windows camp coming the next release: curl 7.27.0.

So now my old talk about the seven SSL libraries libcurl supported has become outdated...

It can be worth noting that as long as you build (lib)curl to also support SCP and SFTP, powered by libssh2, that library will still require a separate crypto library and libssh2 supports to get built with either OpenSSL or gcrypt. Marc mentioned that he might work on making that one use schannel as well.


Who’s 0xabadbabe and why?

It is Friday after all, so I'll offer this little glimpse as an example from what I do at work...

A while ago, I was working for a customer (who shall remain unnamed here) doing system simulation software. I worked on this project for a year or so. I ran full x86 systems completely simulated. During that time I was chasing some nasty bugs in the simulated usb-disk device that caused my Windows boot to end up in a blue screen.


I struggled to figure out why Windows 7 would write 0xABADBABE to EHCI register index 0x1C - which is a reserved register - during boot some 10 milliseconds before the blue screen appears, and I was convinced that it was due to a flaw in the EHCI simulation code and thus was the first indication of the failure. If I didn't have any simulated usb-disk inserted that write wouldn't occur, and similarly that write would occur even if I inserted the usb-disk much later - like even after Windows 7 had started and I was passed the login screen.

An interesting exercise is to grep for this (little-endian so twist it around!) 32 bit pattern in a freshly installed windows 7 file system - I found it on no less than 16 places in a 20GB file system. This bgrep utility was handy for this.

To properly disassemble that code, I hacked up a quick bcut tool so that I could cut out a suitable piece of the 20GB file to pass to objdump, as objdump very inconveniently does not offer an option to skip an arbitrary amount from the beginning of a file! Also, as it is not really possible to easily tell on which byte x86 code starts at, I had to be able to fine-adjust the beginning of the cut so that objdump would show correctly (this is x86-64):

      callq  *0x9061(%rip)        # 0x9080
      mov    0x40(%rsi),%r11d
      mov    %rsi,0x58(%rdi)
      mov    %r11d,(%rdi)
      mov    0x40(%rsi),%eax
      mov    %rsi,0x60(%rdi)
      mov    %eax,0x4(%rdi)
      mov    0xa0(%r13),%rax
      movl   $0xabadbabe,0x1c(%rax)

But then, reading that code never gave me enough clues to figure out why the offending MOV is made.

Thanks to a friend with a good eye and useful resources, I finally learned that Windows does this write on purpose to offer some kind of breakpoint for a debugger. It always does this (assuming a USB device or something is attached)!

A red herring as far as I'm concerned. Nothing to bother about, just MOV on! I simply made the simulation accept this.

Oh. You want to know what happened to the blue screen? It had nothing at all to do with the bad babe constant, but turned out to be because the ehci driver finds out that some USB data structs the controller fills in get pointers that point to memory outside of the area the driver has mapped for this purpose. In other words it was a really hard to track down bug in the simulated device.

localhost hack on Windows

There's no place like of my blog and friends in general know that I'm not really a Windows guy. I never use it and I never develop things explicitly for windows - but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I've just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using "localhost" in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at "c:\WINDOWS\system32\drivers\etc\hosts") and that default  /etc/hosts alternative used to have an entry for "localhost" in it that would point to

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it's not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string "localhost" and is documented to always return "all loopback addresses on the local computer".

So, a custom resolver such as c-ares that doesn't use Windows' functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds "localhost" in a local file but ends up asking the DNS server for info about it... A case that is far from ideal. Most servers won't have an entry for it and others might simply provide the wrong address.

I think we'll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there's even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to "..localmachine", all registered addresses on the local computer are returned.

Windows localhost slowness

A client of mine and myself ran a bunch of tests doing FTP and SFTP transfers against localhost to measure how fast our custom solution is compared to a set of existing solutions.

The specific results from this aren't what caught my eyes, mostly because they're currently still only used for comparisons and to measure relative improvements, but it was instead the relative speed differences between the tests run on Mac 10.5.5, on Windows XP SP3 and on Linux 2.6.26.

Some of the Windows transfers took a magnitude more time than the others. Ten times longer. Since we could see this across multiple tests each being run multiple times and it was also visible with third party tools, the only conclusion I can draw from this is that Windows for some reason has a much slower localhost.

Does any reader of this have any further knowledge or details to share on this topic? Anyone knows if more recent Windows versions do this any better?

It should be noted that on Windows the ssh server used was running in cygwin, which may account for some of the slowness as cygwin isn't really known for being blazingly fast...


Three friends responded to this question:

The first mention that he'd got problems on windows in the past where worked but 'localhost' didn't which might indicate that localhost for some reason would be treated differently.

The second said that it has been mentioned that Windows Vista has significant TCP improvements compared to older versions for which version the TCP/IP stack was rewritten completely.

Pierre (at Microsoft) pointed out that on Vista localhost resolves first to ::1 (ipv6) only, which may explain why some people experience quirks on Vista at least. This test was however done on XP...