Category Archives: Windows

Tech stuff on or with Windows

Blessed curl builds for Windows

The curl project is happy to introduce official and blessed curl builds for Windows for download on the curl web site.

This means we have a set of recommended curl packages that we advice users on Windows to download.

On Linux, macOS, cygwin and pretty much all the other alternatives you have out there, you don’t need to go to random sites on the Internet and download a binary package provided by a (to you) unknown stranger to get curl for your system. Unfortunately that is basically what we have forced Windows users into doing for a few years since our previous maintainer of curl builds for Windows dropped off the project.

These new official curl builds for Windows are the same set of builds Viktor Szakats has been building and providing to the community for a long time already. Now just with the added twist that he feeds his builds and information about them to the main curl site so that users can get them from the same site and thus lean on the same trust they already have in the curl brand in general.

These builds are reproducible, provided with sha256 hashes and a link to the full build log. Everything is public and transparently done.

All the hard work to get these builds in this great shape was done by Viktor Szakats.

Go get it!

much faster curl uploads on Windows with a single tiny commit

These days, operating system kernels provide TCP/IP stacks that can do really fast network transfers. It’s not even unusual for ordinary people to have gigabit connections at home and of course we want our applications to be able take advantage of them.

I don’t think many readers here will be surprised when I say that fulfilling this desire turns out much easier said than done in the Windows world.

Autotuning?

Since Windows 7 / 2008R2, Windows implements send buffer autotuning. Simply put, the faster transfer and longer RTT the connection has, the larger the buffer it uses (up to a max) so that more un-acked data can be outstanding and thus enable the system to saturate even really fast links.

Turns out this useful feature isn’t enabled when applications use non-blocking sockets. The send buffer isn’t increased at all then.

Internally, curl is using non-blocking sockets and most of the code is platform agnostic so it wouldn’t be practical to switch that off for a particular system. The code is pretty much independent of the target that will run it, and now with this latest find we have also started to understand why it doesn’t always perform as well on Windows as on other operating systems: the upload buffer (SO_SNDBUF) is fixed size and simply too small to perform well in a lot of cases

Applications can still enlarge the buffer, if they’re aware of this bottleneck, and get better performance without having to change libcurl, but I doubt a lot of them do. And really, libcurl should perform as good as it possibly can just by itself without any necessary tuning by the application authors.

Users testing this out

Daniel Jelinski brought a fix for this that repeatedly poll Windows during uploads to ask for a suitable send buffer size and then resizes it on the go if it deems a new size is better. In order to figure out that if this patch is indeed a good idea or if there’s a downside for some, we went wide and called out for users to help us.

The results were amazing. With speedups up to almost 7 times faster, exactly those newer Windows versions that supposedly have autotuning can obviously benefit substantially from this patch. The median test still performed more than twice as fast uploads with the patch. Pretty amazing really. And beyond weird that this crazy thing should be required to get ordinary sockets to perform properly on an updated operating system in 2018.

Windows XP isn’t affected at all by this fix, and we’ve seen tests running as VirtualBox guests in NAT-mode also not gain anything, but we believe that’s VirtualBox’s “fault” rather than Windows or the patch.

Landing

The commit is merged into curl’s master git branch and will be part of the pending curl 7.61.1 release, which is due to ship on September 5, 2018. I think it can serve as an interesting case study to see how long time it takes until Windows 10 users get their versions updated to this.

Table of test runs

The Windows versions, and the test times for the runs with the unmodified curl, the patched one, how much time the second run needed as a percentage of the first, a column with comments and last a comment showing the speedup multiple for that test.

Thank you everyone who helped us out by running these tests!

Version Time vanilla Time patched New time Comment speedup
6.0.6002 15.234 2.234 14.66% Vista SP2 6.82
6.1.7601 8.175 2.106 25.76% Windows 7 SP1 Enterprise 3.88
6.1.7601 10.109 2.621 25.93% Windows 7 Professional SP1 3.86
6.1.7601 8.125 2.203 27.11% 2008 R2 SP1 3.69
6.1.7601 8.562 2.375 27.74% 3.61
6.1.7601 9.657 2.684 27.79% 3.60
6.1.7601 11.263 3.432 30.47% Windows 2008R2 3.28
6.1.7601 5.288 1.654 31.28% 3.20
10.0.16299.309 4.281 1.484 34.66% Windows 10, 1709 2.88
10.0.17134.165 4.469 1.64 36.70% 2.73
10.0.16299.547 4.844 1.797 37.10% 2.70
10.0.14393 4.281 1.594 37.23% Windows 10, 1607 2.69
10.0.17134.165 4.547 1.703 37.45% 2.67
10.0.17134.165 4.875 1.891 38.79% 2.58
10.0.15063 4.578 1.907 41.66% 2.40
6.3.9600 4.718 2.031 43.05% Windows 8 (original) 2.32
10.0.17134.191 3.735 1.625 43.51% 2.30
10.0.17713.1002 6.062 2.656 43.81% 2.28
6.3.9600 2.921 1.297 44.40% Windows 2012R2 2.25
10.0.17134.112 5.125 2.282 44.53% 2.25
10.0.17134.191 5.593 2.719 48.61% 2.06
10.0.17134.165 5.734 2.797 48.78% run 1 2.05
10.0.14393 3.422 1.844 53.89% 1.86
10.0.17134.165 4.156 2.469 59.41% had to use the HTTPS endpoint 1.68
6.1.7601 7.082 4.945 69.82% over proxy 1.43
10.0.17134.165 5.765 4.25 73.72% run 2 1.36
5.1.2600 10.671 10.157 95.18% Windows XP Professional SP3 1.05
10.0.16299.547 1.469 1.422 96.80% in a VM runing on Linux 1.03
5.1.2600 11.297 11.046 97.78% XP 1.02
6.3.9600 5.312 5.219 98.25% 1.02
5.2.3790 5.031 5 99.38% Windows 2003 1.01
5.1.2600 7.703 7.656 99.39% XP SP3 1.01
10.0.17134.191 1.219 1.531 125.59% FTP 0.80
TOTAL 205.303 102.271 49.81% 2.01
MEDIAN 43.51% 2.30

Microsoft curls too

On December 19 2017, Microsoft announced that since insider build 17063 of Windows 10, curl is now a default component. I’ve been away from home since then so I haven’t really had time to sit down and write and explain to you all what this means, so while I’m a bit late, here it comes!

I see this as a pretty huge step in curl’s road to conquer the world.

curl was already existing on Windows

Ever since we started shipping curl, it has been possible to build curl for Windows and run it on Windows. It has been working fine on all Windows versions since at least Windows 95. Running curl on Windows is not new to us. Users with a little bit of interest and knowledge have been able to run curl on Windows for almost 20 years already.

Then we had the known debacle with Microsoft introducing a curl alias to PowerShell that has put some obstacles in the way for users of curl.

Default makes a huge difference

Having curl shipped by default by the manufacturer of an operating system of course makes a huge difference. Once this goes out to the general public, all of a sudden several hundred million users will get a curl command line tool install for them without having to do anything. Installing curl yourself on Windows still requires some skill and knowledge and on places like stackoverflow, there are many questions and users showing how it can be problematic.

I expect this to accelerate the curl command line use in the world. I expect this to increase the number of questions on how to do things with curl.

Lots of people mentioned how curl is a “good” new tool to use for malicious downloads of files to windows machines if you manage to run code on someone’s Windows computer. curl is quite a capable thing that you truly do not want to get invoked involuntarily. But sure, any powerful and capable tool can of course be abused.

About the installed curl

This is what it looks when you check out the curl version on this windows build:

(screenshot from Steve Holme)

I don’t think this means that this is necessarily exactly what curl will look like once this reaches the general windows 10 installation, and I also expect Microsoft to update and upgrade curl as we go along.

Some observations from this simple screenshot, and if you work for Microsoft you may feel free to see this as some subtle hints on what you could work on improving in future builds:

  1. They ship 7.55.1, while 7.57.0 was the latest version at the time. That’s just three releases away so I consider that pretty good. Lots of distros and others ship (much) older releases. It’ll be interesting to see how they will keep this up in the future.
  2. Unsurprisingly, they use a build that uses the WinSSL backend for TLS.
  3. They did not build it with IDN support.
  4. They’ve explicitly disabled support a whole range of protocols that curl supports natively by default (gopher, smb, rtsp etc), but they still have a few rare protocols enabled (like dict).
  5. curl supports LDAP using the windows native API, but that’s not used.
  6. The Release-Date line shows they built curl from unreleased sources (most likely directly from a git clone).
  7. No HTTP/2 support is provided.
  8. There’s no automatic decompression support for gzip or brotli content.
  9. The build doesn’t support metalink and no PSL (public suffix list).

(curl gif from the original Microsoft curl announcement blog post)

Independent

Finally, I’d like to add that like all operating system distributions that ship curl (macOS, Linux distros, the BSDs, AIX, etc) Microsoft builds, packages and ships the curl binary completely independently from the actual curl project.

Sure I’ve been in contact with the good people working on this from their end, but they are working totally independently of us in the curl project. They mostly get our code, build it and ship it.

I of course hope that we will get bug fixes and improvement from their end going forward when they find problems or things to polish.

The future looks as great as ever before!

Update: in March 2018, they mentioned that curl comes in Windows 10 version 1803.

Removing the PowerShell curl alias?

PowerShell is a spiced up command line shell made by Microsoft. According to some people, it is a really useful and good shell alternative.

Already a long time ago, we got bug reports from confused users who couldn’t use curl from their PowerShell prompts and it didn’t take long until we figured out that Microsoft had added aliases for both curl and wget. The alias had the shell instead invoke its own command called “Invoke-WebRequest” whenever curl or wget was entered. Invoke-WebRequest being PowerShell’s own version of a command line tool for fiddling with URLs.

Invoke-WebRequest is of course not anywhere near similar to neither curl nor wget and it doesn’t support any of the command line options or anything. The aliases really don’t help users. No user who would want the actual curl or wget is helped by these aliases, and user who don’t know about the real curl and wget won’t use the aliases. They were and remain pointless. But they’ve remained a thorn in my side ever since. Me knowing that they are there and confusing users every now and then – not me personally, since I’m not really a Windows guy.

Fast forward to modern days: Microsoft released PowerShell as open source on github yesterday. Without much further ado, I filed a Pull-Request, asking the aliases to be removed. It is a minuscule, 4 line patch. It took way longer to git clone the repo than to make the actual patch and submit the pull request!

It took 34 minutes for them to close the pull request:

“Those aliases have existed for multiple releases, so removing them would be a breaking change.”

To be honest, I didn’t expect them to merge it easily. I figure they added those aliases for a reason back in the day and it seems unlikely that I as an outsider would just make them change that decision just like this out of the blue.

But the story didn’t end there. Obviously more Microsoft people gave the PR some attention and more comments were added. Like this:

“You bring up a great point. We added a number of aliases for Unix commands but if someone has installed those commands on WIndows, those aliases screw them up.

We need to fix this.”

So, maybe it will trigger a change anyway? The story is ongoing…

curl on windows versions

I had to ask. Just to get a notion of which Windows versions our users are actually using, so that we could get an indication which versions we still should make an effort to keep working on. As people download and run libcurl on their own, we just have no other ways to figure this out.

As always when asking a question to our audience, we can’t really know which part of our users that responded and it is probably more safe to assume that it is not a representative distribution of our actual user base but it is simply as good as it gets. A hint.

I posted about this poll on the libcurl mailing list and over twitter. I had it open for about 48 hours. We received 86 responses. Click the image below for the full res version:

windows-versions-used-for-curlSo, Windows 10, 8 and 7 are very well used and even Vista and XP clocked in fairly high on 14% and 23%. Clearly those are Windows versions we should strive to keep supported.

For Windows versions older than XP I was sort of hoping we’d get a zero, but as you can see in the graph we have users claiming to use curl on as old versions as Windows NT 4. I even checked, and it wasn’t the same two users that checked all those three oldest versions.

The “Other” marks were for Windows 2008 and 2012, and bonus points for the user who added “Other: debian 7”. It is interesting that I specifically asked for users running curl on windows to answer this survey and yet 26% responded that they don’t use Windows at all..

schannel support in libcurl

schannel is the API Microsoft provides to allow applications to for example implement SSL natively, without needing any third part library.

On Monday June 11th we merged the 30+ commits Marc Hörsken brought us. This is now the 8th SSL variation supported by libcurl, and I figure this is going to become fairly popular now in the Windows camp coming the next release: curl 7.27.0.

So now my old talk about the seven SSL libraries libcurl supported has become outdated…

It can be worth noting that as long as you build (lib)curl to also support SCP and SFTP, powered by libssh2, that library will still require a separate crypto library and libssh2 supports to get built with either OpenSSL or gcrypt. Marc mentioned that he might work on making that one use schannel as well.

cURL

Who’s 0xabadbabe and why?

It is Friday after all, so I’ll offer this little glimpse as an example from what I do at work…

A while ago, I was working for a customer (who shall remain unnamed here) doing system simulation software. I worked on this project for a year or so. I ran full x86 systems completely simulated. During that time I was chasing some nasty bugs in the simulated usb-disk device that caused my Windows boot to end up in a blue screen.

BUGCODE_USB_DRIVER

I struggled to figure out why Windows 7 would write 0xABADBABE to EHCI register index 0x1C – which is a reserved register – during boot some 10 milliseconds before the blue screen appears, and I was convinced that it was due to a flaw in the EHCI simulation code and thus was the first indication of the failure. If I didn’t have any simulated usb-disk inserted that write wouldn’t occur, and similarly that write would occur even if I inserted the usb-disk much later – like even after Windows 7 had started and I was passed the login screen.

An interesting exercise is to grep for this (little-endian so twist it around!) 32 bit pattern in a freshly installed windows 7 file system – I found it on no less than 16 places in a 20GB file system. This bgrep utility was handy for this.

To properly disassemble that code, I hacked up a quick bcut tool so that I could cut out a suitable piece of the 20GB file to pass to objdump, as objdump very inconveniently does not offer an option to skip an arbitrary amount from the beginning of a file! Also, as it is not really possible to easily tell on which byte x86 code starts at, I had to be able to fine-adjust the beginning of the cut so that objdump would show correctly (this is x86-64):

      callq  *0x9061(%rip)        # 0x9080
      mov    0x40(%rsi),%r11d
      mov    %rsi,0x58(%rdi)
      mov    %r11d,(%rdi)
      mov    0x40(%rsi),%eax
      mov    %rsi,0x60(%rdi)
      mov    %eax,0x4(%rdi)
      mov    0xa0(%r13),%rax
      movl   $0xabadbabe,0x1c(%rax)

But then, reading that code never gave me enough clues to figure out why the offending MOV is made.

Thanks to a friend with a good eye and useful resources, I finally learned that Windows does this write on purpose to offer some kind of breakpoint for a debugger. It always does this (assuming a USB device or something is attached)!

A red herring as far as I’m concerned. Nothing to bother about, just MOV on! I simply made the simulation accept this.

Oh. You want to know what happened to the blue screen? It had nothing at all to do with the bad babe constant, but turned out to be because the ehci driver finds out that some USB data structs the controller fills in get pointers that point to memory outside of the area the driver has mapped for this purpose. In other words it was a really hard to track down bug in the simulated device.

localhost hack on Windows

There's no place like 127.0.0.1Readers of my blog and friends in general know that I’m not really a Windows guy. I never use it and I never develop things explicitly for windows – but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I’ve just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using “localhost” in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at “c:\WINDOWS\system32\drivers\etc\hosts”) and that default  /etc/hosts alternative used to have an entry for “localhost” in it that would point to 127.0.0.1.

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it’s not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string “localhost” and is documented to always return “all loopback addresses on the local computer”.

So, a custom resolver such as c-ares that doesn’t use Windows’ functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds “localhost” in a local file but ends up asking the DNS server for info about it… A case that is far from ideal. Most servers won’t have an entry for it and others might simply provide the wrong address.

I think we’ll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there’s even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to “..localmachine”, all registered addresses on the local computer are returned.

Concepts of a new distributed build

It was time to make an overhaul of our distributed builds system for Rockbox. The one currently in place is quite fancy and it does build 106 builds in around 7-8 minutes, but during the years it has served us we have found a few areas where we want to improve.

The goals for the new system were primarily:

  • do all the builds faster
  • reverse the connection so that people can contribute clients easier
  • make a system that is more allowing for slower machines to contribute

The biggest weaknesses of the existing system:

  • The master uses ssh to the distributed clients, which forces them to have an accessible ssh server and port etc. It also makes it awkward for people behind NATs who wants to run more clients.
  • It only hands out a particular build to one client, so thus if a large build happens to get handed to a slow client towards the end of a build round, all the other clients will sit idle waiting for the last client to finish.
  • The build and the subsequent upload of results to the master are synchronous, so thus a client with a very slow uplink may spend a significant time on the upload before it can start the next build.

The  new system is currently in development. It consists of a server that runs on one of our main servers, and there’s a client script that each volunteer contributor runs on their systems.

The clients connect to the master on a dedicated TCP port, specifying user name, password, name of the particular client instance, what particular architectures the client can build and how many bogomips the client boasts. While bogomips is a bogus way to measure anything, we’ve started out using it for a rough way to sort the the build clients based on speed.

The clients keep connected to the server all the time. There’s a ping message from the master every N second of idleness to make sure the connection is kept alive. As soon as the master wants the client to do a build, it sends a message to it detailing exactly how it should build it and using what SVN revision. The client will then do the build at once, upload the results using HTTP to a dedicated place and then tell the server the build is complete.

The server knows about all builds to do at a  commit, what we call a build round. It has a rough “score” or “weight” for each build that grades them in a slow to fast order. When a build round starts, the server will first sort all builds based on number of times they’ve been handed out and as secondary sort key the “weight” of it. Then it loops over the currently connected build clients and hand out builds from the sorted build table. The server then continues to do that until all clients have three builds each to build. As soon as a build is reported to have been completed by a client, that client will get the next build from the sorted build list.

If a client connects to the server and the server deems the client to be too old (since it does specify its version in the handshake message), it will be told to update to a specific version instead and come back then. This way the server can update all build clients when important things are fixed.

The clients will soon start to get assigned builds that already have been assigned to another client. This is not a problem but in fact our intention. The client that completes the build first will simply tell the server, and the server will then tell all the other clients that build that same build that they should cancel that particular build.

A client that joins the server in the middle of a build round will simply get a bunch of builds immediately and join in. A client that disconnects during a build round simply won’t complete its builds and other clients will instead do them. The system is also tolerant against the fact that bogomips is lame to compare computers with, and that the build “score” may not be very accurate or even that some server will have very slow or very fast upload speeds at unpredictable times.

The build master itself does not know when to start a new build round. It simply knows about the concept and it knows how to tell clients to complete a round. To make the master to start a new round, you need to connect to the server’s listening port and issue a special command and provide a password and then you can tell the server to start a build of a specific SVN revision. Or to queue up a build to be performed after the current one if there happens to be one in progress already.

When a full build round is complete, a hundred or so builds have been done, and full packages and log files are now in a directory on the build server, the server will simply trigger an external script that then takes care of updating our build table etc. In fact, every single completed build will optionally trigger an external script to allow web pages or stats pages to get updated as we go.

This build system is currently pretty Rockbox-specific as this is the project and development system we’re writing this for, but there’s really nothing in this that must be this way. I’m sure that if someone (you?) wants to adapt this for another project, I’d be more than happy to assist and to help ensuring that this becomes a more generic distributed build system. Just raise your hand and step forward!

At the time of this writing, (primarily) me and Björn are still ironing out quirks in this new system to hopefully get it going live real soon…

Rockbox

Windows localhost slowness

A client of mine and myself ran a bunch of tests doing FTP and SFTP transfers against localhost to measure how fast our custom solution is compared to a set of existing solutions.

The specific results from this aren’t what caught my eyes, mostly because they’re currently still only used for comparisons and to measure relative improvements, but it was instead the relative speed differences between the tests run on Mac 10.5.5, on Windows XP SP3 and on Linux 2.6.26.

Some of the Windows transfers took a magnitude more time than the others. Ten times longer. Since we could see this across multiple tests each being run multiple times and it was also visible with third party tools, the only conclusion I can draw from this is that Windows for some reason has a much slower localhost.

Does any reader of this have any further knowledge or details to share on this topic? Anyone knows if more recent Windows versions do this any better?

It should be noted that on Windows the ssh server used was running in cygwin, which may account for some of the slowness as cygwin isn’t really known for being blazingly fast…

Update:

Three friends responded to this question:

The first mention that he’d got problems on windows in the past where 127.0.0.1 worked but ‘localhost’ didn’t which might indicate that localhost for some reason would be treated differently.

The second said that it has been mentioned that Windows Vista has significant TCP improvements compared to older versions for which version the TCP/IP stack was rewritten completely.

Pierre (at Microsoft) pointed out that on Vista localhost resolves first to ::1 (ipv6) only, which may explain why some people experience quirks on Vista at least. This test was however done on XP…