Tag Archives: Network

WSAPoll is broken

Microsoft admits the WSApoll function is broken but won’t do anything about it. Unless perhaps if customers keep nagging them.

Doing portable socket programming has always meant using a bunch of #ifdefs and similar. A program needs to be built on many systems and slowly get adjusted to work really well all over. For ages, for example, Windows only supported select() and not poll() while all sensible systems[*] out there supported poll(). There are several reasons to prefer poll to select when writing code.

Then one day in 2006, Chad Charlin, a developer at Microsoft wrote the following when talking about the new WSApoll() function they introduced in Windows Vista:

Among the many improvements to the Winsock API shipping in Vista is the new WSAPoll function. Its primary purpose is to simplify the porting of a sockets application that currently uses poll() by providing an identical facility in Winsock for managing groups of sockets.

Great! Starting September 2006 curl started using it (shipped in the release curl and libcurl 7.16.0). It seemed like a huge step forward, and as Chad wrote:

If you have experience developing applications using poll(), WSAPoll will be very familiar. It is designed to behave just like poll().

Emphasis added by me. It was (of course) made to work like poll, and that’s why the API is made like that. Why would you introduce something that is almost like poll() except in minor details?

Since the new function only was available in Vista and later, it took a while until libcurl users in a more wider scale got to use it but over time Windows XP users are slowly shifting away and more and more libcurl Windows users therefore use the WSApoll based builds. Life seemed to be good. Some users noticed funny things and reported bugs we couldn’t repeat (on other platforms) but nothing really stood out and no big alarm bells went off.

During July 2012, a user of libcurl on Windows, Jan Koen Annot experienced such problems and he didn’t just sigh and move on. He rolled up his sleeves and decided to get to the bottom. Perhaps he could fix a bug or two while at it? (It seems reasonable that he thought so, I haven’t actually asked him!) What he found was however not a bug in libcurl. He found out that WSApoll did indeed not work like poll (his initial post to curl-library on the problem)! On August 1st he submitted a support issue to Microsoft about it. On August 7 we pushed the commit to curl that removed our use of WSApoll.

A few days go Jan reported back on how the case has gone, where his journey down the support alleys took him.

It turns out Microsoft already knew about this bug, which they apparently have named “Windows 8 Bugs 309411 – WSAPoll does not report failed connections”. The ticket has been resolved as Won’t Fix… (I haven’t found any public access of this.)

Jan argued for the case that since WSApoll is designed and used as a plain poll() replacement it would make sense to actually make it also work the same way:

First, it will cost much time to find out that some ‘real-life’ issue can be traced back to this WSAPoll bug. In my case we were lucky to have a regression test which triggered when we started using a slightly different cURL-library configuration on Windows. Tracing back that the test was triggered because of this bug in WSAPoll took several hours. Imagine what it would cost, if some customer in the field reported annoying delays, to trace such a vague complaint back to a bug in the WSAPoll function!

Second, even if we know beforehand about this bug in WSAPoll, then it is difficult to determine in which situations in your code you can safely use WSAPoll and in which situations you might suffer from this bug. So a better recommendation would be to simply not use WSAPoll. (…)

Third, porting code which uses the poll() function to the Windows sockets API is made more complex. The introduction of WSAPoll was meant specifically for this, so it should have compatible behavior, without a recommendation to not use it in certain circumstances.

Fourth, your recommendation will only have effect when actively promoted to developers using WSAPoll. A much better approach would be to repair the bug and publish an update. Microsoft has some nice mechanisms in place for that.

So, my conclusion is that, even if in our case the business impact may be low because we found the bug in an early stage, it is still important that Microsoft fixes the bug and publishes an update.

In my eyes all very good and sensible arguments. Perhaps not too surprisingly, these fine reasons didn’t have any particular impact on how Microsoft views this old and known bug that “has been like this forever and people are already used to it.”. It will remain closed, and Microsoft motivated this decision to Jan quite clearly and with arguments one can understand:

A discussion has been conducted around this topic and the taken decision was not to have the fix implemented due to the following reasons:

  • This issue since Vista
  • no other Microsoft customer has asked for a Hotfix since Vista timeframe
  • fixing this old issue might have some application compatibility risk (for those customers who might have somehow taken a dependency on WSAPoll failing with a timeout in the cases of connect failure as opposed to POLLERR).
  • This API will become more irrelevant as the Windows versions increase; the networking APIs will move away from classic select/poll to more advanced I/O completion mechanisms.

Argument one and two are really weak and silly. Microsoft users are very rarely complaining to Microsoft and most wouldn’t even know how to do it. Also, this problem may certainly still affect many users even if nobody has asked for a fix.

The compatibility risk is a valid point, but that’s a bit of a hard argument to have. All bugs that are about behavior will of course risk that users have adapted to the wrong behavior so a bug fix may break those. All of us who write and maintain stable APIs are used to this problem, but sticking to the buggy way of working because it has been doing this for so long is in my eyes only correct if you document this with very large letters and emphasis in all documentation: WSApoll is not fully emulating poll – beware!

The fact that they will focus more on other APIs is also understandable but besides the point. We want reliable APIs that work as documented. Applications that are Windows-only probably already very rarely use WSApoll, it will probably remain being more important for porting socket style programs to Windows.

Jan also especially highlights a funny line from this Microsoft person:

The best way to add pressure for a hotfix to be released would be to have the customers reporting it again on http://connect.microsoft.com.

Okay, so even if they have motives why they won’t fix this bug they seem to hint that if more customers nag them about it they might change their minds. Fair enough. But the users of libcurl who for five years perhaps experienced funny effects are extremely unlikely to ever report and complain to Microsoft about this. They are way more likely to complain to us, or possibly to just work around the issue somehow.

Of course, users of WSApoll can adapt to the differences and make conditional code that handles them and that could be what we end up with in the curl project in the future if we just get volunteers to adapt the code accordingly. In the mean time we’ve just reverted to the old select()-using code instead, since select() does in fact mimic the “real” select much better…

[*] = clearly Mac OS X is not a sensible system since its poll() implementation is even worse than Windows and is mostly broken or just unreliable. Subject for another blog post another time.

Fixed name to dynamic IP with CNAME

Notice: this is not an advanced nor secret trickery. This is just something I’ve found even techsavvy people in my surrounding not having done so its worth being highlighted.

When I upgraded to fiber from ADSL, I had to give up my fixed IPv4 address that I’ve been using for around 10 years and switch to a dynamic DNS service .

In this situation I see lots of people everywhere use dyndns.org and similar services that offer dynamicly assigning a new IP to a fixed host name so that you and your computer illiterate friends still can reach your home site.

I suggest a minor variation of this, that still avoids having to run your own dyndns server on your server. It only requires that you admin and control a DNS domain already, but who doesn’t these days?

  1. Get that dyndns host name at the free provider that makes the name hold your current IP. Let’s call it home.dyndns.org.
  2. In your own DNS zone for your domain (example.com) you add an entry for your host ‘home.example.com‘ as a CNAME, pointing over to home.dyndns.org
  3. Now you can ping or ssh or whatever to ‘home.example.com‘ and it will remain your home IP even when it moves.
  4. Smile and keep using that host name in your own domain to your dynamic IP

A network provider for my fiber

In late May I finally got my media converter installed inside my house so now my fiber gets terminated into a 4-port gigabit switch.

Inteno fiber equipment

Now the quest to find the right provider started. I have a physical 1 gigabit connection to “the station”, and out of the 12 providers (listed on bredbandswebben.se) I can select to get the internet service delivered by, at least two offer 1000 mbit download speeds (with 100mbit upload). I would ideally like a fixed IPv4-address and an IPv6 subnet, and I want my company to subscribe to this service.

The companies are T3 and Alltele. Strangely enough both of them failed to respond in a timely manner, so I went on to probe a few of the other companies that deliver less than 1000mbit services.

The one company that responded fastest and with more details than any other was Tyfon. They informed me that currently nobody can sell a “company subscription” on this service and that on my address I can only get at most a 100/100 mbit service right now. (Amusingly most of these operators also offer 250/25 and 500/50 rates but I would really like to finally get a decent upstream speed so that I for example can backup to a remote site at a decent speed.)

So, I went with 100/100 mbit for 395 SEK/month (~ 44 Euro or 57 USD). I just now submitted my order and their confirmation arrived at 23:00:24. They say it may take a little while to deliver so we’ll see (“normally within 1-2 weeks“). I’ll report back when I have news.

(And I’ve not yet gotten the invoice for the physical installation…)

shorter HTTP requests for curl

Starting in curl 7.26.0 (due to be released at the end of May 2012), we will shrink the User-agent: header that curl sends by default in HTTP(S) requests to something much shorter! I suspect that this will raise some eyebrows out there so even though I’ve emailed about it to the curl-users list before I thought I’d better write it up and elaborate.

A default ‘curl localhost’ on Debian Linux makes 170 bytes get sent in that single request:

GET / HTTP/1.1
User-Agent: curl/7.24.0 (i486-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0g zlib/1.2.6 libidn/1.23 libssh2/1.2.8 librtmp/2.3
Host: localhost
Accept: */*

As you can see, the user-agent description takes up a large portion of that request, and this for really no good reason at all. Without sacrificing any functionality I shrunk the same request down to 71 bytes:

GET / HTTP/1.1
User-Agent: curl/7.24.0
Host: localhost
Accept: */*

That means we shrunk it down to 41% of the original size. I’ll admit the example is a bit extreme and most other normal use cases will use longer host names and longer paths, but even for a URL like “http://daniel.haxx.se/docs/curl-vs-wget.html” we’re down to 50% of the original request size (100 vs 199).

Can we shrink it even more? Sure, we could leave out the version number too. I left it in there now only to allow some kind of statistics to get extracted. We can’t remove the entire header, we need to include a user-agent in requests since there are too many servers who won’t function properly otherwise.

And before anyone asks: this change is only for the curl command line tool and not for libcurl, the library. libcurl does in fact not send any user-agent at all by default…

Digging the fiber

Finally the installation of my open fiber is moving along.

Roughly two weeks ago the team responsible for getting the thing from the boundary of my estate to my house arrived. They spent a great deal of time trying to piggyback the existing tube already running under my driveway for the telephone cable – until they gave up and had to use their shovels to dig a ditch through my garden. Apparently the existing tube was too tight and already too filled up with the existing cables. A little strike of bad luck I think since now they instead had to make a mess of my garden. Here’s a little picture of the dig work they did:

a ditch for the fiber through the garden

They aim at a depth of 25 cm for the cable while going through people’s estate, while outside of my garden they need 50 cm depth underneath the road and sidewalk down my little suburb street.

Once they were done we could see this orange cable sticking up next to my mailbox:

the outer end of the cable by my mailbox

… and the other end is sticking up here next to my front door. I expect the next team to get here and do the installation from here and pull it in through my wall and install the media converter etc possibly in the closet next to my front door. We’ll see…

the end of the cable next to the stairs by my front door

Today, when I arrived home after work the team that were digging up the sidewalk had already connected the cable side that was previously sticking out next to my mailbox (the middle picture).

Of course, they did their best at putting things (like soil) back as it was but I’ll admit that my better half used some rather colorful expressions to describe her sentiments about getting the garden remade like this.

I’ll get back with more reports later on when I get things installed internally and when the garden starts to repair.

Network hardware deaths

Things went southwards already this morning. My wife was about to work from home and called me before 8am asking for help to get online as the wireless Internet access setup didn’t work for her.

As this has happened at some occasions before she knew she might need to reboot the wifi router to get things running again. So she did. Only this time, when she inserted the power plug again there was not a single LED turning on. None. She yanked it out again and re-inserted it. Nothing.

Okay, so she was not able to use the wifi and the router was dead.

At lunch, I took a short walk in the sunshine to my nearest “Kjell & Co” and got myself a new wifi router and brought it back with me home after work and immediately replaced the dead one with the new shiny one. I ran upstairs (most of my network gear is under the staircase on the bottom floor while my main computer andlink DIR 635d work space is on the upper floor), configured the new router with the static IP and those things that need to be there and…

…weird, I still can’t access the Internet!

I then decided to do the power recycle dance with the ADSL modem as well. I could see how the “WAN” led blinked, turned stable and then I could actually successfully send several ping packets (that got responses) before the connection broke again and the WAN led on the modem was again switched off. I retried the power cycle procedure but the led stayed off.

I called customer support for my ADSL service (Bredbandsbolaget) and they immediately spotted how old my modem is, indicating that it was probably the reason for the failure and set me up to receive a free replacement unit within 2-3 days.

This left me with several problems still nudging my brain:

  1. Why would suddenly two devices standing next to each other, connected with a cat5, break on the same day when they both have been running flawlessly like this for years? I had perfect network access when I went to bed last night and there were no power outages, lightning strikes or similar.
  2. Why and how could the customer service so quickly judge that the reason was the age of my modem? I get the sense they just knee-jerk the replacement unit because of the age of mine and there’s a rather big risk that when I plug in the new modem in a few days it will show the same symptoms…
  3. 2-3 days!! Gaaaah. Thank God I can tether with my phone, but man 3G may be nice and all but its not like my trusty old 12mbit ADSL I tend to get. Not to mention that the RTT is much worse and that’s a factor for me who use quite a lot of SSH to remote machines.

I guess I will find out when the new hardware arrives. I may get reason to write a follow-up then. I hope not!

Update on September 23rd:

A new ADSL modem arrived just two days after my call and yay, it could sync and I could use internet. Unfortunately something was still wrong though as my telephone didn’t work (I have a IP-telephony service that goes through the ADSL box). I took me until Sunday to call customer service again, and on Tuesday a second replace modem arrived which I installed on Thursday and… now even the phone works!

I never figured out why both devices died, but the end result is that my 802.11n wifi works properly with speeds above 6.5MB/sec in my house.

What SOCKS is good for

You ever wondered what SOCKS is good for these days?

To help us use the Internet better without having the surrounding be able to watch us as much as otherwise!

There’s basically two good scenarios and use areas for us ordinary people to use SOCKS:

  1. You’re a consultant or you’re doing some kind of work and you are physically connected to a customer’s or a friend’s network. You access the big bad Internet via their proxy or entirely proxy-less using their equipment and cables. This allows the network admin(s) to capture and snoop on your network traffic, be it on purpose or by mistake, as long as you don’t use HTTPS or other secure mechanisms. When surfing the web, it is very easily made to drop out of HTTPS and into HTTP by mistake. Also, even if you HTTPS to the world, the name resolves and more are still done unencrypted and will leak information.
  2. You’re using an open wifi network that isn’t using a secure encryption. Anyone else on that same area can basically capture anything you send and receive.

What you need to set it up? You run

ssh -D 8080 myname@myserver.example.com

… and once you’ve connected, you make sure that you change the network settings of your favourite programs (browsers, IRC clients, mail reader, etc) to reach the Internet using the SOCKS proxy on localhost port 8080. Now you’re done.

Now all your traffic will reach the Internet via your remote server and all traffic between that and your local machine is sent encrypted and secure. This of course requires that you have a server running OpenSSH somewhere, but don’t we all?

If you are behind another proxy in the first place, it gets a little more complicated but still perfectly doable. See my separate SSH through or over proxy document for details.

Open fibre

One of the big telecom operators in Sweden, Telia, has started to offer “fibre to the house”- called “Öppen Fiber” in Swedish – and I’ve signed up for it. They’re investing 5 billion SEK into building fibre infrastructure and I happen to live in an area which is among the first ones in Sweden that gets the chance to participate. What’s in this blog post is information as I’ve received and understood it. I will of course follow-up in the future and tell how it all turns out in reality.

Copper is a Dead End

fiber cableI have my own house. My thinking is that copper-based technologies such as the up-to-24mbit-but-really-12mbit ADSL (I have some 700 meters or so to the nearest station) I have now has reached something of an end of the road. I had 3 mbit/sec ADSL almost ten years ago: obviously not a lot of improvement is happening in this area. We need to look elsewhere in order to up our connection speeds. I think getting a proper fibre connection to the house will be a good thing for years to come. I don’t expect wireless/radio techniques to be able to compete properly, at least not within the next coming years.

Open

This is an “open fibre” in the sense that Telia will install and own the physical fibre and installation but they will not run any services on top of it. I will then buy my internet services, TV and telephone services (should I decide that TV and phone over the fibre is desirable) from the selection of service companies that decide to join in and compete for my money.

Installation

They’re promising delivery “before the end of the year”. I won’t even get an estimated installation date until around mid August. If an existing tube doesn’t exist for the copper or electricity that they can use to push the fibre through, they will dig. From the road outside my house to my building, across whatever land that exists there. They need to dig roughly 40 cm deep. The fibre is terminated inside the house (a maximum of 5 meter inside the building) in a small “media converter” box which basically converts from fibre to a RJ45 network plug. It is the size of a regular small switch or so. It is claimed to be possible to get a different “box” that provide a direct fibre plug of some sorts for the people who may already have fibre installed in their houses. I currently have a burglar alarm in my house that uses the current phone connection which I’ll need to get either just dumped completely or converted over to use a telephone-over-fibre concept. I don’t plan on paying for or using any copper-based service once the fibre gets here. (There’s however no way to use the Swedish tax deduction “rot-avdrag”.)

Price

dlink DIR 635There’s no monthly fee for the fibre, I only pay a one-time installation fee of 16700 SEK (roughly 1800 Euros) to get it. I then of course will have to pay for the services if I want to actually use the installation but until I do there are no fees involved. This price is actually fixed and the same for all the houses in my area that got this deal. At August 15th the deal ends and they’ll increase the installation price to 26700 SEK. Given the amount of work they have to put in for each new customer, I don’t really consider this price to be steep. A lot of money, sure, but also quite a lot of value.

Speeds to expect

The physical speed between my house and the other end (some kind of fibre termination station somewhere) will be exactly 1000mbit/sec and no more “up to” phrasing or similar in the contract. Of course, that’s just the physical speed that is used and with this equipment the network cannot be any faster than 1000 mbit. There will then be ISPs that offer an internet connection, and they may very well offer lower speeds and even varying different speeds at different tariffs. Right now, other fibre installations done by Telia seem to get offered up to 100/100 mbit connections. As this is then not a physical maximum, it should allow for future increasing without much problems. The 1000 mbit/sec speed over the fibre is a limitation in the actual installed hardware (not the fibre) so in the future Telia can indeed replace the media converters in both ends and bump the speed up significantly should they want to and feel that there’s business in doing so. My current D-Link wifi router only has 100 mbit WAN support so clearly I’ll have to replace that if I go beyond.

IPv6

Seriously, I believe I may be closer to actually get a real IPv6 offer using this than with ADSL here in Sweden. I haven’t really investigated this for real though.

Update

December 16th: I got a mail from Telia today that informed me that the installation in my area has been delayed so it won’t happen until Q2 2012! 🙁

IRC use is declining

I discovered IRC around 1993.

Back then, before EFnet split in two, the IRC channel I frequented was #amiga and we were a small bunch of people from all over the globe who got to know each other pretty good. In the 90s I participated in one of my first open source projects and we created the IRC bot we named Dancer. Dancer was a really talented “defence bot” back in the days of the “wild west” of IRC when channel take overs, flood attacks and nick collisions were widespread and frequently occurring. Dancer helped us keep things calm. Later on, I was part of the team that created and setup the new IRC network called amiganet.

I’ve been using IRC on and off since those days in the early 1990s and still today I hang out on 5-6 channels on freenode every day.

IRC was launched to the world already 1988, almost 23 years ago. I’ve been trying to document the basic history of IRC and when I updated that page the other day with some usage numbers for freenode, I decided to have a look around the net to see if there are any general numbers for IRC usage at large, and I found out that usage is decreasing all over and has been doing so for years. Without research, I figure IRC users are either old farts like myself or at least very tech oriented and geeky. Younger, newer and less techy people use other means of communication.

IRC never “took off” among the general public. In general, I find that general people prefer various IM systems (something that I’ve never understood or adopted myself) and most “ordinary”humans I know don’t even know what IRC is. Possibly, the fact that the IRC protocol never got very good (there’s only that original spec from ’93), that there’s a million completely separated IRC networks with no cross-network messages or that all IRC networks still today suffer from netsplits and other artifacts due deficiencies in how the IRC servers are talking to each other.

5-6 years ago the four most popular networks were all over 100,000 users regularly. Quakenet were well over 200,000. Last year, only Quakenet reached over 100,000. It seems basically all of  them have roughly half the numbers they had 2004.

Graphs from irc.netsplit.de:

2004

IRC usage 2004

2010

IRC usage 2010