Pretending port zero is a normal one

Saturday, October 25th, 2014

Speaking the TCP protocol, we communicate between “ports” in the local and remote ends. Each of these port fields are 16 bits in the protocol header so they can hold values between 0 – 65535. (IPv4 or IPv6 are the same here.) We usually do HTTP on port 80 and we do HTTPS on port 443 and so on. We can even play around and use them on various other custom ports when we feel like it.

But what about port 0 (zero) ? Sure, IANA lists the port as “reserved” for TCP and UDP but that’s just a rule in a list of ports, not actually a filter implemented by anyone.

In the actual TCP protocol port 0 is nothing special but just another number. Several people have told me “it is not supposed to be used” or that it is otherwise somehow considered bad to use this port over the internet. I don’t really know where this notion comes from more than that IANA listing.

Frank Gevaerts helped me perform some experiments with TCP port zero on Linux.

In the Berkeley sockets API widely used for doing TCP communications, port zero has a bit of a harder situation. Most of the functions and structs treat zero as just another number so there’s virtually no problem as a client to connect to this port using for example curl. See below for a printout from a test shot.

Running a TCP server on port 0 however, is tricky since the bind() function uses a zero in the port number to mean “pick a random one” (I can only assume this was a mistake done eons ago that can’t be changed). For this test, a little iptables trickery was run so that incoming traffic on TCP port 0 would be redirected to port 80 on the server machine, so that we didn’t have to patch any server code.

Entering a URL with port number zero to Firefox gets this message displayed:

This address uses a network port which is normally used for purposes other than Web browsing. Firefox has canceled the request for your protection.

… but Chrome accepts it and tries to use it as given.

The only little nit that remains when using curl against port 0 is that it seems glibc’s getpeername() assumes this is an illegal port number and refuses to work. I marked that line in curl’s output in red below just to highlight it for you. The actual source code with this check is here. This failure is not lethal for libcurl, it will just have slightly less info but will still continue to work. I claim this is a glibc bug.

$ curl -v http://10.0.0.1:0 -H "Host: 10.0.0.1"
* Rebuilt URL to: http://10.0.0.1:0/
* Hostname was NOT found in DNS cache
* Trying 10.0.0.1...
* getpeername() failed with errno 107: Transport endpoint is not connected
* Connected to 10.0.0.1 () port 0 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.38.1-DEV
> Accept: */*
> Host: 10.0.0.1
>
< HTTP/1.1 200 OK
< Date: Fri, 24 Oct 2014 09:08:02 GMT
< Server: Apache/2.4.10 (Debian)
< Last-Modified: Fri, 24 Oct 2014 08:48:34 GMT
< Content-Length: 22
< Content-Type: text/html

<html>testpage</html>

Why doing this experiment? Just for fun to to see if it worked.

FOSS them students

Thursday, October 16th, 2014

On October 16th, I visited DSV at Stockholm University where I had the pleasure of holding a talk and discussion with students (and a few teachers) under the topic Contribute to Open Source. Around 30 persons attended.

Here are the slides I use, as usual possibly not perfectly telling stand-alone without the talk but there was no recording made and I talked in Swedish anyway…

Contribute to Open Source from Daniel Stenberg

Changing networks with Firefox running

Friday, September 26th, 2014

Short recap: I work on network code for Mozilla. Bug 939318 is one of “mine” – yesterday I landed a fix (a patch series with 6 individual patches) for this and I wanted to explain what goodness that should (might?) come from this!

diffstat

diffstat reports this on the complete patch series:

29 files changed, 920 insertions(+), 162 deletions(-)

The change set can be seen in mozilla-central here. But I guess a proper description is easier for most…

The bouncy road to inclusion

This feature set and associated problems with it has been one of the most time consuming things I’ve developed in recent years, I mean in relation to the amount of actual code produced. I’ve had it “landed” in the mozilla-inbound tree five times and yanked out again before it landed correctly (within a few hours), every time of course reverted again because I had bugs remaining in there. The bugs in this have been really tricky with a whole bunch of timing-dependent and race-like problems and me being unfamiliar with a large part of the code base that I’m working on. It has been a highly frustrating journey during periods but I’d like to think that I’ve learned a lot about Firefox internals partly thanks to this resistance.

As I write this, it has not even been 24 hours since it got into m-c so there’s of course still a risk there’s an ugly bug or two left, but then I also hope to fix the pending problems without having to revert and re-apply the whole series…

Many ways to connect to networks

Firefox Nightly screenshotIn many network setups today, you get an environment and a network “experience” that is crafted for that particular place. For example you may connect to your work over a VPN where you get your company DNS and you can access sites and services you can’t even see when you connect from the wifi in your favorite coffee shop. The same thing goes for when you connect to that captive portal over wifi until you realize you used the wrong SSID and you switch over to the access point you were supposed to use.

For every one of these setups, you get different DHCP setups passed down and you get a new DNS server and so on.

These days laptop lids are getting closed (and the machine is put to sleep) at one place to be opened at a completely different location and rarely is the machine rebooted or the browser shut down.

Switching between networks

Switching from one of the networks to the next is of course something your operating system handles gracefully. You can even easily be connected to multiple ones simultaneously like if you have both an Ethernet card and wifi.

Enter browsers. Or in this case let’s be specific and talk about Firefox since this is what I work with and on. Firefox – like other browsers – will cache images, it will cache DNS responses, it maintains connections to sites a while even after use, it connects to some sites even before you “go there” and so on. All in the name of giving the users an as good and as fast experience as possible.

The combination of keeping things cached and alive, together with the fact that switching networks brings new perspectives and new “truths” offers challenges.

Realizing the situation is new

The changes are not at all mind-bending but are basically these three parts:

  1. Make sure that we detect network changes, even if just the set of available interfaces change. Send an event for this.
  2. Make sure the necessary parts of the code listens and understands this “network topology changed” event and acts on it accordingly
  3. Consider coming back from “sleep” to be a network changed event since we just cannot be sure of the network situation anymore.

The initial work has been made for Windows only but it allows us to smoothen out any rough edges before we continue and make more platforms support this.

The network changed event can be disabled by switching off the new “network.notify.changed” preference. If you do end up feeling a need for that, I really hope you file a bug explaining the details so that we can work on fixing it!

Act accordingly

So what is acting properly? What if the network changes in a way so that your active connections suddenly can’t be used anymore due to the new rules and routing and what not? We attack this problem like this: once we get a “network changed” event, we “allow” connections to prove that they are still alive and if not they’re torn down and re-setup when the user tries to reload or whatever. For plain old HTTP(S) this means just seeing if traffic arrives or can be sent off within N seconds, and for websockets, SPDY and HTTP2 connections it involves sending an actual ping frame and checking for a response.

The internal DNS cache was a bit tricky to handle. I initially just flushed all entries but that turned out nasty as I then also killed ongoing name resolves that caused errors to get returned. Now I instead added logic that flushes all the already resolved names and it makes names “in transit” to get resolved again so that they are done on the (potentially) new network that then can return different addresses for the same host name(s).

This should drastically reduce the situation that could happen before when Firefox would basically just freeze and not want to do any requests until you closed and restarted it. (Or waited long enough for other timeouts to trigger.)

The ‘N seconds’ waiting period above is actually 5 seconds by default and there’s a new preference called “network.http.network-changed.timeout” that can be altered at will to allow some experimentation regarding what the perfect interval truly is for you.

Firefox BallInitially on Windows only

My initial work has been limited to getting the changed event code done for the Windows back-end only (since the code that figures out if there’s news on the network setup is highly system specific), and now when this step has been taken the plan is to introduce the same back-end logic to the other platforms. The code that acts on the event is pretty much generic and is mostly in place already so it is now a matter of making sure the event can be generated everywhere.

My plan is to start on Firefox OS and then see if I can assist with the same thing in Firefox on Android. Then finally Linux and Mac.

I started on Windows since Windows is one of the platforms with the largest amount of Firefox users and thus one of the most prioritized ones.

More to do

There’s separate work going on for properly detecting captive portals. You know the annoying things hotels and airports for example tend to have to force you to do some login dance first before you are allowed to use the internet at that location. When such a captive portal is opened up, that should probably qualify as a network change – but it isn’t yet.

daniel.haxx.se week #3

Monday, September 22nd, 2014

I won’t keep posting every video update here, but I mostly wanted to mention that I’ve kept posting a weekly video over at youtube basically explaining what’s going on right now within my dearest projects. Mostly curl and some Firefox stuff.

This week: libcurl server cert verification API got a bashing at SEC-T, is HTTP for UDP a good idea? How about adding HTTP cache support to libcurl? HTTP/2 is getting deployed as we speak. Interesting curl bug when used by XBMC. The patch series for Firefox bug 939318 is improving slowly – will it ever land?

Video perhaps?

Monday, September 8th, 2014

I decided to try to do a short video about my current work right now and make it available for you all. I try to keep it short (5-7 minutes) and I’m certainly no pro at it, but I will try to make a weekly one for a while and see if it gets any fun. I’m going to read your comments and responses to this very eagerly and that will help me decide how I will proceed on this experiment.

Enjoy.

HTTP/2 interop pains

Tuesday, September 2nd, 2014

At around 06:49 CEST on the morning of August 27 2014, Google deployed an HTTP/2 draft-14 implementation on their front-end servers that handle logins to Google accounts (and possibly others). Those at least take care of all the various login stuff you do with Google, G+, gmail, etc.

The little problem with that was just that their implementation of HTTP2 is in disagreement with all existing client implementations of that same protocol at that draft level. Someone immediately noticed this problem and filed a bug against Firefox.

The Firefox Nightly and beta versions have HTTP2 enabled by default and so users quickly started to notice this and a range of duplicate bug reports have been filed. And keeps being filed as more users run into this problem. As far as I know, Chrome does not have this enabled by default so much fewer Chrome users get this ugly surprise.

The Google implementation has a broken cookie handling (remnants from the draft-13 it looks like by how they do it). As I write this, we’re on the 7th day with this brokenness. We advice bleeding-edge users of Firefox to switch off HTTP/2 support in the mean time until Google wakes up and acts.

You can actually switch http2 support back on once you’ve logged in and it then continues to work fine. Below you can see what a lovely (wildly misleading) error message you get if you try http2 against Google right now with Firefox:

google-http2-draft14-cookies

This post is being debated on hacker news.

Updated: 20:14 CEST: There’s a fix coming, that supposedly will fix this problem on Thursday September 4th.

Update 2: In the morning of September 4th (my time), Google has reverted their servers to instead negotiate SPDY 3.1 and Firefox is fine with this.

Firefox and partial content

Monday, June 16th, 2014

Firefox BallOne of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

  1. Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
  2. Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
  3. With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
  4. In curl’s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
  5. It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

Less plain-text is better. Right?

Tuesday, May 13th, 2014

Every connection and every user on the Internet is being monitored and snooped at to at least some extent every now and then. Everything from the casual firesheep user in your coffee shop, an admin in your ISP, your parents/kids on your wifi network, your employer on the company network, your country’s intelligence service in a national network hub or just a random rogue person somewhere in the middle of all this.

My involvement in HTTP make me mostly view and participate in this discussion with this protocol primarily in mind, but the discussion goes well beyond HTTP and the concepts can (and will?) be applied to most Internet protocols in the future. You can follow some of these discussions in the httpbis group, the UTA group, the tcpcrypt list on twitter and elsewhere.

IETF just published RFC 7258 which states:

Pervasive Monitoring Is a Widespread Attack on Privacy

Passive monitoring

Most networking surveillance can be done entirely passively by just running the correct software and listening in on the correct cable. Because most internet traffic is still plain-text and readable by anyone who wants to read it when the bytes come flying by. Like your postman can read your postcards.

Opportunistic?

Recently there’s been a fierce discussion going on both inside and outside of IETF and other protocol and standards groups about doing “opportunistic encryption” (OE) and its merits and drawbacks. The term, which in itself is being debated and often is said to be better called “opportunistic keying” (OK) instead, is about having protocols transparently (invisible to the user) upgrade plain-text versions to TLS unauthenticated encrypted versions of the protocols. I’m emphasizing the unauthenticated word there because that’s a key to the debate. Recently I’ve been told that the term “opportunistic security” is the term to use instead…

In the way of real security?

Basically the argument against opportunistic approaches tends to be like this: by opportunistically upgrading plain-text to unauthenticated encrypted communication, sysadmins and users in the world will consider that good enough and they will then not switch to using proper, strong and secure authentication encryption technologies. The less good alternative will hamper the adoption of the secure alternative. Server admins should just as well buy a cert for 10 USD and use proper HTTPS. Also, listeners can still listen in on or man-in-the-middle unauthenticated connections if they capture everything from the start of the connection, including the initial key exchange. Or the passive listener will just change to become an active party and this unauthenticated way doesn’t detect that. OE doesn’t prevent snooping.

Isn’t it better than plain text?

The argument for opportunism here is that there will be nothing to the user that shows that it is “upgrading” to something less bad than plain text. Browsers will not show the padlock, clients will not treat the connection as “secure”. It will just silently and transparently make passive monitoring of networks much harder and it will force actors who truly want to snoop on specific traffic to up their game and probably switch to active monitoring for more cases. Something that’s much more expensive for the listener. It isn’t about the cost of a cert. It is about setting up and keeping the cert up-to-date, about SNI not being widely enough adopted and that we can see only 30% of all sites on the Internet today use HTTPS – for these reasons and others.

HTTP:// over TLS

In the httpbis work group in IETF the outcome of this debate is that there is a way being defined on how to do HTTP as specified with a HTTP:// URL – that we’ve learned is plain-text – over TLS, as part of the http2 work. Alt-Svc is the way. (The header can also be used to just load balance HTTP etc but I’ll ignore that for now)

Mozilla and Firefox is basically the only team that initially stands behind the idea of implementing this in a browser. HTTP:// done over TLS will not be seen nor considered any more secure than ordinary HTTP is and users will not be aware if that happens or not. Only true HTTPS connections will get the padlock, secure cookies and the other goodies true HTTPS sites are known and expected to get and show.

HTTP:// over TLS will just silently send everything through TLS (assuming that it can actually negotiate such a connection), thus making passive monitoring of the network less easy.

Ideally, future http2 capable servers will only require a config entry to be set TRUE to make it possible for clients to do OE on them.

HTTPS is the secure protocol

HTTP:// over TLS is not secure. If you want security and privacy, you should use HTTPS. This said, MITMing HTTPS transfers is still a widespread practice in certain network setups…

TCPcrypt

I find this initiative rather interesting. If implemented, it removes the need for all these application level protocols to do anything about opportunistic approaches and it could instead be handled transparently on TCP level! It still has a long way to go though before we will see anything like this fly in real life.

The future will tell

Is this just a fad that will get no adoption and go away or is it the beginning of something that will change how we do protocols in the future? Time will tell. Many harsh words are being exchanged over this topic in many a debate right now…

(I’m trying to stick to “HTTP:// over TLS” here when referring to doing HTTP OE/OK over TLS. This is partly because RFC2818 that describes how to do HTTPS uses the phrase “HTTP over TLS”…)

Wireshark dissector work

Thursday, April 24th, 2014

WiresharkRecently I cloned the Wireshark git repository and started updating the http2 dissector. That’s the piece of code that gets called to analyze a stream of data that Wireshark thinks is http2.

The current http2 dissector was left at draft-09 state, while the current draft at the time was number 11 and there have been several changes on the binary format since so any reasonably updated client or server would send or receive byte streams that Wireshark couldn’t properly display.

I never wrote any dissector code before but I must say Wireshark didn’t disappoint. It was straight forward and mostly downright easy to fix most of the wrong details. I’m not pretending to be a master at this nor is the dissector code anywhere near “finished” yet but I still enjoyed the API and how to write a thing like this.

I’ve since dissected plain-text http2 streams that I’ve done with curl+nghttp2 and I’ve also used the SSLKEYLOGFILE trick with Firefox to automatically decrypt the TLS session and have the dissector figure out the underlying http2 parts.

If there’s any little snag to mention, it is the fact that they insist on getting patches submitted directly to gerrit instead of any mailing list or similar. This required me to create a gerrit account, and really figure out how to push my stuff from git to there, instead of the more traditional and simpler approach of just sending my patch to a mailing list or possibly submitting it to a bug/patch tracker somewhere with my browser.

Call me old-style but in fact the hip way of today with a pull-request github style would also have been much easier. Here’s what my gerrit submission looks like. But I get it, gerrit does push a little more work over to the submitter and I figure that once a submitter such as myself finally has fixed all the nits in the patch it is very easy for the project to actually merge it. I actually got someone else to help me point out how to even find the link to view the code review after the first one was submitted on the site… (when I post this, my patch has not yet been accepted or merged into the wireshark git repo)

Here’s a basic screenshot showing a trace of Firefox requesting https://nghttp2.org using http2. Click it for the full thing.

wireshark-screenshot

.. and what happens this morning my time? There’s a brand new http2 draft-12 out with more changes on the on-the-wire format! Well to be honest, that really wasn’t a surprise. I’ll get the new stuff supported too, but I’ll do that in a separate patch as I prefer to hold off until I see a live stream by at least one implementation to test against.

Presentation: what is http2

Wednesday, April 2nd, 2014

We had the 14th meetup with foss-sthlm yesterday and I talked about http2 for the almost 100 attendees in the audience. See my slides at slideshare:

Http2 from Daniel Stenberg