Tag Archives: IETF

Testing 2-digit year numbers in cookies

In the current work of the IETF http-state working group, we’re documenting how cookies work. The question came up how browsers and clients treat years in ‘expires’ strings if the year is only specified with two digits. And more precisely, is 69 in the future or in the past?

I decided to figure that out. I setup a little CGI that can be used to check what your browser thinks:

http://daniel.haxx.se/cookie.cgi

It sends a single cookie header that looks like:

Set-Cookie: testme=yesyes; expires=Wed Sep  1 22:01:55 69;

The CGI script looks like this:

print "Content-Type: text/plain\n";
print "Set-Cookie: testme=yesyes; expires=Wed Sep  1 22:01:55 69;\n";
print "\nempty?\n";
print $ENV{'HTTP_COOKIE'};

You see that it prints the Cookie: header, so if you reload that URL you should see “testme=yesyes” being output if the cookie is still there. If the cookie is still there, your browser of choice treats the date above as a date in the future.

So, what browsers think 69 is in the future and what think 69 is in the past? Feel free to try out more browsers and tell me the results, this is the list we have so far:

Future:

Firefox v3 and v4 (year 2069)
curl (year 2038)
IE 7 (year 2069)
Opera (year 2036)
Konqueror 4.5
Android

Past:

Chrome (both v4 and v5)
Gnome Epiphany-Webkit

Thanks to my friends in #rockbox-community that helped me out!

(this info was originally posted to the httpstate mailing list)

Beyond just “69”

(this section was added after my first post)

After having done the above basic tests, I proceeded and wrote a slightly more involved test that sets 100 cookies in this format:

Set-Cookie: test$yy=set; expires=Wed Oct  1 22:01:55 $yy;

When the user reloads this page, the page prints all “test$yy” cookies that get sent to the server. The results with the various browsers is very interesting. These are the ranges different browsers think are future:

  • Firefox: 21 – 69 (Safari and Fennec and MicroB on n900) [*]
  • Chrome: 10 – 68
  • Konqueror: 00 – 99 (and IE3, Links, Netsurf, Voyager)
  • curl: 10 – 70
  • Opera: 41 – 69 (and Opera Mobile) [*]
  • IE8: 31 – 79 (and slimbrowser)
  • IE4: 61 – 79 (and IE5, IE6)
  • Midori: 10 – 69 (and IBrowse)
  • w3m: 10 – 37
  • AWeb: 10 – 77
  • Nokia 6300: [none]

[*] = Firefox has a default limit of 50 cookies per host which is the explanation to this funny range. When I changed the config ‘network.cookie.maxPerHost’ to 200 instead (thanks to Dan Witte), I got the more sensible and expected range 10 – 69. Opera has the similar thing, it has a limit of 30 cookies by default which explains the 41-69 limit in this case. It would otherwise get 10-69 as well. (thanks to Stanislaw Adrabinski). I guess that the IE8 range is similarly restricted due to it using a limit of 50 cookies per host and an epoch at 1980.

I couldn’t help myself from trying to parse what this means. The ranges can roughly be summarized like this:

0-9: mostly in the past
10-20: almost always in the future except Firefox
21-30: even more likely to be in the future except IE8
31-37: everyone but opera thinks this is the future
38-40: now w3m and opera think this is the past
41-68: everyone but w3m thinks this is the future
69: Chrome and w3m say past
70: curl, IE8, Konqueror say future
71-79: IE8 and Konqueror say future, everyone else say past
80-99: Konqueror say future, everyone else say past

How to test a browser near you:

  1. goto http://daniel.haxx.se/cookie2.cgi
  2. reload once
  3. the numbers shown on the screen is the year numbers the browsers consider
    to be the future as described above

Websockets right now

Lots of sites today have JavaScript running that connects to the site and keeps the connection open for a long time (or just doing very frequent checks if there is updated info to get). This to such an wide extent that people have started working on a better way. A way for scripts in browsers to connect to a site in a TCP-like manner to exchange messages back and forth, something that doesn’t suffer from the problems HTTP provides when (ab)used for this purpose.

Websockets is the name of the technology that has been lifted out from the HTML5 spec, taken to the IETF and is being worked on there to produce a network protocol that is basically a message-based low level protocol over TCP, designed to allow browsers to do long-living connections to servers instead of using “long-polling HTTP”, Ajax polling or other more or less ugly tricks.

Unfortunately, the name is used for both the on-the-wire protocol as well as the JavaScript API so there’s room for confusion when you read or hear this term, but I’m completely clueless about JavaScript so you can rest assured that I will only refer to the actual network protocol when I speak of Websockets!

I’m tracking the Hybi mailing list closely, and I’m summarizing this post on some of the issues that’s been debated lately. It is not an attempt to cover it all. There is a lot more to be said, both now and in the future.

The first versions of Websockets that were implemented by browsers (Chrome) was the -75 version, as written by Ian Hickson as editor and the main man behind it as it came directly from the WHATWG team (the group of browser-people that work on the HTML5 spec).

He also published a -76 version (that was implemented by Firefox) before it was taken over by the IETF’s Hybi working group, who published that same document as its -00 draft.

There are representatives for a lot of server software and from browsers like Opera, Firefox, Safari and Chrome on the list (no Internet Explorer people seem to be around).

Some Problems

The 76/00 version broke compatibility with the previous version and it also isn’t properly HTTP compatible that is wanted and needed by many.

The way Ian and WHATWG have previously worked on this (and the html5 spec?) is mostly by Ian leading and being the benevolent dictator of what’s good, what’s right and what goes into the spec. The collision with IETF’s way of working that include “everyone can have a say” and “rough consensus” has been harsh and a reason for a lot of arguments and long threads on the hybi mailing list, and I will throw out a guess that it will continue to be the source of “flames” even in the future.

Ian maintains – for example – that a hard requirement on the protocol is that it needs to be “trivial to implement for amateurs”, while basically everyone else have come to the conclusion that this requirement is mostly silly and should be reworded to “be simple” or similar, that avoids the word “amateur” completely or the implying that amateurs wouldn’t be able to follow specs etc.

Right now

(This is early August 2010, things and facts in this protocol are likely to change, potentially a lot, over time so if you read this later on you need to take the date of this writing into account.)

There was a Hybi meeting at the recent IETF 78 meeting in Maastricht where a range of issues got discussed and some of the controversial subjects did actually get quite convincing consensus – and some of those didn’t at all match the previous requirements or the existing -00 spec.

Almost in parallel, Ian suggested a schedule for work ahead that most people expressed a liking in: provide a version of the protocol that can be done in 4 weeks for browsers to adjust to right now, and then work on an updated version to be shipped in 6 months with more of the unknowns detailed.

There’s a very lively debate going on about the need for chunking, the need for multiplexing multiple streams over a single TCP connection and there was a very very long debate going on regarding if the protocol should send data length-prefixed or if data should use a sentinel that marks the end of it. The length-prefix approach (favored by yours truly) seem to have finally won. Exactly how the framing is to be done, what parts that are going to be core protocol and what to leave for extensions and how extensions should work are not yet settled. I think everyone wants the protocol to be “as simple as possible” but everyone has their own mind setup on what amount of features that “the simplest” possible form of the protocol has.

The subject on exactly how the handshake is going to be done isn’t quite agreed to either by my reading, even if there’s a way defined in the -00 spec. The primary connection method will be using a HTTP Upgrade: header and thus connect to the server’s port 80 and then upgrade the protocol from HTTP over to Websockets. A procedure that is documented and supported by HTTP, but there’s been a discussion on how exactly that should be done so that cross-protocol requests can be avoided to the furthest possible extent.

We’ve seen up and beyond 50 mails per day on the hybi mailing list during some of the busiest days the last couple of weeks. I don’t see any signs of it cooling down short-term, so I think the 4 weeks goal seems a bit too ambitious at this point but I’m not ruling it out. Especially since the working procedure with Ian at the wheel is not “IETF-ish” but it’s not controlled entirely by Ian either.

Future

We are likely to see a future world with a lot of client-side applications connecting back to the server. The amount of TCP connections for a single browser is likely to increase, a lot. In fact, even a single web application within a single tab is likely to do multiple websocket connections when they re-use components and widgets from several others.

It is also likely that libcurl will get a websocket implementation in the future to do raw websocket transfers with, as it most likely will be needed in a future to properly emulate a browser/client.

I hope to keep tracking the development, occasionally express my own opinion on the matter (on the list and here), but mostly stay on top of what’s going on so that I can feed that knowledge to my friends and make sure that the curl project keeps up with the time.

chinese-socket

(The pictured socket might not be a websocket, but it is a pretty remarkably designed power socket that is commonly seen in China, and as you can see it accepts and works with many of the world’s different power plug designs.)

curl and speced cookie order

I just posted this on the curl-library list, but I feel it suits to be mentioned here separately.

As I’ve mentioned before, I’m involved in the IETF http-state working group which is working to document how cookies are used in the wild today. The idea is to create a spec that new implementations can follow and that existing implementations can use to become more interoperable.

(If you’re interested in these matters, I can only urge you to join the http-state mailing list and participate in the discussions.)

The subject of how to order cookies in outgoing HTTP Cookie: headers have been much debated over the recent months and I’ve also blogged about it. Now, the issue has been closed and the decision went quite opposite to my standpoint and now the spec will say that while the servers SHOULD not rely on the order (yeah right, some obviously already do and with this specified like this even more will soon do the same) it will recommend clients to sort the cookies in a given way that is close to the way current Firefox does it[*].

This has the unfortunate side-effect that to become fully compatible with how the browsers do cookies, we will need to sort our cookies a bit more than what we just recently introduced. That in itself really isn’t very hard since once we introduced qsort() it is easy to sort on more/other keys.

The biggest problem we get with this, is that the sorting uses creation time of the cookies. libcurl and curl and others mostly use the Netscape cookie files to store cookies and keep state between invokes, and that file format doesn’t include creation time info! It is a simple text-based file format with TAB-separated columns and the last (7th) column is the cookie’s content.

In order to support the correct sorting between sessions, we need to invent a way to pass on the creation time. My thinking is that we do this in a way that allows older libcurls still understand the file but just not see/understand the creation time, while newer versions will be able to get it. This would be possible by extending the expires field (the 6th) as it is a numerical value that the existing code will parse as a number and it will stop at the first non-digit character. We could easily add a character separation and store the
creation time after. Like:

Old expire time:

2345678

New expire+creation time:

2345678/1234567

This format might even work with other readers of this file format if they do similar assumptions on the data, but the truth is that while we picked the format in the first place to be able to exchange cookies with a well known and well used browser, no current browser uses that format anymore. I assume there are still a bunch of other tools that do, like wget and friends.

Update: as my friend Micah Cowan explains: we can in fact use the order of the cookie file as “creation time” hint and use that as basis for sorting. Then we don’t need to modify the file format. We just need to make sure to store them in time-order… Internally we will need to keep a line number or something per cookie so that we can use that for sorting.

[*] – I believe it sorts on path length, domain length and time of creation, but as soon as the -06 draft goes online it will be easy to read the exact phrasing. The existing -05 draft exists at: http://tools.ietf.org/html/draft-ietf-httpstate-cookie-05

An FTP hash command

Anthony Bryan strikes again. This time his name is attached to a new standards draft for how to get a hash checksum of a given file when using the FTP protocol. draft-bryan-ftp-hash-00 was published just a few days ago.

The idea is basically to introduce a spec for a new command named ‘HASH’ that a client can issue to a server to get a hash checksum for a given file in order to know that the file has the exact same contents you want before you even start downloading it or otherwise consider it for actions.

The spec details how you can ask for different hash algorithms, how the server announces its support for this in its FEAT response etc.

I’ve already provided some initial feedback on this draft, and I’ll try to assist Anthony a bit more to get this draft pushed onwards.

cookie order

I’ve previously mentioned the IETF httpstate working group several times, and here’s some insights into one topic we’re currently discussing:

HTTP Cookie: sort order

The current httpstate draft for the updated cookie spec says that cookies that are sent to a server with the Cookie: header in a HTTP request need to be sorted. They must be sorted primarily on path length and secondary on creation-time.

According to others on the http-state mailinglist, that’s exactly what the major browsers do already and they thus think we should document that as that’s how cookies work.

During these discussions I’ve learned that cookies with the same name that have been specified with different paths, will all be sent in the request and thus they need to be sent in a particular order and in fact even the original Netscape “spec” said so. I agree with this and I’ve also modified libcurl to act accordingly.

I hold the position that only cookies that have identical names need this treatment and that’s what we must specify. Implementers however will most likely find that sorting all cookies at once will be easier.

The secondary sort key, the creation-time, is much more questionable. Why would the server care about that? How can they even rely on that?

Previously in this discussion (back in August 2009) I checked other open source cookie implementations to see how they deal with cookie sorting. I learned that perl’s LWP does sort them based on path length but on nothing else. The following tools did no sorting at all: wget, curl, libsoup, pavuk, lftp and aria2. I have no doubts that we can find more if we would search for more implementations in more languages and environments.

Specifying that sorting is a MUST on path length will still keep LWP and the next curl version compliant. Specifying that they MUST sort on creation-date as well will then make all of these projects non-conforming.

One problem I have in the cookie discussion in general is that the (5?) major browsers pretty much all behave the same and while the 5 major browsers have almost 100% of the web browser market for users, we cannot then automatically assume that they have 100% of the HTTP client or cookie-using client market. We just don’t know how many applications, tools and frameworks exist out there that aren’t actually browsers but still are using cookies.

Of course, I want to say that creation-time sorting is pointless but I have nothing to back that up with. No numbers. The other side of the discussion has a bunch of browsers that sort like this already, but no numbers or evidence if servers rely on this or how many that do.

Can any reader of this find a site out there that depends on the cookie order being sorted on creation-date?

(Reminder: the charter for the HTTPSTATE working group is to document existing widely used practices. It is not to solve problems or to fix problems in the existing cookie protocol. We all know and acknowledge that cookies as they currently work are quite flawed and painful.)

2010 conferences

What are the good conferences 2010 that I really shouldn’t miss? I’m talking open source, tech and internet protocols. Where are you going? I’m currently planning like this:

Fosdem 6-7 Feb in Brussels Belgium: I’m going and I’m doing a Rockbox talk.

foss-sthlm 24 Feb in Stockholm Sweden: I’m arranging and I’m doing a curl talk. This isn’t really a “conference” but I wanted to mention it anyway!

IETF 77 in Anaheim USA, March 21-26: While it would’ve been a blast to go there, it really doesn’t sync very well with my work schedule and other lifely matters so I’ll pass this one! Sorry all friends whom I otherwise would’ve met there!

OWASP AppSec Research in Stockholm Sweden, June 21-24: since it is in Stockholm and these guys tend to have interesting stuff I just may go. It depends a little on how the program will end up and if I manage to cough up a talk for it.

IETF 78 in Maastricht Netherlands July 25-30: I want to go there and I think the timing is much better for this IETF meeting than the previous few ones. With a little luck we’ll get both the HTTPBIS and the HTTPSTATE groups to have meetings here, and who knows what other fun there will be?!

Slackathon 2010 in August in Stockholm Sweden? It’s not decided yet but I hope they will go for it and I will try to attend. Slackathon reminders.

FSCONS in Gothenburg Sweden Oct/Nov: Since this is our current major open source conference in Sweden I really want to go and I hope to be able to do a talk too. I don’t think the date is set yet, which is a bit unfortunate since November this year is a bit special to me so there will be some other events going on at that time that risk conflicting with FSCONS.

httpstate cookie domain pains

Back in 2008, I wrote about when Mozilla started to publish their effort the “public suffixes list” and I was a bit skeptic.

Well, the problem with domains in cookies of course didn’t suddenly go away and today when we’re working in the httpstate working group in the IETF with documenting how cookies work, we need to somehow describe how user agents tend to work with these and how they should work with them. It’s really painful.

The problem in short, is that a site that is called ‘www.example.com’ is allowed to set a cookie for ‘example.com’ but not for ‘com’ alone. That’s somewhat easy to understand. It gets more complicated at once when we consider the UK where ‘www.example.co.uk’ is fine but it cannot be allowed to set a cookie for ‘co.uk’.

So how does a user agent know that co.uk is magic?

Firefox (and Chrome I believe) uses the suffixes list mentioned above and IE is claimed to have its own (not published) version and Opera is using a trick where it tries to check if the domain name resolves to an IP address or not. (Although Yngve says Opera will soon do online lookups against the suffix list as well.) But if you want to avoid those tricks, Adam Barth’s description on the http-state mailing list really is creepy.

For me, being a libcurl hacker, I can’t of course but to think about Adam’s words in his last sentence: “If a user agent doesn’t care about security, then it can skip the public suffix check”.

Well. Ehm. We do care about security in the curl project but we still (currently) skip the public suffix check. I know, it is a bit of a contradiction but I guess I’m just too stuck in my opinion that the public suffixes list is a terrible solution but then I can’t figure anything that will work better and offer the same level of “safety”.

I’m thinking perhaps we should give in. Or what should we do? And how should we in the httpstate group document that existing and new user agents should behave to be optimally compliant and secure?

(in case you do take notice of details: yes the mailing list is named http-state while the working group is called httpstate without a dash!)

My Nordic Free Software Awards 2009 nominees

Hey, it’s really about time to nominate your favourite Free Software persons and projects from the nordic region for the 2009 awards before the time runs out.

This year, I decided to nominate the following “nordic” heroes:

Simon Josefsson

For his excellent work in GnuTLS, libssh2 and a bunch of other projects.

Henrik Nordström

For his work in the Squid project, and his efforts within IETF and its HTTP related struggles and more.

Björn Stenberg

As the primary founder of the Rockbox project. He started somehting special back in 2001 that now is a huge, thriving and succesful Free Software project.

As you might spot, I favor “doers”. I don’t believe in the concept of “nordic projects” when it comes to free or open software – the entire concept of open and free should mean that projects cross borders and regions.

In fact, it feels so out of the ordinary to think about open source people in a geographical context I find it hard to come up with a lot of names. It would be cool if ohloh had some ways to list people and projects based on where people live.

Then again, if a person from a nordic country moves somewhere else, is he or she still a nordic person? Does it depend on where the person lived during the actual act? Is Linus Torvalds a nordic person since he was born, lived many years and started his big project in Finland?

(yeah I already blogged about this subject but hey, it can’t hurt can it?)

My IETF 75 Story

A while ago, like a couple of years ago, I joined the mailing list for the HTTPbis effort. That’s the IETF work group which is working on producing an update to the HTTP 1.1 spec RFC2616 that clarifies it and removes things that nobody does or that doesn’t work. That spec is huge and the number of conflicting statements or just generally muddy expressions is large. This work is not near completion yet.

So when I learned the IETF75 meeting was to be held here in Stockholm, I of course cheered the opportunity to get to join in the talks and meet the people here.IETF

IETF is basically just an informal bunch of people who likes internet protocols, “above the wire and below the application”. The goal of the IETF is to make the Internet work better. This is the organization behind the RFCs. They wrote the specs for TCP, FTP, SMTP, POP3, HTTP and a wide range of other protocols we all use non-stop these days. I’m actually quite surprised IETF is so little known. When I mention the organization name, most people just give me a blank face back that reveals it just isn’t known to very many “civilians”.

IETF has meetings around the world 2 or 3 times per year, and every time some 1K-2K people join up and there’s a week filled with working group meetings, talks and a lot of socializing and so on. The attitude is generally relaxed all over and very welcoming for newcomers (like me). Everyone can join any talk or discussion. There’s simply no dress code, but everyone just wears whatever they think is comfortable.

The whole week was packed with scheduled talks and sessions, but I only spent roughly 1.5 days at the conference center as I had to get some “real work” done as well and quite honestly I didn’t find sessions that matched my interest every single day anyway.

After the scheduled days, and between sessions and sometimes instead of the scheduled stuff, a lot of social events took place. People meet in informal gatherings, talk, plans, sessions and dinners. The organizers of this particular even also arranged two separate off-topic social events. As one of the old-timers I talked to said something like “this year was unusually productive, but just about all of that was done outside of the schedule”…

This time. The 75th IETF meeting was hosted by .se in Stockholm Sweden in the end of July 2009. 1230 persons had signed up to come. The entrance fee was 650 USD unless you wanted to pay very late or at the door, as then it was a 130 extra or so. Oh, and we got a t-shirt!

One amusing little detail: on the t-shirt we got there was a “free invite code” to the Spotify streaming music service. But when people at the IETF meeting tried to use it at the conference center, Spotify refused to accept the users claiming it doesn’t work in the US! Clearly they’re not using the most updated ip-geography database in the world! 😉

So let me quickly mention a few of the topics I caught and found interesting, some of which I’ve blogged about separately.

HTTP over SCTP

The guys in the SCTP team works on this draft on how to do HTTP the best possible way over SCTP. They cite tests that claim “web browsing” becomes a better experience when done over SCTP. I’m personally quite interested in this work and in SCTP in general and I hope to be able to play with making libcurl support this in a not too distant future.

How to select TCP or SCTP

Assuming that SCTP gets widespread adoption and even browsers and browser-like apps start support it. How should the clients figure out which transport mechanism to use? The problem is similar to the selection between IPv4 or IPv6 but for the IP choice we can at least use A and AAAA lookups. The suggestions that were presented basically argued for trying all combinations in parallel and going with the fastest to respond, but a few bright minds questioned the smartness of that as it scales very badly if more transport options are added and it potentially introduces a lot more (start-up) traffic.

Getting HTTP from multiple mirrors

What the authors call Multiserver HTTP, is a pretty small suggestion of a few additional HTTP headers that a server would be able to return to a client hinting about other URIs where the exact same file/resource can be downloaded from. A client would then get that resource in parallel from multiple HTTP servers using range requests. Quite inspired by other technologies such as bittorrent and metalink. This first draft faced some criticism of not using existing HTTP as it could, but the general spirit has been welcoming.

The presenter of the idea, Mike Hanley, also mentioned the ability to allow the server response include a wildcard mention, so that a server for example could hint that “other images from this dir path can also be found under this dir path on this other server”. Thus a client would be able to download such images from multiple servers. This idea is not in that draft and I’m not personally sure I think it fits as nicely.

HTTP-state wg

During the IETF week, it was announced that the HTTP-state working group is being formed. It didn’t actually happen on the actual meeting but still… See my separate blog post on http-state.

Multipath TCP

The idea and concept behind MPTCP was new to me but I quickly come to like the thought of getting this into network stacks around me. I hope this will grow up to become something fine! See my separate blog post on multipath tcp for all details.

IRI

I visited the IRI BOF and got some fine insights on the troubles of creating the IRI spec. Without revealing too much, it’s quite clear sometimes that politics can be hard even in these surroundings…

tng – Transport Next Generation

What felt pretty “researchy” and still not really ready for adaption (or am I wrong?) is this effort they call Transport Next Generation. Their ideas include the concept of inserting a whole bunch of more layers into the typical network stuff, to for example move congestion handling into its own layer to be able to make it per network-segment basis instead of only doing end-to-end like today. Apparently they have tests and studies that suggest that the per network-segment basis can improve traffic a lot. These days a lot of the first part and last part of network accesses are done over wireless networks while the core center tends to still be physical cables.

DCCP

I found it interesting and amusing when they presented DCCP with all its bells and whistles, and then toward the end of the presentation it surfaces that they don’t really know what DCCP would be used for and at the moment the work group is pretty much done but there’s just nobody that’s using the protocol…

HTTPbis

HTTPbis is more or less my “home” in the IETF. We had a meeting in which things were discussed, some decisions were made and some new topics were raised. RFC2616 is a monster of a spec and it certainly contains so much details, so many potentially conflicting statements and quite clearly very many implementers have interpreted sections differently, that doing these clarifications is a next to endless work. I figure the work will simply be deemed “done” one day, and the remaining confusions will then just be left. The good part is then that the new document should at least be heaps better than the former. It will certainly benefit future and existing HTTP implementers nonetheless.

And a bunch of the HTTPbis guys got together a bit outside of the meeting as well, so we get to talk quite a bit and top off the evening with a dinner…

OpenDNSSec

There was quite a lot of DNSSEC talk during the week, and it annoys me that I double-booked the evening they had their opendnssec release (or was it tech preview?) party so I couldn’t go there and take advantage of my two free beers!

Observations

Apple laptops. A crushing majority of the people seemed to have Apple branded laptops, and nearly all presentations I saw were done with Apples.

Not too surprising, the male vs female ratio was very very high. I would guess 20:1 to 30:1, at least in those surrounding where I spent my time.

Upcoming Meetings

This week was lots of fun. More fun than I have had in a conference in a very long time. I’ll definately consider going to some upcoming meetings, although the next one in Japan in November doesn’t fit my schedule. Possibly Anaheim in March 2010 and even more likely the 78th meeting in Maastricht in July 2010.

Thanks everyone who was there. Thanks to the hosters for a great event. It was a blast!

Multipath TCP

During the IETF 75 meeting in Stockholm, there was this multipath tcp BOF (“start-up meeting” sort of) on Thursday morning that I visited.

Multipath TCP (shortened to MPTCP at times) is basically an idea to make everything look like TCP for both end points, but allow for additional TCP paths to get added and allow packets to get routed over any of the added flows to overcome congestion and to select the routes where it flows “best”. The socket API would remain unmodified in both ends. The individual TCP paths would all look and work like regular TCP streams for the rest of the network. It is basically a way to introduce these new fancy features without breaking compatibility. Of course a big point of that is to keep functionality over NATs or other middle-boxes. (See full description.)

The guys holding the BOF had already presented a fairly detailed draft how it could be designed both one-ended and with multiple adresses,  but could also boast with an already written implementation that was even demoed live in front of the audience.

The term ‘path’ is basically used for a pair of address+port sets. I would personally rather call it “flow” or “stream” or something, as we cannot really control that the paths are separate from each other as those are entirely in the hands of those who route the IP packets to the destination.

They stressed that their goals here included:

  • perform no worse than TCP would on the best of the single TCP paths
  • be no harder on the network than a single TCP flow would be, not even for single bottlenecks (network and bottleneck fairness)
  • allow resource pooling over multiple TCP paths

A perfect use-case for this is hosts with multiple interfaces. Like a mobile phone with 3G and wifi, as it could have a single TCP connection using paths over both interfaces, and it could even change paths along the way when you move to handover to new wifi access-points or when you plug in your Ethernet cable or whatever. Kind of like a solution to the mobile ip concept with roaming that was never made to actually work in the past.

The multipath tcp mailinglist is already quite active, and it didn’t take long until possible flaws in the backwards compatibility have been discovered and are being discussed. Like if you use TCP to verify that a particular link is alive, MPTCP may in fact break that as the proposal is currently written.

What struck me as an interesting side-effect of this concept, is that if implemented it will separate packets from the same original stream further from each other and possibly make snooping on plain-text TCP traffic harder. Like in the case where you monitor traffic going through a router or similar.