Category Archives: Network

Internet. Networking.

The cookie RFC 6265

http://www.rfc-editor.org/rfc/rfc6265.txt is out!

Back when I was a HTTP rookie in the late 90s, I once expected that there was this fine RFC document somewhere describing how to do HTTP cookies. I was wrong. A lot of others have missed that document too, both before and after my initial search.

I was wrong in the sense that sure there were RFCs for cookies. There were even two of them (RFC2109 and RFC2965)! The only sad thing was however that both of them were totally pointless as in effect nobody (servers nor clients) implemented cookies like that so they documented idealistic protocols that didn’t exist in the real world. This sad state has made people fall into cookie problems all the way into modern days when they’ve implemented services according to those RFCs and then blame their browser for failing.

cookie

It turned out that the only document that existed that were being used, was the original Netscape cookie document. It can’t even be called a specification because it is so short and is so lacking in details that it leaves large holes open and forces implementers to guess about the missing pieces. A sweet irony in itself is the fact that even Netscape removed the document from their site so the only place to find this document is at archive.org or copies like the one I link to above at the curl.haxx.se site. (For some further and more detailed reading about the history of cookies and a bunch of the flaws in the protocol/design, I recommend Michal Zalewski’s excellent blog post HTTP cookies, or how not to design protocols.)

While HTTP was increasing in popularity as a protocol during the 00s and still is, and more and more stuff get done in browsers and everything and everyone are using cookies, the protocol was still not documented anywhere as it was actually used.

Somewhat modeled after the httpbis working group (which is working on updating and bugfixing the HTTP 1.1 spec), IETF setup a mailing list named httpstate in the early 2009 to start discussing what problems there are with cookies and all related matters. After lively discussions throughout the year, the working group with the same name as the mailinglist was founded at December 11th 2009.

One of the initial sparks to get the httpstate group going came from Bill Corry who said this about the start:

In late 2008, Jim Manico and I connected to create a specification for
HTTPOnly — we saw the security issues arising from how the browser vendors
were implementing HTTPOnly in varying ways[1] due to a lack of a specification
and formed an ad-hoc working group to tackle the issue[2].
When I approached the IETF about forming a charter for an official working
group, I was told that I was <quote> “wasting my time” because cookies itself
did not have a proper specification, so it didn’t make sense to work on a spec
for HTTPOnly.  Soon after, we pursued reopening the IETF httpstate Working
Group to tackle the entire cookie spec, not just HTTPOnly.  Eventually Adam
Barth would become editor and Jeff Hodges our chair.

In late 2008, Jim Manico and I connected to create a specification for HTTPOnly — we saw the security issues arising from how the browser vendors were implementing HTTPOnly in varying ways[1] due to a lack of a specification and formed an ad-hoc working group to tackle the issue[2].

When I approached the IETF about forming a charter for an official working group, I was told that I was <quote> “wasting my time” because cookies itself did not have a proper specification, so it didn’t make sense to work on a spec for HTTPOnly.  Soon after, we pursued reopening the IETF httpstate Working Group to tackle the entire cookie spec, not just HTTPOnly. Eventually Adam Barth would become editor and Jeff Hodges our chair.

Since then Adam Barth has worked fiercely as author of the specification and lots of people have joined in and contributed their views, comments and experiences, and we have over time really nailed down how cookies work in the wild today. The current spec now actually describes how to send and receive cookies, the way it is done by existing browsers and clients. Of course, parts of this new spec say things I don’t think it should, like how it deals with the order of cookies in headers, but as everything in life we needed to compromise and I seemed to be rather lonely on my side of that “fence”.
I must stress that the work has only involved to document how things work today and not to invent or create anything new. We don’t fix any of the many known problems with cookies, but we describe how you write your protocol implementation if you want to interact fine with existing infrastructure.

The new spec explicitly obsoletes the older RFC2965, but doesn’t obsolete RFC2109. That was done already by RFC2965. (I updated this paragraph after my initial post.)

Oh, and yours truly is mentioned in the ending “acknowledgements” section. It’s actually the second RFC I get to be mentioned in, the first being RFC5854.

Future

I am convinced that I will get reason to get back to the cookie topic soon and describe what is being worked on for the future. Once the existing cookies have been documented, there’s a desire among people to design something that overcomes the problems with the existing protocol. Adam’s CAKE proposal being one of the attempts and ideas in the pipe.

Another parallel IETF effort is the http-auth mailing list in which lots of discussions around HTTP authentication is being held, and as they often today involve cookies there’s a lot of talk about them there as well. See for example Timothy D. Morgan’s document Weaning the Web off of Session Cookies.

I’ll certainly track the development. And possibly even participate in shaping how this will go. We’ll see.

(cookie image source)

HTTP transfer compression

HTTP is a protocol that looks simple in its simplest form and its readability can easily fool you into believing an implementation is straight forward and quickly done.

That’s not the reality though. HTTP is a very big protocol with lots of corners and twisting mazes that one can get lost in. Even after having been the primary author of curl for 13+ years, there are still lots of HTTP things I don’t master.

To name an example of an area with little known quirks, there’s a funny situation when it comes to how HTTP supports and doesn’t support compression of data and compression of data in transfer.

No header compression

A little flaw in HTTP in regards to compression is that there’s no way to compress headers, in either direction. No matter what we do, we must send the text as-is and both requests and responses are sometimes very big these days. Especially taken into account how cookies are always inserted in requests if they match. Anyway, this flaw is nothing we can do anything about in HTTP 1.1 so we need to live with it.

On the other side, compression of the response body is supported.

Compressing data

Compression of data can be done in two ways: either the actual transfer is compressed or the body data is compressed. The difference is subtle, but when the body data is compressed there’s really nothing that mandates that the client has to uncompress it for the end user, and if the transfer is compressed the receiver must uncompress it in order to deal with the transfer properly.

For reasons that are unknown to me, HTTP clients and servers started out supporting compression only using the Content-Encoding style. It means that the client tells the server what kind of content encodings it supports (using Accept-Encoding:) and the server then sends the response data using one of the supported encodings. The client then decides on its own that if it gets the content in one of the compressed formats that it said it can handle, it will automatically uncompress that on arrival.

The HTTP protocol designers however intended this kind of automatic compression and subsequent uncompress to be done using Transfer-Encoding, as the end result is the completely transparent and the uncompress action is implied and intended by the protocol design. This is done by the client telling the server what transfer encodings it supports with the TE: header and the server adds a Transfer-Encoding: header in the response telling how the transfer is encoded.

HTTP 1.1 introduced a mandatory encoding that all servers can use whenever they feel like it: chunked encoding, so all HTTP 1.1 clients already have to deal with Transfer-Encoding to some degree.

Surely curl is better than all those other guys, right?

Not really. Not yet anyway.

curl has a long history of copying its behavior from what the browsers do, in order to allow users to basically script anything imaginable that is HTTP-like with curl. In this vein, we implemented compression support the same way as all the browsers did it: the content encoding style. (I have reason to believe that at least Opera actually supports or used to support compressed Transfer-Encoding.)

Starting now (code pushed to git repo just after the 7.21.5 release), we’ve taken steps to improve things. We’re changing gears and we’re introducing support for asking for and using compressed Transfer-Encoding. This will start out as an optional feature/flag (–tr-encoding / CURLOPT_TRANSFER_ENCODING) so that we can start out and see how servers in the wild behave and that we can deal with them properly. Then possibly we can switch the default in the future to always ask for compressed transfers. At least for the command line tool.

We know from the little tests we are aware of, that there are at least one known little problem or shall we call it a little detail to keep on eye at, with introducing compressed Transfer-Encoding. As has been so fine reported several years ago in the opera blog Browser sniffing gone wrong (again): Cars.com, there are cases where this may cause the server to send data that gets compressed twice (using both Content and Transfer Encoding) and that needs to be taken care of properly by the client.

At the time of this writing, I’ve not yet taken care of the double-compress case in the code, but I intend to get on to it within shortly.

I’m otherwise very interested in hearing what kind of experience people will have from this. What servers and sites will support this as documented and intended?

Future transports, the video

The talk I did at FSCONS 2010 titled “Future Transports” has now been made available online and you can see the whole thing. It is split up in three separate video snippets. Click on the picture below to get started:

fscons2010-futuretransports

I originally put the videos embedded here on my blog, but it turned out to be a really certain way to kill Firefox so it turned out to be annoying. Now you’ll instead get handed over to the video on vimeo’s site.

IRC use is declining

I discovered IRC around 1993.

Back then, before EFnet split in two, the IRC channel I frequented was #amiga and we were a small bunch of people from all over the globe who got to know each other pretty good. In the 90s I participated in one of my first open source projects and we created the IRC bot we named Dancer. Dancer was a really talented “defence bot” back in the days of the “wild west” of IRC when channel take overs, flood attacks and nick collisions were widespread and frequently occurring. Dancer helped us keep things calm. Later on, I was part of the team that created and setup the new IRC network called amiganet.

I’ve been using IRC on and off since those days in the early 1990s and still today I hang out on 5-6 channels on freenode every day.

IRC was launched to the world already 1988, almost 23 years ago. I’ve been trying to document the basic history of IRC and when I updated that page the other day with some usage numbers for freenode, I decided to have a look around the net to see if there are any general numbers for IRC usage at large, and I found out that usage is decreasing all over and has been doing so for years. Without research, I figure IRC users are either old farts like myself or at least very tech oriented and geeky. Younger, newer and less techy people use other means of communication.

IRC never “took off” among the general public. In general, I find that general people prefer various IM systems (something that I’ve never understood or adopted myself) and most “ordinary”humans I know don’t even know what IRC is. Possibly, the fact that the IRC protocol never got very good (there’s only that original spec from ’93), that there’s a million completely separated IRC networks with no cross-network messages or that all IRC networks still today suffer from netsplits and other artifacts due deficiencies in how the IRC servers are talking to each other.

5-6 years ago the four most popular networks were all over 100,000 users regularly. Quakenet were well over 200,000. Last year, only Quakenet reached over 100,000. It seems basically all of  them have roughly half the numbers they had 2004.

Graphs from irc.netsplit.de:

2004

IRC usage 2004

2010

IRC usage 2010

Haxxup – cheap remote backup

The pains and guilty consciences from having a lacking backup concept established are widely common. I honestly don’t know anyone (and I mean it) that can say that they have their (home, private) backup covered with a straight face. We all know we should backup locally and remotely, so that we can do fast recovery for the easy things we mistakenly remove or ruin, and if we get burgled or the house burns down we need to have a backup remotely.

The importance of private computer backups has only increased over time, as these days most of us have vast amounts of family pictures and videos stored as well, things that in the old days were stored (and lost) separately.boom

A growing problem with remote backups is of course that we all have ridiculous amounts of data to backup. Getting a commercial remote backup deal for say 300GB (and growing) isn’t cheap. And we’re also very often at loss when it comes to get a solution that works on Linux.

In Haxx, we also recognized and suffered from these problems. We came up with a scheme to fix a distributed networked backup among ourselves! Getting large hard-drives to use locally is fairly cheap. We all have fairly good fixed-fee no-bandwidth-limit internet connections (although admittedly the uplink speeds are lacking for us typical ADSL users).

We decided that among us 4, each of us gets an account at two of our friends’ servers and we’ll be able to upload our backups to those at our own pace to store whatever we want. We decided on getting two places for everyone to decrease the risk even further, especially if you for example urgently need to get something back and one of us have a network problem (not completely unheard of) or something else.

My current total backup is about 100GB and I have a 1mbit uplink. If I use the entire bandwidth for this, other things get a little sluggish so I’ve capped the rsync job to 90KB/sec… My first run thus completed in roughly 13 days. Luckily I don’t add contents at a very high pace so the ordinary sync jobs from then on should be much smaller and should be able to complete within hours. As long as I add less than ~3.5GB during a 24 hour period, it should be able to keep up to sync to two remote places.

localhost hack on Windows

There's no place like 127.0.0.1

Readers of my blog and friends in general know that I’m not really a Windows guy. I never use it and I never develop things explicitly for windows – but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I’ve just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using “localhost” in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at “c:\WINDOWS\system32\drivers\etc\hosts”) and that default  /etc/hosts alternative used to have an entry for “localhost” in it that would point to 127.0.0.1.

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it’s not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string “localhost” and is documented to always return “all loopback addresses on the local computer”.

So, a custom resolver such as c-ares that doesn’t use Windows’ functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds “localhost” in a local file but ends up asking the DNS server for info about it… A case that is far from ideal. Most servers won’t have an entry for it and others might simply provide the wrong address.

I think we’ll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there’s even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to “..localmachine”, all registered addresses on the local computer are returned.

Fosdem 2011: my libcurl talk on video

Kai Engert was good enough to capture all the talks in the security devroom at Fosdem 2011, and while I’m seeding the full torrent I’ve made my own talk available as a direct download from here:

Fosdem 2011: security-room at 14:15 by Daniel Stenberg

The thing is about 107MB big, 640×480 resolution and is roughly 26 minutes playing time. WebM format.

Cookies and Websockets and HTTP headers

So yesterday we held a little HTTP-related event in Stockholm, arranged by OWASP Sweden. We talked a bit about cookies, websockets and recent HTTP headers.

Below are all the slides from the presentations I, Martin Holst Swende and John Wilanders did. (The entire event was done in Swedish.)

Martin Holst Swende’s talk:

John Wilander’s slides from his talk are here:

NASed and RAID1ed

Synology DS211jI finally got my act together and bought myself a Synology DS211j NAS  with two 2TB drives. I’ll use it as a shared network disk at home and I intend to backup to it – as I also have a home office it’ll feel better to be able to also backup company related data somewhat more safely. My previous backup was only copying data from one HDD to another within the same physical machine.

To make it slightly more resistant to disk and hardware errors I’ve configured the disks to do RAID1 (all data is stored on both disks simultaneously).

The GPLv2 license was provided printed on paper in the package.

HTTP security, websockets and more

owasp

Together with friends in OWASP I’m happy to mention that we will do an event on January 31st on the topic “HTTP security, websockets and more” where I’ll talk. Starting at 17:30, the exact location is not decided yet and it’ll depend a bit on popularity, but it will be in Stockholm, Sweden.

The two other speakers to appear at the event are, apart from myself, John Wilander and Martin Holst-Swende. My part of the session will be about the WebSockets protocol, about the upcoming cookie RFC and some bits about the ongoing HTTPbis work.

Sign up to attend, the opportunity is only open one week.

Omegapoint will sponsor with something to eat and drink, and we do plan to go out and grab a beer afterwards and continue the discussion.

See you!