Category Archives: Technology

Really everything related to technology

The cookie RFC 6265

April 28, 2011 Daniel Stenberg 4 Comments

http://www.rfc-editor.org/rfc/rfc6265.txt is out!

Back when I was a HTTP rookie in the late 90s, I once expected that there was this fine RFC document somewhere describing how to do HTTP cookies. I was wrong. A lot of others have missed that document too, both before and after my initial search.

I was wrong in the sense that sure there were RFCs for cookies. There were even two of them (RFC2109 and RFC2965)! The only sad thing was however that both of them were totally pointless as in effect nobody (servers nor clients) implemented cookies like that so they documented idealistic protocols that didn’t exist in the real world. This sad state has made people fall into cookie problems all the way into modern days when they’ve implemented services according to those RFCs and then blame their browser for failing.

It turned out that the only document that existed that were being used, was the original Netscape cookie document. It can’t even be called a specification because it is so short and is so lacking in details that it leaves large holes open and forces implementers to guess about the missing pieces. A sweet irony in itself is the fact that even Netscape removed the document from their site so the only place to find this document is at archive.org or copies like the one I link to above at the curl.haxx.se site. (For some further and more detailed reading about the history of cookies and a bunch of the flaws in the protocol/design, I recommend Michal Zalewski’s excellent blog post HTTP cookies, or how not to design protocols.)

While HTTP was increasing in popularity as a protocol during the 00s and still is, and more and more stuff get done in browsers and everything and everyone are using cookies, the protocol was still not documented anywhere as it was actually used.

Somewhat modeled after the httpbis working group (which is working on updating and bugfixing the HTTP 1.1 spec), IETF setup a mailing list named httpstate in the early 2009 to start discussing what problems there are with cookies and all related matters. After lively discussions throughout the year, the working group with the same name as the mailinglist was founded at December 11th 2009.

One of the initial sparks to get the httpstate group going came from Bill Corry who said this about the start:

In late 2008, Jim Manico and I connected to create a specification for

HTTPOnly — we saw the security issues arising from how the browser vendors

were implementing HTTPOnly in varying ways[1] due to a lack of a specification

and formed an ad-hoc working group to tackle the issue[2].

When I approached the IETF about forming a charter for an official working

group, I was told that I was <quote> “wasting my time” because cookies itself

did not have a proper specification, so it didn’t make sense to work on a spec

for HTTPOnly. Â Soon after, we pursued reopening the IETF httpstate Working

Group to tackle the entire cookie spec, not just HTTPOnly. Â Eventually Adam

Barth would become editor and Jeff Hodges our chair.

In late 2008, Jim Manico and I connected to create a specification for HTTPOnly — we saw the security issues arising from how the browser vendors were implementing HTTPOnly in varying ways[1] due to a lack of a specification and formed an ad-hoc working group to tackle the issue[2].

When I approached the IETF about forming a charter for an official working group, I was told that I was <quote> “wasting my time” because cookies itself did not have a proper specification, so it didn’t make sense to work on a spec for HTTPOnly. Soon after, we pursued reopening the IETF httpstate Working Group to tackle the entire cookie spec, not just HTTPOnly. Eventually Adam Barth would become editor and Jeff Hodges our chair.

Since then Adam Barth has worked fiercely as author of the specification and lots of people have joined in and contributed their views, comments and experiences, and we have over time really nailed down how cookies work in the wild today. The current spec now actually describes how to send and receive cookies, the way it is done by existing browsers and clients. Of course, parts of this new spec say things I don’t think it should, like how it deals with the order of cookies in headers, but as everything in life we needed to compromise and I seemed to be rather lonely on my side of that “fence”.

I must stress that the work has only involved to document how things work today and not to invent or create anything new. We don’t fix any of the many known problems with cookies, but we describe how you write your protocol implementation if you want to interact fine with existing infrastructure.

The new spec explicitly obsoletes the older RFC2965, but doesn’t obsolete RFC2109. That was done already by RFC2965. (I updated this paragraph after my initial post.)

Oh, and yours truly is mentioned in the ending “acknowledgements” section. It’s actually the second RFC I get to be mentioned in, the first being RFC5854.

Future

I am convinced that I will get reason to get back to the cookie topic soon and describe what is being worked on for the future. Once the existing cookies have been documented, there’s a desire among people to design something that overcomes the problems with the existing protocol. Adam’s CAKE proposal being one of the attempts and ideas in the pipe.

Another parallel IETF effort is the http-auth mailing list in which lots of discussions around HTTP authentication is being held, and as they often today involve cookies there’s a lot of talk about them there as well. See for example Timothy D. Morgan’s document Weaning the Web off of Session Cookies.

I’ll certainly track the development. And possibly even participate in shaping how this will go. We’ll see.

(cookie image source)

cURL and libcurl, Network, Web

HTTP transfer compression

April 18, 2011 Daniel Stenberg

HTTP is a protocol that looks simple in its simplest form and its readability can easily fool you into believing an implementation is straight forward and quickly done.

That’s not the reality though. HTTP is a very big protocol with lots of corners and twisting mazes that one can get lost in. Even after having been the primary author of curl for 13+ years, there are still lots of HTTP things I don’t master.

To name an example of an area with little known quirks, there’s a funny situation when it comes to how HTTP supports and doesn’t support compression of data and compression of data in transfer.

No header compression

A little flaw in HTTP in regards to compression is that there’s no way to compress headers, in either direction. No matter what we do, we must send the text as-is and both requests and responses are sometimes very big these days. Especially taken into account how cookies are always inserted in requests if they match. Anyway, this flaw is nothing we can do anything about in HTTP 1.1 so we need to live with it.

On the other side, compression of the response body is supported.

Compressing data

Compression of data can be done in two ways: either the actual transfer is compressed or the body data is compressed. The difference is subtle, but when the body data is compressed there’s really nothing that mandates that the client has to uncompress it for the end user, and if the transfer is compressed the receiver must uncompress it in order to deal with the transfer properly.

For reasons that are unknown to me, HTTP clients and servers started out supporting compression only using the Content-Encoding style. It means that the client tells the server what kind of content encodings it supports (using Accept-Encoding:) and the server then sends the response data using one of the supported encodings. The client then decides on its own that if it gets the content in one of the compressed formats that it said it can handle, it will automatically uncompress that on arrival.

The HTTP protocol designers however intended this kind of automatic compression and subsequent uncompress to be done using Transfer-Encoding, as the end result is the completely transparent and the uncompress action is implied and intended by the protocol design. This is done by the client telling the server what transfer encodings it supports with the TE: header and the server adds a Transfer-Encoding: header in the response telling how the transfer is encoded.

HTTP 1.1 introduced a mandatory encoding that all servers can use whenever they feel like it: chunked encoding, so all HTTP 1.1 clients already have to deal with Transfer-Encoding to some degree.

Surely curl is better than all those other guys, right?

Not really. Not yet anyway.

curl has a long history of copying its behavior from what the browsers do, in order to allow users to basically script anything imaginable that is HTTP-like with curl. In this vein, we implemented compression support the same way as all the browsers did it: the content encoding style. (I have reason to believe that at least Opera actually supports or used to support compressed Transfer-Encoding.)

Starting now (code pushed to git repo just after the 7.21.5 release), we’ve taken steps to improve things. We’re changing gears and we’re introducing support for asking for and using compressed Transfer-Encoding. This will start out as an optional feature/flag (–tr-encoding / CURLOPT_TRANSFER_ENCODING) so that we can start out and see how servers in the wild behave and that we can deal with them properly. Then possibly we can switch the default in the future to always ask for compressed transfers. At least for the command line tool.

We know from the little tests we are aware of, that there are at least one known little problem or shall we call it a little detail to keep on eye at, with introducing compressed Transfer-Encoding. As has been so fine reported several years ago in the opera blog Browser sniffing gone wrong (again): Cars.com, there are cases where this may cause the server to send data that gets compressed twice (using both Content and Transfer Encoding) and that needs to be taken care of properly by the client.

At the time of this writing, I’ve not yet taken care of the double-compress case in the code, but I intend to get on to it within shortly.

I’m otherwise very interested in hearing what kind of experience people will have from this. What servers and sites will support this as documented and intended?

Network

Future transports, the video

April 4, 2011 Daniel Stenberg

The talk I did at FSCONS 2010 titled “Future Transports” has now been made available online and you can see the whole thing. It is split up in three separate video snippets. Click on the picture below to get started:

I originally put the videos embedded here on my blog, but it turned out to be a really certain way to kill Firefox so it turned out to be annoying. Now you’ll instead get handed over to the video on vimeo’s site.

Network

IRC use is declining

March 31, 2011 Daniel Stenberg

I discovered IRC around 1993.

Back then, before EFnet split in two, the IRC channel I frequented was #amiga and we were a small bunch of people from all over the globe who got to know each other pretty good. In the 90s I participated in one of my first open source projects and we created the IRC bot we named Dancer. Dancer was a really talented “defence bot” back in the days of the “wild west” of IRC when channel take overs, flood attacks and nick collisions were widespread and frequently occurring. Dancer helped us keep things calm. Later on, I was part of the team that created and setup the new IRC network called amiganet.

I’ve been using IRC on and off since those days in the early 1990s and still today I hang out on 5-6 channels on freenode every day.

IRC was launched to the world already 1988, almost 23 years ago. I’ve been trying to document the basic history of IRC and when I updated that page the other day with some usage numbers for freenode, I decided to have a look around the net to see if there are any general numbers for IRC usage at large, and I found out that usage is decreasing all over and has been doing so for years. Without research, I figure IRC users are either old farts like myself or at least very tech oriented and geeky. Younger, newer and less techy people use other means of communication.

IRC never “took off” among the general public. In general, I find that general people prefer various IM systems (something that I’ve never understood or adopted myself) and most “ordinary”humans I know don’t even know what IRC is. Possibly, the fact that the IRC protocol never got very good (there’s only that original spec from ’93), that there’s a million completely separated IRC networks with no cross-network messages or that all IRC networks still today suffer from netsplits and other artifacts dueÂ deficienciesÂ in how the IRC servers are talking to each other.

5-6 years ago the four most popular networks were all over 100,000 users regularly. Quakenet were well over 200,000. Last year, only Quakenet reached over 100,000. It seems basically all ofÂ them have roughly half the numbers they had 2004.

Graphs from irc.netsplit.de:

2004

2010

Mail

Email asking for my products

March 30, 2011 Daniel Stenberg

In my mini-series of strange mails I receive, here’s another one:

Subject: Product Request

Hello,
I am interested in purchasing some of your products, I will like to know
if youcan ship directly to SPAIN , I also want you to know my mode of
payment for this order is via Credit Card. Get back to me if you can ship
to that destination and also if you accept the payment type I indicated.
Kindly return this email with your price list of your products..

I assume I’ll never figure out what products he speaks of, or how on earth he ended up sending me this… I’ll admit I was tempted to make up some “interesting” products to offer.

Update: I was informed that this is probably “just” another online fraud attempt. How boring.

Haxx, Network

Haxxup – cheap remote backup

March 28, 2011 Daniel Stenberg

The pains and guilty consciences from having a lacking backup concept established are widely common. I honestly don’t know anyone (and I mean it) that can say that they have their (home, private) backup covered with a straight face. We all know we should backup locally and remotely, so that we can do fast recovery for the easy things we mistakenly remove or ruin, and if we getÂ burgledÂ or the house burns down we need to have a backup remotely.

The importance of private computer backups has only increased over time, as these days most of us have vast amounts of family pictures and videos stored as well, things that in the old days were stored (and lost) separately. boom

A growing problem with remote backups is of course that we all have ridiculous amounts of data to backup. Getting a commercial remote backup deal for say 300GB (and growing) isn’t cheap. And we’re also very often at loss when it comes to get a solution that works on Linux.

In Haxx, we also recognized and suffered from these problems. We came up with a scheme to fix a distributed networked backup among ourselves! Getting largeÂ hard-drivesÂ to use locally is fairly cheap. We all have fairly good fixed-fee no-bandwidth-limit internet connections (although admittedly the uplink speeds are lacking for us typical ADSL users).

We decided that among us 4, each of us gets an account at two of our friends’ servers and we’ll be able to upload our backups to those at our own pace to store whatever we want. We decided on getting two places for everyone toÂ decreaseÂ the risk even further, especially if you for example urgently need to get something back and one of us have a network problem (not completely unheard of) or something else.

My current total backup is about 100GB and I have a 1mbit uplink. If I use the entire bandwidth for this, other things get a little sluggish so I’ve capped the rsync job to 90KB/sec… My first run thus completed in roughly 13 days. Luckily I don’t add contents at a very high pace so the ordinary sync jobs from then on should be much smaller and should be able to complete within hours. As long as I add less than ~3.5GB during a 24 hour period, it should be able to keep up to sync to two remote places.

Network, Windows

localhost hack on Windows

February 21, 2011 Daniel Stenberg

Readers of my blog and friends in general know that I’m not really a Windows guy. I never use it and I never develop things explicitly for windows – but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I’ve just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using “localhost” in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at “c:\WINDOWS\system32\drivers\etc\hosts”) and that default Â /etc/hosts alternative used to have an entry for “localhost” in it that would point to 127.0.0.1.

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it’s not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string “localhost” and is documented to always return “all loopback addresses on the local computer”.

So, a custom resolver such as c-ares that doesn’t use Windows’ functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds “localhost” in a local file but ends up asking the DNS server for info about it… A case that is far from ideal. Most servers won’t have an entry for it and others might simply provide the wrong address.

I think we’ll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there’s even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to “..localmachine”, all registered addresses on the local computer are returned.

cURL and libcurl, libssh2, Network

Fosdem 2011: my libcurl talk on video

February 8, 2011 Daniel Stenberg 1 Comment

Kai Engert was good enough to capture all the talks in the security devroom at Fosdem 2011, and while I’m seeding the full torrent I’ve made my own talk available as a direct download from here:

Fosdem 2011: security-room at 14:15 by Daniel Stenberg

The thing is about 107MB big, 640×480 resolution and is roughly 26 minutes playing time. WebM format.

Technology

News flash! Tech terms used almost correctly!

February 2, 2011 Daniel Stenberg 6 Comments

Ok, The Social Network isn’t a new movie by any means at this time, but I happened to see it the other day. I’ll leave the entire story and whatever facts or not it did or didn’t portrait in a correct manner.

But I did spot the use of several at least basic technical terms used in the beginning that struck me as amazingly correctly used! The movie character Mark actually used wget to download images (at about 10:05 into the movie), and as you can see on my first screenshot the initial keystrokes we get to see on the command line also actually resembles a correct wget command line. You can click on these images to get a slightly larger version of the pics. I’m sorry I couldn’t get any higher quality ones, but I figure the point is still the same!

After having invoked wget, as is explained he gets many pictures downloaded and what do you know, the screen output actually looks like it could’ve been a wget that has downloaded a couple of files:

He also mentioned the terms ‘Apache’, ’emacs’ and ‘perl scripts’ in complete and correct sentences.

Where is the world heading?!

Update: Hrvoje Niksic, the founder of wget, helped out with some additional observations:

The options looked right to me, something like -r -A.jpg …

I was wondering about the historical accuracy of the progress bar, but it checks out. The movie takes place about a year and a half after the release of Wget 1.8, which added the feature. The department that takes care of these things did a good job. 🙂

Network, Web

Cookies and Websockets and HTTP headers

February 1, 2011 Daniel Stenberg

So yesterday we held a little HTTP-related event in Stockholm, arranged by OWASP Sweden. We talked a bit about cookies, websockets and recent HTTP headers.

Below are all the slides from the presentations I, Martin Holst Swende andÂ John Wilanders did. (The entire event was done in Swedish.)

Cookies och Websockets

Martin Holst Swende’s talk:

WebSockets fÃ¶r applikationstestare

John Wilander’s slides from his talk are here:

Kommer nya HTTP-headers rÃ¤dda oss?

daniel.haxx.se