Category Archives: Network

Internet. Networking.

HTTP transfer compression

April 18, 2011 Daniel Stenberg

HTTP is a protocol that looks simple in its simplest form and its readability can easily fool you into believing an implementation is straight forward and quickly done.

That’s not the reality though. HTTP is a very big protocol with lots of corners and twisting mazes that one can get lost in. Even after having been the primary author of curl for 13+ years, there are still lots of HTTP things I don’t master.

To name an example of an area with little known quirks, there’s a funny situation when it comes to how HTTP supports and doesn’t support compression of data and compression of data in transfer.

No header compression

A little flaw in HTTP in regards to compression is that there’s no way to compress headers, in either direction. No matter what we do, we must send the text as-is and both requests and responses are sometimes very big these days. Especially taken into account how cookies are always inserted in requests if they match. Anyway, this flaw is nothing we can do anything about in HTTP 1.1 so we need to live with it.

On the other side, compression of the response body is supported.

Compressing data

Compression of data can be done in two ways: either the actual transfer is compressed or the body data is compressed. The difference is subtle, but when the body data is compressed there’s really nothing that mandates that the client has to uncompress it for the end user, and if the transfer is compressed the receiver must uncompress it in order to deal with the transfer properly.

For reasons that are unknown to me, HTTP clients and servers started out supporting compression only using the Content-Encoding style. It means that the client tells the server what kind of content encodings it supports (using Accept-Encoding:) and the server then sends the response data using one of the supported encodings. The client then decides on its own that if it gets the content in one of the compressed formats that it said it can handle, it will automatically uncompress that on arrival.

The HTTP protocol designers however intended this kind of automatic compression and subsequent uncompress to be done using Transfer-Encoding, as the end result is the completely transparent and the uncompress action is implied and intended by the protocol design. This is done by the client telling the server what transfer encodings it supports with the TE: header and the server adds a Transfer-Encoding: header in the response telling how the transfer is encoded.

HTTP 1.1 introduced a mandatory encoding that all servers can use whenever they feel like it: chunked encoding, so all HTTP 1.1 clients already have to deal with Transfer-Encoding to some degree.

Surely curl is better than all those other guys, right?

Not really. Not yet anyway.

curl has a long history of copying its behavior from what the browsers do, in order to allow users to basically script anything imaginable that is HTTP-like with curl. In this vein, we implemented compression support the same way as all the browsers did it: the content encoding style. (I have reason to believe that at least Opera actually supports or used to support compressed Transfer-Encoding.)

Starting now (code pushed to git repo just after the 7.21.5 release), we’ve taken steps to improve things. We’re changing gears and we’re introducing support for asking for and using compressed Transfer-Encoding. This will start out as an optional feature/flag (–tr-encoding / CURLOPT_TRANSFER_ENCODING) so that we can start out and see how servers in the wild behave and that we can deal with them properly. Then possibly we can switch the default in the future to always ask for compressed transfers. At least for the command line tool.

We know from the little tests we are aware of, that there are at least one known little problem or shall we call it a little detail to keep on eye at, with introducing compressed Transfer-Encoding. As has been so fine reported several years ago in the opera blog Browser sniffing gone wrong (again): Cars.com, there are cases where this may cause the server to send data that gets compressed twice (using both Content and Transfer Encoding) and that needs to be taken care of properly by the client.

At the time of this writing, I’ve not yet taken care of the double-compress case in the code, but I intend to get on to it within shortly.

I’m otherwise very interested in hearing what kind of experience people will have from this. What servers and sites will support this as documented and intended?

Network

Future transports, the video

April 4, 2011 Daniel Stenberg

The talk I did at FSCONS 2010 titled “Future Transports” has now been made available online and you can see the whole thing. It is split up in three separate video snippets. Click on the picture below to get started:

I originally put the videos embedded here on my blog, but it turned out to be a really certain way to kill Firefox so it turned out to be annoying. Now you’ll instead get handed over to the video on vimeo’s site.

Network

IRC use is declining

March 31, 2011 Daniel Stenberg

I discovered IRC around 1993.

Back then, before EFnet split in two, the IRC channel I frequented was #amiga and we were a small bunch of people from all over the globe who got to know each other pretty good. In the 90s I participated in one of my first open source projects and we created the IRC bot we named Dancer. Dancer was a really talented “defence bot” back in the days of the “wild west” of IRC when channel take overs, flood attacks and nick collisions were widespread and frequently occurring. Dancer helped us keep things calm. Later on, I was part of the team that created and setup the new IRC network called amiganet.

I’ve been using IRC on and off since those days in the early 1990s and still today I hang out on 5-6 channels on freenode every day.

IRC was launched to the world already 1988, almost 23 years ago. I’ve been trying to document the basic history of IRC and when I updated that page the other day with some usage numbers for freenode, I decided to have a look around the net to see if there are any general numbers for IRC usage at large, and I found out that usage is decreasing all over and has been doing so for years. Without research, I figure IRC users are either old farts like myself or at least very tech oriented and geeky. Younger, newer and less techy people use other means of communication.

IRC never “took off” among the general public. In general, I find that general people prefer various IM systems (something that I’ve never understood or adopted myself) and most “ordinary”humans I know don’t even know what IRC is. Possibly, the fact that the IRC protocol never got very good (there’s only that original spec from ’93), that there’s a million completely separated IRC networks with no cross-network messages or that all IRC networks still today suffer from netsplits and other artifacts dueÂ deficienciesÂ in how the IRC servers are talking to each other.

5-6 years ago the four most popular networks were all over 100,000 users regularly. Quakenet were well over 200,000. Last year, only Quakenet reached over 100,000. It seems basically all ofÂ them have roughly half the numbers they had 2004.

Graphs from irc.netsplit.de:

2004

2010

Haxx, Network

Haxxup – cheap remote backup

March 28, 2011 Daniel Stenberg

The pains and guilty consciences from having a lacking backup concept established are widely common. I honestly don’t know anyone (and I mean it) that can say that they have their (home, private) backup covered with a straight face. We all know we should backup locally and remotely, so that we can do fast recovery for the easy things we mistakenly remove or ruin, and if we getÂ burgledÂ or the house burns down we need to have a backup remotely.

The importance of private computer backups has only increased over time, as these days most of us have vast amounts of family pictures and videos stored as well, things that in the old days were stored (and lost) separately. boom

A growing problem with remote backups is of course that we all have ridiculous amounts of data to backup. Getting a commercial remote backup deal for say 300GB (and growing) isn’t cheap. And we’re also very often at loss when it comes to get a solution that works on Linux.

In Haxx, we also recognized and suffered from these problems. We came up with a scheme to fix a distributed networked backup among ourselves! Getting largeÂ hard-drivesÂ to use locally is fairly cheap. We all have fairly good fixed-fee no-bandwidth-limit internet connections (although admittedly the uplink speeds are lacking for us typical ADSL users).

We decided that among us 4, each of us gets an account at two of our friends’ servers and we’ll be able to upload our backups to those at our own pace to store whatever we want. We decided on getting two places for everyone toÂ decreaseÂ the risk even further, especially if you for example urgently need to get something back and one of us have a network problem (not completely unheard of) or something else.

My current total backup is about 100GB and I have a 1mbit uplink. If I use the entire bandwidth for this, other things get a little sluggish so I’ve capped the rsync job to 90KB/sec… My first run thus completed in roughly 13 days. Luckily I don’t add contents at a very high pace so the ordinary sync jobs from then on should be much smaller and should be able to complete within hours. As long as I add less than ~3.5GB during a 24 hour period, it should be able to keep up to sync to two remote places.

Network, Windows

localhost hack on Windows

February 21, 2011 Daniel Stenberg

Readers of my blog and friends in general know that I’m not really a Windows guy. I never use it and I never develop things explicitly for windows – but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I’ve just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using “localhost” in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at “c:\WINDOWS\system32\drivers\etc\hosts”) and that default Â /etc/hosts alternative used to have an entry for “localhost” in it that would point to 127.0.0.1.

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it’s not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string “localhost” and is documented to always return “all loopback addresses on the local computer”.

So, a custom resolver such as c-ares that doesn’t use Windows’ functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds “localhost” in a local file but ends up asking the DNS server for info about it… A case that is far from ideal. Most servers won’t have an entry for it and others might simply provide the wrong address.

I think we’ll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there’s even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to “..localmachine”, all registered addresses on the local computer are returned.

cURL and libcurl, libssh2, Network

Fosdem 2011: my libcurl talk on video

February 8, 2011 Daniel Stenberg 1 Comment

Kai Engert was good enough to capture all the talks in the security devroom at Fosdem 2011, and while I’m seeding the full torrent I’ve made my own talk available as a direct download from here:

Fosdem 2011: security-room at 14:15 by Daniel Stenberg

The thing is about 107MB big, 640×480 resolution and is roughly 26 minutes playing time. WebM format.

Network, Web

Cookies and Websockets and HTTP headers

February 1, 2011 Daniel Stenberg

So yesterday we held a little HTTP-related event in Stockholm, arranged by OWASP Sweden. We talked a bit about cookies, websockets and recent HTTP headers.

Below are all the slides from the presentations I, Martin Holst Swende andÂ John Wilanders did. (The entire event was done in Swedish.)

Cookies och Websockets

Martin Holst Swende’s talk:

WebSockets fÃ¶r applikationstestare

John Wilander’s slides from his talk are here:

Kommer nya HTTP-headers rÃ¤dda oss?

Electronics, Network

NASed and RAID1ed

January 19, 2011 Daniel Stenberg

I finally got my act together and bought myself a Synology DS211j NASÂ with two 2TB drives. I’ll use it as a shared network disk at home and I intend to backup to it – as I also have a home office it’ll feel better to be able to also backup company related data somewhat more safely. My previous backup was only copying data from one HDD to another within the same physical machine.

To make it slightly more resistant to disk and hardware errors I’ve configured the disks to do RAID1 (all data is stored on both disks simultaneously).

The GPLv2 license was provided printed on paper in the package.

Network, Security, Technology

HTTP security, websockets and more

January 17, 2011 Daniel Stenberg 4 Comments

Together with friends in OWASP I’m happy to mention that we will do an event on January 31st on the topic “HTTP security, websockets and more” where I’ll talk. Starting at 17:30, the exact location is not decided yet and it’ll depend a bit on popularity, but it will be in Stockholm, Sweden.

The two other speakers to appear at the event are, apart from myself, John Wilander and Martin Holst-Swende. My part of the session will be about the WebSockets protocol, about the upcoming cookie RFC and some bits about the ongoing HTTPbis work.

Omegapoint will sponsor with something to eat and drink, and we do plan to go out and grab a beer afterwards and continue the discussion.

See you!

Network, Web

WebSockets now: handshake and masking

January 4, 2011 Daniel Stenberg

In August 2010 I blogged about the WebSockets state at the time. In some aspects nothing has changed, and in some other aspects a lot has changed. There’s still no WebSockets specification that approaches consensus (remember theÂ 4 weeks plan from July?).

Handshaking this or that way

We’ve been reading an endless debate through the last couple of months on how the handshake should be made and how to avoid that stupidÂ intermediariesÂ might get tricked by HTTP-looking websocket traffic. In the midst of that storm, a team of people posted the paper Transparent Proxies: Threat or Menace? which argued that HTTP+Upgrade would be insecure and that CONNECT should be used (Abarth’s early draft of the CONNECT handshake).

CONNECT to the server is not kosher HTTP and is not being appreciated by several people – CONNECT is meant to get sent to proxies and proxies are explicitly setup to a client.

The idea to use a separate and dedicated port is of course brought up every now and then but is mostly not considered. Most people seem to want this protocol to go over the “web” ports 80 and 443 and thus to be able to share the proxy environment used for HTTP.

Currently it seems as if we’re back to a HTTP+Upgrade handshake.

Masking the traffic

A lot of people also questioned the very binary outcome of the Transparet Proxies report mentioned above, and later on it seems the consensus that by “masking” WebSocket traffic it should be possible to avoid the risk that stupid intermediaries misinterpret the traffic as HTTP. The masking is currently being discussed to be XOR with a frame-specific key, so that a typical stream will change key multiple times but is still easy for a WebSocket-aware tool (say Wireshark and similar) to “demask” on purpose.

The last few weeks have been spent on discussing how the masking is done, if it is to become optional and if the masking should include the framing or not.

This is an open process

I’m not sure I’ve stressed this properly before: IETF is an open organization. Anyone can join in and share their views and opinions, but of course you need to argue technical merits.

daniel.haxx.se