Tag Archives: HTTP

curl 7.40.0: unix domain sockets and smb

curl and libcurl curl dot-to-dot7.40.0 was just released this morning. There’s a closer look at some of the perhaps more noteworthy changes. As usual, you can find the entire changelog on the curl web site.

HTTP over unix domain sockets

So just before the feature window closed for the pending 7.40.0 release of curl, Peter Wu’s patch series was merged that brings the ability to curl and libcurl to do HTTP over unix domain sockets. This is a feature that’s been mentioned many times through the history of curl but never previously truly implemented. Peter also very nicely adjusted the test server and made two test cases that verify the functionality.

To use this with the curl command line, you specify the socket path to the new –unix-domain option and assuming your local HTTP server listens on that socket, you’ll get the response back just as with an ordinary TCP connection.

Doing the operation from libcurl means using the new CURLOPT_UNIX_SOCKET_PATH option.

This feature is actually not limited to HTTP, you can do all the TCP-based protocols except FTP over the unix domain socket, but it is to my knowledge only HTTP that is regularly used this way. The reason FTP isn’t supported is of course its use of two connections which would be even weirder to do like this.

SMB

SMB is also known as CIFS and is an old network protocol from the Microsoft world access files. curl and libcurl now support this protocol with SMB:// URLs thanks to work by Bill Nagel and Steve Holme.

Security Advisories

Last year we had a large amount of security advisories published (eight to be precise), and this year we start out with two fresh ones already on the 8th day… The ones this time were of course discovered and researched already last year.

CVE-2014-8151 is a way we accidentally allowed an application to bypass the TLS server certificate check if a TLS Session-ID was already cached for a non-checked session – when using the Mac OS SecureTransport SSL backend.

CVE-2014-8150 is a URL request injection. When letting curl or libcurl speak over a HTTP proxy, it would copy the URL verbatim into the HTTP request going to the proxy, which means that if you craft the URL and insert CRLFs (carriage returns and linefeed characters) you can insert your own second request or even custom headers into the request that goes to the proxy.

You may enjoy taking a look at the curl vulnerabilities table.

Bugs bugs bugs

The release notes mention no less than 120 specific bug fixes, which in comparison to other releases is more than average.

Enjoy!

Stricter HTTP 1.1 framing good bye

I worked on a patch for Firefox bug 237623 to make sure Firefox would use a stricter check for “HTTP 1.1 framing”, checking that Content-Length is correct and that there’s no broken chunked encoding pieces. I was happy to close an over 10 years old bug when the fix landed in June 2014.

The fix landed and has not caused any grief all the way since June through to the actual live release (Nightlies, Aurora, Beta etc). This change finally shipped in Firefox 33 and I had more or less already started to forget about it, and now things went south really fast.

The amount of broken servers ended up too massive for us and we had to backpedal. The largest amount of problems can be split up in these two categories:

  1. Servers that deliver gzipped content and sends a Content-Length: for the uncompressed data. This seems to be commonly done with old mod_deflate and mod_fastcgi versions on Apache, but we also saw people using IIS reporting this symptom.
  2. Servers that deliver chunked-encoding but who skip the final zero-size chunk so that the stream actually never really ends.

We recognize that not everyone can have the servers fixed – even if all these servers should still be fixed! We now make these HTTP 1.1 framing problems get detected but only cause a problem if a certain pref variable is set (network.http.enforce-framing.http1), and since that is disabled by default they will be silently ignored much like before. The Internet is a more broken and more sad place than I want to accept at times.

We haven’t fully worked out how to also make the download manager (ie the thing that downloads things directly to disk, without showing it in the browser) happy, which was the original reason for bug 237623…

Although the code may now no longer alert anything about HTTP 1.1 framing problems, it will now at least mark the connection not due for re-use which will be a big boost compared to before since these broken framing cases really hurt persistent connections use. The partial transfer return codes for broken SPDY and HTTP/2 transfers remain though and I hope to be able to remain stricter with these newer protocols.

This partial reversion will land ASAP and get merged into patch releases of Firefox 33 and later.

Finally, to top this off. Here’s a picture of an old HTTP 1.1 frame so that you know what we’re talking about.

An old http1.1 frame

Firefox and partial content

Update: parts of the change mentioned in this blog post has subsequently been reverted since clearly I had a too positive view of the Internet.

Firefox BallOne of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

  1. Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
  2. Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
  3. With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
  4. In curl‘s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
  5. It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

Bye bye RFC 2616

In August 2007 the IETF HTTPbis work group started to make an update to the HTTP 1.1 specification RFC 2616 (from June 1999) which already was an update to RFC 2068 from 1996. I wasn’t part of the effort back then so I didn’t get to hear the back chatter or what exactly the expectations were on delivery time and time schedule, but I’m pretty sure nobody thought it would take almost seven long years for the update to reach publication status.

On June 6 2014 when RFC 7230 – RFC 7235 were released, the single 176 page document has turned into 6 documents with a total size that is now much larger, and there’s also a whole slew of additional related documents released at the same time.

2616 is deeply carved into my brain so it’ll take some time until I unlearn that, plus the fact that now we need to separate our pointers to one of those separate document instead of just one generic number for the whole thing. Source codes and documents all over now need to be carefully updated to instead refer to the new documents.

And the HTTP/2 work continues to progress at high speed. More about that in a separate blog post soon.

More details on the road from RFC2616 until today can be found in Mark Nottingham’s RFC 2616 is dead.

Less plain-text is better. Right?

Every connection and every user on the Internet is being monitored and snooped at to at least some extent every now and then. Everything from the casual firesheep user in your coffee shop, an admin in your ISP, your parents/kids on your wifi network, your employer on the company network, your country’s intelligence service in a national network hub or just a random rogue person somewhere in the middle of all this.

My involvement in HTTP make me mostly view and participate in this discussion with this protocol primarily in mind, but the discussion goes well beyond HTTP and the concepts can (and will?) be applied to most Internet protocols in the future. You can follow some of these discussions in the httpbis group, the UTA group, the tcpcrypt list on twitter and elsewhere.

IETF just published RFC 7258 which states:

Pervasive Monitoring Is a Widespread Attack on Privacy

Passive monitoring

Most networking surveillance can be done entirely passively by just running the correct software and listening in on the correct cable. Because most internet traffic is still plain-text and readable by anyone who wants to read it when the bytes come flying by. Like your postman can read your postcards.

Opportunistic?

Recently there’s been a fierce discussion going on both inside and outside of IETF and other protocol and standards groups about doing “opportunistic encryption” (OE) and its merits and drawbacks. The term, which in itself is being debated and often is said to be better called “opportunistic keying” (OK) instead, is about having protocols transparently (invisible to the user) upgrade plain-text versions to TLS unauthenticated encrypted versions of the protocols. I’m emphasizing the unauthenticated word there because that’s a key to the debate. Recently I’ve been told that the term “opportunistic security” is the term to use instead…

In the way of real security?

Basically the argument against opportunistic approaches tends to be like this: by opportunistically upgrading plain-text to unauthenticated encrypted communication, sysadmins and users in the world will consider that good enough and they will then not switch to using proper, strong and secure authentication encryption technologies. The less good alternative will hamper the adoption of the secure alternative. Server admins should just as well buy a cert for 10 USD and use proper HTTPS. Also, listeners can still listen in on or man-in-the-middle unauthenticated connections if they capture everything from the start of the connection, including the initial key exchange. Or the passive listener will just change to become an active party and this unauthenticated way doesn’t detect that. OE doesn’t prevent snooping.

Isn’t it better than plain text?

The argument for opportunism here is that there will be nothing to the user that shows that it is “upgrading” to something less bad than plain text. Browsers will not show the padlock, clients will not treat the connection as “secure”. It will just silently and transparently make passive monitoring of networks much harder and it will force actors who truly want to snoop on specific traffic to up their game and probably switch to active monitoring for more cases. Something that’s much more expensive for the listener. It isn’t about the cost of a cert. It is about setting up and keeping the cert up-to-date, about SNI not being widely enough adopted and that we can see only 30% of all sites on the Internet today use HTTPS – for these reasons and others.

HTTP:// over TLS

In the httpbis work group in IETF the outcome of this debate is that there is a way being defined on how to do HTTP as specified with a HTTP:// URL – that we’ve learned is plain-text – over TLS, as part of the http2 work. Alt-Svc is the way. (The header can also be used to just load balance HTTP etc but I’ll ignore that for now)

Mozilla and Firefox is basically the only team that initially stands behind the idea of implementing this in a browser. HTTP:// done over TLS will not be seen nor considered any more secure than ordinary HTTP is and users will not be aware if that happens or not. Only true HTTPS connections will get the padlock, secure cookies and the other goodies true HTTPS sites are known and expected to get and show.

HTTP:// over TLS will just silently send everything through TLS (assuming that it can actually negotiate such a connection), thus making passive monitoring of the network less easy.

Ideally, future http2 capable servers will only require a config entry to be set TRUE to make it possible for clients to do OE on them.

HTTPS is the secure protocol

HTTP:// over TLS is not secure. If you want security and privacy, you should use HTTPS. This said, MITMing HTTPS transfers is still a widespread practice in certain network setups…

TCPcrypt

I find this initiative rather interesting. If implemented, it removes the need for all these application level protocols to do anything about opportunistic approaches and it could instead be handled transparently on TCP level! It still has a long way to go though before we will see anything like this fly in real life.

The future will tell

Is this just a fad that will get no adoption and go away or is it the beginning of something that will change how we do protocols in the future? Time will tell. Many harsh words are being exchanged over this topic in many a debate right now…

(I’m trying to stick to “HTTP:// over TLS” here when referring to doing HTTP OE/OK over TLS. This is partly because RFC2818 that describes how to do HTTPS uses the phrase “HTTP over TLS”…)

http2 explained

http2 front page

I’m hereby offering you all the first version of my document explaining http2, the protocol. It features explanations on the background, basic fundamentals, details on the wire format and something about existing implementations and what’s to expect for the future.

The full PDF currently boasts 27 pages at version 1.0, but I plan to keep up with the http2 development going further and I’m also kind of thinking that I will get at least some user feedback, and I’ll do subsequent updates to improve and extend the document over time. Of course time will tell how good that will work.

The document is edited in libreoffice and that file is available on github, but ODT is really not a format suitable for patches and merges so I hope we can sort out changes with filing issues and sending emails.

curl and proxy headers

Starting in the next curl release, 7.37.0, the curl tool supports the new command line option –proxy-header. (Completely merged at this commit.)

It works exactly like –header does, but will only include the headers in requests sent to a proxy, while the opposite is true for –header: that will only be sent in requests that will go to the end server. But of course, if you use a HTTP proxy and do a normal GET for example, curl will include headers for both the proxy and the server in the request. The bigger difference is when using CONNECT to a proxy, which then only will use proxy headers.

libcurl

For libcurl, the story is slightly different and more complicated since we’re having things backwards compatible there. The new libcurl still works exactly like the former one by default.

CURLOPT_PROXYHEADER is the new option that is the new proxy header option that should be set up exactly like CURLOPT_HTTPHEADER is

CURLOPT_HEADEROPT is then what an application uses to set how libcurl should use the two header options. Again, by default libcurl will keep working like before and use the CURLOPT_HTTPHEADER list in all HTTP requests. To change that behavior and use the new functionality instead, set CURLOPT_HEADEROPT to CURLHEADER_SEPARATE.

Then, the header lists will be handled as separate. An application can then switch back to the old behavior with a unified header list by using CURLOPT_HEADEROPT set to CURLHEADER_UNIFIED.

Reducing the Public Suffix pain?

Let me introduce you to what I consider one of the worst hacks we have in current and modern internet protocols: the Public Suffix List (PSL). This is a list (maintained by Mozilla) with domains that have some kind administrative setup or arrangement that makes sub-domains independent. For example, you can’t be allowed to set cookies for “*.com” because .com is a TLD that has independent domains. But the same thing goes for “*.co.uk” and there’s no hint anywhere about this – except for the Public Suffix List. Then, take that simple little example and extrapolate to a domain system that grows with several new TLDs every month and more. The PSL is now several thousands of entries long.

And cookies isn’t the only thing this is used for. Another really common and perhaps even more important use case is for wildcard matches in TLS server certificates. You should not be allowed to buy and use a cert for “*.co.uk” but you can for “*.yourcompany.co.uk”…

Not really official but still…

If you read the cookie RFC or the spec for how to do TLS wildcard certificate matching you won’t read any line putting it crystal clear that the Suffix List is what you must use and I’m sure different browser solve this slightly differently but in practice and most unfortunately (if you ask me) you must either use the list or make your own to be fully compliant with how the web works 2014.

curl, wget and the PSL

In curl and libcurl, we have so far not taken the PSL into account which is by choice since I’ve not had any decent way to handle it and there are lots of embedded and other use cases that simply won’t be able to cope with that large PSL chunk.

Wget hasn’t had any PSL awareness either, but the recent weeks this has been brought up on the wget list and more attention has been given to this. Work has been initiated to do something about it, which has lead to…

libpsl

Tim Rühsen took the baton and started the libpsl project and its associated mailing list, as a foundation for something for Wget to use to get PSL awareness.

I’ve mostly cheered the effort so far and said that I wouldn’t mind building on this to enhance curl in the future if it just gets a suitable (liberal enough) license and it seems to go in that direction. For curl’s sake, I would like to get a conditional dependency on this so that people without particular size restrictions can use this, and people on more embedded and special-purpose situations can continue to build without PSL support.

If you’re interested in helping out in curl and libcurl in this area, feel most welcome!

dbound

Meanwhile, the IETF has set up a new mailing list called dbound for discussions around PSL and similar issues and it seems very timely!

HTTPbis design team meeting London

I’m writing this just hours after the HTTPbis design team meeting in London 2014 has ended.

Around 30 people attended the meeting i Mozilla’s central London office. The fridge was filled up with drinks, the shelves were full of snacks and goodies. The day could begin. This is the Saturday after the IETF89 week so most people attending had already spent the whole or parts of the week before here in London doing other HTTP and network related work. The HTTPbis sessions at the IETF itself were productive and had already pushed us forward.

We started at 9:30 and we quickly got to work. Mark Nottingham guided us through the day with usual efficiency.

We all basically hang out in a huge room, some in chairs, some in sofas and a bunch of people on the floor or just standing up. We had mikes passed around and the http2 discussions were flowing back and forth depending on the topics and what people felt about them. Some of the issues that were nailed down this time and will end up detailed in the upcoming draft-11 are (strictly speaking, we only discussed the things and formed opinions, as by IETF guidelines we can’t decide things on an offline meeting like this):

  • Priorities of streams will have a dependency graph or direction, making individual  streams less or more important than other
  • A client can send headers without compression and tell the proxy that the header shouldn’t be compressed – used a way to mitigate some of the compression security problems
  • There will be no TLS renegotiation allowed mid-session. Basically a client will have to tear down the connection and negotiate again if suddenly a need to use a client certificate arises.
  • Alt-Svc is the way forward so ALTSVC will appear a new frame in draft-11. This is the way to signal to an application that there is another “route” tIMG_20140308_100453o the same content on the same server. This will allow for what is popularly known as “opportunistic encryption” or at least one sort of that. In short, you can do “plain-text” HTTP over a TLS connection using this…
  • We decided that a server should support gzip contents from clients

There were some other things too handled, but I believe those are the main changes. When the afternoon started to turn long, beers and other beverages were brought out and we did enjoy a relaxing social finale of the day before we split up in smaller groups and headed out in the busy London night to get dinner…

Thanks everyone for a great day. I also appreciated meeting several people in real-life I never met before, only discussed with and read emails from online and of course some old friends I hadn’t seen in a long time!

Oh, there’s also a new rough time frame for http2 going forward. Nearest in time would be the draft-11 at the end of March and another interim in the beginning of June (Boston?).

As a reminder, here’s what was happened for draft-10, and here is http2 draft-10.

Out of all people present today, I believe Mozilla was the company with the largest team (8 attendees) – funnily enough none of us Mozillians there actually work in this office or even in this country.

HTTP2 – the next step

IETFThe HTTPbis working group of the IETF had an interim meeting in Zurich January 22nd to 24th. I participated from remote and I listened in on the discussions over webex and followed the jabber room while the meetings were going on, addressing HTTP2 protocol issues one by one ironing out quirks and progressing forward.

I won’t bore you with details why I wasn’t present in Zurich.

Here’s a couple of quick and brief reflections from my perspective:

Listening in from remote like this is not at all adequately compensating for not being there. A room full of people discussing something is really hard to follow from remote and completely impossible to interact with. It is better than not being able to listen in at all, but it is certainly not a replacement for being there.

It is amazing how much faster people can come to conclusions and fix issues when being in the same room. Issues that have been lingering in the tracker for a very long time could be dealt with and closed within minutes. Things like what to call the protocol in ALPN or to remove the ability to switched off flow control. Not all issues of course…

HTTP2 draft-09 that soon will become draft-10 to reflect the updates from this meeting and more, is from my perspective quite far in its process. It is clearly at a point that seems to be OK with most people and the discussions are now just about details. Of course the devil is in the details and I’m not saying it can’t take a long time to settle on them, but the structure and main concepts of the protocol are probably now established.

There were not very many proxy or server people at the interim. Most of the people seemed to be client-side oriented and some service oriented. I’m personally client-side biased myself but I hope this doesn’t lead to us deciding on things that the “other side” will have problems with down the line.

Firefox nightly supports HTTP2 draft-09 (for https:// URLs) and twitter supports it server side. Enable it in the browser by entering “about:config” in the URL bar and change the config entry called “network.http.spdy.enabled.http2draft” to true. Done.

Some of the biggest HTTP2 changes brought up compared to what draft-09 says include:

  • no more ability to switch off flow control
  • the prioritization field/concept “weighted dependency tree approach”
  • >= TLS 1.2 with ephemeral ciphers
  • MUST not use TLS compression
  • tolerant to TLS false start or at least must accept/buffer application layer data
  • padding

There was also a whole lot of discussions about TLS for http:// urls, proxys, MITM for SSL, opportunistic encryption and more but I believe most of those issues remained at the same position as before – I missed out on parts of the last afternoon so I may have missed some details. It’ll all be revealed in draft-10 and the mailing list I’m sure!

Update: the minutes from the interim meeting.

http2-drawing