More HTTP framing attempts

Monday, March 2nd, 2015

Previously, in my exciting series “improving the HTTP framing checks in Firefox” we learned that I landed a patch, got it backed out, struggled to improve the checks and finally landed the fixed version only to eventually get that one backed out as well.

And now I’ve landed my third version. The amendment I did this time:

When receiving HTTP content that is content-encoded and compressed I learned that when receiving deflate compression there is basically no good way for us to know if the content gets prematurely cut off. They seem to lack the footer too often for it to make any sense in checking for that. gzip streams however end with a footer so they are easier to reliably detect when they are incomplete. (As was discovered before, the Content-Length: is far too often not updated by the server so it is instead wrongly showing the uncompressed size.)

This (deflate vs gzip) knowledge is now used by the patch, meaning that deflate compressed downloads can be cut off without the browser noticing…

Will this version of the fix actually stick? I don’t know. There’s lots of bad voodoo out there in the HTTP world and I’m putting my finger right in the middle of some of it with this change. I’m pretty sure I’ve not written my last blog post on this topic just yet… If it sticks this time, it should show up in Firefox 39.

bolt-cutter

Tightening Firefox’s HTTP framing – again

Thursday, February 12th, 2015

An old http1.1 frameCall me crazy, but I’m at it again. First a little resume from our previous episodes in this exciting saga:

Chapter 1: I closed the 10+ year old bug that made the Firefox download manager not detect failed downloads, simply because Firefox didn’t care if the HTTP 1.1 Content-Length was larger than what was actually saved – after the connection potentially was cut off for example. There were additional details, but that was the bigger part.

Chapter 2: After having been included all the way to public release, we got a whole slew of bug reports immediately when Firefox 33 shipped and we had to revert parts of the fix I did.

Chapter 3.

Will it land before it turns 11 years old? The bug was originally submitted 2004-03-16.

Since chapter two of this drama brought back the original bugs again we still have to do something about them. I fully understand if not that many readers of this can even keep up of all this back and forth and juggling of HTTP protocol details, but this time we’re putting back the stricter frame checks with a few extra conditions to allow a few violations to remain but detect and react on others!

Here’s how I addressed this issue. I wanted to make the checks stricter but still allow some common protocol violations.

In particular I needed to allow two particular flaws that have proven to be somewhat common in the wild and were the reasons for the previous fix being backed out again:

A – HTTP chunk-encoded responses that lack the final 0-sized chunk.

B – HTTP gzipped responses where the Content-Length is not the same as the actual contents.

So, in order to allow A + B and yet be able to detect prematurely cut off transfers I decided to:

  1. Detect incomplete chunks then the transfer has ended. So, if a chunk-encoded transfer ends on exactly a chunk boundary we consider that fine. Good: This will allow case (A) to be considered fine. Bad: It will make us not detect a certain amount of cut-offs.
  2. When receiving a gzipped response, we consider a gzip stream that doesn’t end fine according to the gzip decompressing state machine to be a partial transfer. IOW: if a gzipped transfer ends fine according to the decompressor, we do not check for size misalignment. This allows case (B) as long as the content could be decoded.
  3. When receiving HTTP that isn’t content-encoded/compressed (like in case 2) and not chunked (like in case 1), perform the size comparison between Content-Length: and the actual size received and consider a mismatch to mean a NS_ERROR_NET_PARTIAL_TRANSFER error.

Firefox BallPrefs

When my first fix was backed out, it was actually not removed but was just put behind a config string (pref as we call it) named “network.http.enforce-framing.http1“. If you set that to true, Firefox will behave as it did with my original fix applied. It makes the HTTP1.1 framing fairly strict and standard compliant. In order to not mess with that setting that now has been around for a while (and I’ve also had it set to true for a while in my browser and I have not seen any problems with doing it this way), I decided to introduce my new changes pref’ed behind a separate variable.

network.http.enforce-framing.soft” is the new pref that is set to true by default with my patch. It will make Firefox do the detections outlined in 1 – 3 and setting it to false will disable those checks again.

Now I only hope there won’t ever be any chapter 4 in this story… If things go well, this will appear in Firefox 38.

Chromium

But how do they solve these problems in the Chromium project? They have slightly different heuristics (with the small disclaimer that I haven’t read their code for this in a while so details may have changed). First of all, they do not allow a missing final 0-chunk. Then, they basically allow any sort of misaligned size when the content is gzipped.

Update: this patch was subsequently backed out again due to several bug reports about it. I have yet to analyze exactly what went wrong.

Http2 right now

Sunday, February 1st, 2015

I talked in the Mozilla devroom at FOSDEM 2015. Here are the slides from it. It was recorded on video and I will post a suitable link to that once it becomes available. The talk was meant to be 20 minutes, I think I did it on 22 or something.

curl 7.40.0: unix domain sockets and smb

Thursday, January 8th, 2015

curl and libcurl curl dot-to-dot7.40.0 was just released this morning. There’s a closer look at some of the perhaps more noteworthy changes. As usual, you can find the entire changelog on the curl web site.

HTTP over unix domain sockets

So just before the feature window closed for the pending 7.40.0 release of curl, Peter Wu’s patch series was merged that brings the ability to curl and libcurl to do HTTP over unix domain sockets. This is a feature that’s been mentioned many times through the history of curl but never previously truly implemented. Peter also very nicely adjusted the test server and made two test cases that verify the functionality.

To use this with the curl command line, you specify the socket path to the new –unix-domain option and assuming your local HTTP server listens on that socket, you’ll get the response back just as with an ordinary TCP connection.

Doing the operation from libcurl means using the new CURLOPT_UNIX_SOCKET_PATH option.

This feature is actually not limited to HTTP, you can do all the TCP-based protocols except FTP over the unix domain socket, but it is to my knowledge only HTTP that is regularly used this way. The reason FTP isn’t supported is of course its use of two connections which would be even weirder to do like this.

SMB

SMB is also known as CIFS and is an old network protocol from the Microsoft world access files. curl and libcurl now support this protocol with SMB:// URLs thanks to work by Bill Nagel and Steve Holme.

Security Advisories

Last year we had a large amount of security advisories published (eight to be precise), and this year we start out with two fresh ones already on the 8th day… The ones this time were of course discovered and researched already last year.

CVE-2014-8151 is a way we accidentally allowed an application to bypass the TLS server certificate check if a TLS Session-ID was already cached for a non-checked session – when using the Mac OS SecureTransport SSL backend.

CVE-2014-8150 is a URL request injection. When letting curl or libcurl speak over a HTTP proxy, it would copy the URL verbatim into the HTTP request going to the proxy, which means that if you craft the URL and insert CRLFs (carriage returns and linefeed characters) you can insert your own second request or even custom headers into the request that goes to the proxy.

You may enjoy taking a look at the curl vulnerabilities table.

Bugs bugs bugs

The release notes mention no less than 120 specific bug fixes, which in comparison to other releases is more than average.

Enjoy!

Stricter HTTP 1.1 framing good bye

Sunday, October 26th, 2014

I worked on a patch for Firefox bug 237623 to make sure Firefox would use a stricter check for “HTTP 1.1 framing”, checking that Content-Length is correct and that there’s no broken chunked encoding pieces. I was happy to close an over 10 years old bug when the fix landed in June 2014.

The fix landed and has not caused any grief all the way since June through to the actual live release (Nightlies, Aurora, Beta etc). This change finally shipped in Firefox 33 and I had more or less already started to forget about it, and now things went south really fast.

The amount of broken servers ended up too massive for us and we had to backpedal. The largest amount of problems can be split up in these two categories:

  1. Servers that deliver gzipped content and sends a Content-Length: for the uncompressed data. This seems to be commonly done with old mod_deflate and mod_fastcgi versions on Apache, but we also saw people using IIS reporting this symptom.
  2. Servers that deliver chunked-encoding but who skip the final zero-size chunk so that the stream actually never really ends.

We recognize that not everyone can have the servers fixed – even if all these servers should still be fixed! We now make these HTTP 1.1 framing problems get detected but only cause a problem if a certain pref variable is set (network.http.enforce-framing.http1), and since that is disabled by default they will be silently ignored much like before. The Internet is a more broken and more sad place than I want to accept at times.

We haven’t fully worked out how to also make the download manager (ie the thing that downloads things directly to disk, without showing it in the browser) happy, which was the original reason for bug 237623…

Although the code may now no longer alert anything about HTTP 1.1 framing problems, it will now at least mark the connection not due for re-use which will be a big boost compared to before since these broken framing cases really hurt persistent connections use. The partial transfer return codes for broken SPDY and HTTP/2 transfers remain though and I hope to be able to remain stricter with these newer protocols.

This partial reversion will land ASAP and get merged into patch releases of Firefox 33 and later.

Finally, to top this off. Here’s a picture of an old HTTP 1.1 frame so that you know what we’re talking about.

An old http1.1 frame

Firefox and partial content

Monday, June 16th, 2014

Update: parts of the change mentioned in this blog post has subsequently been reverted since clearly I had a too positive view of the Internet.

Firefox BallOne of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

  1. Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
  2. Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
  3. With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
  4. In curl’s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
  5. It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

Bye bye RFC 2616

Saturday, June 7th, 2014

In August 2007 the IETF HTTPbis work group started to make an update to the HTTP 1.1 specification RFC 2616 (from June 1999) which already was an update to RFC 2068 from 1996. I wasn’t part of the effort back then so I didn’t get to hear the back chatter or what exactly the expectations were on delivery time and time schedule, but I’m pretty sure nobody thought it would take almost seven long years for the update to reach publication status.

On June 6 2014 when RFC 7230 – RFC 7235 were released, the single 176 page document has turned into 6 documents with a total size that is now much larger, and there’s also a whole slew of additional related documents released at the same time.

2616 is deeply carved into my brain so it’ll take some time until I unlearn that, plus the fact that now we need to separate our pointers to one of those separate document instead of just one generic number for the whole thing. Source codes and documents all over now need to be carefully updated to instead refer to the new documents.

And the HTTP/2 work continues to progress at high speed. More about that in a separate blog post soon.

More details on the road from RFC2616 until today can be found in Mark Nottingham’s RFC 2616 is dead.

Less plain-text is better. Right?

Tuesday, May 13th, 2014

Every connection and every user on the Internet is being monitored and snooped at to at least some extent every now and then. Everything from the casual firesheep user in your coffee shop, an admin in your ISP, your parents/kids on your wifi network, your employer on the company network, your country’s intelligence service in a national network hub or just a random rogue person somewhere in the middle of all this.

My involvement in HTTP make me mostly view and participate in this discussion with this protocol primarily in mind, but the discussion goes well beyond HTTP and the concepts can (and will?) be applied to most Internet protocols in the future. You can follow some of these discussions in the httpbis group, the UTA group, the tcpcrypt list on twitter and elsewhere.

IETF just published RFC 7258 which states:

Pervasive Monitoring Is a Widespread Attack on Privacy

Passive monitoring

Most networking surveillance can be done entirely passively by just running the correct software and listening in on the correct cable. Because most internet traffic is still plain-text and readable by anyone who wants to read it when the bytes come flying by. Like your postman can read your postcards.

Opportunistic?

Recently there’s been a fierce discussion going on both inside and outside of IETF and other protocol and standards groups about doing “opportunistic encryption” (OE) and its merits and drawbacks. The term, which in itself is being debated and often is said to be better called “opportunistic keying” (OK) instead, is about having protocols transparently (invisible to the user) upgrade plain-text versions to TLS unauthenticated encrypted versions of the protocols. I’m emphasizing the unauthenticated word there because that’s a key to the debate. Recently I’ve been told that the term “opportunistic security” is the term to use instead…

In the way of real security?

Basically the argument against opportunistic approaches tends to be like this: by opportunistically upgrading plain-text to unauthenticated encrypted communication, sysadmins and users in the world will consider that good enough and they will then not switch to using proper, strong and secure authentication encryption technologies. The less good alternative will hamper the adoption of the secure alternative. Server admins should just as well buy a cert for 10 USD and use proper HTTPS. Also, listeners can still listen in on or man-in-the-middle unauthenticated connections if they capture everything from the start of the connection, including the initial key exchange. Or the passive listener will just change to become an active party and this unauthenticated way doesn’t detect that. OE doesn’t prevent snooping.

Isn’t it better than plain text?

The argument for opportunism here is that there will be nothing to the user that shows that it is “upgrading” to something less bad than plain text. Browsers will not show the padlock, clients will not treat the connection as “secure”. It will just silently and transparently make passive monitoring of networks much harder and it will force actors who truly want to snoop on specific traffic to up their game and probably switch to active monitoring for more cases. Something that’s much more expensive for the listener. It isn’t about the cost of a cert. It is about setting up and keeping the cert up-to-date, about SNI not being widely enough adopted and that we can see only 30% of all sites on the Internet today use HTTPS – for these reasons and others.

HTTP:// over TLS

In the httpbis work group in IETF the outcome of this debate is that there is a way being defined on how to do HTTP as specified with a HTTP:// URL – that we’ve learned is plain-text – over TLS, as part of the http2 work. Alt-Svc is the way. (The header can also be used to just load balance HTTP etc but I’ll ignore that for now)

Mozilla and Firefox is basically the only team that initially stands behind the idea of implementing this in a browser. HTTP:// done over TLS will not be seen nor considered any more secure than ordinary HTTP is and users will not be aware if that happens or not. Only true HTTPS connections will get the padlock, secure cookies and the other goodies true HTTPS sites are known and expected to get and show.

HTTP:// over TLS will just silently send everything through TLS (assuming that it can actually negotiate such a connection), thus making passive monitoring of the network less easy.

Ideally, future http2 capable servers will only require a config entry to be set TRUE to make it possible for clients to do OE on them.

HTTPS is the secure protocol

HTTP:// over TLS is not secure. If you want security and privacy, you should use HTTPS. This said, MITMing HTTPS transfers is still a widespread practice in certain network setups…

TCPcrypt

I find this initiative rather interesting. If implemented, it removes the need for all these application level protocols to do anything about opportunistic approaches and it could instead be handled transparently on TCP level! It still has a long way to go though before we will see anything like this fly in real life.

The future will tell

Is this just a fad that will get no adoption and go away or is it the beginning of something that will change how we do protocols in the future? Time will tell. Many harsh words are being exchanged over this topic in many a debate right now…

(I’m trying to stick to “HTTP:// over TLS” here when referring to doing HTTP OE/OK over TLS. This is partly because RFC2818 that describes how to do HTTPS uses the phrase “HTTP over TLS”…)

http2 explained

Saturday, April 26th, 2014

http2 front page

I’m hereby offering you all the first version of my document explaining http2, the protocol. It features explanations on the background, basic fundamentals, details on the wire format and something about existing implementations and what’s to expect for the future.

The full PDF currently boasts 27 pages at version 1.0, but I plan to keep up with the http2 development going further and I’m also kind of thinking that I will get at least some user feedback, and I’ll do subsequent updates to improve and extend the document over time. Of course time will tell how good that will work.

The document is edited in libreoffice and that file is available on github, but ODT is really not a format suitable for patches and merges so I hope we can sort out changes with filing issues and sending emails.

curl and proxy headers

Friday, April 4th, 2014

Starting in the next curl release, 7.37.0, the curl tool supports the new command line option –proxy-header. (Completely merged at this commit.)

It works exactly like –header does, but will only include the headers in requests sent to a proxy, while the opposite is true for –header: that will only be sent in requests that will go to the end server. But of course, if you use a HTTP proxy and do a normal GET for example, curl will include headers for both the proxy and the server in the request. The bigger difference is when using CONNECT to a proxy, which then only will use proxy headers.

libcurl

For libcurl, the story is slightly different and more complicated since we’re having things backwards compatible there. The new libcurl still works exactly like the former one by default.

CURLOPT_PROXYHEADER is the new option that is the new proxy header option that should be set up exactly like CURLOPT_HTTPHEADER is

CURLOPT_HEADEROPT is then what an application uses to set how libcurl should use the two header options. Again, by default libcurl will keep working like before and use the CURLOPT_HTTPHEADER list in all HTTP requests. To change that behavior and use the new functionality instead, set CURLOPT_HEADEROPT to CURLHEADER_SEPARATE.

Then, the header lists will be handled as separate. An application can then switch back to the old behavior with a unified header list by using CURLOPT_HEADEROPT set to CURLHEADER_UNIFIED.