Archive for the ‘Web’ Category

Firefox and partial content

Monday, June 16th, 2014

Firefox BallOne of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

  1. Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
  2. Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
  3. With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
  4. In curl’s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
  5. It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

Http2 interim meeting NYC

Sunday, June 8th, 2014

On June 5th, around thirty people sat down around a huge table in a conference room on the 4th floor in the Google offices in New York City, with a heavy rain pouring down outside.

It was time for another IETF http2 interim meeting. The attendees were all participants in the HTTPbis work group and came from a wide variety of companies and countries. The major browser vendors were represented there, and so were operators and big service providers and some proxy people. Most of the people who have been speaking up on the mailing list over the last year or so, unfortunately with a couple of people notably absent. (And before anyone asks, yes we are a group where the majority is old males like me.)

Most people present knew many of the others already, which helped to create a friendly familiar spirit and we quickly got started on the Thursday morning working our way through the rather long lits of issues to deal with. When we had our previous interim meeting in London, I think most of us though we would’ve been further along today but recent development and discussions on the list had actually brought back a lot of issues we though we were already done with and we now reiterated a whole slew of subjects. We weren’t allowed to take photographs indoors so you won’t see any pictures of this opportunity from me here.

Google offices building logo

We did close many issues and I’ll just quickly mention some of the noteworthy ones here…

Extensions

We started out with the topic of “extensions”. Should we revert the decision from Zurich (where it was decided that we shouldn’t allow extensions in http2) or was the current state of the protocol the right one? The arguments for allowing extensions included that we’d keep getting requests for new things to add unless we have a way and that some of the recent stuff we’ve added really could’ve been done as extensions instead. An argument against it is that it makes things much simpler and reliable if we just document exactly what the protocol has and is, and removing “optional” behavior from the protocol has been one of the primary mantas along the design process.

The discussion went back and forth for a long time, and after almost three hours we had kind of a draw. Nobody was firmly against “the other” alternative but the two sides also seemed to have roughly the same amount of support. Then it was yet again time for the coin toss to guide us. Martin brought out an Australian coin and … the next protocol draft will allow extensions. Again. This also forces implementation to have to read and skip all unknown frames it receives compared to the existing situation where no unknown frames can ever occur.

BLOCKED as an extension

A rather given first candidate for an extension was the BLOCKED frame. At the time BLOCKED was added to the protocol it was explicitly added into the spec because we didn’t have extensions – and it is now being lifted out into one.

ALTSVC as an extension

What received slightly more resistance was the move to move out the ALTSVC frame as well. It was argued that the frame isn’t mandatory to support and therefore easily can be made into an extension.

Simplified padding

Another small change of the wire format since draft-12 was the removal of the high byte for padding to simplify. It reduces the amount you can pad a single frame but you can easily pad more using other means if you really have to, and there were numbers presented that said that 255 bytes were enough with HTTP 1.1 already so probably it will be enough for version 2 as well.

Schedule

There will be a new draft out really soon: draft -13. Martin, our editor of the spec, says he’ll be able to ship it in a week. That is intended to be the last draft, intended for implementation and it will then be expected to get deployed rather widely to allow us all in the industry to see how it works and be able to polish details or wordings that may still need it.

We had numerous vendors and HTTP stack implementers in the room and when we discussed schedule for when various products will be able to see daylight. If we all manage to stick to the plans. we may just have plenty of products and services that support http2 by the September/October time frame. If nothing major is found in this latest draft, we’re looking at RFC status not too far into 2015.

Meeting summary

I think we’re closing in for real now and I have good hopes for the protocol and our progress to a really wide scale deployment across the Internet. The HTTPbis group is an awesome crowd to work with and I had a great time. Our hosts took good care of us and made sure we didn’t lack any services or supplies. Extra thanks go to those of you who bought me dinners and to those who took me out to good beer places!

My http2 document

Yeah, it will now become somewhat out of date and my plan is to update it once the next draft ships. I’ll also do another http2 presentation already this week so I hope to also post an updated slide set soonish. Stay tuned!

Wireshark

My plan is to cooperate with the other Wireshark hackers and help making sure we have the next draft version supported in Wireshark really soon after its published.

curl and nghttp2

Most of the differences introduced are in the binary format so nghttp2 will need to be updated again – it is the library curl uses for the wire format of http2. The curl parts will need some adjustments, for example for Content-Encoding gzip that no longer is implicit but there should be little to do in the curl code for this draft bump.

Less plain-text is better. Right?

Tuesday, May 13th, 2014

Every connection and every user on the Internet is being monitored and snooped at to at least some extent every now and then. Everything from the casual firesheep user in your coffee shop, an admin in your ISP, your parents/kids on your wifi network, your employer on the company network, your country’s intelligence service in a national network hub or just a random rogue person somewhere in the middle of all this.

My involvement in HTTP make me mostly view and participate in this discussion with this protocol primarily in mind, but the discussion goes well beyond HTTP and the concepts can (and will?) be applied to most Internet protocols in the future. You can follow some of these discussions in the httpbis group, the UTA group, the tcpcrypt list on twitter and elsewhere.

IETF just published RFC 7258 which states:

Pervasive Monitoring Is a Widespread Attack on Privacy

Passive monitoring

Most networking surveillance can be done entirely passively by just running the correct software and listening in on the correct cable. Because most internet traffic is still plain-text and readable by anyone who wants to read it when the bytes come flying by. Like your postman can read your postcards.

Opportunistic?

Recently there’s been a fierce discussion going on both inside and outside of IETF and other protocol and standards groups about doing “opportunistic encryption” (OE) and its merits and drawbacks. The term, which in itself is being debated and often is said to be better called “opportunistic keying” (OK) instead, is about having protocols transparently (invisible to the user) upgrade plain-text versions to TLS unauthenticated encrypted versions of the protocols. I’m emphasizing the unauthenticated word there because that’s a key to the debate. Recently I’ve been told that the term “opportunistic security” is the term to use instead…

In the way of real security?

Basically the argument against opportunistic approaches tends to be like this: by opportunistically upgrading plain-text to unauthenticated encrypted communication, sysadmins and users in the world will consider that good enough and they will then not switch to using proper, strong and secure authentication encryption technologies. The less good alternative will hamper the adoption of the secure alternative. Server admins should just as well buy a cert for 10 USD and use proper HTTPS. Also, listeners can still listen in on or man-in-the-middle unauthenticated connections if they capture everything from the start of the connection, including the initial key exchange. Or the passive listener will just change to become an active party and this unauthenticated way doesn’t detect that. OE doesn’t prevent snooping.

Isn’t it better than plain text?

The argument for opportunism here is that there will be nothing to the user that shows that it is “upgrading” to something less bad than plain text. Browsers will not show the padlock, clients will not treat the connection as “secure”. It will just silently and transparently make passive monitoring of networks much harder and it will force actors who truly want to snoop on specific traffic to up their game and probably switch to active monitoring for more cases. Something that’s much more expensive for the listener. It isn’t about the cost of a cert. It is about setting up and keeping the cert up-to-date, about SNI not being widely enough adopted and that we can see only 30% of all sites on the Internet today use HTTPS – for these reasons and others.

HTTP:// over TLS

In the httpbis work group in IETF the outcome of this debate is that there is a way being defined on how to do HTTP as specified with a HTTP:// URL – that we’ve learned is plain-text – over TLS, as part of the http2 work. Alt-Svc is the way. (The header can also be used to just load balance HTTP etc but I’ll ignore that for now)

Mozilla and Firefox is basically the only team that initially stands behind the idea of implementing this in a browser. HTTP:// done over TLS will not be seen nor considered any more secure than ordinary HTTP is and users will not be aware if that happens or not. Only true HTTPS connections will get the padlock, secure cookies and the other goodies true HTTPS sites are known and expected to get and show.

HTTP:// over TLS will just silently send everything through TLS (assuming that it can actually negotiate such a connection), thus making passive monitoring of the network less easy.

Ideally, future http2 capable servers will only require a config entry to be set TRUE to make it possible for clients to do OE on them.

HTTPS is the secure protocol

HTTP:// over TLS is not secure. If you want security and privacy, you should use HTTPS. This said, MITMing HTTPS transfers is still a widespread practice in certain network setups…

TCPcrypt

I find this initiative rather interesting. If implemented, it removes the need for all these application level protocols to do anything about opportunistic approaches and it could instead be handled transparently on TCP level! It still has a long way to go though before we will see anything like this fly in real life.

The future will tell

Is this just a fad that will get no adoption and go away or is it the beginning of something that will change how we do protocols in the future? Time will tell. Many harsh words are being exchanged over this topic in many a debate right now…

(I’m trying to stick to “HTTP:// over TLS” here when referring to doing HTTP OE/OK over TLS. This is partly because RFC2818 that describes how to do HTTPS uses the phrase “HTTP over TLS”…)

licensed to get shared

Tuesday, May 6th, 2014

As my http2 presentation is about to get its 16,000th viewer over at Slideshare I just have to take a moment and reflect over that fact.

Sixteen thousand viewers. I’ve uploaded slides there before over the years but no other presentation has gotten even close to this amount of attention even though some of them have been collecting views for years by now.

http2 presentation screenshot

I wrote my http2 explained document largely due to the popularity of my presentation and the stream of questions and curiosity that brought to life. Within just a couple of days, that 27 page document had been downloaded more than 2,000 times and by now over 5000 times. This is almost 7MB of PDF which I believe raises the bar for the ordinary casual browser to not download it without having an interest and intention to at least glance through it. Of course I realize a large portion of said downloads are never really read.

Someone suggested to me (possibly in jest) that I should convert these into ebooks and “charge 1 USD a piece to get some profit out of them”. I really won’t and I would have a struggle to do that. It has been said before but in my case it is indeed true: I stand on the shoulders of giants. I’ve just collected information and written down texts that mostly are ideas, suggestions and conclusions others have already made in various other forums, lists or documents. I wouldn’t feel right charging for that nor depriving anyone the rights and freedoms to create derivatives and continue building on what I’ve done. I’m just the curator and janitor here. Besides, I already have an awesome job at an awesome company that allows me to work full time on open source – every day.

The next phase started thanks to the open license. A friendly volunteer named Vladimir Lettiev showed up and translated the entire document into Russian and now suddenly the reach of the text is vastly expanded into a territory where it previously just couldn’t penetrate. With using people’s native languages, information can really trickle down to a much larger audience. Especially in regions that aren’t very Englishified.

http2 explained

Saturday, April 26th, 2014

http2 front page

I’m hereby offering you all the first version of my document explaining http2, the protocol. It features explanations on the background, basic fundamentals, details on the wire format and something about existing implementations and what’s to expect for the future.

The full PDF currently boasts 27 pages at version 1.0, but I plan to keep up with the http2 development going further and I’m also kind of thinking that I will get at least some user feedback, and I’ll do subsequent updates to improve and extend the document over time. Of course time will tell how good that will work.

The document is edited in libreoffice and that file is available on github, but ODT is really not a format suitable for patches and merges so I hope we can sort out changes with filing issues and sending emails.

curl and proxy headers

Friday, April 4th, 2014

Starting in the next curl release, 7.37.0, the curl tool supports the new command line option –proxy-header. (Completely merged at this commit.)

It works exactly like –header does, but will only include the headers in requests sent to a proxy, while the opposite is true for –header: that will only be sent in requests that will go to the end server. But of course, if you use a HTTP proxy and do a normal GET for example, curl will include headers for both the proxy and the server in the request. The bigger difference is when using CONNECT to a proxy, which then only will use proxy headers.

libcurl

For libcurl, the story is slightly different and more complicated since we’re having things backwards compatible there. The new libcurl still works exactly like the former one by default.

CURLOPT_PROXYHEADER is the new option that is the new proxy header option that should be set up exactly like CURLOPT_HTTPHEADER is

CURLOPT_HEADEROPT is then what an application uses to set how libcurl should use the two header options. Again, by default libcurl will keep working like before and use the CURLOPT_HTTPHEADER list in all HTTP requests. To change that behavior and use the new functionality instead, set CURLOPT_HEADEROPT to CURLHEADER_SEPARATE.

Then, the header lists will be handled as separate. An application can then switch back to the old behavior with a unified header list by using CURLOPT_HEADEROPT set to CURLHEADER_UNIFIED.

Presentation: what is http2

Wednesday, April 2nd, 2014

We had the 14th meetup with foss-sthlm yesterday and I talked about http2 for the almost 100 attendees in the audience. See my slides at slideshare:

Http2 from Daniel Stenberg

groups.google.com hates greylisting

Thursday, March 27th, 2014

Dear Google,

Here’s a Wikipedia article for you: Greylisting.

After you’ve read that, then consider the error message I always get for my groups.google.com account when you disable mail sending to me due to “bouncing”:

Bounce status Your email address is currently flagged as bouncing. For additional information or to correct this, view your email status here [link].

Following that link I get to read the reason:

“Google tried to deliver your message, but it was rejected by the server for the recipient domain haxx.se by [mailserver]. The error that the other server returned was: 451 4.7.1 Greylisting in action, please come back later”

See, even the error message spells out what it is all about!

Thanks to this feature of Google groups, I cannot participate in any such lists/groups for as long as I keep my greylisting activated since it’ll keep disabling mail delivery to me.

Enabling greylisting decreased my spam flood to roughly a third of the previous volume (and now I’m at 500-1000 spam emails/day) so I’m not ready to disable it yet. I just have to not use google groups.

Update: I threw in the towel and I now whitelist google.com servers to get around this problem…

http2 in curl

Sunday, March 9th, 2014

While the first traces of http2 support in curl was added already back in September 2013 it hasn’t been until recently it actually was made useful. There’s been a lot of http2 related activities in the curl team recently and in the late January 2014 we could run our first command line inter-op tests against public http2 (draft-09) servers on the Internet.

There’s a lot to be said about http2 for those not into its nitty gritty details, but I’ll focus on the curl side of this universe in this blog post. I’ll do separate posts and presentations on http2 “internals” later.

A quick http2 overview

http2 (without the minor version, as per what the IETF work group has decided on) is a binary protocol that allows many logical streams multiplexed over the same physical TCP connection, it features compressed headers in both directions and it has stream priorities and more. It is being designed to maintain the user concepts and paradigms from HTTP 1.1 so web sites don’t have to change contents and web authors won’t need to relearn a lot. The web will not break because of http2, it will just magically work a little better, a little smoother and a little faster.

In libcurl we build http2 support with the help of the excellent library called nghttp2, which takes care of all the binary protocol details for us. You’ll also have to build it with a new enough version of the SSL library of your choice, as http2 over TLS will require use of some fairly recent TLS extensions that not many older releases have and several TLS libraries still completely lack!

The need for an extension is because with speaking TLS over port 443 which HTTPS implies, the current and former web infrastructure assumes that we will speak HTTP 1.1 over that, while we now want to be able to instead say we want to talk http2. When Google introduced SPDY then pushed for a new extension called NPN to do this, which when taken through the standardization in IETF has been forked, changed and renamed to ALPN with roughly the same characteristics (I don’t know the specific internals so I’ll stick to how they appear from the outside).

So, NPN and especially ALPN are fairly recent TLS extensions so you need a modern enough SSL library to get that support. OpenSSL and NSS both support NPN and ALPN with a recent enough version, while GnuTLS only supports ALPN. You can build libcurl to use any of these threes libraries to get it to talk http2 over TLS.

http2 using libcurl

(This still describes what’s in curl’s git repository, the first release to have this level of http2 support is the upcoming 7.36.0 release.)

Users of libcurl who want to enable http2 support will only have to set CURLOPT_HTTP_VERSION to CURL_HTTP_VERSION_2_0 and that’s it. It will make libcurl try to use http2 for the HTTP requests you do with that handle.

For HTTP URLs, this will make libcurl send a normal HTTP 1.1 request with an offer to the server to upgrade the connection to version 2 instead. If it does, libcurl will continue using http2 in the clear on the connection and if it doesn’t, it’ll continue using HTTP 1.1 on it. This mode is what Firefox and Chrome will not support.

For HTTPS URLs, libcurl will use NPN and ALPN as explained above and offer to speak http2 and if the server supports it. there will be http2 sweetness from than point onwards. Or it selects HTTP 1.1 and then that’s what will be used. The latter is also what will be picked if the server doesn’t support ALPN and NPN.

Alt-Svc and ALTSVC are new things planned to show up in time for http2 draft-11 so we haven’t really thought through how to best support them and provide their features in the libcurl API. Suggestions (and patches!) are of course welcome!

http2 with curl

Hardly surprising, the curl command line tool also has this power. You use the –http2 command line option to switch on the libcurl behavior as described above.

Translated into old-style

To reduce transition pains and problems and to work with the rest of the world to the highest possible degree, libcurl will (decompress and) translate received http2 headers into http 1.1 style headers so that applications and users will get a stream of headers that look very much the way you’re used to and it will produce an initial response line that says HTTP 2.0 blabla.

Building (lib)curl to support http2

See the README.http2 file in the lib/ directory.

This is still a draft version of http2!

I just want to make this perfectly clear: http2 is not out “for real” yet. We have tried our http2 support somewhat at the draft-09 level and Tatsuhiro has worked on the draft-10 support in nghttp2. I expect there to be at least one more draft, but perhaps even more, before http2 becomes an official RFC. We hope to be able to stay on the frontier of http2 and deliver support for the most recent draft going forward.

PS. If you try any of this and experience any sort of problems, please speak to us on the curl-library mailing list and help us smoothen out whatever problem you got!

cURL

HTTP2 – the next step

Friday, January 24th, 2014

IETFThe HTTPbis working group of the IETF had an interim meeting in Zurich January 22nd to 24th. I participated from remote and I listened in on the discussions over webex and followed the jabber room while the meetings were going on, addressing HTTP2 protocol issues one by one ironing out quirks and progressing forward.

I won’t bore you with details why I wasn’t present in Zurich.

Here’s a couple of quick and brief reflections from my perspective:

Listening in from remote like this is not at all adequately compensating for not being there. A room full of people discussing something is really hard to follow from remote and completely impossible to interact with. It is better than not being able to listen in at all, but it is certainly not a replacement for being there.

It is amazing how much faster people can come to conclusions and fix issues when being in the same room. Issues that have been lingering in the tracker for a very long time could be dealt with and closed within minutes. Things like what to call the protocol in ALPN or to remove the ability to switched off flow control. Not all issues of course…

HTTP2 draft-09 that soon will become draft-10 to reflect the updates from this meeting and more, is from my perspective quite far in its process. It is clearly at a point that seems to be OK with most people and the discussions are now just about details. Of course the devil is in the details and I’m not saying it can’t take a long time to settle on them, but the structure and main concepts of the protocol are probably now established.

There were not very many proxy or server people at the interim. Most of the people seemed to be client-side oriented and some service oriented. I’m personally client-side biased myself but I hope this doesn’t lead to us deciding on things that the “other side” will have problems with down the line.

Firefox nightly supports HTTP2 draft-09 (for https:// URLs) and twitter supports it server side. Enable it in the browser by entering “about:config” in the URL bar and change the config entry called “network.http.spdy.enabled.http2draft” to true. Done.

Some of the biggest HTTP2 changes brought up compared to what draft-09 says include:

  • no more ability to switch off flow control
  • the prioritization field/concept “weighted dependency tree approach”
  • >= TLS 1.2 with ephemeral ciphers
  • MUST not use TLS compression
  • tolerant to TLS false start or at least must accept/buffer application layer data
  • padding

There was also a whole lot of discussions about TLS for http:// urls, proxys, MITM for SSL, opportunistic encryption and more but I believe most of those issues remained at the same position as before – I missed out on parts of the last afternoon so I may have missed some details. It’ll all be revealed in draft-10 and the mailing list I’m sure!

Update: the minutes from the interim meeting.

http2-drawing