Category Archives: Technology

Really everything related to technology

Changing networks with Firefox running

September 26, 2014 Daniel Stenberg 2 Comments

Short recap: I work on network code for Mozilla. Bug 939318 is one of “mine” – yesterday I landed a fix (a patch series with 6 individual patches) for this and I wanted to explain what goodness that should (might?) come from this!

diffstat

diffstat reports this on the complete patch series:

29 files changed, 920 insertions(+), 162 deletions(-)

The change set can be seen in mozilla-central here. But I guess a proper description is easier for most…

The bouncy road to inclusion

This feature set and associated problems with it has been one of the most time consuming things I’ve developed in recent years, I mean in relation to the amount of actual code produced. I’ve had it “landed” in the mozilla-inbound tree five times and yanked out again before it landed correctly (within a few hours), every time of course reverted again because I had bugs remaining in there. The bugs in this have been really tricky with a whole bunch of timing-dependent and race-like problems and me being unfamiliar with a large part of the code base that I’m working on. It has been a highly frustrating journey during periods but I’d like to think that I’ve learned a lot about Firefox internals partly thanks to this resistance.

As I write this, it has not even been 24 hours since it got into m-c so there’s of course still a risk there’s an ugly bug or two left, but then I also hope to fix the pending problems without having to revert and re-apply the whole series…

Many ways to connect to networks

In many network setups today, you get an environment and a network “experience” that is crafted for that particular place. For example you may connect to your work over a VPN where you get your company DNS and you can access sites and services you can’t even see when you connect from the wifi in your favorite coffee shop. The same thing goes for when you connect to that captive portal over wifi until you realize you used the wrong SSID and you switch over to the access point you were supposed to use.

For every one of these setups, you get different DHCP setups passed down and you get a new DNS server and so on.

These days laptop lids are getting closed (and the machine is put to sleep) at one place to be opened at a completely different location and rarely is the machine rebooted or the browser shut down.

Switching between networks

Switching from one of the networks to the next is of course something your operating system handles gracefully. You can even easily be connected to multiple ones simultaneously like if you have both an Ethernet card and wifi.

Enter browsers. Or in this case let’s be specific and talk about Firefox since this is what I work with and on. Firefox – like other browsers – will cache images, it will cache DNS responses, it maintains connections to sites a while even after use, it connects to some sites even before you “go there” and so on. All in the name of giving the users an as good and as fast experience as possible.

The combination of keeping things cached and alive, together with the fact that switching networks brings new perspectives and new “truths” offers challenges.

Realizing the situation is new

The changes are not at all mind-bending but are basically these three parts:

Make sure that we detect network changes, even if just the set of available interfaces change. Send an event for this.
Make sure the necessary parts of the code listens and understands this “network topology changed” event and acts on it accordingly
Consider coming back from “sleep” to be a network changed event since we just cannot be sure of the network situation anymore.

The initial work has been made for Windows only but it allows us to smoothen out any rough edges before we continue and make more platforms support this.

The network changed event can be disabled by switching off the new “network.notify.changed” preference. If you do end up feeling a need for that, I really hope you file a bug explaining the details so that we can work on fixing it!

Act accordingly

So what is acting properly? What if the network changes in a way so that your active connections suddenly can’t be used anymore due to the new rules and routing and what not? We attack this problem like this: once we get a “network changed” event, we “allow” connections to prove that they are still alive and if not they’re torn down and re-setup when the user tries to reload or whatever. For plain old HTTP(S) this means just seeing if traffic arrives or can be sent off within N seconds, and for websockets, SPDY and HTTP2 connections it involves sending an actual ping frame and checking for a response.

The internal DNS cache was a bit tricky to handle. I initially just flushed all entries but that turned out nasty as I then also killed ongoing name resolves that caused errors to get returned. Now I instead added logic that flushes all the already resolved names and it makes names “in transit” to get resolved again so that they are done on the (potentially) new network that then can return different addresses for the same host name(s).

This should drastically reduce the situation that could happen before when Firefox would basically just freeze and not want to do any requests until you closed and restarted it. (Or waited long enough for other timeouts to trigger.)

The ‘N seconds’ waiting period above is actually 5 seconds by default and there’s a new preference called “network.http.network-changed.timeout” that can be altered at will to allow some experimentation regarding what the perfect interval truly is for you.

Initially on Windows only

My initial work has been limited to getting the changed event code done for the Windows back-end only (since the code that figures out if there’s news on the network setup is highly system specific), and now when this step has been taken the plan is to introduce the same back-end logic to the other platforms. The code that acts on the event is pretty much generic and is mostly in place already so it is now a matter of making sure the event can be generated everywhere.

My plan is to start on Firefox OS and then see if I can assist with the same thing in Firefox on Android. Then finally Linux and Mac.

I started on Windows since Windows is one of the platforms with the largest amount of Firefox users and thus one of the most prioritized ones.

More to do

There’s separate work going on for properly detecting captive portals. You know the annoying things hotels and airports for example tend to have to force you to do some login dance first before you are allowed to use the internet at that location. When such a captive portal is opened up, that should probably qualify as a network change – but it isn’t yet.

cURL and libcurl, Security

Using APIs without reading docs

September 18, 2014 Daniel Stenberg 1 Comment

This morning, my debug session was interrupted for a brief moment when two friends independently of each other pinged me to inform me about a talk at the current SEC-T conference going on here in Stockholm right now. It was yet again time to bring up the good old fun called libcurl API bashing. Again from the angle that users who don’t read the API docs might end up using it wrong.

Updated: You can see Meredith Patterson’s talk here, and the libcurl parts start at 24:15.

The specific libcurl topic at hand once again mostly had the CURLOPT_VERIFYHOST option in focus, with basically is the same argument that was thrown at us two years ago when libcurl was said to be dangerous. It is not a boolean. It is an option that takes (or took) three different values, where 2 is the secure level and 0 is disabled.

(This picture is a screengrab from the live stream off youtube, I don’t have any link to a stored version of it yet. Click it for slightly higher resolution.)

Speaker Meredith L. Patterson actually spoke for quite a long time about curl and its options to verify server certificates. While I will agree that she has a few good points, it was still riddled with errors and I think she deliberately phrased things in a manner to make the talk good and snappy rather than to be factually correct and trying to understand why things are like they are.

The VERIFYHOST option apparently sounds as if it takes a boolean (accordingly), but it doesn’t. She says verifying a certificate has to be a Yes/No question so obviously it is a boolean. First, let’s be really technical: the libcurl options that take numerical values always accept a ‘long’ and all documentation specify which values you can pass in. None of them are boolean, not by actual type in the C language and not described like that in the man pages. There are however language bindings running on top of libcurl that may use booleans for the values that take 0 or 1, but there’s no guarantee we won’t add more values in a future to numerical options.

I wrote down a few quotes from her that I’d like to address.

“In order for it to do anything useful, the value actually has to be set to two”

I get it, she wants a fun presentation that makes the audience listen and grin cheerfully. But this is highly inaccurate. libcurl has it set to verify by default. An application doesn’t have to set it to anything. The only reason to set this value is if you’re not happy with checking the cert unconditionally, and then you’ve already wondered off the secure route.

“All it does when set to to two is to check that the common name in the cert matches the host name in the URL. That’s literally all it does.”

No, it’s not. It “only” verifies the host name curl connects to against the name hints in the server cert, yes, but that’s a lot more than just the common name field.

“there’s been 10 versions and they haven’t fixed this yet […] the docs still say they’re gonna fix this eventually […] I wanna know when eventually is”

Qualified BS and ignorance of details. Let’s see the actual code first: it ignores the 1 value and returns an error and thus leaves the internal default 2, Alas, code that sets 1 or 2 gets the same effect == verified certificate. Why is this a problem?

Then, she says she really wants to know when “eventually” is. (The docs say “Future versions will…”) So if she was so curious you’d think she would’ve tried to ask us? We’re an accessible bunch, on mailing lists, on IRC and on twitter. No she didn’t ask.

But perhaps most importantly: did she really consider why it returns an error for 1? Since libcurl silently accepted 1 as a value for something like 10 years, there are a lot of old installations “out there” in the wild, and by returning an error for 1 we try to make applications notice and adjust. By silently accepting 1 without errors, there would be no notice and people will keep using 1 in new applications as well and thus when running such an newly written application with an older libcurl – you’d be back to having the security problem again. So, we have the error there to improve the situation.

“a peer is someone like you […] a host is a server”

I’m a networking guy since 20+ years and I’m not used to people having a hard time to understand these terms. While perhaps there are rookies out in the world who don’t immediately understand some terms in the curl option names, should we really be criticized for that? I find that a hilarious critique. Also, these names were picked 13 years ago and we have them around for compatibility and API stability.

“why would you ever want to …”

Welcome to the real world. Why would an application author ever want to have these options to something else than just full check and no check? Because people and software development is a large world with many different desires and use case scenarios and curl is more widely used and abused than what many people consider. Lots of people have wanted something else than just a Yes/No to server cert verification. In fact, I’ve had many users ask for even more switches and fine-grained ways to fiddle with verification. Yes/No is a lay mans simplified view of certificate verification.

SEC-T curl slide

(This picture is the slide from the above picture, just zoomed and straightened out a bit.)

API age, stability and organic growth

We started working on libcurl in spring 1999, we added the CURLOPT_SSL_VERIFYPEER option in October 2000 and we added CURLOPT_SSL_VERIFYHOST in August 2001. All that quite a long time ago.

Then add thousands of hours, hundreds of hackers, thousands of applications, a user count that probably surpasses one billion users by now. Then also add the fact that option names are sticky in the way we write docs, examples pop up all over the internet and everyone who’s close to the project learns them by name and spirit and we quite simply grow attached to them and the way they work. Changing the name of an option is really painful and cause of a lot of confusion.

I’ve instead tried to more and more emphasize the functionality in the docs, to stress what the options do and how to do server cert verifications with curl the safe way.

I can’t force users to read docs. I can’t forbid users to blindly assume something and I’m not in control of, nor do I want to affect, the large population of third party bindings that exist for using on top of libcurl to cater for every imaginable programming language – and some of them may of course themselves have documentation problems and what not.

Would I change some of the APIs and names for options we have in libcurl if I would redo them today? Yes I would.

So what do we do about it?

I think this is the only really interesting question to take from all this. Everyone wants stable APIs. Everyone wants sensible and easy to understand APIs and as we can see they should also basically be possible to figure out without reading any documentation. And yet the API has to be powerful and flexible enough to be really useful for all those different applications.

At this point where we have these options that we do, when you’ve done your mud slinging and the finger of blame is firmly pointed at us. How exactly do you suggest we move forward to fix these claimed problems?

Taking it personally

Before anyone tells me to not take it personally: curl is my biggest hobby and a project I’ve spent many years and thousands of hours on. Of course I take it personally, otherwise I would’ve stopped working in the project a long time ago. This is personal to me. I give it my loving care and personal energy and then someone comes here and throw ill-founded and badly researched criticisms at me. I think criticizers of open source projects should learn to discuss the matters with the projects as their primary way instead of using it to make their conference presentations become more feisty.

Development, Open Source, Technology, Work

Snaxx delivers

September 17, 2014 Daniel Stenberg

Late in the year 1999 I quit my job. I handed over a signed paper where I wrote that I quit and then I started my new job first thing in the year 2000. I had a bunch of friends at the work I left and together with my closest friends (who coincidentally also switched jobs at roughly the same time) we decided we needed a way to keep in touch with friends that isn’t associated with our current employer.

The fix, the “employer independent” social thing to help us keep in touch with friends and colleagues in the industry, started on the last of February 2000. The 29th of February, since it was a leap year and that fact alone is a subject that itself must’ve been discussed at that meetup.

Snaxx was born.

Snaxx is getting a bunch of friends to a pub somewhere in Stockholm. Preferably a pub with lots of great beers and a sensible sound situation. That means as little music as possible and certainly no TVs or anything. We keep doing them at a pace of two or three per year or so.

Yesterday we had the 31st Snaxx and just under 30 guests showed up (that might actually have been the new all time high). We had many great beers, food and we argued over bug reporting, discussed source code formats, electric car charging, C64 nostalgia, mentioned Linux kernel debugging methods, how to transition from Erlang to javascript development and a whole load of other similarly very important topics. The Bishops Arms just happens to be a brand of pubs here that have a really sensible view on how to run pubs to be suitable for our events so yesterday we once again visited one of their places.

Thanks for a great time yesterday, friends! I’ll be setting up a date for number 32 soon. I figure it’ll be in the January 2015 time frame…If you want to get notified with an email, sign up yourself on the snaxx mailing list.

A few pictures from yesterday can be found on the Snaxx-31 G+ event page.

cURL and libcurl, Network, Open Source

Daladevelop hackathon

September 15, 2014 Daniel Stenberg

On Saturday the 13th of September, I took part in a hackathon in Falun Sweden organized by Daladevelop.

20-something hacker enthusiasts gathered in a rather large and comfortable room in this place, an almost three hour drive from my home. A number of talks and lectures were held through the day and the difficulty level ranged from newbie to more advanced. My own contribution was a talk about curl followed by one about HTTP/2. Blabbermouth as I am, I exhausted the friendly audience by talking a good total of almost 90 minutes straight. I got a whole range of clever and educated questions and I think and hope we all had a good time as a result.

The organizers ran a quiz for two-person teams. I teamed up with Andreas Olsson in team Emacs, and after having identified x86 assembly, written binary, spotted perl, named Ada Lovelace, used the term lightfoot and provided about 15 more answers we managed to get first prize and the honor of having beaten the others. Great fun!

Firefox, Mozilla, Network

HTTP/2 interop pains

September 2, 2014 Daniel Stenberg

At around 06:49 CEST on the morning of August 27 2014, Google deployed an HTTP/2 draft-14 implementation on their front-end servers that handle logins to Google accounts (and possibly others). Those at least take care of all the various login stuff you do with Google, G+, gmail, etc.

The little problem with that was just that their implementation of HTTP2 is in disagreement with all existing client implementations of that same protocol at that draft level. Someone immediately noticed this problem and filed a bug against Firefox.

The Firefox Nightly and beta versions have HTTP2 enabled by default and so users quickly started to notice this and a range of duplicate bug reports have been filed. And keeps being filed as more users run into this problem. As far as I know, Chrome does not have this enabled by default so much fewer Chrome users get this ugly surprise.

The Google implementation has a broken cookie handling (remnants from the draft-13 it looks like by how they do it). As I write this, we’re on the 7th day with this brokenness. We advice bleeding-edge users of Firefox to switch off HTTP/2 support in the mean time until Google wakes up and acts.

You can actually switch http2 support back on once you’ve logged in and it then continues to work fine. Below you can see what a lovely (wildly misleading) error message you get if you try http2 against Google right now with Firefox:

google-http2-draft14-cookies

This post is being debated on hacker news.

Updated: 20:14 CEST: There’s a fix coming, that supposedly will fix this problem on Thursday September 4th.

Update 2: In the morning of September 4th (my time), Google has reverted their servers to instead negotiate SPDY 3.1 and Firefox is fine with this.

Technology, Work

The “right” keyboard layout

August 20, 2014 Daniel Stenberg 6 Comments

I’ve never considered myself very picky about the particular keyboard I use for my machines. Sure, I work full-time and spare time in front of the same computer and thus I easily spend 2500-3000 hours a year in front of it but I haven’t thought much about it. I wish I had some actual stats on how many key-presses I do on my keyboard on an average day or year or so.

Then, one of these hot summer days this summer I left the roof window above my work place a little bit too much open when a very intense rain storm hit our neighborhood when I was away for a brief moment and to put it shortly, the huge amounts of water that poured in luckily only destroyed one piece of electronics for me: my trusty old keyboard. The keyboard I just randomly picked from some old computer without any consideration a bunch of years ago.

So the old was dead, I just picked another keyboard I had lying around.

But man, very soft rubber-style keys are very annoying to work with. Then I picked another with a weird layout and a control-key that required a little too much pressure to work for it to be comfortable. So, my race for a good enough keyboard had begun. Obviously I couldn’t just pick a random cheap new one and be happy with it.

Nordic key layout

That’s what they call it. It is even a Swedish layout, which among a few other details means it features å, ä and ö keys at a rather prominent place. See illustration. Those letters are used fairly frequently in our language. We have a few peculiarities in the Swedish layout that is downright impractical for programming, like how the {[]} – symbols all require AltGr pressed and slash, asterisk and underscore require Shift to be pressed etc. Still, I’v’e learned to program on such a layout so I’m quite used to those odd choices by now…

kb-nordic

Cursor keys

I want the cursor keys to be of “standard size”, have the correct location and relative positions. Like below. Also, the page up and page down keys should not be located close to the cursor keys (like many laptop keyboards do).

keyboard with marked cursorkeys

Page up and down

The page up and page down keys should instead be located in the group of six keys above the cursor keys. The group should have a little gap between it and the three keys (print screen, scroll lock and pause/break) above them so that finding the upper row is easy and quick without looking.

Backspace

I’m not really a good keyboard typist. I do a lot of mistakes and I need to use the backspace key quite a lot when doing so. Thus I’m a huge fan of the slightly enlarged backspace key layout so that I can find and hit that key easily. Also, the return key is a fairly important one so I like the enlarged and strangely shaped version of that as well. Pretty standard.

Further details

The Escape key should have a little gap below it so that I can find it easily without looking.

The Caps lock key is completely useless for locking caps is not something a normal person does, but it can be reprogrammed for other purposes. I’ve still refrained from doing so, mostly to not get accustomed to “weird” setups that makes it (even) harder for me to move between different keyboards at different places. Just recently I’ve configured it to work as ctrl – let’s see how that works out.

The F-keys are pretty useless. I use F5 sometimes to refresh web pages but as ctrl-r works just as well I don’t see a strong need for them in my life.

Numpad – a completely useless piece of the keyboard that I would love to get rid of – I never use any of those key. Never. Unfortunately I haven’t found any otherwise decent keyboards without the numpad.

Func KB-460

The Func KB-460 is the keyboard I ended up with this time in my search. It has some fun extra cruft such as two USB ports and a red backlight (that can be made to pulse). The backlight gave me extra points from my kids.

It is “mechanical” which obviously is some sort of thing among keyboards that has followers and is supposed to be very good. I remain optimistic about this particular model, even if there are a few minor things with it I haven’t yet gotten used to. I hope I’ll just get used to them.

This keyboard has Cherry MX Red linear switches.

How it could look

Based on my preferences and what keys I think I use, I figure an ideal keyboard layout for me could very well look like this:

my keyboard layout

Keyfreq

I have decided to go further and “scientifically” measure how I use my keyboard, which keys I use the most and similar data and metrics. Turns out the most common keylog program on Linux doesn’t log enough details, so I forked it and created keyfreq for this purpose. I’ll report details about this separately – soon.

I’m with Firefox OS!

August 13, 2014 Daniel Stenberg 8 Comments

Tablet

I have received a Firefox OS tablet as part of a development program. My plan is to use this device to try out stuff I work on and see how it behaves on Firefox OS “for real” instead of just in emulators or on other systems. While Firefox OS is a product of my employer Mozilla, I personally don’t work particularly much with Firefox OS specifically. I work on networking in general for Firefox, and large chunks of the networking stack is used in both the ordinary Firefox browser like on desktops as well as in Firefox OS. I hope to polish and improve networking on Firefox OS too over time.

Phone

The primary development device for Firefox OS is right now apparently the Flame phone, and I have one of these too now in my possession. I took a few photos when I unpacked it and crammed them into the same image, click it for higher res:

A brief explanation of Firefox OS

Firefox OS is an Android kernel (including drivers etc) and a bionic libc – simply the libc that Android uses. Linux-wise and slightly simplified, it runs a single application full-screen: Firefox, which then can run individual Firefox-apps that appears as apps on the phone. This means that the underlying fundamentals are shared with Android, while the layers over that are Firefox and then a world of HTML and javascript. Thus most of the network stack used for Firefox – that I work with – the http, ftp, dns, cookies and so forth is shared between Firefox for desktop and Firefox for Android and Firefox OS.

Firefox OS is made to use a small footprint to allow cheaper smartphones than Android itself can. Hence it is targeted to developing nations and continents.

Both my devices came with Firefox OS version 1.3 pre-installed.

The phone

The specs: Qualcomm Snapdragon 1.2GHZ dual-core processor, 4.5-inch 854×480 pixel screen, five-megapixel rear camera with auto-focus and flash, two-megapixel front-facing camera. Dual-SIM 3G, 8GB of onboard memory with a microSD slot, and a 1800 mAh capacity battery.

The Flame phone should be snappy enough although at times it seems to take a moment too long to populate a newly shown screen with icons etc. The screen surface is somehow not as smooth as my Nexus devices (we have the 4,5,7,10 nexuses in the house), leaving me with a constant feeling the screen isn’t cleaned.

Its dual-sim support is something that seems ideal for traveling etc to be able to use my home sim for incoming calls but use a local sim for data and outgoing calls… I’ve never had a phone featuring that before. I’ve purchased a prepaid SIM-card to use with this phone as my secondary device.

Some Good

I like the feel of the tablet. It feels like a solid and sturdy 10″ tablet, just like it should. I think the design language of Firefox OS for a newbie such as myself is pleasing and good-looking. The quad-core 1GHz thing is certainly fast enough CPU-wise to eat most of what you can throw at it.

These are really good devices to do web browsing on as the browser is a highly capable and fast browser.

Mapping: while of course there’s Google maps app, using the openstreetmap map is great on the device and Google maps in the browser is also a perfectly decent way to view maps. Using openstreetmap also of course has the added bonus that it feels great to see your own edits in your own neck of the woods!

I really appreciate that Mozilla pushes for new, more and better standardized APIs to enable all of this to get done in web applications. To me, this is one of the major benefits with Firefox OS. It benefits all of us who use the web.

Some Bad

Firefox OS feels highly US-centric (which greatly surprised me, seeing the primary markets for Firefox OS are certainly not in the US). As a Swede, I of course want my calendar to show Monday as the first day of the week. No can do. I want my digital clock to show me the time using 24 hour format (the am/pm scheme only confuses me). No can do. Tiny teeny details in the grand scheme of things, yes, but annoying. Possibly I’m just stupid and didn’t find how to switch these settings, but I did look for them on both my devices.

The actual Firefox OS system feels like a scaled-down Android where all apps are simpler and less fancy than Android. There’s a Facebook “app” for it that shows Facebook looking much crappier than it usually does in a browser or in the Android app – although on the phone it looked much better than on the tablet for some reason that I don’t understand.

I managed to get the device to sync my contacts from Google (even with my google 2-factor auth activated) but trying to sync my Facebook contacts just gave me a very strange error window in spite of repeated attempts, but again that worked on my phone!

I really miss a proper back button! Without it, we end up in this handicapped iphone-like world where each app has to provide a back button in its own UI or I have to hit the home button – which doesn’t just go back one step.

The tablet supports a gesture, pull up from the button of the screen, to get to the home screen while the phone doesn’t support that but instead has a dedicated home button which if pressed a long time shows up cards with all currently running apps. I’m not even sure how to do that latter operation on the tablet as it doesn’t’ have a home button.

The gmail web interface and experience is not very good on either of the devices.

Building Firefox OS

I’ve only just started this venture and dipped my toes in that water. All code is there in the open and you build it all with open tools. I might get back on this topic later if I get the urge to ventilate something from it… 🙂 I didn’t find any proper device specific setup for the tablet, but maybe I just don’t know its proper code word and I’ve only given it a quick glance so far. I’ll do my first builds and installs for the phone. Any day now!

My seven year old son immediately found at least one game on my dev phone (he actually found the market and downloaded it all by himself the first time he tried the device) that he really likes and now he wants to borrow this from time to time to play that game – in competition with the android phones and tablets we have here already. A pretty good sign I’d say.

Firefox OS is already a complete and competent phone operating system and app ecosystem. If you’re not coming from Android or Iphone it is a step up from everything else. If you do come from Android or Iphone I think you have to accept that this is meant for the lower end spectrum of smart-phones.

I think the smart-phone world can use more competition and Firefox OS brings exactly that.

firefox-os-bootscreen

Mozilla, Network, Web

Firefox and partial content

June 16, 2014 Daniel Stenberg

Update: parts of the change mentioned in this blog post has subsequently been reverted since clearly I had a too positive view of the Internet.

Firefox Ball One of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
In curl‘s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

cURL and libcurl, Mozilla, Network, Web

Http2 interim meeting NYC

June 8, 2014 Daniel Stenberg 5 Comments

On June 5th, around thirty people sat down around a huge table in a conference room on the 4th floor in the Google offices in New York City, with a heavy rain pouring down outside.

It was time for another IETF http2 interim meeting. The attendees were all participants in the HTTPbis work group and came from a wide variety of companies and countries. The major browser vendors were represented there, and so were operators and big service providers and some proxy people. Most of the people who have been speaking up on the mailing list over the last year or so, unfortunately with a couple of people notably absent. (And before anyone asks, yes we are a group where the majority is old males like me.)

Most people present knew many of the others already, which helped to create a friendly familiar spirit and we quickly got started on the Thursday morning working our way through the rather long lits of issues to deal with. When we had our previous interim meeting in London, I think most of us though we would’ve been further along today but recent development and discussions on the list had actually brought back a lot of issues we though we were already done with and we now reiterated a whole slew of subjects. We weren’t allowed to take photographs indoors so you won’t see any pictures of this opportunity from me here.

We did close many issues and I’ll just quickly mention some of the noteworthy ones here…

Extensions

We started out with the topic of “extensions”. Should we revert the decision from Zurich (where it was decided that we shouldn’t allow extensions in http2) or was the current state of the protocol the right one? The arguments for allowing extensions included that we’d keep getting requests for new things to add unless we have a way and that some of the recent stuff we’ve added really could’ve been done as extensions instead. An argument against it is that it makes things much simpler and reliable if we just document exactly what the protocol has and is, and removing “optional” behavior from the protocol has been one of the primary mantas along the design process.

The discussion went back and forth for a long time, and after almost three hours we had kind of a draw. Nobody was firmly against “the other” alternative but the two sides also seemed to have roughly the same amount of support. Then it was yet again time for the coin toss to guide us. Martin brought out an Australian coin and … the next protocol draft will allow extensions. Again. This also forces implementation to have to read and skip all unknown frames it receives compared to the existing situation where no unknown frames can ever occur.

BLOCKED as an extension

A rather given first candidate for an extension was the BLOCKED frame. At the time BLOCKED was added to the protocol it was explicitly added into the spec because we didn’t have extensions – and it is now being lifted out into one.

ALTSVC as an extension

What received slightly more resistance was the move to move out the ALTSVC frame as well. It was argued that the frame isn’t mandatory to support and therefore easily can be made into an extension.

Simplified padding

Another small change of the wire format since draft-12 was the removal of the high byte for padding to simplify. It reduces the amount you can pad a single frame but you can easily pad more using other means if you really have to, and there were numbers presented that said that 255 bytes were enough with HTTP 1.1 already so probably it will be enough for version 2 as well.

Schedule

There will be a new draft out really soon: draft -13. Martin, our editor of the spec, says he’ll be able to ship it in a week. That is intended to be the last draft, intended for implementation and it will then be expected to get deployed rather widely to allow us all in the industry to see how it works and be able to polish details or wordings that may still need it.

We had numerous vendors and HTTP stack implementers in the room and when we discussed schedule for when various products will be able to see daylight. If we all manage to stick to the plans. we may just have plenty of products and services that support http2 by the September/October time frame. If nothing major is found in this latest draft, we’re looking at RFC status not too far into 2015.

Meeting summary

I think we’re closing in for real now and I have good hopes for the protocol and our progress to a really wide scale deployment across the Internet. The HTTPbis group is an awesome crowd to work with and I had a great time. Our hosts took good care of us and made sure we didn’t lack any services or supplies. Extra thanks go to those of you who bought me dinners and to those who took me out to good beer places!

My http2 document

Yeah, it will now become somewhat out of date and my plan is to update it once the next draft ships. I’ll also do another http2 presentation already this week so I hope to also post an updated slide set soonish. Stay tuned!

Wireshark

My plan is to cooperate with the other Wireshark hackers and help making sure we have the next draft version supported in Wireshark really soon after its published.

curl and nghttp2

Most of the differences introduced are in the binary format so nghttp2 will need to be updated again – it is the library curl uses for the wire format of http2. The curl parts will need some adjustments, for example for Content-Encoding gzip that no longer is implicit but there should be little to do in the curl code for this draft bump.

Network

Bye bye RFC 2616

June 7, 2014 Daniel Stenberg

In August 2007 the IETF HTTPbis work group started to make an update to the HTTP 1.1 specification RFC 2616 (from June 1999) which already was an update to RFC 2068 from 1996. I wasn’t part of the effort back then so I didn’t get to hear the back chatter or what exactly the expectations were on delivery time and time schedule, but I’m pretty sure nobody thought it would take almost seven long years for the update to reach publication status.

On June 6 2014 when RFC 7230 – RFC 7235 were released, the single 176 page document has turned into 6 documents with a total size that is now much larger, and there’s also a whole slew of additional related documents released at the same time.

2616 is deeply carved into my brain so it’ll take some time until I unlearn that, plus the fact that now we need to separate our pointers to one of those separate document instead of just one generic number for the whole thing. Source codes and documents all over now need to be carefully updated to instead refer to the new documents.

And the HTTP/2 work continues to progress at high speed. More about that in a separate blog post soon.

More details on the road from RFC2616 until today can be found in Mark Nottingham’s RFC 2616 is dead.