Archive for the ‘Mozilla’ Category

Changing networks with Firefox running

Friday, September 26th, 2014

Short recap: I work on network code for Mozilla. Bug 939318 is one of “mine” – yesterday I landed a fix (a patch series with 6 individual patches) for this and I wanted to explain what goodness that should (might?) come from this!

diffstat

diffstat reports this on the complete patch series:

29 files changed, 920 insertions(+), 162 deletions(-)

The change set can be seen in mozilla-central here. But I guess a proper description is easier for most…

The bouncy road to inclusion

This feature set and associated problems with it has been one of the most time consuming things I’ve developed in recent years, I mean in relation to the amount of actual code produced. I’ve had it “landed” in the mozilla-inbound tree five times and yanked out again before it landed correctly (within a few hours), every time of course reverted again because I had bugs remaining in there. The bugs in this have been really tricky with a whole bunch of timing-dependent and race-like problems and me being unfamiliar with a large part of the code base that I’m working on. It has been a highly frustrating journey during periods but I’d like to think that I’ve learned a lot about Firefox internals partly thanks to this resistance.

As I write this, it has not even been 24 hours since it got into m-c so there’s of course still a risk there’s an ugly bug or two left, but then I also hope to fix the pending problems without having to revert and re-apply the whole series…

Many ways to connect to networks

Firefox Nightly screenshotIn many network setups today, you get an environment and a network “experience” that is crafted for that particular place. For example you may connect to your work over a VPN where you get your company DNS and you can access sites and services you can’t even see when you connect from the wifi in your favorite coffee shop. The same thing goes for when you connect to that captive portal over wifi until you realize you used the wrong SSID and you switch over to the access point you were supposed to use.

For every one of these setups, you get different DHCP setups passed down and you get a new DNS server and so on.

These days laptop lids are getting closed (and the machine is put to sleep) at one place to be opened at a completely different location and rarely is the machine rebooted or the browser shut down.

Switching between networks

Switching from one of the networks to the next is of course something your operating system handles gracefully. You can even easily be connected to multiple ones simultaneously like if you have both an Ethernet card and wifi.

Enter browsers. Or in this case let’s be specific and talk about Firefox since this is what I work with and on. Firefox – like other browsers – will cache images, it will cache DNS responses, it maintains connections to sites a while even after use, it connects to some sites even before you “go there” and so on. All in the name of giving the users an as good and as fast experience as possible.

The combination of keeping things cached and alive, together with the fact that switching networks brings new perspectives and new “truths” offers challenges.

Realizing the situation is new

The changes are not at all mind-bending but are basically these three parts:

  1. Make sure that we detect network changes, even if just the set of available interfaces change. Send an event for this.
  2. Make sure the necessary parts of the code listens and understands this “network topology changed” event and acts on it accordingly
  3. Consider coming back from “sleep” to be a network changed event since we just cannot be sure of the network situation anymore.

The initial work has been made for Windows only but it allows us to smoothen out any rough edges before we continue and make more platforms support this.

The network changed event can be disabled by switching off the new “network.notify.changed” preference. If you do end up feeling a need for that, I really hope you file a bug explaining the details so that we can work on fixing it!

Act accordingly

So what is acting properly? What if the network changes in a way so that your active connections suddenly can’t be used anymore due to the new rules and routing and what not? We attack this problem like this: once we get a “network changed” event, we “allow” connections to prove that they are still alive and if not they’re torn down and re-setup when the user tries to reload or whatever. For plain old HTTP(S) this means just seeing if traffic arrives or can be sent off within N seconds, and for websockets, SPDY and HTTP2 connections it involves sending an actual ping frame and checking for a response.

The internal DNS cache was a bit tricky to handle. I initially just flushed all entries but that turned out nasty as I then also killed ongoing name resolves that caused errors to get returned. Now I instead added logic that flushes all the already resolved names and it makes names “in transit” to get resolved again so that they are done on the (potentially) new network that then can return different addresses for the same host name(s).

This should drastically reduce the situation that could happen before when Firefox would basically just freeze and not want to do any requests until you closed and restarted it. (Or waited long enough for other timeouts to trigger.)

The ‘N seconds’ waiting period above is actually 5 seconds by default and there’s a new preference called “network.http.network-changed.timeout” that can be altered at will to allow some experimentation regarding what the perfect interval truly is for you.

Firefox BallInitially on Windows only

My initial work has been limited to getting the changed event code done for the Windows back-end only (since the code that figures out if there’s news on the network setup is highly system specific), and now when this step has been taken the plan is to introduce the same back-end logic to the other platforms. The code that acts on the event is pretty much generic and is mostly in place already so it is now a matter of making sure the event can be generated everywhere.

My plan is to start on Firefox OS and then see if I can assist with the same thing in Firefox on Android. Then finally Linux and Mac.

I started on Windows since Windows is one of the platforms with the largest amount of Firefox users and thus one of the most prioritized ones.

More to do

There’s separate work going on for properly detecting captive portals. You know the annoying things hotels and airports for example tend to have to force you to do some login dance first before you are allowed to use the internet at that location. When such a captive portal is opened up, that should probably qualify as a network change – but it isn’t yet.

daniel.haxx.se week #3

Monday, September 22nd, 2014

I won’t keep posting every video update here, but I mostly wanted to mention that I’ve kept posting a weekly video over at youtube basically explaining what’s going on right now within my dearest projects. Mostly curl and some Firefox stuff.

This week: libcurl server cert verification API got a bashing at SEC-T, is HTTP for UDP a good idea? How about adding HTTP cache support to libcurl? HTTP/2 is getting deployed as we speak. Interesting curl bug when used by XBMC. The patch series for Firefox bug 939318 is improving slowly – will it ever land?

Video perhaps?

Monday, September 8th, 2014

I decided to try to do a short video about my current work right now and make it available for you all. I try to keep it short (5-7 minutes) and I’m certainly no pro at it, but I will try to make a weekly one for a while and see if it gets any fun. I’m going to read your comments and responses to this very eagerly and that will help me decide how I will proceed on this experiment.

Enjoy.

HTTP/2 interop pains

Tuesday, September 2nd, 2014

At around 06:49 CEST on the morning of August 27 2014, Google deployed an HTTP/2 draft-14 implementation on their front-end servers that handle logins to Google accounts (and possibly others). Those at least take care of all the various login stuff you do with Google, G+, gmail, etc.

The little problem with that was just that their implementation of HTTP2 is in disagreement with all existing client implementations of that same protocol at that draft level. Someone immediately noticed this problem and filed a bug against Firefox.

The Firefox Nightly and beta versions have HTTP2 enabled by default and so users quickly started to notice this and a range of duplicate bug reports have been filed. And keeps being filed as more users run into this problem. As far as I know, Chrome does not have this enabled by default so much fewer Chrome users get this ugly surprise.

The Google implementation has a broken cookie handling (remnants from the draft-13 it looks like by how they do it). As I write this, we’re on the 7th day with this brokenness. We advice bleeding-edge users of Firefox to switch off HTTP/2 support in the mean time until Google wakes up and acts.

You can actually switch http2 support back on once you’ve logged in and it then continues to work fine. Below you can see what a lovely (wildly misleading) error message you get if you try http2 against Google right now with Firefox:

google-http2-draft14-cookies

This post is being debated on hacker news.

Updated: 20:14 CEST: There’s a fix coming, that supposedly will fix this problem on Thursday September 4th.

Update 2: In the morning of September 4th (my time), Google has reverted their servers to instead negotiate SPDY 3.1 and Firefox is fine with this.

Firefox OS Flatfish Bluedroid fix

Friday, August 29th, 2014

Hey, when I just built my own Firefox OS (b2g) image for my Firefox OS Tablet (flatfish) straight from the latest sources, I ran into this (known) problem:

Can't find necessary file(s) of Bluedroid in the backup-flatfish folder.
Please update the system image for supporting Bluedroid (Bug-986314),
so that the needed binary files can be extracted from your flatfish device.

So, as I struggled to figure out the exact instructions on how to proceed from this, I figured I should jot down what I did in the hopes that it perhaps will help a fellow hacker at some point:

  1. Download the 3 *.img files from the dropbox site that is referenced from bug 986314.
  2. Download the flash-flatfish.sh script from the same dropbox place
  3. Make sure you have ‘fastboot’ installed (I’m mentioning this here because it turned out I didn’t and yet I have already built and flashed my Flame phone successfully without having it). “apt-get install android-tools-fastboot” solved it for me. Note that if it isn’t installed, the flash-flatfish.sh script will claim that the device is not in fastboot mode and stop with an error message saying so.
  4. Finally: run the script “./flash-flatfish.sh [dir with the 3 .img files]“
  5. Once it has succeeded, the tablet reboots
  6. Remove the backup-flatfish directory in the build dir.
  7. Restart the flatfish build again and now it should get passed that Bluedroid nit

Enjoy!

My home setup

Monday, August 25th, 2014

I work in my home office which is upstairs in my house, perhaps 20 steps from my kitchen and the coffee refill. I have a largish desk with room for a number of computers. The photo below shows the three meter beauty. My two kids have their two machines on the left side while I use the right side of it for my desktop and laptop.

Daniel's home office

Many computers

The kids use my old desktop computer with a 20″ Dell screen and my old 15.6″ dual-core Asus laptop. My wife has her laptop downstairs and we have a permanent computer installed underneath the TV for media (an Asus VivoPC).

My desktop computer

I’m primarily developing C and C++ code and I’m frequently compiling rather large projects – repeatedly. I use a desktop machine for my ordinary development, equipped with a fairly powerful 3.5GHz quad-core Core-I7 CPU, I have my OS, my home dir and all source code put on an SSD. I have a larger HDD for larger and slower content. With ccache and friends, this baby can build Firefox really fast. I put my machine together from parts myself as I couldn’t find a suitable one focused on horse power but yet a “normal” 2D graphics card that works Fractal Designfine with Linux. I use a Radeon HD 5450 based ASUS card, which works fine with fully open source drivers.

I have two basic 24 inch LCD monitors (Benq and Dell) both using 1920×1200 resolution. I like having lots of windows up, nothing runs full-screen. I use KDE as desktop and I edit everything in Emacs. Firefox is my primary browser. I don’t shut down this machine, it runs a few simple servers for private purposes.

My machines (and my kids’) all run Debian Linux, typically of the unstable flavor allowing me to get new code reasonably fast.

Func KB-460 keyboardMy desktop keyboard is a Func KB-460, mechanical keyboard with some funky extra candy such as red backlight and two USB ports. Both my keyboard and my mouse are wired, not wireless, to take away the need for batteries or recharging etc in this environment. My mouse is a basic and old Logitech MX 310.

I have a crufty old USB headset with a mic, that works fine for hangouts and listening to music when the rest of the family is home. I have Logitech webcam thing sitting on the screen too, but I hardly ever use it for anything.

When on the move

I need to sometimes move around and work from other places. Going to conferences or even our regular Mozilla work weeks. Hence I also have a laptop that is powerful enough to build Firefox is a sane amount of time. I have Lenovo Thinkpad w540a Lenovo Thinkpad W540 with a 2.7GHz quad-core Core-I7, 16GB of RAM and 512GB of SSD. It has the most annoying touch pad on it. I don’t’ like that it doesn’t have the explicit buttons so for example both-clicking (to simulate a middle-click) like when pasting text in X11 is virtually impossible.

On this machine I also run a VM with win7 installed and associated development environment so I can build and debug Firefox for Windows on it.

I have a second portable. A small and lightweight netbook, an Eeepc S101, 10.1″ that I’ve been using when I go and just do presentations at places but recently I’ve started to simply use my primary laptop even for those occasions – primarily because it is too slow to do anything else on.

I do video conferences a couple of times a week and we use Vidyo for that. Its Linux client is shaky to say the least, so I tend to use my Nexus 7 tablet for it since the Vidyo app at least works decently on that. It also allows me to quite easily change location when it turns necessary, which it sometimes does since my meetings tend to occur in the evenings and then there’s also varying amounts of “family activities” going on!

Backup

For backup, I have a Synology NAS equipped with 2TB of disk in a RAIDSynology DS211j stashed downstairs, on the wired in-house gigabit ethernet. I run an rsync job every night that syncs the important stuff to the NAS and I run a second rsync that also mirrors relevant data over to a friends house just in case something terribly bad would go down. My NAS backup has already saved me really good at least once.

Printer

HP Officejet 8500ANext to the NAS downstairs is the house printer, also attached to the gigabit even if it has a wifi interface of its own. I just like increasing reliability to have the “fixed services” in the house on wired network.

The printer also has scanning capability which actually has come handy several times. The thing works nicely from my Linux machines as well as my wife’s windows laptop.

Internet

fiber cableI have fiber going directly into my house. It is still “just” a 100/100 connection in the other end of the fiber since at the time I installed this they didn’t yet have equipment to deliver beyond 100 megabit in my area. I’m sure I’ll upgrade this to something more impressive in the future but this is a pretty snappy connection already. I also have just a few milliseconds latency to my primary servers.

Having the fast uplink is perfect for doing good remote backups.

RouterĀ  and wifi

dlink DIR 635I have a lowly D-Link DIR 635 router and wifi access point providing wifi for the 2.4GHz and 5GHz bands and gigabit speed on the wired side. It was dead cheap it just works. It NATs my traffic and port forwards some ports through to my desktop machine.

The router itself can also update the dyndns info which ultimately allows me to use a fixed name to my home machine even without a fixed ip.

Frequent Wifi users in the household include my wife’s laptop, the TV computer and all our phones and tablets.

Telephony

Ping Communication Voice Catcher 201EWhen I installed the fiber I gave up the copper connection to my home and since then I use IP telephony for the “land line”. Basically a little box that translates IP to old phone tech and I keep using my old DECT phone. We basically only have our parents that still call this number and it has been useful to have the kids use this for outgoing calls up until they’ve gotten their own mobile phones to use.

It doesn’t cost very much, but the usage is dropping over time so I guess we’ll just give it up one of these days.

Mobile phones and tablets

I have a Nexus 5 as my daily phone. I also have a Nexus 7 and Nexus 10 that tend to be used by the kids mostly.

I have two Firefox OS devices for development/work.

I’m with Firefox OS!

Wednesday, August 13th, 2014

Tablet

I have received a Firefox OS tablet as part of a development program. My plan is to use this device to try out stuff I work on and see how it behaves on Firefox OS “for real” instead of just in emulators or on other systems. While Firefox OS is a product of my employer Mozilla, I personally don’t work particularly much with Firefox OS specifically. I work on networking in general for Firefox, and large chunks of the networking stack is used in both the ordinary Firefox browser like on desktops as well as in Firefox OS. I hope to polish and improve networking on Firefox OS too over time.

Firefox OS tablet

Phone

The primary development device for Firefox OS is right now apparently the Flame phone, and I have one of these too now in my possession. I took a few photos when I unpacked it and crammed them into the same image, click it for higher res:

Flame - Firefox OS phone

A brief explanation of Firefox OS

Firefox OS is an Android kernel (including drivers etc) and a bionic libc – simply the libc that Android uses. Linux-wise and slightly simplified, it runs a single application full-screen: Firefox, which then can run individual Firefox-apps that appears as apps on the phone. This means that the underlying fundamentals are shared with Android, while the layers over that are Firefox and then a world of HTML and javascript. Thus most of the network stack used for Firefox – that I work with – the http, ftp, dns, cookies and so forth is shared between Firefox for desktop and Firefox for Android and Firefox OS.

Firefox OS is made to use a small footprint to allow cheaper smartphones than Android itself can. Hence it is targeted to developing nations and continents.

Both my devices came with Firefox OS version 1.3 pre-installed.

The phone

The specs: Qualcomm Snapdragon 1.2GHZ dual-core processor, 4.5-inch 854×480 pixel screen, five-megapixel rear camera with auto-focus and flash, two-megapixel front-facing camera. Dual-SIM 3G, 8GB of onboard memory with a microSD slot, and a 1800 mAh capacity battery.

The Flame phone should be snappy enough although at times it seems to take a moment too long to populate a newly shown screen with icons etc. The screen surface is somehow not as smooth as my Nexus devices (we have the 4,5,7,10 nexuses in the house), leaving me with a constant feeling the screen isn’t cleaned.

Its dual-sim support is something that seems ideal for traveling etc to be able to use my home sim for incoming calls but use a local sim for data and outgoing calls… I’ve never had a phone featuring that before. I’ve purchased a prepaid SIM-card to use with this phone as my secondary device.

Some Good

I like the feel of the tablet. It feels like a solid and sturdy 10″ tablet, just like it should. I think the design language of Firefox OS for a newbie such as myself is pleasing and good-looking. The quad-core 1GHz thing is certainly fast enough CPU-wise to eat most of what you can throw at it.

These are really good devices to do web browsing on as the browser is a highly capable and fast browser.

Mapping: while of course there’s Google maps app, using the openstreetmap map is great on the device and Google maps in the browser is also a perfectly decent way to view maps. Using openstreetmap also of course has the added bonus that it feels great to see your own edits in your own neck of the woods!

I really appreciate that Mozilla pushes for new, more and better standardized APIs to enable all of this to get done in web applications. To me, this is one of the major benefits with Firefox OS. It benefits all of us who use the web.

Some Bad

Firefox OS feels highly US-centric (which greatly surprised me, seeing the primary markets for Firefox OS are certainly not in the US). As a Swede, I of course want my calendar to show Monday as the first day of the week. No can do. I want my digital clock to show me the time using 24 hour format (the am/pm scheme only confuses me). No can do. Tiny teeny details in the grand scheme of things, yes, but annoying. Possibly I’m just stupid and didn’t find how to switch these settings, but I did look for them on both my devices.

The actual Firefox OS system feels like a scaled-down Android where all apps are simpler and less fancy than Android. There’s a Facebook “app” for it that shows Facebook looking much crappier than it usually does in a browser or in the Android app – although on the phone it looked much better than on the tablet for some reason that I don’t understand.

I managed to get the device to sync my contacts from Google (even with my google 2-factor auth activated) but trying to sync my Facebook contacts just gave me a very strange error window in spite of repeated attempts, but again that worked on my phone!

I really miss a proper back button! Without it, we end up in this handicapped iphone-like world where each app has to provide a back button in its own UI or I have to hit the home button – which doesn’t just go back one step.

The tablet supports a gesture, pull up from the button of the screen, to get to the home screen while the phone doesn’t support that but instead has a dedicated home button which if pressed a long time shows up cards with all currently running apps. I’m not even sure how to do that latter operation on the tablet as it doesn’t’ have a home button.

The gmail web interface and experience is not very good on either of the devices.

Building Firefox OS

I’ve only just started this venture and dipped my toes in that water. All code is there in the open and you build it all with open tools. I might get back on this topic later if I get the urge to ventilate something from it… :-) I didn’t find any proper device specific setup for the tablet, but maybe I just don’t know its proper code word and I’ve only given it a quick glance so far. I’ll do my first builds and installs for the phone. Any day now!

More

My seven year old son immediately found at least one game on my dev phone (he actually found the market and downloaded it all by himself the first time he tried the device) that he really likes and now he wants to borrow this from time to time to play that game – in competition with the android phones and tablets we have here already. A pretty good sign I’d say.

Firefox OS is already a complete and competent phone operating system and app ecosystem. If you’re not coming from Android or Iphone it is a step up from everything else. If you do come from Android or Iphone I think you have to accept that this is meant for the lower end spectrum of smart-phones.

I think the smart-phone world can use more competition and Firefox OS brings exactly that.

firefox-os-bootscreen

I’m eight months in on my Mozilla adventure

Tuesday, August 5th, 2014

I started working for Mozilla in January 2014. Here’s some reflections from my first time as Mozilla employee.

Working from home

I’ve worked completely from home during some short periods before in my life so I had an idea what it would be like. So far, it has been even better than I had anticipated. It suits me so well it is almost scary! No commutes. No delays due to traffic. No problems ever with over-crowded trains or buses. No time wasted going to work and home again. And I’m around when my kids get home from school and it’s easy to receive deliveries all days. I don’t think I ever want to work elsewhere again… :-)

Another effect of my work place is also that I probably have become somewhat more active on social networks and IRC. If I don’t use those means, I may spent whole days without talking to any humans.

Also, I’m the only Mozilla developer in Sweden – although we have a few more employees in Sweden. (Update: apparently this is wrong and there’s’ also a Mats here!)

Daniel's home office

The freedom

I have freedom at work. I control and decide a lot of what I do and I get to do a lot of what I want at work. I can work during the hours I want. As long as I deliver, my employer doesn’t mind. The freedom isn’t just about working hours but I also have a lot of control and saying about what I want to work on and what I think we as a team should work on going further.

The not counting hours

For the last 16 years I’ve been a consultant where my customers almost always have paid for my time. Paid by the hour I spent working for them. For the last 16 years I’ve counted every single hour I’ve worked and made sure to keep detailed logs and tracking of whatever I do so that I can present that to the customer and use that to send invoices. Counting hours has been tightly integrated in my work life for 16 years. No more. I don’t count my work time. I start work in the morning, I stop work in the evening. Unless I work longer, and sometimes I start later. And sometimes I work on the weekend or late at night. And I do meetings after regular “office hours” many times. But I don’t keep track – because I don’t have to and it would serve no purpose!

The big code base

I work with Firefox, in the networking team. Firefox has about 10 million lines C and C++ code alone. Add to that everything else that is other languages, glue logic, build files, tests and lots and lots of JavaScript.

It takes time to get acquainted with such a large and old code base, and lots of the architecture or traces of the original architecture are also designed almost 20 years ago in ways that not many people would still call good or preferable.

Mozilla is using Mercurial as the primary revision control tool, and I started out convinced I should too and really get to learn it. But darn it, it is really too similar to git and yet lots of words are intermixed and used as command but they don’t do the same as for git so it turns out really confusing and yeah, I felt I got handicapped a little bit too often. I’ve switched over to use the git mirror and I’m now a much happier person. A couple of months in, I’ve not once been forced to switch away from using git. Mostly thanks to fancy scripts and helpers from fellow colleagues who did this jump before me and already paved the road.

C++ and code standards

I’m a C guy (note the absence of “++”). I’ve primarily developed in C for the whole of my professional developer life – which is approaching 25 years. Firefox is a C++ fortress. I know my way around most C++ stuff but I’m not “at home” with C++ in any way just yet (I never was) so sometimes it takes me a little time and reading up to get all the C++-ishness correct. Templates, casting, different code styles, subtleties that isn’t in C and more. I’m slowly adapting but some things and habits are hard to “unlearn”…

The publicness and Bugzilla

I love working full time for an open source project. Everything I do during my work days are public knowledge. We work a lot with Bugzilla where all (well except the security sensitive ones) bugs are open and public. My comments, my reviews, my flaws and my patches can all be reviewed, ridiculed or improved by anyone out there who feels like doing it.

Development speed

There are several hundred developers involved in basically the same project and products. The commit frequency and speed in which changes are being crammed into the source repository is mind boggling. Several hundred commits daily. Many hundred and sometimes up to a thousand new bug reports are filed – daily.

yet slowness of moving some bugs forward

Moving a particular bug forward into actually getting it land and included in pending releases can be a lot of work and it can be tedious. It is a large project with lots of legacy, traditions and people with opinions on how things should be done. Getting something to change from an old behavior can take a whole lot of time and massaging and discussions until they can get through. Don’t get me wrong, it is a good thing, it just stands in direct conflict to my previous paragraph about the development speed.

In the public eye

I knew about Mozilla before I started here. I knew Firefox. Just about every person I’ve ever mentioned those two brands to have known about at least Firefox. This is different to what I’m used to. Of course hardly anyone still fully grasp what I’m actually doing on a day to day basis but I’ve long given up on even trying to explain that to family and friends. Unless they really insist.

Vitriol and expectations of high standards

I must say that being in the Mozilla camp when changes are made or announced has given me a less favorable view on the human race. Almost anything or any chance is received by a certain amount of users that are very aggressively against the change. All changes really. “If you’ll do that I’ll be forced to switch to Chrome” is a very common “threat” – as if that would A) work B) be a browser that would care more about such “conservative loonies” (you should consider that my personal term for such people)). I can only assume that the Chrome team also gets a fair share of that sort of threats in the other direction…

Still, it seems a lot of people out there and perhaps especially in the Free Software world seem to hold Mozilla to very high standards. This is both good and bad. This expectation of being very good also comes from people who aren’t even Firefox users – we must remain the bright light in a world that goes darker. In my (biased) view that tends to lead to unfair criticisms. The other browsers can do some of those changes without anyone raising an eyebrow but when Mozilla does similar for Firefox, a shitstorm breaks out. Lots of those people criticizing us for doing change NN already use browser Y that has been doing NN for a good while already…

Or maybe I’m just not seeing these things with clear enough eyes.

How does Mozilla make money?

Yeps. This is by far the most common question I’ve gotten from friends when I mention who I work for. In fact, that’s just about the only question I get from a lot of people… (possibly because after that we get into complicated questions such as what exactly do I do there?)

curl and IETF

I’m grateful that Mozilla allows me to spend part of my work time working on curl.

I’m also happy to now work for a company that allows me to attend to IETF/httpbis and related activities much better than ever I’ve had the opportunity to in the past. Previously I’ve pretty much had to spend spare time and my own money, which has limited my participation a great deal. The support from Mozilla has allowed me to attend to two meetings so far during the year, in London and in NYC and I suspect there will be more chances in the future.

Future

I only just started. I hope to grab on to more and bigger challenges and tasks as I get warmer and more into everything. I want to make a difference. See you in bugzilla.

Firefox and partial content

Monday, June 16th, 2014

Firefox BallOne of the first bugs that fell into my lap when I started working for Mozilla not a very long time ago, was bug 237623. Anyone involved in Mozilla knows a bug in that range is fairly old (we just recently passed one million filed bugs). This particular bug was filed in March 2004 and there are (right now) 26 other bugs marked as duplicates of this. Today, the fix for this problem has landed.

The core of the problem is that when a HTTP server sends contents back to a client, it can send a header along indicating the size of the data in the response. The header is called “Content-Length:”. If the connection gets broken during transfer for whatever reason and the browser hasn’t received as much data as was initially claimed to be delivered, that’s a very good hint that something is wrong and the transfer was incomplete.

The perhaps most annoying way this could be seen is when you download a huge DVD image or something and for some reason the connection gets cut off after only a short time, way before the entire file is downloaded, but Firefox just silently accept that as the end of the transfer and think everything was fine and dandy.

What complicates the issue is the eternal problem: not everything abides to the protocol. This said, if there are frequent violators of the protocol we can’t strictly fail on each case of problem we detect but we must instead do our best to handle it anyway.

Is Content-Length a frequently violated HTTP response header?

Let’s see…

  1. Back in the HTTP 1.0 days, the Content-Length header was not very important as the connection was mostly shut down after each response anyway. Alas, clients/browsers would swiftly learn to just wait for the disconnect anyway.
  2. Back in the old days, there were cases of problems with “large files” (files larger than 2 or 4GB) which every now and then caused the Content-Length: header to turn into negative or otherwise confused values when it wrapped. That’s not really happening these days anymore.
  3. With HTTP 1.1 and its persuasive use of persistent connections it is important to get the size right, as otherwise the chain of requests get messed up and we end up with tears and sad faces
  4. In curl’s HTTP parser we’ve always been strictly abiding to this header and we’ve bailed out hard on mismatches. This is a very rare error for users to get and based on this (admittedly unscientific data) I believe that there is not a widespread use of servers sending bad Content-Length headers.
  5. It seems Chrome at least in some aspects is already much more strict about this header.

My fix for this problem takes a slightly careful approach and only enforces the strictness for HTTP 1.1 or later servers. But then as a bonus, it has grown to also signal failure if a chunked encoded transfer ends without the ending trailer or if a SPDY or http2 transfer gets prematurely stopped.

This is basically a 6-line patch at its core. The rest is fixing up old test cases, added new tests etc.

As a counter-point, Eric Lawrence apparently worked on adding stricter checks in IE9 three years ago as he wrote about in Content-Length in the Real World. They apparently subsequently added the check again in IE10 which seems to have caused some problems for them. It remains to be seen how this change affects Firefox users out in the real world. I believe it’ll be fine.

This patch also introduces the error code for a few other similar network situations when the connection is closed prematurely and we know there are outstanding data that never arrived, and I got the opportunity to improve how Firefox behaves when downloading an image and it gets an error before the complete image has been transferred. Previously (when a partial transfer wasn’t an error), it would always throw away the image on an error and instead show the “image not found” picture. That really doesn’t make sense I believe, as a partial image is better than that default one – especially when a large portion of the image has been downloaded already.

Follow-up effects

Other effects of this change that possibly might be discovered and cause some new fun reports: prematurely cut off transfers of javascript or CSS will discard the entire javascript/CSS file. Previously the partial file would be used.

Of course, I doubt that these are the files that are as commonly cut off as many other file types but still on a very slow and bad connection it may still happen and the new behavior will make Firefox act as if the file wasn’t loaded at all, instead of previously when it would happily used the portions of the files that it had actually received. Partial CSS and partial javascript of course could lead to some “fun” effects of brokenness.

Http2 interim meeting NYC

Sunday, June 8th, 2014

On June 5th, around thirty people sat down around a huge table in a conference room on the 4th floor in the Google offices in New York City, with a heavy rain pouring down outside.

It was time for another IETF http2 interim meeting. The attendees were all participants in the HTTPbis work group and came from a wide variety of companies and countries. The major browser vendors were represented there, and so were operators and big service providers and some proxy people. Most of the people who have been speaking up on the mailing list over the last year or so, unfortunately with a couple of people notably absent. (And before anyone asks, yes we are a group where the majority is old males like me.)

Most people present knew many of the others already, which helped to create a friendly familiar spirit and we quickly got started on the Thursday morning working our way through the rather long lits of issues to deal with. When we had our previous interim meeting in London, I think most of us though we would’ve been further along today but recent development and discussions on the list had actually brought back a lot of issues we though we were already done with and we now reiterated a whole slew of subjects. We weren’t allowed to take photographs indoors so you won’t see any pictures of this opportunity from me here.

Google offices building logo

We did close many issues and I’ll just quickly mention some of the noteworthy ones here…

Extensions

We started out with the topic of “extensions”. Should we revert the decision from Zurich (where it was decided that we shouldn’t allow extensions in http2) or was the current state of the protocol the right one? The arguments for allowing extensions included that we’d keep getting requests for new things to add unless we have a way and that some of the recent stuff we’ve added really could’ve been done as extensions instead. An argument against it is that it makes things much simpler and reliable if we just document exactly what the protocol has and is, and removing “optional” behavior from the protocol has been one of the primary mantas along the design process.

The discussion went back and forth for a long time, and after almost three hours we had kind of a draw. Nobody was firmly against “the other” alternative but the two sides also seemed to have roughly the same amount of support. Then it was yet again time for the coin toss to guide us. Martin brought out an Australian coin and … the next protocol draft will allow extensions. Again. This also forces implementation to have to read and skip all unknown frames it receives compared to the existing situation where no unknown frames can ever occur.

BLOCKED as an extension

A rather given first candidate for an extension was the BLOCKED frame. At the time BLOCKED was added to the protocol it was explicitly added into the spec because we didn’t have extensions – and it is now being lifted out into one.

ALTSVC as an extension

What received slightly more resistance was the move to move out the ALTSVC frame as well. It was argued that the frame isn’t mandatory to support and therefore easily can be made into an extension.

Simplified padding

Another small change of the wire format since draft-12 was the removal of the high byte for padding to simplify. It reduces the amount you can pad a single frame but you can easily pad more using other means if you really have to, and there were numbers presented that said that 255 bytes were enough with HTTP 1.1 already so probably it will be enough for version 2 as well.

Schedule

There will be a new draft out really soon: draft -13. Martin, our editor of the spec, says he’ll be able to ship it in a week. That is intended to be the last draft, intended for implementation and it will then be expected to get deployed rather widely to allow us all in the industry to see how it works and be able to polish details or wordings that may still need it.

We had numerous vendors and HTTP stack implementers in the room and when we discussed schedule for when various products will be able to see daylight. If we all manage to stick to the plans. we may just have plenty of products and services that support http2 by the September/October time frame. If nothing major is found in this latest draft, we’re looking at RFC status not too far into 2015.

Meeting summary

I think we’re closing in for real now and I have good hopes for the protocol and our progress to a really wide scale deployment across the Internet. The HTTPbis group is an awesome crowd to work with and I had a great time. Our hosts took good care of us and made sure we didn’t lack any services or supplies. Extra thanks go to those of you who bought me dinners and to those who took me out to good beer places!

My http2 document

Yeah, it will now become somewhat out of date and my plan is to update it once the next draft ships. I’ll also do another http2 presentation already this week so I hope to also post an updated slide set soonish. Stay tuned!

Wireshark

My plan is to cooperate with the other Wireshark hackers and help making sure we have the next draft version supported in Wireshark really soon after its published.

curl and nghttp2

Most of the differences introduced are in the binary format so nghttp2 will need to be updated again – it is the library curl uses for the wire format of http2. The curl parts will need some adjustments, for example for Content-Encoding gzip that no longer is implicit but there should be little to do in the curl code for this draft bump.