Category Archives: Technology

Really everything related to technology

New screen and new fuses

I got myself a new 27″ 4K screen to my work setup, a Dell P2715Q, and replaced one of my old trusty twenty-four inch friends with it.

I now work with the “Thinkpad 13″ on the left as my video conference machine (it does nothing else and it runs Windows!), the two mid screens are a 24″ and the new 27” and they are connected to my primary dev machine while the rightmost thing is my laptop for when I need to move.

Did everything run smoothly? Heck no.

When I first inserted the 4K screen without modifying anything else in the setup, it was immediately obvious that I really needed to upgrade my graphics card since it didn’t have muscles enough to drive the screen at 4K so the screen would then instead upscale a 1920×1200 image in a slightly blurry fashion. I couldn’t have that!

New graphics card

So when I was out and about later that day I more or less accidentally passed a Webhallen store, and I got myself a new card. I wanted to play it easy so I stayed with an AMD processor and went with ASUS Dual-Rx460-O2G. The key feature I wanted was to be able to drive one 4K screen and one at 1920×1200, and then I unfortunately had to give up on the ones with only passive cooling and I instead had to pick what sounds like a gaming card. (I hate shopping graphics cards.)As I was about to do surgery on the machine anyway. I checked and noticed that I could add more memory to the motherboard so I bought 16 more GB to a total of 32GB.

Blow some fuses

Later that night, when the house was quiet and dark I shut down my machine, inserted the new card, the new memory DIMMs and powered it back up again.

At least that was the plan. When I fired it back on, it said clock and my lamps around me all got dark and the machine didn’t light up at all. The fuse was blown! Man, wasn’t that totally unexpected?

I did some further research on what exactly caused the fuse to blow and blew a few more in the process, as I finally restored the former card and removed the memory DIMMs again and it still blew the fuse. Puzzled and slightly disappointed I went to bed when I had no more spare fuses.

I hate leaving the machine dead in parts on the floor with an uncertain future, but what could I do?

A new PSU

Tuesday morning I went to get myself a PSU replacement (Plexgear PS-600 Bronze), and once I had that installed no more fuses blew and I could start the machine again!

I put the new memory back in and I could get into the BIOS config with both screens working with the new card (and it detected 32GB ram just fine). But as soon as I tried to boot Linux, the boot process halted after just 3-4 seconds and seemingly just froze. Hm, I tested a few different kernels and safety mode etc but they all acted like that. Weird!

BIOS update

A little googling on the messages that appeared just before it froze gave me the idea that maybe I should see if there’s an update for my bios available. After all, I’ve never upgraded it and it was a while since I got my motherboard (more than 4 years).

I found a much updated bios image on ASUS support site, put it on a FAT-formatted USB-drive and I upgraded.

Now it booted. Of course the error messages I had googled for are still present, and I suppose they were there before too, I just hadn’t put any attention to them when everything was working dandy!

Displayport vs HDMI

I had the wrong idea that I should use the display port to get 4K working, but it just wouldn’t work. DP + DVI just showed up on one screen and I even went as far as trying to download some Ubuntu Linux driver package for Radeon RX460 that I found, but of course it failed miserably due to my Debian Unstable having a totally different kernel running and what not.

In a slightly desperate move (I had now wasted quite a few hours on this and my machine still wasn’t working), I put back the old graphics card – (with DVI + hdmi) only to note that it no longer works like it did (the DVI one didn’t find the correct resolution anymore). Presumably the BIOS upgrade or something shook the balance?

Back on the new card I booted with DVI + HDMI, leaving DP entirely, and now suddenly both screens worked!

HiDPI + LoDPI

Once I had logged in, I could configure the 4K screen to show at its full 3840×2160 resolution glory. I was back.

Now I only had to start fiddling with getting the two screens to somehow co-exist next to each other, which is a challenge in its own. The large difference in DPI makes it hard to have one config that works across both screens. Like I usually have terminals on both screens – which font size should I use? And I put browser windows on both screens…

So far I’ve settled with increasing the font DPI in KDE and I use two different terminal profiles depending on which screen I put the terminal on. Seems to work okayish. Some texts on the 4K screen are still terribly small, so I guess it is good that I still have good eye sight!

24 + 27

So is it comfortable to combine a 24″ with a 27″ ? Sure, the size difference really isn’t that notable. The 27 one is really just a few centimeters taller and the differences in width isn’t an inconvenience. The photo below shows how similar they look, size-wise:

Post FOSDEM 2017

I attended FOSDEM again in 2017 and it was as intense, chaotic and wonderful as ever. I met old friends, got new friends and I got to test a whole range of Belgian beers. Oh, and there was also a set of great open source related talks to enjoy!

On Saturday at 2pm I delivered my talk on curl in the main track in the almost frighteningly large room Janson. I estimate that it was almost half full, which would mean upwards 700 people in the audience. The talk itself went well. I got audible responses from the audience several times and I kept well within my given time with time over for questions. The trickiest problem was the audio from the people who asked questions because it wasn’t at all very easy to hear, while the audio is great for the audience and in the video recording. Slightly annoying because as everyone else heard, it made me appear half deaf. Oh well. I got great questions both then and from people approaching me after the talk. The questions and the feedback I get from a talk is really one of the things that makes me appreciate talking the most.

The video of the talk is available, and the slides can also be viewed.

So after I had spent some time discussing curl things and handing out many stickers after my talk, I managed to land in the cafeteria for a while until it was time for me to once again go and perform.

We’re usually a team of friends that hang out during FOSDEM and we all went over to the Mozilla room to be there perhaps 20 minutes before my talk was scheduled and wow, there was a huge crowd outside of that room already waiting by the time we arrived. When the doors then finally opened (about 10 minutes before my talk started), I had to zigzag my way through to get in, and there was a large amount of people who didn’t get in. None of my friends from the cafeteria made it in!

The Mozilla devroom had 363 seats, not a single one was unoccupied and there was people standing along the sides and the back wall. So, an estimated nearly 400 persons in that room saw me speak about HTTP/2 deployments numbers right now, how HTTP/2 doesn’t really work well under 2% packet loss situations and then a bit about how QUIC can solve some of that and what QUIC is and when we might see the first experiments coming with IETF-QUIC – which really isn’t the same as Google-QUIC was.

To be honest, it is hard to deliver a talk in twenty minutes and I  was only 30 seconds over my time. I got questions and after the talk I spent a long time talking with people about HTTP, HTTP/2, QUIC, curl and the future of Internet protocols and transports. Very interesting.

The video of my talk can be seen, and the slides are online too.

I’m not sure if I was just unusually unlucky in my choices, or if there really was more people this year, but I experienced that “FULL” sign more than usual this year.

I fully intend to return again next year. Who knows, maybe I’ll figure out something to talk about then too. See you there?

One URL standard please

Following up on the problem with our current lack of a universal URL standard that I blogged about in May 2016: My URL isn’t your URL. I want a single, unified URL standard that we would all stand behind, support and adhere to.

What triggers me this time, is yet another issue. A friendly curl user sent me this URL:

http://user@example.com:80@daniel.haxx.se

… and pasting this URL into different tools and browsers show that there’s not a wide agreement on how this should work. Is the URL legal in the first place and if so, which host should a client contact?

  • curl treats the ‘@’-character as a separator between userinfo and host name so ‘example.com’ becomes the host name, the port number is 80 followed by rubbish that curl ignores. (wget2, the next-gen wget that’s in development works identically)
  • wget extracts the example.com host name but rejects the port number due to the rubbish after the zero.
  • Edge and Safari say the URL is invalid and don’t go anywhere
  • Firefox and Chrome allow ‘@’ as part of the userinfo, take the ’80’ as a password and the host name then becomes ‘daniel.haxx.se’

The only somewhat modern “spec” for URLs is the WHATWG URL specification. The other major, but now somewhat aged, URL spec is RFC 3986, made by the IETF and published in 2005.

In 2015, URL problem statement and directions was published as an Internet-draft by Masinter and Ruby and it brings up most of the current URL spec problems. Some of them are also discussed in Ruby’s WHATWG URL vs IETF URI post from 2014.

What I would like to see happen…

Which group? A group!

Friends I know in the WHATWG suggest that I should dig in there and help them improve their spec. That would be a good idea if fixing the WHATWG spec would be the ultimate goal. I don’t think it is enough.

The WHATWG is highly browser focused and my interactions with members of that group that I have had in the past, have shown that there is little sympathy there for non-browsers who want to deal with URLs and there is even less sympathy or interest for URL schemes that the popular browsers don’t even support or care about. URLs cover much more than HTTP(S).

I have the feeling that WHATWG people would not like this work to be done within the IETF and vice versa. Since I’d like buy-in from both camps, and any other camps that might have an interest in URLs, this would need to be handled somehow.

It would also be great to get other major URL “consumers” on board, like authors of popular URL parsing libraries, tools and components.

Such a URL group would of course have to agree on the goal and how to get there, but I’ll still provide some additional things I want to see.

Update: I want to emphasize that I do not consider the WHATWG’s job bad, wrong or lost. I think they’ve done a great job at unifying browsers’ treatment of URLs. I don’t mean to belittle that. I just know that this group is only a small subset of the people who probably should be involved in a unified URL standard.

A single fixed spec

I can’t see any compelling reasons why a URL specification couldn’t reach a stable state and get published as *the* URL standard. The “living standard” approach may be fine for certain things (and in particular browsers that update every six weeks), but URLs are supposed to be long-lived and inter-operate far into the future so they really really should not change. Therefore, I think the IETF documentation model could work well for this.

The WHATWG spec documents what browsers do, and browsers do what is documented. At least that’s the theory I’ve been told, and it causes a spinning and never-ending loop that goes against my wish.

Document the format

The WHATWG specification is written in a pseudo code style, describing how a parser would “walk” over the string with a state machine and all. I know some people like that, I find it utterly annoying and really hard to figure out what’s allowed or not. I much more prefer the regular RFC style of describing protocol syntax.

IDNA

Can we please just say that host names in URLs should be handled according to IDNA2008 (RFC 5895)? WHATWG URL doesn’t state any IDNA spec number at all.

Move out irrelevant sections

“Irrelevant” when it comes to documenting the URL format that is. The WHATWG details several things that are related to URL for browsers but are mostly irrelevant to other URL consumers or producers. Like section “5. application/x-www-form-urlencoded” and “6. API”.

They would be better placed in a “URL considerations for browsers” companion document.

Working doesn’t imply sensible

So browsers accept URLs written with thousands of forward slashes instead of two. That is not a good reason for the spec to say that a URL may legitimately contain a thousand slashes. I’m totally convinced there’s no critical content anywhere using such formatted URLs and no soul will be sad if we’d restricted the number to a single-digit. So we should. And yeah, then browsers should reject URLs using more.

The slashes are only an example. The browsers have used a “liberal in what you accept” policy for a lot of things since forever, but we must resist to use that as a basis when nailing down a standard.

The odds of this happening soon?

I know there are individuals interested in seeing the URL situation getting worked on. We’ve seen articles and internet-drafts posted on the issue several times the last few years. Any year now I think we will see some movement for real trying to fix this. I hope I will manage to participate and contribute a little from my end.

QUIC is h2 over UDP

The third day of the QUIC interim passed and now that meeting has ended. It continued to work very well to attend from remote and the group manged to plow through an extensive set of issues. A lot of consensus was achieved and I personally now have a much better feel for the protocol and many of its details thanks to the many discussions.

The drafts are still a bit too early for us to start discussing inter-op for real. But there were mentions and hopes expressed that maybe maybe we might start to see some of that by mid 2017. When we did HTTP/2, we had about 10 different implementations by the time draft-04 was out. I suspect we will see a smaller set for QUIC simply because of it being much more complex.

The next interim is planned to occur in the beginning of June in Europe.

There is an official QUIC logo being designed, but it is not done yet so you still need to imagine one placed here.

QUIC needs HTTP/2 needs HTTP/1

QUIC is primarily designed to send and receive HTTP/2 frames and entire streams over UDP (not only, but this is where the bulk of the work has been put in so far). Sure, TLS encrypted and everything, but my point here is that it is being designed to transfer HTTP/2 frames. You remember how HTTP/2 is “just a new framing” layer that changes how HTTP is sent over the wire, but when “decoded” again in the receiving end it is in most important aspects still HTTP/1 there. You have to implement most of a HTTP/1 stack in order to support HTTP/2. Now QUIC adds another layer to that. QUIC is a new way to send HTTP/2 frames over the network.

A QUIC stack needs to handle most aspects of HTTP/2!

Of course, there are notable differences and changes to some underlying principles that makes QUIC a bit different. It isn’t exactly HTTP/2 over secure UDP. Let me give you a few examples…

Streams are more independent

Packets sent over the wire with UDP are independent from each other to a very large degree. In order to avoid Head-of-Line blocking (HoL), packets that are lost and re-transmitted will only block the particular streams to which the lost packets belong. The other streams can keep flowing, unaware and uncaring.

Thanks to the nature of the Internet and how packets are handled, it is not unusual for network packets to arrive in a slightly different order than they were sent, even when they aren’t exactly “lost”.

So, streams in HTTP/2 were entirely synced and the order the sender of frames use, will be the exact same order in which the frames arrive in the other end. Packet loss or not.

In QUIC, individual frames and entire streams may arrive in the receiver in a different order than what was used in the sender.

Stream ID gaps means open

When receiving a QUIC packet, there’s basically no way to know if there are packets missing that were intended to arrive but got lost and haven’t yet been re-transmitted.

If a frame is received that uses the new stream ID N (a stream not previously seen), the receiver is then forced to assume that all the other streams ID from our previously highest ID to N are all just missing and will arrive soon. They are then presumed to exist!

In HTTP/2, we could handle gaps in stream IDs much differently because of TCP. Then a gap is known to be deliberate.

Some h2 frames are done by QUIC

Since QUIC is designed with streams, flow control and more and is used to send HTTP/2 frames over them, some of the h2 frames aren’t needed but are instead handled by the transport layer within QUIC and won’t show up in the HTTP/2 layer.

HPACK goes QPACK?

HPACK is the header compression system used in HTTP/2. Among other things it features a dictionary that you manipulate with instructions and then subsequent header frames can refer to those dictionary indexes instead of sending the full header. Header frame one says “insert my user-agent string” and then header frame two can refer back to the index in the dictionary for where that identical user-agent string is stored.

Due to the out of order streams in QUIC, this dictionary treatment is harder. The second header frame could arrive before the first, so if it would refer to an index set in the first header frame, it would have to block the entire stream until that first header arrives.

HPACK also has a concept of just adding things to the dictionary without specifying the index, and since both sides are in perfect sync it works just fine. In QUIC, if we want to maintain the independence of streams and avoid blocking to the highest degree, we need to instead specify exact indexes to use and not assume perfect sync.

This (and more) are reasons why QPACK is being suggested as a replacement for HPACK when HTTP/2 header frames are sent over QUIC.

First QUIC interim – in Tokyo

The IETF working group QUIC has its first interim meeting in Tokyo Japan for three days. Day one is today, January 24th 2017.

As I’m not there physically, I attend the meeting from remote using the webex that’s been setup for this purpose, and I’ll drop in a little screenshot below from one of the discussions (click it for hires) to give you a feel for it. It shows the issue being discussed and the camera view of the room in Tokyo. I run the jabber client on a different computer which allows me to also chat with the other participants. It works really well, both audio and video are quite crisp and understandable.

Japan is eight hours ahead of me time zone wise, so this meeting  runs from 01:30 until 09:30 Central European Time. That’s less comfortable and it may cause me some troubles to attend the entire thing.

On QUIC

We started off at once with a lot of discussions on basic issues. Versioning and what a specific version actually means and entails. Error codes and how error codes should be used within QUIC and its different components. Should the transport level know about priorities or shouldn’t it? How is the security protocol decided?

Everyone who is following the QUIC issues on github knows that there are plenty of people with a lot of ideas and thoughts on these matters and this meeting shows this impression is real.

For a casual bystander, you might’ve been fooled into thinking that because Google already made and deployed QUIC, these issues should be if not already done and decided, at least fairly speedily gone over. But nope. I think there are plenty of indications already that the protocol outputs that will come in the end of this process, the IETF QUIC will differ from the Google QUIC in a fair number of places.

The plan is that the different QUIC drafts (there are at least 4 different planned RFCs as they’re divided right now) should all be “done” during 2018.

(At 4am, the room took lunch and I wrote this up.)

Lesser HTTPS for non-browsers

An HTTPS client needs to do a whole lot of checks to make sure that the remote host is fine to communicate with to maintain the proper high security levels.

In this blog post, I will explain why and how the entire HTTPS ecosystem relies on the browsers to be good and strict and thanks to that, the rest of the HTTPS clients can get away with being much more lenient. And in fact that is good, because the browsers don’t help the rest of the ecosystem very much to do good verification at that same level.

Let me me illustrate with some examples.

CA certs

The server’s certificate must have been signed by a trusted CA (Certificate Authority). A client then needs the certificates from all the CAs that are trusted. Who’s a trusted CA and how would a client get their certs to use for verification?

You can say that you trust the same set of CAs that your operating system vendor trusts (which I’ve always thought is a bit of a stretch but hey, I can very well understand the convenience in this). If you want to do this as an HTTPS client you need to use native APIs in Windows or macOS, or you need to figure out where the cert bundle is stored if you’re using Linux.

If you’re not using the native libraries on windows and macOS or if you can’t find the bundle in your Linux distribution, or you’re in one of a large amount of other setups where you can’t use someone else’s bundle, then you need to gather this list by yourself.

How on earth would you gather a list of hundreds of CA certs that are used for the popular web sites on the net of today? Stand on someone else’s shoulders and use what they’ve done? Yeps, and conveniently enough Mozilla has such a bundle that is licensed to allow others to use it…

Mozilla doesn’t offer the set of CA certs in a format that anyone else can use really, which is the primary reason why we offer Mozilla’s cert bundle converted to PEM format on the curl web site. The other parties that collect CA certs at scale (Microsoft for Windows, Apple for macOS, etc) do even less.

Before you ask, Google doesn’t maintain their own list for Chrome. They piggyback the CA store provided on the operating system it runs on. (Update: Google maintains its own list for Android/Chrome OS.)

Further constraints

But the browsers, including Firefox, Chrome, Edge and Safari all add additional constraints beyond that CA cert store, on what server certificates they consider to be fine and okay. They blacklist specific fingerprints, they set a last allowed date for certain CA providers to offer certificates for servers and more.

These additional constraints, or additional rules if you want, are never exported nor exposed to the world in ways that are easy for anyone to mimic (in other ways than that everyone of course can implement the same code logic in their ends). They’re done in code and they’re really hard for anyone not a browser to implement and keep up with.

This makes every non-browser HTTPS client susceptible to okaying certificates that have already been deemed not OK by security experts at the browser vendors. And in comparison, not many HTTPS clients can compare or stack up the amount of client-side TLS and security expertise that the browser developers can.

HSTS preload

HTTP Strict Transfer Security is a way for sites to tell clients that they are to be accessed over HTTPS only for a specified time into the future, and plain HTTP should then not be used for the duration of this rule. This setup removes the Man-In-The-Middle (MITM) risk on subsequent accesses for sites that may still get linked to via HTTP:// URLs or by users entering the web site names directly into the address bars and so on.

The browsers have a “HSTS preload list” which is a list of sites that people have submitted and they are HSTS sites that basically never time out and always will be accessed over HTTPS only. Forever. No risk for MITM even in the first access to these sites.

There are no such HSTS preload lists being provided for non-browser HTTPS clients so there’s no easy way for non-browsers to avoid the first access MITM even for these class of forever-on-HTTPS sites.

Update: The Chromium HSTS preload list is available in a JSON format.

SHA-1

I’m sure you’ve heard about the deprecation of SHA-1 as a certificate hashing algorithm and how the browsers won’t accept server certificates using this starting at some cut off date.

I’m not aware of any non-browser HTTPS client that makes this check. For services, API providers and others don’t serve “normal browsers” they can all continue to play SHA-1 certificates well into 2017 without tears or pain. Another ecosystem detail we rely on the browsers to fix for us since most of these providers want to work with browsers as well…

This isn’t really something that is magic or would be terribly hard for non-browsers to do, its just that it will make some users suddenly get errors for their otherwise working setups and that takes a firm attitude from the software provider that is hard to maintain. And you’d have to introduce your own cut-off date that you’d have to fight with your users about! 😉

TLS is hard to get right

TLS and HTTPS are full of tricky areas and dusty corners that are hard to get right. The more we can share tricks and rules the better it is for everyone.

I think the browser vendors could do much better to help the rest of the ecosystem. By making their meta data available to us in sensible formats mostly. For the good of the Internet.

Disclaimer

Yes I work for Mozilla which makes Firefox. A vendor and a browser that I write about above. I’ve been communicating internally about some of these issues already, but I’m otherwise not involved in those parts of Firefox.

DMARC helped me ditch gmail

I’ve been a gmail user for many years (maybe ten). Especially since the introduction of smart phones it has been a really convenient system to read email on the go. I rarely respond to email from my phone but I’ve done that occasionally too and it has worked adequately.

All this time I’ve used my own domain and email address and simply forwarded a subset of my email over to gmail, and I had gmail setup so that when I emailed out from it, it would use my own email address and not the @gmail.com one. Nothing fancy, just convenient. The gmail spam filter is also pretty decent so it helped me to filter off some amount of garbage too.

It was fine until DMARC

However, with the rise of DMARC over the recent years and with Google insisting on getting on that bandwagon, it has turned out to be really hard to keep forwarding email to gmail (since gmail considers forwarded emails using such headers fraudulent and it rejects them). So a fair amount of email simply never showed up in my gmail inbox (and instead caused the senders to get a bounce from a gmail address they didn’t even know I had).

I finally gave up and decided gmail doesn’t work for this sort of basic email setup anymore. DMARC and its siblings have quite simply made it impossible to work with emails this way, a way that has been functional for decades (I used similar approaches already back in the mid 90s on my first few jobs).

Similarly, DMARC has turned out to be a pain for mailing lists since they too forward email in a similar fashion and this causes the DMARC police to go berserk. Luckily, recent versions of mailman has options that makes it rewrite the From:-lines from senders that send emails from domains that have strict DMARC policies. That mitigates most of the problems for mailman lists. I love the title of this old mail on the subject: “Yahoo breaks every mailing list in the world including the IETF’s

I’m sure DMARC works for the providers in the sence that they block huge amounts of spam and fake users and that’s what it was designed for. The fact that it also makes ordinary old-school mail forwards really difficult and forces mailing list admins all over to upgrade mailman or just keep getting rejects since they use mailing list software that lacks the proper features, that’s probably all totally ignored. DMARC was as designed: it reduces spam at the big providers’ systems. Mission accomplished. The fact that they at the same time made world wide Internet email a lot less useful is probably not something they care about.

It’s done

gmail can read mails from remote inboxes, but it doesn’t support IMAP (only POP3) so simply switching to such a method wouldn’t even work. I just refuse to enable POP3 anywhere again.

Of course it isn’t an irreversible decision, but I’ve stopped the forward to gmail, cleared the inbox there and instead I’ve switched to Aqua mail on Android. It seems fairly feature complete and snappy. It isn’t quite as fancy and cool as the gmail client, but hopefully it will do its job.

The biggest drawback I’ve felt after a couple of weeks is the gmail spam filter. I do run spamassassin on my server and it catches the large bulk of all spams, but having the gmail spam system on top of that was able to block more silliness from my phone than spamassassin does alone.

My talks at FOSDEM 2017

I couldn’t even recall how many times I’ve done this already, but in 2017 I am once again showing up in the cold and grey city called Brussels and the lovely FOSDEM conference, to talk. (Yes, it is cold and grey every February, trust me.) So I had to go back and count, and it turns out 2017 will become my 8th straight visit to FOSDEM and I believe it is the 5th year I’ll present there.First, a reminder about what I talked about at FOSDEM 2016: An HTTP/2 update. There’s also a (rather low quality) video recording of the talk to see there.

I’m scheduled for two presentations in 2017, and this year I’m breaking new ground for myself as I’m doing one of them on the “main track” which is the (according to me) most prestigious track held in one of the biggest rooms – seating more than 1,400 persons.

You know what’s cool? Running on billions of devices

Room: Janson, time: Saturday 14:00

Thousands of contributors help building the curl software which runs on several billions of devices and are affecting every human in the connected world daily. How this came to happen, who contributes and how Daniel at the wheel keeps it all together. How a hacking ring is actually behind it all and who funds this entire operation.

So that was HTTP/2, what’s next?

Room: UD2.218A, time: Saturday 16:30

A shorter recap on what HTTP/2 brought that HTTP/1 couldn’t offer before we dig in and look at some numbers that show how HTTP/2 has improved (browser) networking and the web experience for people.

Still, there are scenarios where HTTP/1’s multiple connections win over HTTP/2 in performance tests. Why is that and what is being done about it? Wasn’t HTTP/2 supposed to be the silver bullet?

A closer look at QUIC, its promises to fix the areas where HTTP/2 didn’t deliver and a check on where it is today. Is QUIC perhaps actually HTTP/3 in everything but the name?

Depending on what exactly happens in this area over time until FOSDEM, I will spice it up with more details on how we work on these protocol things in Mozilla/Firefox.

This will become my 3rd year in a row that I talk in the Mozilla devroom to present the state of the HTTP protocol and web transport.

6 hours of bliss

I sent out the release announcement for curl 7.52.0 exactly 07:59 in the morning of December 21, 2016. A Wednesday. We typically  release curl on Wednesdays out of old habit. It is a good release day.

curl 7.52.0 was just as any other release. Perhaps with a slightly larger set of new features than what’s typical for us. We introduce TLS 1.3 support, we now provide HTTPS-proxy support and the command line tool has this option called –fail-early that I think users will start to appreciate once they start to discover it. We also  announced three fixed security vulnerabilities. And some other good things.

I pushed the code to git, signed and uploaded the tarballs, I updated the info on the web site and I sent off that release announcement email and I felt good. Release-time good. That short feeling of relief and starting over on a new slate that I often experience these release days. Release days make me happy.

Any bets?

It is not unusual for someone to find a bug really fast after a release has shipped. As I was feeling good, I had to joke in the #curl IRC channel (42 minutes after that email):

08:41 <bagder> any bets on when the first bug report on the new release shows up? =)

Hours passed and maybe, just maybe there was not going to be any quick bugs filed on this release?

But of course. I wouldn’t write this blog post if it all had been nice and dandy. At 14:03, I got the email. 6 hours and 4 minutes since I wrote the 7.52.0 announcement email.

The email was addressed to the curl project security email list and included a very short patch and explanation how the existing code is wrong and needs “this fix” to work correctly. And it was entirely correct!

Now I didn’t feel that sense of happiness anymore. For some reason it was now completely gone and instead I felt something that involved sensations like rage, embarrassment and general tiredness. How the [beep] could this slip through like this?

I’ve done releases in the past that were broken to various extents but this is a sort of a new record and an unprecedented event. Enough time had passed that I couldn’t just yank the package from the download page either. I had to take it through the correct procedures.

What happened?

As part of a general code cleanup during this last development round, I changed all the internals to use a proper internal API to get random data and if libcurl is built with a TLS library it uses its provided API to get secure and safe random data. As a move to improve our use of random internally. We use this internal API for getting the nonce in authentication mechanisms such as Digest and NTLM and also for generating the boundary string in HTTP multipart formposts and more. (It is not used for any TLS or SSH level protocol stuff though.)

I did the largest part of the random overhaul of this in commit f682156a4f, just a little over a month ago.

Of course I made sure that all test cases kept working and there were no valgrind reports or anything, the code didn’t cause any compiler warnings. It did not generate any reports in the many clang-analyzer or Coverity static code analyzer runs we’ve done since. We run clang-analyzer daily and Coverity perhaps weekly.

But there’s a valgrind report just here!

Kamil Dudka, who sent the 14:03 email, got a valgrind error and that’s what set him off – but how come he got that and I didn’t?

The explanation consists of the following two conditions that together worked to hide the problem for us quite successfully:

  1. I (and I suppose several of the other curl hackers) usually build curl and libcurl “debug enabled”. This allows me to run more tests, do more diagnostics and debug it easier when I run into problems. It also provides a system with “fake random” so that we can actually verify that functions that otherwise use real random values generate the correct output when given a known random value… and yeah, this debug system prevented valgrind from detecting any problem!
  2. In the curl test suite we once had a problem with valgrind generating reports on third party libraries etc which then ended up as false positives. We then introduced a “valgrind report parser” that would detect if the report concerns curl or something else. It turns out this parser doesn’t detect the errors if curl is compiled without the cc’s -g command line option. And of course… curl and libcurl both build without -g by default!

The patch?

The vulnerable function basically uses this simple prototype. It is meant to get an “int” worth of random value stored in the buffer ‘rnd’ points to. That’s 4 bytes.

randit(struct Curl_easy *data, unsigned int *rnd)

But due to circumstances I can’t explain on anything other than my sloppy programming, I managed to write the function store random value in the actual pointer instead of the buffer it points to. So when the function returns, there’s nothing stored in the buffer. No 4 bytes of random. Just the uninitialized value of whatever happened to be there, on the stack.

The patch that fixes this problem looks like this (with some names shortened to simplify but keep the idea):

- res = random(data, (char *)&rnd, sizeof(rnd));
+ res = random(data, (char *)rnd, sizeof(*rnd));

So yeah. I introduced this security flaw in 7.52.0. We had it fixed in 7.52.1, released roughly 48 hours later.

(I really do not need comments on what other languages that wouldn’t have allowed this mistake or otherwise would’ve brought us world peace a long time ago.)

Make it not happen again

The primary way to make this same mistake not happen again easily, is that I’m removing the valgrind report parsing function from the test suite and we will now instead assume that valgrind reports will be legitimate and if not, work on suppressing the false positives in a better way.

References

This flaw is officially known as CVE-2016-9594

The real commit that fixed this problem is here, or as stand-alone patch.

The full security advisory for this flaw is here: https://curl.haxx.se/docs/adv_20161223.html

Facepalm photo by Alex E. Proimos.

xkcd: 221

 

curl security audit

“the overall impression of the state of security and robustness
of the cURL library was positive.”

I asked for, and we were granted a security audit of curl from the Mozilla Secure Open Source program a while ago. This was done by Mozilla getting a 3rd party company involved to do the job and footing the bill for it. The auditing company is called Cure53.

good_curl_logoI applied for the security audit because I feel that we’ve had some security related issues lately and I’ve had the feeling that we might be missing something so it would be really good to get some experts’ eyes on the code. Also, as curl is one of the most used software components in the world a serious problem in curl could have a serious impact on tools, devices and applications everywhere. We don’t want that to happen.

Scans and tests and all

We run static analyzers on the code frequently with a zero warnings tolerance. The daily clang-analyzer scan hasn’t found a problem in a long time and the Coverity once-every-few-weeks occasionally finds something suspicious but we always fix those immediately.

We have  thousands of tests and unit tests that we run non-stop on the code on multiple platforms running multiple build combinations. We also use valgrind when running tests to verify memory use and check for potential memory leaks.

Secrecy

The audit itself. The report and the work on fixing the issues were all done on closed mailing lists without revealing to the world what was really going on. All as our fine security process describes.

There are several downsides with fixing things secretly. One of the primary ones is that we get much fewer eyes on the fixes and there aren’t that many people involved when discussing solutions or approaches to the issues at hand. Another is that our test infrastructure is made for and runs only public code so the code can’t really be fully tested until it is merged into the public git repository.

The report

We got the report on September 23, 2016 and it certainly gave us a lot of work.

The audit report has now been made public and is a very interesting work if you’re into security, C code and curl hacking. I find the report very clear, well written and it spells out each problem very accurately and even shows proof of concept code snippets and exploit examples to drive the points home.

Quoted from the report intro:

As for the approach, the test was rooted in the public availability of the source code belonging to the cURL software and the investigation involved five testers of the Cure53 team. The tool was tested over the course of twenty days in August and September of 2016 and main efforts were focused on examining cURL 7.50.1. and later versions of cURL. It has to be noted that rather than employ fuzzing or similar approaches to validate the robustness of the build of the application and library, the latter goal was pursued through a classic source code audit. Sources covering authentication, various protocols, and, partly, SSL/TLS, were analyzed in considerable detail. A rationale behind this type of scoping pointed to these parts of the cURL tool that were most likely to be prone and exposed to real-life attack scenarios. Rounding up the methodology of the classic code audit, Cure53 benefited from certain tools, which included ASAN targeted with detecting memory errors, as well as Helgrind, which was tasked with pinpointing synchronization errors with the threading model.

They identified no less than twenty-three (23) potential problems in the code, out of which nine were deemed security vulnerabilities. But I’d also like to emphasize that they did also actually say this:

At the same time, the overall impression of the state of security and robustness of the cURL library was positive.

Resolving problems

In the curl security team we decided to downgrade one of the 9 vulnerabilities to a “plain bug” since the required attack scenario was very complicated and the risk deemed small, and two of the issues we squashed into treating them as a single one. That left us with 7 security vulnerabilities. Whoa, that’s a lot. The largest amount we’ve ever fixed in a single release before was 4.

I consider handling security issues in the project to be one of my most important tasks; pretty much all other jobs are down-prioritized in comparison. So with a large queue of security work, a lot of bug fixing and work on features basically had to halt.

You can get a fairly detailed description of our work on fixing the issues in the fix and validation log. The report, the log and the advisories we’ve already posted should cover enough details about these problems and associated fixes that I don’t feel a need to write about them much further.

More problems

Just because we got our hands full with an audit report doesn’t mean that the world stops, right? While working on the issues one by one to have them fixed we also ended up getting an additional 4 security issues to add to the set, by three independent individuals.

All these issues gave me a really busy period and it felt great when we finally shipped 7.51.0 and announced all those eleven fixes to the world and I could get a short period of relief until the next tsunami hits.