Category Archives: Work

Work stuff

A curl delivery network

I’ve run my own public web sites on hardware I’ve administered myself for over twenty years now. I’ve hosted the curl web site myself since it’s inception.

The curl web site at curl.haxx.se has recently been delivering roughly 1.5 terabyte of data to the world per month. The CA bundle we convert to PEM from the Mozilla source code, is alone downloaded more than 100,000 times per day. Occasional blog entries I’ve posted here on my blog have climbed very fast on popular sites such as Hacker news and Reddit, and have resulted in intense visitor storms hitting this same server – sometimes reaching visitor counts above 200,000 “uniques” – most of them within the first few hours of the publication. At times, those visitor spikes have effectively brought the server to its knees.

Yes, my personal web site and the curl web site are both sharing the same physical server. It also hosts more than a dozen other sites and numerous services for our own pleasures and fun, providing services for a handful of different open source projects. So when the server has to cease doing work because it runs out of memory or hits other resource restraints, that causes interruptions all over. Oh yes, and my email doesn’t reach me.

Inconvenient and annoying.

The server

Haxx owns and runs this co-located server that we have a busload of web servers on – for the good of the projects and people that run things on it. This machine’s worst bottle neck is available RAM memory and perhaps I/O performance. Every time the server goes down to a crawl due to network traffic overload we discuss how we should upgrade it. Installing a new machine and transferring over all the sites and services is work. Work that none of us at Haxx are very happy to volunteer to do. So it hasn’t been done yet, and frankly the server handles the daily load just fine and without even a blink. Which is ninety nine point something percent of the time…

Haxx pays for a certain amount of network traffic so as long as we’re below some threshold we remain paying the same monthly fee. We don’t want to increase the traffic by magnitudes as that would cost more.

The specific machine, that sits deep inside a server room in Stockholm Sweden, is a five(?) years old Dell Poweredge E310, Intel Xeon X3440 2.53GHz with 8GB ram, This model is shown on the image at the top.

Alternatives that hasn’t helped

Why not a mirror system? We had a fair amount of curl site mirrors a few years ago, but it never worked well because they were always less reliable than the main site and they often turned stale and out of sync with the master site which eventually just hurt users.

They also trick visitors into bookmark or otherwise go back to the mirror site instead of the real one and there were always the annoying people who couldn’t resist but to fill the mirror with ads and stuff. Plus, they didn’t help much with with the storms to the main site.

Why not a cloud server? Because with the amount of services, servers and various things we do on our server, it would be inconvenient and expensive. But perhaps even more because we started out like this so we have invested time and energy into the infrastructure as it works right now. And I enjoy rowing my own boat!

The CDN

Fastly reached out and graciously offered to help us handle the load. Both on the account of traffic amounts but also to save our machine from struggling this hard the next time I’ll write something that tickles people’s curiosity (or rage) to that level when several thousands of visitors want to read the same article at the same time.

Starting now, the curl.haxx.se and the daniel.haxx.se web sites are fronted by Fastly. It should give web site visitors from all over the world faster response times and it will make the site more reliable and less likely to have problems due to traffic load going forward.

In case you’re not familiar with what a CDN is, a simplified explanation would say it is a globally distributed network of reverse proxy servers deployed in multiple data centers. These CDN servers front the Internet and will to the largest extent possible serve the visitors with the right content directly from their own caches instead of them reaching the actual lowly backend server I run that hosts the original content. Fastly has lots of servers across the globe for this purpose. Users who are a long way away from Sweden will probably be the ones who will notice this change the most, as you may suddenly find haxx.se content much closer (network round-trip wise) than before.

Standards

These new servers will host the sites over HTTPS just like before, and they will require TLS 1.2 and SNI. They will work over IPv6 and support HTTP/2.  Network standard wise, there shouldn’t be any step down – and honestly, I haven’t exactly been on the cutting edge of these technologies myself for these sites in the past.

Editing the site

We will keep editing and maintaining the site like before. It is made up of an old system with templates and include files that generate mostly static web pages. The site is mostly available on github and using that, you can build a local version for development and trying out changes before they land.

Hopefully, this move to Fastly will only make the site faster and more reliable. If you notice any glitches or experience any problems with the site, please let us know!

New screen and new fuses

I got myself a new 27″ 4K screen to my work setup, a Dell P2715Q, and replaced one of my old trusty twenty-four inch friends with it.

I now work with the “Thinkpad 13″ on the left as my video conference machine (it does nothing else and it runs Windows!), the two mid screens are a 24″ and the new 27” and they are connected to my primary dev machine while the rightmost thing is my laptop for when I need to move.

Did everything run smoothly? Heck no.

When I first inserted the 4K screen without modifying anything else in the setup, it was immediately obvious that I really needed to upgrade my graphics card since it didn’t have muscles enough to drive the screen at 4K so the screen would then instead upscale a 1920×1200 image in a slightly blurry fashion. I couldn’t have that!

New graphics card

So when I was out and about later that day I more or less accidentally passed a Webhallen store, and I got myself a new card. I wanted to play it easy so I stayed with an AMD processor and went with ASUS Dual-Rx460-O2G. The key feature I wanted was to be able to drive one 4K screen and one at 1920×1200, and then I unfortunately had to give up on the ones with only passive cooling and I instead had to pick what sounds like a gaming card. (I hate shopping graphics cards.)As I was about to do surgery on the machine anyway. I checked and noticed that I could add more memory to the motherboard so I bought 16 more GB to a total of 32GB.

Blow some fuses

Later that night, when the house was quiet and dark I shut down my machine, inserted the new card, the new memory DIMMs and powered it back up again.

At least that was the plan. When I fired it back on, it said clock and my lamps around me all got dark and the machine didn’t light up at all. The fuse was blown! Man, wasn’t that totally unexpected?

I did some further research on what exactly caused the fuse to blow and blew a few more in the process, as I finally restored the former card and removed the memory DIMMs again and it still blew the fuse. Puzzled and slightly disappointed I went to bed when I had no more spare fuses.

I hate leaving the machine dead in parts on the floor with an uncertain future, but what could I do?

A new PSU

Tuesday morning I went to get myself a PSU replacement (Plexgear PS-600 Bronze), and once I had that installed no more fuses blew and I could start the machine again!

I put the new memory back in and I could get into the BIOS config with both screens working with the new card (and it detected 32GB ram just fine). But as soon as I tried to boot Linux, the boot process halted after just 3-4 seconds and seemingly just froze. Hm, I tested a few different kernels and safety mode etc but they all acted like that. Weird!

BIOS update

A little googling on the messages that appeared just before it froze gave me the idea that maybe I should see if there’s an update for my bios available. After all, I’ve never upgraded it and it was a while since I got my motherboard (more than 4 years).

I found a much updated bios image on ASUS support site, put it on a FAT-formatted USB-drive and I upgraded.

Now it booted. Of course the error messages I had googled for are still present, and I suppose they were there before too, I just hadn’t put any attention to them when everything was working dandy!

Displayport vs HDMI

I had the wrong idea that I should use the display port to get 4K working, but it just wouldn’t work. DP + DVI just showed up on one screen and I even went as far as trying to download some Ubuntu Linux driver package for Radeon RX460 that I found, but of course it failed miserably due to my Debian Unstable having a totally different kernel running and what not.

In a slightly desperate move (I had now wasted quite a few hours on this and my machine still wasn’t working), I put back the old graphics card – (with DVI + hdmi) only to note that it no longer works like it did (the DVI one didn’t find the correct resolution anymore). Presumably the BIOS upgrade or something shook the balance?

Back on the new card I booted with DVI + HDMI, leaving DP entirely, and now suddenly both screens worked!

HiDPI + LoDPI

Once I had logged in, I could configure the 4K screen to show at its full 3840×2160 resolution glory. I was back.

Now I only had to start fiddling with getting the two screens to somehow co-exist next to each other, which is a challenge in its own. The large difference in DPI makes it hard to have one config that works across both screens. Like I usually have terminals on both screens – which font size should I use? And I put browser windows on both screens…

So far I’ve settled with increasing the font DPI in KDE and I use two different terminal profiles depending on which screen I put the terminal on. Seems to work okayish. Some texts on the 4K screen are still terribly small, so I guess it is good that I still have good eye sight!

24 + 27

So is it comfortable to combine a 24″ with a 27″ ? Sure, the size difference really isn’t that notable. The 27 one is really just a few centimeters taller and the differences in width isn’t an inconvenience. The photo below shows how similar they look, size-wise:

Post FOSDEM 2017

I attended FOSDEM again in 2017 and it was as intense, chaotic and wonderful as ever. I met old friends, got new friends and I got to test a whole range of Belgian beers. Oh, and there was also a set of great open source related talks to enjoy!

On Saturday at 2pm I delivered my talk on curl in the main track in the almost frighteningly large room Janson. I estimate that it was almost half full, which would mean upwards 700 people in the audience. The talk itself went well. I got audible responses from the audience several times and I kept well within my given time with time over for questions. The trickiest problem was the audio from the people who asked questions because it wasn’t at all very easy to hear, while the audio is great for the audience and in the video recording. Slightly annoying because as everyone else heard, it made me appear half deaf. Oh well. I got great questions both then and from people approaching me after the talk. The questions and the feedback I get from a talk is really one of the things that makes me appreciate talking the most.

The video of the talk is available, and the slides can also be viewed.

So after I had spent some time discussing curl things and handing out many stickers after my talk, I managed to land in the cafeteria for a while until it was time for me to once again go and perform.

We’re usually a team of friends that hang out during FOSDEM and we all went over to the Mozilla room to be there perhaps 20 minutes before my talk was scheduled and wow, there was a huge crowd outside of that room already waiting by the time we arrived. When the doors then finally opened (about 10 minutes before my talk started), I had to zigzag my way through to get in, and there was a large amount of people who didn’t get in. None of my friends from the cafeteria made it in!

The Mozilla devroom had 363 seats, not a single one was unoccupied and there was people standing along the sides and the back wall. So, an estimated nearly 400 persons in that room saw me speak about HTTP/2 deployments numbers right now, how HTTP/2 doesn’t really work well under 2% packet loss situations and then a bit about how QUIC can solve some of that and what QUIC is and when we might see the first experiments coming with IETF-QUIC – which really isn’t the same as Google-QUIC was.

To be honest, it is hard to deliver a talk in twenty minutes and I  was only 30 seconds over my time. I got questions and after the talk I spent a long time talking with people about HTTP, HTTP/2, QUIC, curl and the future of Internet protocols and transports. Very interesting.

The video of my talk can be seen, and the slides are online too.

I’m not sure if I was just unusually unlucky in my choices, or if there really was more people this year, but I experienced that “FULL” sign more than usual this year.

I fully intend to return again next year. Who knows, maybe I’ll figure out something to talk about then too. See you there?

First QUIC interim – in Tokyo

The IETF working group QUIC has its first interim meeting in Tokyo Japan for three days. Day one is today, January 24th 2017.

As I’m not there physically, I attend the meeting from remote using the webex that’s been setup for this purpose, and I’ll drop in a little screenshot below from one of the discussions (click it for hires) to give you a feel for it. It shows the issue being discussed and the camera view of the room in Tokyo. I run the jabber client on a different computer which allows me to also chat with the other participants. It works really well, both audio and video are quite crisp and understandable.

Japan is eight hours ahead of me time zone wise, so this meeting  runs from 01:30 until 09:30 Central European Time. That’s less comfortable and it may cause me some troubles to attend the entire thing.

On QUIC

We started off at once with a lot of discussions on basic issues. Versioning and what a specific version actually means and entails. Error codes and how error codes should be used within QUIC and its different components. Should the transport level know about priorities or shouldn’t it? How is the security protocol decided?

Everyone who is following the QUIC issues on github knows that there are plenty of people with a lot of ideas and thoughts on these matters and this meeting shows this impression is real.

For a casual bystander, you might’ve been fooled into thinking that because Google already made and deployed QUIC, these issues should be if not already done and decided, at least fairly speedily gone over. But nope. I think there are plenty of indications already that the protocol outputs that will come in the end of this process, the IETF QUIC will differ from the Google QUIC in a fair number of places.

The plan is that the different QUIC drafts (there are at least 4 different planned RFCs as they’re divided right now) should all be “done” during 2018.

(At 4am, the room took lunch and I wrote this up.)

2nd best in Sweden

“Probably the only person in the whole of Sweden whose code is used by all people in the world using a computer / smartphone / ATM / etc … every day. His contribution to the world is so large that it is impossible to understand the breadth.

(translated motivation from the Swedish original page)

Thank you everyone who nominated me. I’m truly grateful, honored and humbled. You, my community, is what makes me keep doing what I do. I love you all!

To list “Sweden’s best developers” (the list and site is in Swedish) seems like a rather futile task, doesn’t it? Yet that’s something the Swedish IT and technology news site Techworld has been doing occasionally for the last several years. With two, three year intervals since 2008.

Everyone reading this will of course immediately start to ponder on what developers they speak of or how they define developers and how on earth do you judge who the best developers are? Or even who’s included in the delimiter “Sweden” – is that people living in Sweden, born in Sweden or working in Sweden?

I’m certainly not alone in having chuckled to these lists when they have been published in the past, as I’ve never seen anyone on the list be even close to my own niche or areas of interest. The lists have even worked a little as a long-standing joke in places.

It always felt as if the people on the lists were found on another planet than mine – mostly just Java and .NET people. and they very rarely appeared to be developers who actually spend their days surrounded by code and programming. I suppose I’ve now given away some clues to some characteristics I think “a developer” should posses…

This year, their fifth time doing this list, they changed the way they find candidates, opened up for external nominations and had a set of external advisors. This also resulted in me finding several friends on the list that were never on it in the past.

Tonight I got called onto the stage during the little award ceremony and I was handed this diploma and recognition for landing at second place in the best developer in Sweden list.

img_20161201_192510

And just to keep things safe for the future, this is how the listing looks on the Swedish list page:

2nd-best-developer-2016

Yes I’m happy and proud and humbled. I don’t get this kind of recognition every day so I’ll take this opportunity and really enjoy it. And I’ll find a good spot for my diploma somewhere around the house.

I’ll keep a really big smile on my face for the rest of the day for sure!

best-dev-2016

(Photo from the award ceremony by Emmy Jonsson/IDG)

Update

The winner was Joel Ambrahansson, in the middle on the photo above, and on third place and on the right in the photo is Mina Nakicenovic.

curl security audit

“the overall impression of the state of security and robustness
of the cURL library was positive.”

I asked for, and we were granted a security audit of curl from the Mozilla Secure Open Source program a while ago. This was done by Mozilla getting a 3rd party company involved to do the job and footing the bill for it. The auditing company is called Cure53.

good_curl_logoI applied for the security audit because I feel that we’ve had some security related issues lately and I’ve had the feeling that we might be missing something so it would be really good to get some experts’ eyes on the code. Also, as curl is one of the most used software components in the world a serious problem in curl could have a serious impact on tools, devices and applications everywhere. We don’t want that to happen.

Scans and tests and all

We run static analyzers on the code frequently with a zero warnings tolerance. The daily clang-analyzer scan hasn’t found a problem in a long time and the Coverity once-every-few-weeks occasionally finds something suspicious but we always fix those immediately.

We have  thousands of tests and unit tests that we run non-stop on the code on multiple platforms running multiple build combinations. We also use valgrind when running tests to verify memory use and check for potential memory leaks.

Secrecy

The audit itself. The report and the work on fixing the issues were all done on closed mailing lists without revealing to the world what was really going on. All as our fine security process describes.

There are several downsides with fixing things secretly. One of the primary ones is that we get much fewer eyes on the fixes and there aren’t that many people involved when discussing solutions or approaches to the issues at hand. Another is that our test infrastructure is made for and runs only public code so the code can’t really be fully tested until it is merged into the public git repository.

The report

We got the report on September 23, 2016 and it certainly gave us a lot of work.

The audit report has now been made public and is a very interesting work if you’re into security, C code and curl hacking. I find the report very clear, well written and it spells out each problem very accurately and even shows proof of concept code snippets and exploit examples to drive the points home.

Quoted from the report intro:

As for the approach, the test was rooted in the public availability of the source code belonging to the cURL software and the investigation involved five testers of the Cure53 team. The tool was tested over the course of twenty days in August and September of 2016 and main efforts were focused on examining cURL 7.50.1. and later versions of cURL. It has to be noted that rather than employ fuzzing or similar approaches to validate the robustness of the build of the application and library, the latter goal was pursued through a classic source code audit. Sources covering authentication, various protocols, and, partly, SSL/TLS, were analyzed in considerable detail. A rationale behind this type of scoping pointed to these parts of the cURL tool that were most likely to be prone and exposed to real-life attack scenarios. Rounding up the methodology of the classic code audit, Cure53 benefited from certain tools, which included ASAN targeted with detecting memory errors, as well as Helgrind, which was tasked with pinpointing synchronization errors with the threading model.

They identified no less than twenty-three (23) potential problems in the code, out of which nine were deemed security vulnerabilities. But I’d also like to emphasize that they did also actually say this:

At the same time, the overall impression of the state of security and robustness of the cURL library was positive.

Resolving problems

In the curl security team we decided to downgrade one of the 9 vulnerabilities to a “plain bug” since the required attack scenario was very complicated and the risk deemed small, and two of the issues we squashed into treating them as a single one. That left us with 7 security vulnerabilities. Whoa, that’s a lot. The largest amount we’ve ever fixed in a single release before was 4.

I consider handling security issues in the project to be one of my most important tasks; pretty much all other jobs are down-prioritized in comparison. So with a large queue of security work, a lot of bug fixing and work on features basically had to halt.

You can get a fairly detailed description of our work on fixing the issues in the fix and validation log. The report, the log and the advisories we’ve already posted should cover enough details about these problems and associated fixes that I don’t feel a need to write about them much further.

More problems

Just because we got our hands full with an audit report doesn’t mean that the world stops, right? While working on the issues one by one to have them fixed we also ended up getting an additional 4 security issues to add to the set, by three independent individuals.

All these issues gave me a really busy period and it felt great when we finally shipped 7.51.0 and announced all those eleven fixes to the world and I could get a short period of relief until the next tsunami hits.

Mozilla’s search for a new logo

I’m employed by Mozilla. The same Mozilla that recently has announced that it is looking around for feedback on how to revamp its logo and graphical image.

It was with amusement I saw one of the existing suggestions for a new logo by using “://” (colon slash slash) the name:

mozilla-colon-slashslash-suggestion

… compared with the recently announced new curl logo:

good_curl_logo

Me being in both teams and being a general Internet protocol enthusiast I couldn’t be more happy if Mozilla would end up using a design so clearly based on the same underlying thoughts. After all,
Imitation is the sincerest of flattery as Charles Caleb Colton once so eloquently expressed it.

A workshop Monday

http workshopI decided I’d show up a little early at the Sheraton as I’ve been handling the interactions with hotel locally here in Stockholm where the workshop will run for the coming three days. Things were on track, if we ignore how they got the wrong name of the workshop on the info screens in the lobby, instead saying “Haxx Ab”…

Mark welcomed us with a quick overview of what we’re here for and quick run-through of the rough planning for the days. Our schedule is deliberately loose and open to allow for changes and adaptations as we go along.

Patrick talked about the 1 1/2 years of HTTP/2 working in Firefox so far, and we discussed a lot around the numbers and telemetry. What do they mean and why do they look like this etc. HTTP/2 is now at 44% of all HTTPS requests and connections using HTTP/2 are used for more than 8 requests on median (compared to slightly over 1 in the HTTP/1 case). What’s almost not used at all? HTTP/2 server push, Alt-Svc and HTTP 308 responses. Patrick’s presentation triggered a lot of good discussions. His slides are here.

RTT distribution for Firefox running on desktop and mobile, from Patrick’s slide set:

rtt-dist

The lunch was lovely.

Vlad then continued to talk about experiences from implementing and providing server push at Cloudflare. It and the associated discussions helped emphasize that we need better help for users on how to use server push and there might be reasons for browsers to change how they are stored in the current “secondary cache”. Also, discussions around how to access pushed resources and get information about pushes from javascript were briefly touched on.

After a break with some sweets and coffee, Kazuho continued to describe cache digests and how this concept can help making servers do better or more accurate server pushes. Back to more discussions around push and what it actually solved, how much complexity it is worth and so on. I thought I could sense hesitation in the room on whether this is really something to proceed with.

We intend to have a set of lightning talks after lunch each day and we have already have twelve such suggested talks listed in the workshop wiki, but the discussions were so lively and extensive that we missed them today and we even had to postpone the last talk of today until tomorrow. I can already sense how these three days will not be enough for us to cover everything we have listed and planned…

We ended the evening with a great dinner sponsored by Mozilla. I’d say it was a great first day. I’m looking forward to day 2!