Changing networks with Linux

January 16th, 2015 by Daniel Stenberg

A rather long time ago I blogged about my work to better deal with changing networks while Firefox is running, and the change was then pushed for Android and I subsequently pushed the same functionality for Firefox on Mac.

Today I’ve landed yet another change, which detects network changes on Firefox OS and Linux.

Firefox Nightly screenshotAs Firefox OS uses a Linux kernel, I ended up doing the same fix for both the Firefox OS devices as for Firefox on Linux desktop: I open a socket in the AF_NETLINK family and listen on the stream of messages the kernel sends when there are network updates. This way we’re told when the routing tables update or when we get a new IP address etc. I consider this way better than the NotifyIpInterfaceChange() API Windows provides, as this allows us to filter what we’re interested in. The windows API makes that rather complicated and in fact a lot of the times when we get the notification on windows it isn’t clear to me why!

The Mac API way is what I would consider even more obscure, but then I’m not at all used to their way of doing things and how you add things to the event handlers etc.

The journey to the landing of this particular patch was once again long and bumpy and full of sweat in this tradition that seem seems to be my destiny, and this time I ran into problems with the Firefox OS emulator which seems to have some interesting bugs that cause my code to not work properly and as a result of that our automated tests failed: occasionally data sent over a pipe or socketpair doesn’t end up in the receiving end. In my case this means that my signal to the child thread to die would sometimes not be noticed and thus the thread wouldn’t exit and die as intended.

I ended up implementing a work-around that makes it work even if the emulator eats the data by also checking a shared should-I-shutdown-now flag every once in a while. For more specific details on that, see the bug.

http2 explained 1.8

January 15th, 2015 by Daniel Stenberg

I’ve been updating my “http2 explained” document every nohttp2 logow and then since my original release of it back in April 2014. Today I put up version 1.8 which is one of the bigger updates in a while:

http2 explained

The HTTP/2 Last Call within the IETF ended yesterday and the wire format of the protocol has remained fixed for quite some time now so it seemed like a good moment.

I updated some graphs and images to make them look better and be more personal, I added some new short sections in 8.4 and I refreshed the language in several places. Also, now all links mentioned in footnotes and elsewhere should be properly clickable to make following them a more pleasant experience. And page numbers!

As always, do let me know if you find errors, have questions on the content or think I should add something!

My talks at FOSDEM 2015

January 14th, 2015 by Daniel Stenberg

fosdem

Sunday 13:00, embedded room (Lameere)

Tile: Internet all the things – using curl in your device

Embedded devices are very often network connected these days. Network connected embedded devices often need to transfer data to and from them as clients, using one or more of the popular internet protocols.

libcurl is the world’s most used and most popular internet transfer library, already used in every imaginable sort of embedded device out there. How did this happen and how do you use libcurl to transfer data to or from your device?

Note that this talk was originally scheduled to be at a different time!

Sunday, 09:00 Mozilla room (UD2.218A)

Title: HTTP/2 right now

HTTP/2 is the new version of the web’s most important and used protocol. Version 2 is due to be out very soon after FOSDEM and I want to inform the audience about what’s going on with the protocol, why it matters to most web developers and users and not the last what its status is at the time of FOSDEM.

My first year at Mozilla

January 13th, 2015 by Daniel Stenberg

January 13th 2014 I started my fiMozilla dinosaur head logorst day at Mozilla. One year ago exactly today.

It still feels like it was just a very short while ago and I keep having this sense of being a beginner at the company, in the source tree and all over.

One year of networking code work that really at least during periods has not progressed as quickly as I would’ve wished for, and I’ve had some really hair-tearing problems and challenges that have taken me sweat and tears to get through. But I am getting through and I’m enjoying every (oh well, let’s say almost every) moment.

During the year I’ve had the chance to meetup with my team mates twice (in Paris and in Portland) and I’ve managed to attend one IETF (in London) and two special HTTP2 design meetings (in London and NYC).

openhub.net counts 47 commits by me in Firefox and that feels like counting high. bugzilla has tracked activity by me in 107 bug reports through the year.

I’ve barely started. I’ll spend the next year as well improving Firefox networking, hopefully with a higher turnout this year. (I don’t mean to make this sound as if Firefox networking is just me, I’m just speaking for my particular part of the networking team and effort and I let the others speak for themselves!)

Onwards and upwards!

My table tennis racket sized phone

January 13th, 2015 by Daniel Stenberg

I upgraded my Nexus 5 to a Nexus 6 the other day. It is a biiiig phone, and just to show you how big I made a little picture showing all my Android phones so far using the correct relative sizes. It certainly isn’t very far away from a table tennis racket in size now. My Android track record so far goes like this: HTC Magic, HTC Desire HD, Nexus 4, Nexus 5 and now Nexus 6.

my-androids

As shown, this latest step is probably the biggest relative size change in a single go. If the next step would be as big, imagine the size that would require! (While you think about that, I’ve already done the math: the 6 is 159.3 mm tall, 15.5% taller than the 5’s’ 137.9mm, so adding 15.5% to the Nexus 6 ends up at 184 – only 16 mm shorter than a Nexus 7 in portrait mode… I don’t think I could handle that!)

After the initial size shock, I’m enjoying the large size. It is a bit of a clunker to cram down into my left front-side jeans pocket where I’m used to carry around my device. It is still doable, but not as easy as before and it easily get uncomfortable when sitting down. I guess I need to sit less or change my habit somehow.

This largest phone ever ironically switched to the smallest SIM card size so my micro-SIM had to be replaced with a nano-SIM.

Borked upgrade procedure

Not a single non-Google app got installed in my new device in the process. I strongly suspect it was that “touch the back of another device to copy from” thing that broke it because it didn’t work at all – and when it failed, it did not offer me to restore a copy from backup which I later learned it does if I skip the touch-back step. I ended up manually re-installing my additional 100 or so apps…

My daughter then switched from her Nexus 4 to my (by then) clean-wiped 5.  For her, we skipped that broken back-touch process and she got a nice backup from the 4 restored onto the 5. But she got another nasty surprise: basically over half of her contacts were just gone when she opened the contacts app on the 5, so we had to manually go through the contact list on the old device and re-add them into the new one. The way we did (not even do) it in the 90s…

The Android device installation (and data transfer) process is not perfect yet. Although my brother says he did his two upgrades perfectly smoothly…

curl 7.40.0: unix domain sockets and smb

January 8th, 2015 by Daniel Stenberg

curl and libcurl curl dot-to-dot7.40.0 was just released this morning. There’s a closer look at some of the perhaps more noteworthy changes. As usual, you can find the entire changelog on the curl web site.

HTTP over unix domain sockets

So just before the feature window closed for the pending 7.40.0 release of curl, Peter Wu’s patch series was merged that brings the ability to curl and libcurl to do HTTP over unix domain sockets. This is a feature that’s been mentioned many times through the history of curl but never previously truly implemented. Peter also very nicely adjusted the test server and made two test cases that verify the functionality.

To use this with the curl command line, you specify the socket path to the new –unix-domain option and assuming your local HTTP server listens on that socket, you’ll get the response back just as with an ordinary TCP connection.

Doing the operation from libcurl means using the new CURLOPT_UNIX_SOCKET_PATH option.

This feature is actually not limited to HTTP, you can do all the TCP-based protocols except FTP over the unix domain socket, but it is to my knowledge only HTTP that is regularly used this way. The reason FTP isn’t supported is of course its use of two connections which would be even weirder to do like this.

SMB

SMB is also known as CIFS and is an old network protocol from the Microsoft world access files. curl and libcurl now support this protocol with SMB:// URLs thanks to work by Bill Nagel and Steve Holme.

Security Advisories

Last year we had a large amount of security advisories published (eight to be precise), and this year we start out with two fresh ones already on the 8th day… The ones this time were of course discovered and researched already last year.

CVE-2014-8151 is a way we accidentally allowed an application to bypass the TLS server certificate check if a TLS Session-ID was already cached for a non-checked session – when using the Mac OS SecureTransport SSL backend.

CVE-2014-8150 is a URL request injection. When letting curl or libcurl speak over a HTTP proxy, it would copy the URL verbatim into the HTTP request going to the proxy, which means that if you craft the URL and insert CRLFs (carriage returns and linefeed characters) you can insert your own second request or even custom headers into the request that goes to the proxy.

You may enjoy taking a look at the curl vulnerabilities table.

Bugs bugs bugs

The release notes mention no less than 120 specific bug fixes, which in comparison to other releases is more than average.

Enjoy!

Can curl avoid to be in a future funnily named exploit that shakes the world?

December 15th, 2014 by Daniel Stenberg

During this year we’ve seen heartbleed and shellshock strike (and a  few more big flaws that I’ll skip for now). Two really eye opening recent vulnerabilities in projects with many similarities:

  1. Popular corner stones of open source stacks and internet servers
  2. Mostly run and maintained by volunteers
  3. Mature projects that have been around since “forever”
  4. Projects believed to be fairly stable and relatively trustworthy by now
  5. A myriad of features, switches and code that build on many platforms, with some parts of code only running on a rare few
  6. Written in C in a portable style

Does it sound like the curl project to you too? It does to me. Sure, this description also matches a slew of other projects but I lead the curl development so let me stay here and focus on this project.

cURLAre we in jeopardy? I honestly don’t know, but I want to explain what we do in our project in order to minimize the risk and maximize our ability to find problems on our own before they become serious attack vectors somewhere!

previous flaws

There’s no secret that we have let security problems slip through at times. We’re right now working toward our 143rd release during our around 16 years of life-time. We have found and announced 28 security problems over the years. Looking at these found problems, it is clear that very few security problems are discovered quickly after introduction. Most of them linger around for several years until found and fixed. So, realistically speaking based on history: there are security bugs still in the code, and they have probably been present for a while already.

code reviews and code standards

We try to review all patches from people without push rights in the project. It would probably be a good idea to review all patches before they go in for real, but that just wouldn’t work with the (lack of) man power we have in the project while we at the same time want to develop curl, move it forward and introduce new things and features.

We maintain code standards and formatting to keep code easy to understand and follow. We keep individual commits smallish for easier review now or in the future.

test cases

As simple as it is, we test that the basic stuff works. We don’t and can’t test everything but having test cases for most things give us the confidence to change code when we see problems as we then remain fairly sure things keep working the same way as long as the test go through. In projects with much less test coverage, you become much more conservative with what you dare to change and that also makes you more vulnerable.

We always want more test cases and we want to improve on how we always add test cases when we add new features and ideally we should also add new test cases when we fix bugs so that we know that we don’t introduce any such bug again in the future.

static code analyzes

We regularly scan our code base using static code analyzers. Both clang-analyzer and coverity are good tools, and they help us by pointing out code that look wrong or suspicious. By making sure we have very few or no such flaws left in the code, we minimize the risk. A static code analyzer is better than run-time tools for cases where they can check code flows that are hard to repeat in my local environment.

valgrind

bike helmetValgrind is an awesome tool to detect memory problems in run-time. Leaks or just stupid uses of memory or related functions. We have our test suite automatically use valgrind when it runs tests in case it is present and it helps us make sure that all situations we test for are also error-free from valgrind’s point of view.

autobuilds

Building and testing curl on a plethora of platforms non-stop is also useful to make sure we don’t depend on behaviors of particular library implementations or non-standard features and more. Testing it all is basically the only way to make sure everything keeps working over the years while we continue to develop and fix bugs. We would course be even better off with more platforms that would test automatically and with more developers keeping an eye on problems that show up there…

code complexity

Arguably, one of the best ways to avoid security flaws and bugs in general, is to keep the source code as simple as possible. Complex functions need to be broken down into smaller functions that are possible to read and understand. A good way to identify functions suitable for fixing is pmccabe,

essential third parties

curl and libcurl are usually built to use a whole bunch of third party libraries in order to perform all the functionality. In order to not have any of those uses turn into a source for trouble we must of course also participate in those projects and help them stay strong and make sure that we use them the proper way that doesn’t lead to any bad side-effects.

You can help!

All this takes time, energy and system resources. Your contributions and help will be appreciated where ever among these tasks that you can insert any. We could do more of all this, more often and more thorough if we only were more people involved!

libcurl multi_socket 3333 days later

December 10th, 2014 by Daniel Stenberg

.SE-logoOn October 25, 2005 I sent out the announcement about “libcurl funding from the Swedish IIS Foundation“. It was the beginning of what would eventually become the curl_multi_socket_action() function and its related API features. The API we provide for event-driven applications. This API is the most suitable one in libcurl if you intend to scale up your client up to and beyond hundreds or thousands of simultaneous transfers.

Thanks to this funding from IIS, I could spend a couple of months working full-time on implementing the ideas I had. They paid me the equivalent of 19,000 USD back then. IIS is the non-profit foundation that runs the .se TLD and they fund projects that help internet and internet usage, in particular in Sweden. IIS usually just call themselves “.se” (dot ess ee) these days.

Event-based programming isn’t generally the easiest approach so most people don’t easily take this route without careful consideration, and also if you want your event-based application to be portable among multiple platforms you also need to use an event-based library that abstracts the underlying function calls. These are all reasons why this remains a niche API in libcurl, used only by a small portion of users. Still, there are users and they seem to be able to use this API fine. A success in my eyes.

One dollar billPart of that improvement project to make libcurl scale and perform better, was also to introduce HTTP pipelining support. I didn’t quite manage that part with in the scope of that project but the pipelining support in libcurl was born in that period  (autumn 2006) but had to be improved several times over the years until it became decently good just a few years ago – and we’re just now (still) fixing more pipelining problems.

On December 10, 2014 there are exactly 3333 days since that initial announcement of mine. I’d like to highlight this occasion by thanking IIS again. Thanks IIS!

Current funding

These days I’m spending a part of my daytime job working on curl with my employer’s blessing and that’s the funding I have – most of my personal time spent is still spare time. I certainly wouldn’t mind seeing others help out, but the best funding is provided as pure man power that can help out and not by trying to buy my time to add your features. Also, I will decline all (friendly) offers to host the web site on your servers since we already have a fairly stable and reliable infrastructure sponsored.

I’m not aware of anyone else that are spending (much) paid work time on curl code, although I’m know there are quite a few who do it every now and then – especially to fix problems that occur in commercial products or services or to add features to such.

IIS still donates money to internet related projects in Sweden but I never applied for any funding from them again. Mostly because it has been hard to sync with my normal life and job situation. If you’re a Swede or just live in Sweden, do consider checking this out for your next internet adventure!

Why curl defaults to stdout

November 17th, 2014 by Daniel Stenberg

(Recap: I founded the curl project, I am still the lead developer and maintainer)

When asking curl to get a URL it’ll send the output to stdout by default. You can of course easily change this behavior with options or just using your shell’s redirect feature, but without any option it’ll spew it out to stdout. If you’re invoking the command line on a shell prompt you’ll immediately get to see the response as soon as it arrives.

I decided curl should work like this, and it was a natural decision I made already when I worked on the predecessors during 1997 or so that later would turn into curl.

On Unix systems there’s a common mantra that “everything is a file” but also in fact that “everything is a pipe”. You accomplish things on Unix by piping the output of one program into the input of another program. Of course I wanted curl to work as good as the other components and I wanted it to blend in with the rest. I wanted curl to feel like cat but for a network resource. And cat is certainly not the only pre-curl command that writes to stdout by default; they are plentiful.

And then: once I had made that decision and I released curl for the first time on March 20, 1998: the call was made. The default was set. I will not change a default and hurt millions of users. I rather continue to be questioned by newcomers, but now at least I can point to this blog post! :-)

About the wget rivalry

cURLAs I mention in my curl vs wget document, a very common comment to me about curl as compared to wget is that wget is “easier to use” because it needs no extra argument in order to download a single URL to a file on disk. I get that, if you type the full commands by hand you’ll use about three keys less to write “wget” instead of “curl -O”, but on the other hand if this is an operation you do often and you care so much about saving key presses I would suggest you make an alias anyway that is even shorter and then the amount of options for the command really doesn’t matter at all anymore.

I put that argument in the same category as the people who argue that wget is easier to use because you can type it with your left hand only on a qwerty keyboard. Sure, that is indeed true but I read it more like someone trying to come up with a reason when in reality there’s actually another one underneath. Sometimes that other reason is a philosophical one about preferring GNU software (which curl isn’t) or one that is licensed under the GPL (which wget is) or simply that wget is what they’re used to and they know its options and recognize or like its progress meter better.

I enjoy our friendly competition with wget and I seriously and honestly think it has made both our projects better and I like that users can throw arguments in our face like “but X can do Y”and X can alter between curl and wget depending on which camp you talk to. I also really like wget as a tool and I am the occasional user of it, just like most Unix users. I contribute to the wget project well, both with code and with general feedback. I consider myself a friend of the current wget maintainer as well as former ones.

Keyboard key frequency

November 12th, 2014 by Daniel Stenberg

A while ago I wrote about my hunt for a new keyboard, and in my follow-up conversations with friends around that subject I quickly came to the conclusion I should get myself better analysis and data on how I actually use a keyboard and the individual keys on it. And if you know me, you know I like (useless) statistics.

Func KB-460 keyboardSo, I tried out the popular and widely used Linux key-logger software ‘logkeys‘ and immediately figured out that it doesn’t really support the precision and detail level I wanted so I forked the project and modified the code to work the way I want it: keyfreq was born. Code on github. (I forked it because I couldn’t find any way to send back my modifications to the upstream project, I don’t really feel a need for another project.)

Then I fired up the logging process and it has been running in the background for a while now, logging every key stroke with a time stamp.

Counting key frequency and how it gets distributed very quickly turns into basically seeing when I’m active in front of the computer and it also gave me thoughts around what a high key frequency actually means in terms of activity and productivity. Does a really high key frequency really mean that I was working intensely or isn’t that purpose more a sign of mail sending time? When I debug problems or research details, won’t those periods result in slower key activity?

In the end I guess that over time, the key frequency chart basically says that if I have pressed a lot of keys during a period, I was working on something then. Hours or days with a very low average key frequency are probably times when I don’t work as much.

The weekend key frequency is bound to be slightly wrong due to me sometimes doing weekend hacking on other computers where I don’t log the keys since my results are recorded from a single specific keyboard only.

Conclusions

So what did I learn? Here are some conclusions and results from 1276614 keystrokes done over a period of the most recent 52 calendar days.

I have a 105-key keyboard, but during this period I only pressed 90 unique keys. Out of the 90 keys I pressed, 3 were pressed more than 5% of the time – each. In fact, those 3 keys are more than 20% of all keystrokes. Those keys are: <Space>, <Backspace> and the letter ‘e’.

<Space> stands out from all the rest as it has been used more than 10%.

Only 29 keys were used more than 1% of the presses, giving this a really long tail with lots of keys hardly ever used.

Over this logged time, I have registered key strokes during 46% of all hours. Counting only the hours in which I actually used the keyboard, the average number of key strokes were 2185/hour, 36 keys/minute.

The average week day (excluding weekend days), I registered 32486 key presses. The most active sinngle minute during this logging period, I hit 405 keys. The most active single hour I managed to do 7937 key presses. During weekends my activity is much lower, and then I average at 5778 keys/day (7.2% of all activity were weekends).

When counting most active hours over the day, there are 14 hours that have more than 1% activity and there are 5 with less than 1%, leaving 5 hours with no keyboard activity at all (02:00- 06:59). Interestingly, the hour between 23-24 at night is the single most busy hour for me, with 12.5% of all keypresses during the period.

Random “anecdotes”

Longest contiguous time without keys: 26.4 hours

Longest key sequence without backspace: 946

There are 7 keys I only pressed once during this period; 4 of them are on the numerical keypad and the other three are F10, F3 and <Pause>.

More

I’ll try to keep the logging going and see if things change over time or if there later might end up things that can be seen in the data when looked over a longer period.