Haxxup – cheap remote backup

The pains and guilty consciences from having a lacking backup concept established are widely common. I honestly don’t know anyone (and I mean it) that can say that they have their (home, private) backup covered with a straight face. We all know we should backup locally and remotely, so that we can do fast recovery for the easy things we mistakenly remove or ruin, and if we get burgled or the house burns down we need to have a backup remotely.

The importance of private computer backups has only increased over time, as these days most of us have vast amounts of family pictures and videos stored as well, things that in the old days were stored (and lost) separately.boom

A growing problem with remote backups is of course that we all have ridiculous amounts of data to backup. Getting a commercial remote backup deal for say 300GB (and growing) isn’t cheap. And we’re also very often at loss when it comes to get a solution that works on Linux.

In Haxx, we also recognized and suffered from these problems. We came up with a scheme to fix a distributed networked backup among ourselves! Getting large hard-drives to use locally is fairly cheap. We all have fairly good fixed-fee no-bandwidth-limit internet connections (although admittedly the uplink speeds are lacking for us typical ADSL users).

We decided that among us 4, each of us gets an account at two of our friends’ servers and we’ll be able to upload our backups to those at our own pace to store whatever we want. We decided on getting two places for everyone to decrease the risk even further, especially if you for example urgently need to get something back and one of us have a network problem (not completely unheard of) or something else.

My current total backup is about 100GB and I have a 1mbit uplink. If I use the entire bandwidth for this, other things get a little sluggish so I’ve capped the rsync job to 90KB/sec… My first run thus completed in roughly 13 days. Luckily I don’t add contents at a very high pace so the ordinary sync jobs from then on should be much smaller and should be able to complete within hours. As long as I add less than ~3.5GB during a 24 hour period, it should be able to keep up to sync to two remote places.

curlyears plus equals one

BirthdaycakeThirteen years ago I released the first version of curl to the world – on March 20 1998. curl is now a teenage project and there’s no slowdown or end in sight.

So what does a project like ours introduce after having existed for so long? The recent year has been full of activities in the project, and here’s a run down with some of the stuff that has been going on:

We switched source code versioning system from CVS to git

gopher support got back into curl

support for RTMP was added

Two additional SSL libraries are now supported: PolarSSL and axTLS, making it a total of seven

70 persons provided code into the git repository, making the THANKS document now list 854 names.

About 1000 commits were made – out of a total of almost 14000 (counted from Dec 1999) so it makes this year slightly under average in terms of commit rate.

6 releases were shipped with a total of 179 bugfixes

1 security flaw was found and fixed

More than 4900 mails were posted to the curl mailing lists

We introduced a unit test system

Over 1400GB of data was downloaded from the curl web site

During the end of this period, 45% of our web visitors used Firefox, 23% used Chrome and less than 20% used IE. 75% were on Windows, 13% on Linux and 10% on Mac.

cURL

First time for Steering Board

Some years ago in the Rockbox project (2008 to be exact), we started the Rockbox Steering Board (RSB). A board with the intention of having a core group that would take final hard decision when consensus was not reached among developers, or when conflicts arise or whatever.

I was voted in as member of the board in the first RSB and I’ve been a member of it since. We have annual elections where we vote for 5 trusted persons to attend the board that potentially will make decisions for the project’s good.

But no real crises turned up. No discussion was so heated it wasn’t handled by the developers on the mailing list or over IRC. No decision was needed by the RSB. And time passed.

In February 2011, the first ever case for RSB was brought to us by a member of the project who felt there was potentially some wrong-doing going on or something that was done was against our established procedure.

The issue itself was not that easy to deal with, and it also quickly showed that all five of us RSB members are busy persons with lots of stuff going on in our own ends so each round of discussions and decision-makings took a really long time. In the end we really had to push ourselves to get a statement together and published before the pending release.

I think we did good in the end and I think we learned a little on how to do it better next time. But let’s hope it’ll take another few years until the RSB is brought out again… Thanks Jens, Marianne, Frank and Björn for a job well done!

Rockbox

localhost hack on Windows

There's no place like 127.0.0.1

Readers of my blog and friends in general know that I’m not really a Windows guy. I never use it and I never develop things explicitly for windows – but I do my best in making sure my portable code also builds and runs on windows. This blog post is about a new detail that I’ve just learned and that I think I could help shed the light on, to help my fellow hackers. The other day I was contacted by a user of libcurl because he was using it on Windows and he noticed that when wanting to transfer data from the loopback device (where he had a service of his own), and he accessed it using “localhost” in the URL passed to libcurl, he would spot a DNS request for the address of that host name while when he used regular windows tools he would not see that! After some mails back and forth, the details got clear:

Windows has a default /etc/hosts version (conveniently instead put at “c:\WINDOWS\system32\drivers\etc\hosts”) and that default  /etc/hosts alternative used to have an entry for “localhost” in it that would point to 127.0.0.1.

When Windows 7 was released, Microsoft had removed the localhost entry from the /etc/hosts file. Reading sources on the net, it might be related to them supporting IPv6 for real but it’s not at all clear what the connection between those two actions would be.

getaddrinfo() in Windows has since then, and it is unclear exactly at which point in time it started to do this, been made to know about the specific string “localhost” and is documented to always return “all loopback addresses on the local computer”.

So, a custom resolver such as c-ares that doesn’t use Windows’ functions to resolve names but does it all by itself, that has been made to look in the /etc/host file etc now suddenly no longer finds “localhost” in a local file but ends up asking the DNS server for info about it… A case that is far from ideal. Most servers won’t have an entry for it and others might simply provide the wrong address.

I think we’ll have to give in and provide this hack in c-ares as well, just the way Windows itself does.

Oh, and as a bonus there’s even an additional hack mentioned in the getaddrinfo docs: On Windows Server 2003 and later if the pNodeName parameter points to a string equal to “..localmachine”, all registered addresses on the local computer are returned.

Today I am Chinese

Google thinks I'm ChineseI’m going about my merry life and I use google every day.

Today Google decided I’m in China and redirects me to google.com.hk and it shows me all text in Chinese. It’s just another proof how silly it is trying to use the IP address to figure out location (or even worse trying to guess language based on IP address).

Click on the image to get it in its full glory.

I haven’t changed anything locally, but it seems Google has updated (broken) their database somehow.

Just to be perfectly sure my browser isn’t playing any tricks behind my back, I snooped up the headers sent in the HTTP request and there’s nothing notable:

GET /complete/search?output=firefox&client=firefox&hl=en-US&q=rockbox HTTP/1.1
Host: suggestqueries.google.com
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.6.13-1.fc13 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Cookie: PREF=ID=dc410 [truncated]

Luckily, I know about the URL “google.com/ncr” (No Country Redirect) so I can still use it, but not through my browser’s search box…

Fosdem 2011: my libcurl talk on video

Kai Engert was good enough to capture all the talks in the security devroom at Fosdem 2011, and while I’m seeding the full torrent I’ve made my own talk available as a direct download from here:

Fosdem 2011: security-room at 14:15 by Daniel Stenberg

The thing is about 107MB big, 640×480 resolution and is roughly 26 minutes playing time. WebM format.

libcurl, seven SSL libs and one SSH lib

I did a talk today at Fosdem with this title. The room only had 48 seats and it was completely packed with people standing everywhere it was possible around the seated guys.

The English slides from my talk are below. It was also recorded on video so I hope I’ll be able to post once it becomes available online

News flash! Tech terms used almost correctly!

Ok, The Social Network isn’t a new movie by any means at this time, but I happened to see it the other day. I’ll leave the entire story and whatever facts or not it did or didn’t portrait in a correct manner.

But I did spot the use of several at least basic technical terms used in the beginning that struck me as amazingly correctly used! The movie character Mark actually used wget to download images (at about 10:05 into the movie), and as you can see on my first screenshot the initial keystrokes we get to see on the command line also actually resembles a correct wget command line. You can click on these images to get a slightly larger version of the pics. I’m sorry I couldn’t get any higher quality ones, but I figure the point is still the same!

the-social-network-wget-cmdline

After having invoked wget, as is explained he gets many pictures downloaded and what do you know, the screen output actually looks like it could’ve been a wget that has downloaded a couple of files:

the-social-network-wget-output

He also mentioned the terms ‘Apache’, ’emacs’ and ‘perl scripts’ in complete and correct sentences.

Where is the world heading?!

Update: Hrvoje Niksic, the founder of wget, helped out with some additional observations:

The options looked right to me, something like -r -A.jpg …

I was wondering about the historical accuracy of the progress bar, but it checks out. The movie takes place about a year and a half after the release of Wget 1.8, which added the feature. The department that takes care of these things did a good job. 🙂

Cookies and Websockets and HTTP headers

So yesterday we held a little HTTP-related event in Stockholm, arranged by OWASP Sweden. We talked a bit about cookies, websockets and recent HTTP headers.

Below are all the slides from the presentations I, Martin Holst Swende and John Wilanders did. (The entire event was done in Swedish.)

Martin Holst Swende’s talk:

John Wilander’s slides from his talk are here:

curl, open source and networking