Category Archives: Development

Cure coming for Wrap Rage?

This phenomena you thought you were alone to experience, the rage and anger you feel when you’ve bought some new toy and you get it packaged in tight and nearly un-enforceable plastic that demands a decent amount of violence and persistence to crack. It’s called Wrap Rage.

I’ve been told the packages (called blister packs or clam shells) are designed to be this way to be able to show off the merchandise while at the same time prevent thefts: it is hard for a customer to just extract something out of those things in your typical physical store.

Amazon’s initiative Frustration-Free packaging is indeed a refreshing take on this and apparently an attempt to reverse this development. Online stores really cannot have any good reasons to use this kind of armor around products since there’s no risk of stealing. I wish others will follow to make the manufacturers realize that there is a market for this. This needs to be done by manufacturers of stuff, the stores cannot be made to repackage stuff due to warranties and what not.

It wouldn’t surprise me if you could even find cheaper ways to package products once you let go of some of the requirements that no longer apply for online stores. Visibility of the products once packaged is another thing that is pointless for online stores but I would expect is very important to sales in physical stores. I’ve always thought it is pretty pointless and expensive that every single package is made to be able to be a display model. To be able to attract customers to buy it. When you buy the thing online it’s no longer just pointless, it’s plain stupid.

Imagine a future when you can just open your new toy without getting bruises or scratch marks!

My million users

I’ve been working professionally with computers since 1991 and explicitly as a developer since 1993. I’ve written one or two lines of code since then. How many users could there be out there that are using something that includes my code?

Open source

I’ve participated in a wide range of open source projects, so of course all direct users of those projects would count: curl, Rockbox and let’s include subversion and others. I would guess that there are at least one million users of curl, quite likely more than so of subversion and Rockbox may also reach a million users or so. It’s of course impossible to know for sure…

Lots of open source projects use libraries that I work on now and have worked with in the past. Primarily libcurl and c-ares. Such as Boinc, git, bazaar, darcs. Millions of users, no doubt (Boinc alone has some 1.5 million users). The OLPC’s XO laptop comes with (lib)curl. I think most Linux distros these days come with curl installed. How many linux installations are there? libcurl is rather popular when used within PHP as well and there are many many million installations of PHP out there. I have code in wget, also used by millions.

Closed source users of open source I’ve participated in

Adobe acrobat reader (for non-windows platforms), Adobe’s flash player and various other Adobe products, Second life, Google Earth and others. They’re bound to have several million users. curl is included in Mac OS X.

There are also a lot of devices that use libcurl that are even harder to track: Sandisk makes mp3 players that use libcurl, Sony makes a video device that uses libcurl, Tilgin, Neuros and others make IPTV-devices that use libcurl. libcurl is used for multiple “installers” such as the one AOL provide for a specific router. There are many company users.

Closed source stuff I’ve worked with on my day-job

… is of course also used widely and all over, but me being an embedded guys I mostly work on software in products and most of the products I’ve worked within have been for various niche markets in which I have little or no knowledge about how much the products (and thus my code) are actually used. I’ve left my fingerprints on several networking products, IPTV/Digital TV settop boxes, railroad equipments, a car ignition tester, 3g/telecom switches, rfid receivers, laser-using positioning systems and more.

How many millions?

Ok, let’s for the sake of the argument say that there’s somewhere around 100 million devices with my code from me included – I really have no idea how to make a sensible estimate. Let’s for simplicity also say that there are 100 million users of these devices. I would also guess that about half of the world’s population isn’t near using devices I may have programmed. Thus, if you’re using “devices” in general there’s a probability of 3 billion/100 million = 1/30 that you’re using something that includes code that I’ve worked on…

In fact, that number is then valid for any random “device” user – if you’re reading this on my blog I don’t expect you to be very random but rather a specialized person and then I would say the likeliness of you having at least something with my code in it is almost 100% guaranteed…

Where would you say my biggest weaknesses in this reasoning are?

Metalink in curl bounty

The Metalink guys host a list of project ideas and one of those ideas is to add metalink support to curl, and I recently bumped the stakes a bit by raising the bounty with an additional 200 USD so that the offer is now 500 USD for the person or team that brings the feature as described.

My primary motivation for doing this is that I like the metalink idea and I’d like to help making sure it gets used more widely.

The Open Source Census Report

I’d never heard about the Open Source Census before when I fell over a mention of their recent report somewhere. Their mission is to get “enterprises” to install their little client which scans computers for open source products and reports the findings back to a central server.

Anyway, their current database consists of a “mere” 2300 machines scanned but that equals a total of 314,000 open source installations. 768 different packages are identified. The top-10 found products are:

  1. firefox 84.4%
  2. zlib 65.75%
  3. xerces 61.24%
  4. wget 61.12%
  5. xalan 58.19%
  6. prototype 57.03%
  7. activation 53.01%
  8. javamail 50.15%
  9. openssl 46.45%
  10. docbook-xml 46.27%

Ok, as an open source hacker and a geek, there are two things we need to do here: 1) find out how our own projects rank among the others and 2) how the scanning is done and thus how good it is. Thankfully all this is possible due to the entire data set being downloadable for free and the client being fully open source.

find out how our own projects rank

“curl” was found on 18.19% of all computers. That makes it #81 on the list, just below virtualbox and wireshark, but immediately above jstl and busybox. This includes “All Versions” of all tools, and for curl’s sake that was 22 different versions!

I found no other project I do anything noticeable in. Subversion is at #44.

how the scanning is done

It’s quite simple. It scans for file names based on a file name pattern and then it pattern matches contents of those files. It also extracts version numbers for the files using those regex patterns. You can see the full set of patterns/rules in the XML file straight off their source code repository: project-rules.xml.

how good is it

With this specific patterns for binary contents they of course need special human treatment for many versions and that is of course error-prone. That could explain why no curl version of the latest version (7.19.0) was reported. It will also cause renamed tools to remain undetected.

In my particular case I would of course also like to know how much libcurl is used, but they don’t seem to check for that (I found several projects besides the curl tool that I know use libcurl).

All this said, I didn’t actually try out the client myself so I haven’t verified it for real.

strcasecmp in Turkish

A friendly user submitted the (lib)curl bug report #2154627 which identified a problem with our URL parser. It doesn’t treat “file://” as a known protocol if the locale in use is Turkish.

This was the beginning of a minor world-moving revelation for me. Of course this is already known to mankind and I’m just behind, but really: lots of my fellow hacker friends had no idea either.

So “file” and “FILE” are not the same word case insensitively in Turkish because ‘i’ is not the lowercase version of ‘I’.

Back to strcasecmp: POSIX pretty much makes the function useless by saying that “The results are unspecified in other locales [than POSIX]”.

I’m a bit annoyed by this fact, as now I have to introduce my own function (which thus cannot use tolower() or toupper() since they also are affected by the locale) and use since the strings in our code is clearly used for “English” strings so file and FILE truly are the same string when compared case insensitively…

So THAT is the point of releases!

In the Rockbox project we’ve been using a rather sophisticated build system for many years that provide updated binary packages to the public after every single commit. We also provide daily built zips, manuals, fonts and other extras directly off the subversion server fully automatic every day.

I used to be in the camp that thought that this is a very good system to the extent that it makes ordinary version-numbered releases somewhat unnecessary since everyone can easily get recent downloads whenever they want anyway. We also had a general problem getting a release done.

But as you all know by now, we shipped Rockbox 3.0 the other day. And man did it hit the news!

lifehacker.com, gizmodo.com, engadget.com, slashdot.org, golum.de, boingboing.net, reddit.com and others helped us really put our web server to a crawl. The 4 days following the release, we got roughly 160,000 more visits on our site than usual, 5 times the normal amount (200,000 visits compared to the “normal” 40,000).

Of course, as a pure open source project with no company or money involved anywhere, we don’t exactly need new users but we of course want more developers and hopefully we do reach out to a few new potential contributors when we become known to a larger amount of people.

So I’m now officially convinced: doing this release was a good thing!

Shared Dictionary Compression over HTTP

Wei-Hsin Lee of Google posted about their effort to create a dictionary-based compression scheme for HTTP. I find the idea rather interesting, and it’ll be fun to see what the actual browser and server vendors will say about this.

The idea is basically to use “cookie rules” (domain, path, port number, max-age etc) to make sure a client gets a dictionary and then the server can deliver responses that are diffs computed against the dictionary it has delivered before to the client. For repeated similar contents it should be able to achieve a lot better compression ratios than any other existing HTTP compression in use.

I figure it should be seen as a relative to the “Delta encoding in HTTP” idea, although the SDCH idea seems somewhat more generically applicable.

Since they seem to be using the VCDIFF algorithm for SDCH, the recent open-vcdiff announcement of course is interesting too.

popen() in pthreaded program confuses gdb

I just thought I’d share a lesson I learned today:

I’ve been struggling for a long time at work with a gdb problem. When I set a break-point and then single-step from that point, it sometimes (often) decides to act as if I had done ‘continue‘ and not ‘next‘. It is highly annoying and makes debugging nasty problems really awkward.

Today I searched around for the topic and after some experiments I can now confirm: if I remove all uses of popen() I no longer get the problems! I found posts that indicated that forking could confuse threaded programs, and since this program at work uses threads I could immediately identify that it uses both popen() and system() and both of them use fork() internally. (And yes, I believe my removal of popen() also removed the system() calls.)

And now I can finally debug my crappy code again to become less crappy!

My work PC runs glibc 2.6.1, gcc 4.1.3 and gdb 6.6. But I doubt the specific versions matter much.

Standardized cookies never took off

David M. Kristol is one of the authors of RFC2109 and RFC2965, “HTTP State Management Mechanism”. RFC2109 is also known as the first attempt to standardize how cookies should be sent and received. Prior to that document, the only cookie spec was the very brief document released by Netscape in the old days and it certainly left many loose ends.

Mr Kristol has published a great and long document, HTTP Cookies: Standards, Privacy, and Politics, about the slow and dwindling story of how the work on the IETF with the cookie standard took place and how it proceeded.

Still today, none of those documents are used very much. The original Netscape way is still how cookies are done and even if a lot of good will and great efforts were spent on doing things right in these RFCs, I can’t honestly say that I can see anything on the horizon that will push the web world towards changing the cookies to become compliant.