Tag Archives: Open Source

Open source personal

I participate in a range of different open source projects. Of course I spend more time on some of them and only a very little time in most of them, but I’m currently listed as member of 18 projects on sourceforge and 16 on ohloh and I can easily figure out a bunch more than aren’t listed on either of those sites.

I’m just the kind of guy who tend to actually get the code and write up a patch for problems, and in fact also in many cases I’ll write an fresh application and publish it openly for the world (not that my typical programs get any particularly large audience but still). I’m not saying everyone has to be like this, I’m just describing me here.

It seems this is a troublesome concept for people to grasp.

I get a large amount of private mail where people talk about “your project” (as in a single one that I am supposed to understand which one they’re referring to) and just about all open source-related interview/questionnaire things I’ve filled in tend to assume My One Single Project. In the first case I can often guess which one they refer to by the phrasing of the mail, and in the second case I tend to answer for the project I’m involved the most in.

So I get this feed of private emails on projects I participate in, but I don’t like private emails about open source projects when people request and expect free support and help. If they want free support, I expect the people to post the questions publicly and open to allow others to reply and read both the question and the subsequent answer online, right there at the time they’re asked but also much later when searching for help on the same subject as then the answers will be around in mailing list archives etc.

These days I have a blanket reply form that I bounce back when I get private support mails and I will admit that most people respect that after having been told about the situation. Every now and then of course I get a violent refusal for sympathy and instead I get to learn I’m an arrogant bastard. This is also related to the fact that:

We (Haxx) run and offer commercial support around curl and libcurl, and for that purpose we have a dedicated support email address. Mail there if you’re willing to pay for support. That’s actually quite clearly spelled out everywhere where that address is displayed, but yet people seem to find that a good place to mail random questions and bug reports. Just today I got a very upset mail response after I mentioned the “paid support” part of the deal there expecting us (me?) to instantly fix bugs regardless since I’ve been told about them per email…

All in all, I’m not really complaining since I’m generally getting along fine with everyone and stuff around this.

Just everyone try to keep things apart: the projects, the people and the companies. They’re sometimes intertwined but sometimes not.

Open source in my day job

From people in the open source community and then especially friends and fellow hackers in projects I am involved in, I sometimes get questions on how my open source participations affect my “real job”.

I work as a consultant during the days and I do contract development for hire. I’ve been a consultant since 1996 and during this time I’ve participated in more than 25 “full-time” projects for almost as many customers.

I contribute to numerous open source projects and I’ve done it for many years. I lead and maintain several open source projects. I’ve committed many thousands of times to public source code repositories.

Does the contributions make me more attractive to potential customers? Not particularly is my rather sad experience. While some of my customers notice my track record (my CV does of course mention my most notable contributions) most of my day-job clients focus solely on other paid projects I’ve done and that exact technologies and products I worked with and created in the past. It may of course not be too strange as things I do and get paid for is then potentially “good enough for someone to pay me for” while the stuff I do for free in open source projects are… well, not paid for and thus it can’t be qualified by that ruler.

Do I get new assigments thanks to my open source projects? Yes, I do, but usually they tend to be on the smallish side and not of the bigger kinds my regular assignments at work are.

And the reality is of course also that the vast vast vaaast majority of all software consultants that people hire to do development have no public record of open source involvement so it could just be a result of that this is so rare the customers never had a reason to learn or adapt to using open source contributions as a “factor”.

Or maybe I’m just ignorant and haven’t figured out how my customers truly work.

Do I work with open source in my day job? Yes almost exclusively. I’ve been working with Linux in various embedded systems basically the last 8 years and working with Linux systems pretty much implies a wide range of open source development tools as well.

4 ohloh improvements I’d like

I am a stats junkie so I like my stats in large amounts. But I like the stats to be right and as accurate as possible, and when I look at what ohloh produces I like the concepts and ideas in general, I just think their implementation is lacking in a few vital areas that need improvement:

1. There are no dependencies or hierarchies between packages, so “I use this” counters get worthless since people mark end-user packages they use. Low-level support packages and libraries that are used indirectly don’t get many “use counts”

2. Doing very few commits in a very well used project with few authors gives you way way more points than doing a bus-load of commits in something less used with many fellow contributors. This makes the top-list of people very skewed as some of the top-64 people only did a few hundred commits ever. I doubt many mortals would consider someone who only ever did 300 commits to be a top community person. At the very moment I write this, the #1 ranked person has done 20 commits during 5 months…!

3. Too few versioning systems are supported, leaving out huge chunks of the open source world. Bazaar, mercurial and a few more are a bit too popular to be ignored without the results getting skewed.

4. I’d like to see the “number of users” of products as a percentage, as the total number of users they show include all contributors to all projects. Out of the 140,000 users (which undoubtedly include a lot of duplicates), it would surprise me if more than 10,000 have actually registered what products they use. I’ve tried to find the exact number but I failed. So 3,000 users don’t mean 3,000 out of 140,000 but 3,000 out of 10,000…

My Firefox Add-ons

I simply need to have this list somewhere so that I can find out my own add-ons again when I’m running Firefox away from home!

Adblock Plus – since ads are too annoying these days

DownThemAll – because I like to be able to get whole batches of images or similar at times

Fission – just a silly eye-candy thing

Forecastfox – I like weather forecasts!

FoxClocks – helps me keep track of the time my friends around the world have at different moments.

It’s All Text – makes web based editing/posting a more pleasurable experience by allowing me to edit such contents with emacs!

Live HTTP Headers is a must when you want to figure out how to repeat your browser’s actions with a set of curl commands.

Open in Browser allows me to open more stuff within the browser itself, even when the Content-Type is bad.

Right-Click-Link is great when you quickly want to browse to links you find in plain text sections.

Torbutton lets me quickly switch to anonymous browsing.

User Agent Switcher lets me trick stupid server-side scripts into beleiving I use a different browser or even operating system.

What great add-ons did I miss?

(Some nitpickers would say that I don’t run Firefox since I use Debian and then it is called Iceweasel, but while that is entirely true, Iceweasel is still the Firefox source code and the Add-ons are in fact still Firefox Add-ons even if they also run perfectly fine on Iceweasel.)

My million users

I’ve been working professionally with computers since 1991 and explicitly as a developer since 1993. I’ve written one or two lines of code since then. How many users could there be out there that are using something that includes my code?

Open source

I’ve participated in a wide range of open source projects, so of course all direct users of those projects would count: curl, Rockbox and let’s include subversion and others. I would guess that there are at least one million users of curl, quite likely more than so of subversion and Rockbox may also reach a million users or so. It’s of course impossible to know for sure…

Lots of open source projects use libraries that I work on now and have worked with in the past. Primarily libcurl and c-ares. Such as Boinc, git, bazaar, darcs. Millions of users, no doubt (Boinc alone has some 1.5 million users). The OLPC’s XO laptop comes with (lib)curl. I think most Linux distros these days come with curl installed. How many linux installations are there? libcurl is rather popular when used within PHP as well and there are many many million installations of PHP out there. I have code in wget, also used by millions.

Closed source users of open source I’ve participated in

Adobe acrobat reader (for non-windows platforms), Adobe’s flash player and various other Adobe products, Second life, Google Earth and others. They’re bound to have several million users. curl is included in Mac OS X.

There are also a lot of devices that use libcurl that are even harder to track: Sandisk makes mp3 players that use libcurl, Sony makes a video device that uses libcurl, Tilgin, Neuros and others make IPTV-devices that use libcurl. libcurl is used for multiple “installers” such as the one AOL provide for a specific router. There are many company users.

Closed source stuff I’ve worked with on my day-job

… is of course also used widely and all over, but me being an embedded guys I mostly work on software in products and most of the products I’ve worked within have been for various niche markets in which I have little or no knowledge about how much the products (and thus my code) are actually used. I’ve left my fingerprints on several networking products, IPTV/Digital TV settop boxes, railroad equipments, a car ignition tester, 3g/telecom switches, rfid receivers, laser-using positioning systems and more.

How many millions?

Ok, let’s for the sake of the argument say that there’s somewhere around 100 million devices with my code from me included – I really have no idea how to make a sensible estimate. Let’s for simplicity also say that there are 100 million users of these devices. I would also guess that about half of the world’s population isn’t near using devices I may have programmed. Thus, if you’re using “devices” in general there’s a probability of 3 billion/100 million = 1/30 that you’re using something that includes code that I’ve worked on…

In fact, that number is then valid for any random “device” user – if you’re reading this on my blog I don’t expect you to be very random but rather a specialized person and then I would say the likeliness of you having at least something with my code in it is almost 100% guaranteed…

Where would you say my biggest weaknesses in this reasoning are?

FNOSS hosts nordic foss blogs

There’s yet another blog aggregator on the internet now, and this time it’s fnoss.org which includes blogs from a bunch of “Nordic” (I would assume that means people from the northern parts of Europe) people writing about free software and related matters. I am one.

My blog is since previously also seen in the advogato aggregation.

This of course makes my blog get more read but like the rss feeds it also makes it harder for me to know how many readers/visitors I have since it’s all distributed. Not that this number matter very much anyway…

The Open Source Census Report

I’d never heard about the Open Source Census before when I fell over a mention of their recent report somewhere. Their mission is to get “enterprises” to install their little client which scans computers for open source products and reports the findings back to a central server.

Anyway, their current database consists of a “mere” 2300 machines scanned but that equals a total of 314,000 open source installations. 768 different packages are identified. The top-10 found products are:

  1. firefox 84.4%
  2. zlib 65.75%
  3. xerces 61.24%
  4. wget 61.12%
  5. xalan 58.19%
  6. prototype 57.03%
  7. activation 53.01%
  8. javamail 50.15%
  9. openssl 46.45%
  10. docbook-xml 46.27%

Ok, as an open source hacker and a geek, there are two things we need to do here: 1) find out how our own projects rank among the others and 2) how the scanning is done and thus how good it is. Thankfully all this is possible due to the entire data set being downloadable for free and the client being fully open source.

find out how our own projects rank

“curl” was found on 18.19% of all computers. That makes it #81 on the list, just below virtualbox and wireshark, but immediately above jstl and busybox. This includes “All Versions” of all tools, and for curl’s sake that was 22 different versions!

I found no other project I do anything noticeable in. Subversion is at #44.

how the scanning is done

It’s quite simple. It scans for file names based on a file name pattern and then it pattern matches contents of those files. It also extracts version numbers for the files using those regex patterns. You can see the full set of patterns/rules in the XML file straight off their source code repository: project-rules.xml.

how good is it

With this specific patterns for binary contents they of course need special human treatment for many versions and that is of course error-prone. That could explain why no curl version of the latest version (7.19.0) was reported. It will also cause renamed tools to remain undetected.

In my particular case I would of course also like to know how much libcurl is used, but they don’t seem to check for that (I found several projects besides the curl tool that I know use libcurl).

All this said, I didn’t actually try out the client myself so I haven’t verified it for real.

Copyleft and closed dual license ethics

There are a bunch of companies out there today that offer their products in a dual-license style, where you can download and use the GPL licensed version or buy the proprietary licensed version (often together with some kind of service deal) that you then can use without the “burden” of a GPL agreement. Popular known brands doing this include Trolltech/Qt (now Nokia), MySQL (now Sun), OO.o (Sun), Sleepycat (now Oracle) (Berkely DB is not strictly GPL but still copyleft) and VirtualBox (now Sun) etc.

It’s perfectly legal for them to do this, as the company is the copyright holder of all the files, they can just easily re-release everything under whatever license they want at their own discretion. The condition is of course that they are in fact copyright holders of everything, that the parts they don’t have copyright for are either licensed under an enough liberal license or that they can buy a similar relicense from third party lib authors.

It kills contributions from non-employees since doing a large chunk of code for these guys means that you would hand over copyright to a company whose entire business idea is to convert that to a proprietary license and make money from it. In a way you cannot do yourself since they can turn the GPL code into proprietary goods and you cannot. This may be a clue to why MySQL has less community contributors. The forced assigning of copyright over to a company could very well also be a contributing factor to OO.o’s problems to attract developers.

Companies “hide” the truth about this and try talking customers into the proprietary license. I’ve worked a bit with Qt and the wording they have used have often given companies the impression that they have to pay for the proprietary licensed version to be allowed to use the product in a commercial product. I’ve had to explain to several customers that as long as they just adhere to GPL they can use the free version just fine without paying anything. Trolltech also has this dubious condition tied to their commercial license: “The Commercial license does not allow the incorporation of code developed with the Open Source Edition of Qt into a commercial product.“[*] Needless to say, this will prevent companies from trying the open source licensed route first. I’m curious if they even have the legal right to make that claim.

This puts competitors at an arm’s distance of course since no other companies can take the code and conduct business the same way. Of course this is part of the reason why they gladly adapt GPL for this. Lots of actions by these companies make me feel that they aren’t real and true open source believers, but that they use this label a lot for marketing and for making sure competitors can’t do the same as they do.

The GPL version is without support for customers in another push to drive them to pay for the proprietary license instead of the GPL one. Of course, it being open source lets companies going the GPL route to fix their own problems since they have the source and all, but the push towards the proprietary license also narrows how many customers that will actively contribute anything back since there’s little chance they will do anything in a project with a proprietary license. I honestly can’t see many other possible legitimate reasons why companies wouldn’t do support for the GPL licensed versions.

I’ve not personally worked in any of these projects under such proprietary licenses, but I would love to hear experiences from people that have!

Obviously all this are not problems large enough to concern users. Quite possibly so because these companies do a good enough job and keep the GPL versioned versions of their software at a sufficiently good quality so that there just don’t appear any forked projects that take the GPL version and run with it in a different direction. Another explanation could be that there are good enough alternative projects to go with if you’re not happy with one of these dual-licensed ones.

A little related anecdote told to me by an MySQL employee (who’s name shall remain untold). He described how they still haven’t implemented a feature in MySQL that many people have requested, since they according to him don’t want to cram in more stuff in the existing branch but instead are releasing it in the next major release (due to release in 4-6 months or similar). In the next sentence he explained how they already have it implemented in the closed version for at least one paying customer… Any (other) true open source project would’ve made that change available as a patch/branch in the GPL version for the public.

I’m pretty sure I personally would release my patches as open source only if I would change any code for any of these products. But yeah, that would mean that they would never get incorporated into their “real” products…