Category Archives: Open Source

Open Source, Free Software, and similar

Metalink in curl bounty

The Metalink guys host a list of project ideas and one of those ideas is to add metalink support to curl, and I recently bumped the stakes a bit by raising the bounty with an additional 200 USD so that the offer is now 500 USD for the person or team that brings the feature as described.

My primary motivation for doing this is that I like the metalink idea and I’d like to help making sure it gets used more widely.

curl 7.19.1

Trying hard to maintain the bimonthly release schedule we’ve been keeping up with for quite some time by now, we therefore now proudly announce the release of curl and libcurl 7.19.1

This release includes at least 24 bug fixes and the following changes:

The Open Source Census Report

I’d never heard about the Open Source Census before when I fell over a mention of their recent report somewhere. Their mission is to get “enterprises” to install their little client which scans computers for open source products and reports the findings back to a central server.

Anyway, their current database consists of a “mere” 2300 machines scanned but that equals a total of 314,000 open source installations. 768 different packages are identified. The top-10 found products are:

  1. firefox 84.4%
  2. zlib 65.75%
  3. xerces 61.24%
  4. wget 61.12%
  5. xalan 58.19%
  6. prototype 57.03%
  7. activation 53.01%
  8. javamail 50.15%
  9. openssl 46.45%
  10. docbook-xml 46.27%

Ok, as an open source hacker and a geek, there are two things we need to do here: 1) find out how our own projects rank among the others and 2) how the scanning is done and thus how good it is. Thankfully all this is possible due to the entire data set being downloadable for free and the client being fully open source.

find out how our own projects rank

“curl” was found on 18.19% of all computers. That makes it #81 on the list, just below virtualbox and wireshark, but immediately above jstl and busybox. This includes “All Versions” of all tools, and for curl’s sake that was 22 different versions!

I found no other project I do anything noticeable in. Subversion is at #44.

how the scanning is done

It’s quite simple. It scans for file names based on a file name pattern and then it pattern matches contents of those files. It also extracts version numbers for the files using those regex patterns. You can see the full set of patterns/rules in the XML file straight off their source code repository: project-rules.xml.

how good is it

With this specific patterns for binary contents they of course need special human treatment for many versions and that is of course error-prone. That could explain why no curl version of the latest version (7.19.0) was reported. It will also cause renamed tools to remain undetected.

In my particular case I would of course also like to know how much libcurl is used, but they don’t seem to check for that (I found several projects besides the curl tool that I know use libcurl).

All this said, I didn’t actually try out the client myself so I haven’t verified it for real.

ohloh vs statcvs

I’ve played a bit with statcvs lately and I generated reports for the curl repository. It turned out rather interesting (well, assuming you’re a statistics geek such as me) especially in comparison to the data and stats ohloh.net presents for the same code:

[the images have been lost in time, like tears in rain]

Executive summary:

  • I’ve done 82% of all code changes.
  • We seem to grow at roughly the same pace (both number of code lines and number of files) over the last years.
  • The lines of code per file count seems rather fixed

Oh, that initial big bump at late 1999/early 2000 was due to a lot of “wrong” files such as configure, config.guess etc were committed and subsequently removed. It is a bit annoying to have there as it ruins the data somewhat but I’ve not managed to fool statcvs into ignoring that part…

The NFSA 2008 went to…

The Nordic Free Software Award 2008 went to Mats Östling for programverket.org which is “a project operating with open software and open software development in the public sector. The purpose is to achieve more collaboration and more efficient IT application within the public sector“. Congratulations Mats!

The FSCONS official site (the award was handed out during that event) keeps up with its tradition with being totally behind the schedule and isn’t even mentioning the winner yet…

I’m not sure only two awards is enough to draw any conclusions, but with Skolelinux last year and a public sector open source project this year it certainly gives a feeling what the jury has prioritized so far.

Snaxx 19

In an attempt at making something social, to actually meet up with real-life physical people buHaxx!t yet avoid common trivial subjects and only stay on-topic with technology, computing, work, beer and things related to that, we’re gathering at the next Snaxx on november 20th somewhere in Stockholm city Sweden. The exact location has yet to be decided.

If you’re into technology, open source, good ales, talking about work on your spare time or possibly all of that at once – then you might just be one of us.

Welcome!

Copyleft and closed dual license ethics

There are a bunch of companies out there today that offer their products in a dual-license style, where you can download and use the GPL licensed version or buy the proprietary licensed version (often together with some kind of service deal) that you then can use without the “burden” of a GPL agreement. Popular known brands doing this include Trolltech/Qt (now Nokia), MySQL (now Sun), OO.o (Sun), Sleepycat (now Oracle) (Berkely DB is not strictly GPL but still copyleft) and VirtualBox (now Sun) etc.

It’s perfectly legal for them to do this, as the company is the copyright holder of all the files, they can just easily re-release everything under whatever license they want at their own discretion. The condition is of course that they are in fact copyright holders of everything, that the parts they don’t have copyright for are either licensed under an enough liberal license or that they can buy a similar relicense from third party lib authors.

It kills contributions from non-employees since doing a large chunk of code for these guys means that you would hand over copyright to a company whose entire business idea is to convert that to a proprietary license and make money from it. In a way you cannot do yourself since they can turn the GPL code into proprietary goods and you cannot. This may be a clue to why MySQL has less community contributors. The forced assigning of copyright over to a company could very well also be a contributing factor to OO.o’s problems to attract developers.

Companies “hide” the truth about this and try talking customers into the proprietary license. I’ve worked a bit with Qt and the wording they have used have often given companies the impression that they have to pay for the proprietary licensed version to be allowed to use the product in a commercial product. I’ve had to explain to several customers that as long as they just adhere to GPL they can use the free version just fine without paying anything. Trolltech also has this dubious condition tied to their commercial license: “The Commercial license does not allow the incorporation of code developed with the Open Source Edition of Qt into a commercial product.“[*] Needless to say, this will prevent companies from trying the open source licensed route first. I’m curious if they even have the legal right to make that claim.

This puts competitors at an arm’s distance of course since no other companies can take the code and conduct business the same way. Of course this is part of the reason why they gladly adapt GPL for this. Lots of actions by these companies make me feel that they aren’t real and true open source believers, but that they use this label a lot for marketing and for making sure competitors can’t do the same as they do.

The GPL version is without support for customers in another push to drive them to pay for the proprietary license instead of the GPL one. Of course, it being open source lets companies going the GPL route to fix their own problems since they have the source and all, but the push towards the proprietary license also narrows how many customers that will actively contribute anything back since there’s little chance they will do anything in a project with a proprietary license. I honestly can’t see many other possible legitimate reasons why companies wouldn’t do support for the GPL licensed versions.

I’ve not personally worked in any of these projects under such proprietary licenses, but I would love to hear experiences from people that have!

Obviously all this are not problems large enough to concern users. Quite possibly so because these companies do a good enough job and keep the GPL versioned versions of their software at a sufficiently good quality so that there just don’t appear any forked projects that take the GPL version and run with it in a different direction. Another explanation could be that there are good enough alternative projects to go with if you’re not happy with one of these dual-licensed ones.

A little related anecdote told to me by an MySQL employee (who’s name shall remain untold). He described how they still haven’t implemented a feature in MySQL that many people have requested, since they according to him don’t want to cram in more stuff in the existing branch but instead are releasing it in the next major release (due to release in 4-6 months or similar). In the next sentence he explained how they already have it implemented in the closed version for at least one paying customer… Any (other) true open source project would’ve made that change available as a patch/branch in the GPL version for the public.

I’m pretty sure I personally would release my patches as open source only if I would change any code for any of these products. But yeah, that would mean that they would never get incorporated into their “real” products…

Nordic Free Software Award Nominee 2008

It seems I’m again (as I was last year) nominated for the Nordic Free Software Award.

They list thirteen nominees, of which there are four organizations/companies. I’m proud to be mentioned in such a swell company.

Unfortunately I cannot be present at the FSCONS itself this year (where the award is being handed over), so all the partying and celebrating the award winner will have to be done without me! 🙂

Another curl scan shows work to do

The nice guys on Coverity did a new scan on curl (the 7.19.0 source code) and they dug a bunch of new flaws. The previous version they checked was 7.16.1, release some 20 months before. The new changes are not only because of how the code has changed in the mean time, but it seems their scanner have improved a bit since the last time as well!

Here’s a sample view of how libcurl might dereference a NULL pointer with a step-by-step explanation on what conditions that lead to the flaw:

They identify 22 flaws and I found it interesting to compare the top list of bad functions as reported by Coverity with the complexity list I showed the other day. First we need to ignore the 9 flaws Coverity found in the ‘curl’ tool code (i.e not within the library). Then the 10 remaining functions with flaws marked by Coverity are:

  • Curl_getinfo (4 flaws, all the other ones have one each)
  • Curl_cookie_add (present in the complexity top-10 table)
  • FormAdd (present in the complexity top-10 table)
  • parsedate
  • ftp_parse_url_path
  • tftp_do
  • resolve_server
  • curl_easy_pause
  • add_closure
  • Curl_connect

See? Only two of them were present in that list. The Coverity tool does in fact also count the complexity for each function, and while it doesn’t match the values pmccabe shows exactly, they seem to agree in general about what functions that are the most complex ones.

Ok, now let’s go work on fixing all these problems…