Tag Archives: Development

cURL and libcurl, Development, Open Source

How much for a bug?

September 14, 2009 Daniel Stenberg 7 Comments

no bugs Warning: blog post with no clear conclusion!

I offer support deals to companies that want to get help with Open Source programs I’ve contributed to. The deals I’ve made so far have primarily involved libcurl, c-ares or libssh2, but that’s basically because those are projects in which I participate a lot in (and maintain) so people find me easily in relation to those projects.

I wouldn’t mind accepting service and support deals for other projects or software products either, as long as they are products I know and am fairly familiar with already and I am not scared of digging in and fixing things under the hood when that is required.

In fact, I could very well consider to offer to fix bugs in any Open Source software. Like a general: if you have a bug in an open source project that you really want fixed and you can’t do it yourself I might be your man. Of course this would be limited to some certain kinds of projects and programs, but it could still include a wide range of software. A lot more than the ones I happen to be involved in at any particular point in time.

But while “a bug” is a fairly easily defined term to a user who can’t make something work in a given program it can be anything from dead simple to downright impossible for a developer to fix. The fact that users many times cannot determine if a “bug” is hard or easy, if it’s a bug or a feature not working on purpose, makes such a business deal very hard to provide.

How to pay to get a bug fixed?

Fixed price per bug? Presumably only tricky bugs would be considered for this so it would require a fairly high fixed price. But then it’ll also never be used for simple bugs either since the fixed price would scare away such use cases. I don’t think a fixed-price scheme works very well for this. One dollar bill

Then we only have a variable price approach left. A common way for a consultant like me is to charge for my time spent on a project: I set an hourly rate, I fix the issue in N hours. I charge hourly rate * N. For smallish projects, this is less attractive to customers. If we have no previous relationship, there’s a trust issue where the customer might not just blindly accept that I worked 10 hours on a task they think sounds easy so they feel overcharged. Also, there’s the risk that I estimate the job to be 2 hours but end up spending 12. My conclusion is that per-hour pricing doesn’t work for this either.

A variable price approach based on something else than number of hours it took for me to fix the problem is therefore needed.

A bug fix is of course worth whatever someone is willing to pay for it. But we don’t know what they are prepared to pay. On the other end, a bug fix can get done by someone for the price he/she is willing to accept to get the job done. So where is the cross section of those two unknown graphs?

I don’t have the answer here. I’m very interested in feedback and suggestions though. If you would pay for a bug fix, how would you like to get the price set?

Development, Linux, Open Source

Going full-time Haxx

August 30, 2009 Daniel Stenberg

I realize not a lot of you who read my site or blog are aware of my actual real world day-job situation (nor should you have to care), but I still want to let you guys know that I’m ending my employment at CAG Contactor and my intention is to find my way forward with my own company, Haxx AB, as employee number 1.

Haxx has existed for over ten years already, but we’ve so far only used it for stuff on the side that wasn’t full-time nor competing with our day-jobs. Starting in October, I’ll now instead work only for and with Haxx.

I don’t expect much in my actual day to day business to change much as I intend to continue as a contract developer / consultant / hacker doing embedded, Linux, open source and network development as an expert and senior engineer.

So if you want my help, you can continue to contact me the same way as before, and I can offer my services like before! 😉 The only difference is in my end where I get more freedom and control.

This move on my behalf will affect some of you indirectly: I will move a lot of web and other internet-based services from servers owned and run by Contactor to servers owned by Haxx. So, expect a lot of my sites and contents to get some uptime glitches in the upcoming month in my struggle to get things up on the new place(s).

Development, Electronics, Rockbox, Web, Windows

Concepts of a new distributed build

July 5, 2009 Daniel Stenberg

It was time to make an overhaul of our distributed builds system for Rockbox. The one currently in place is quite fancy and it does build 106 builds in around 7-8 minutes, but during the years it has served us we have found a few areas where we want to improve.

The goals for the new system were primarily:

do all the builds faster
reverse the connection so that people can contribute clients easier
make a system that is more allowing for slower machines to contribute

The biggest weaknesses of the existing system:

The master uses ssh to the distributed clients, which forces them to have an accessible ssh server and port etc. It also makes it awkward for people behind NATs who wants to run more clients.
It only hands out a particular build to one client, so thus if a large build happens to get handed to a slow client towards the end of a build round, all the other clients will sit idle waiting for the last client to finish.
The build and the subsequent upload of results to the master are synchronous, so thus a client with a very slow uplink may spend a significant time on the upload before it can start the next build.

TheÂ new system is currently in development. It consists of a server that runs on one of our main servers, and there’s a client script that each volunteer contributor runs on their systems.

The clients connect to the master on a dedicated TCP port, specifying user name, password, name of the particular client instance, what particular architectures the client can build and how many bogomips the client boasts. While bogomips is a bogus way to measure anything, we’ve started out using it for a rough way to sort the the build clients based on speed.

The clients keep connected to the server all the time. There’s a ping message from the master every N second of idleness to make sure the connection is kept alive. As soon as the master wants the client to do a build, it sends a message to it detailing exactly how it should build it and using what SVN revision. The client will then do the build at once, upload the results using HTTP to a dedicated place and then tell the server the build is complete.

The server knows about all builds to do at aÂ commit, what we call a build round. It has a rough “score” or “weight” for each build that grades them in a slow to fast order. When a build round starts, the server will first sort all builds based on number of times they’ve been handed out and as secondary sort key the “weight” of it. Then it loops over the currently connected build clients and hand out builds from the sorted build table. The server then continues to do that until all clients have three builds each to build. As soon as a build is reported to have been completed by a client, that client will get the next build from the sorted build list.

If a client connects to the server and the server deems the client to be too old (since it does specify its version in the handshake message), it will be told to update to a specific version instead and come back then. This way the server can update all build clients when important things are fixed.

The clients will soon start to get assigned builds that already have been assigned to another client. This is not a problem but in fact our intention. The client that completes the build first will simply tell the server, and the server will then tell all the other clients that build that same build that they should cancel that particular build.

A client that joins the server in the middle of a build round will simply get a bunch of builds immediately and join in. A client that disconnects during a build round simply won’t complete its builds and other clients will instead do them. The system is also tolerant against the fact that bogomips is lame to compare computers with, and that the build “score” may not be very accurate or even that some server will have very slow or very fast upload speeds at unpredictable times.

The build master itself does not know when to start a new build round. It simply knows about the concept and it knows how to tell clients to complete a round. To make the master to start a new round, you need to connect to the server’s listening port and issue a special command and provide a password and then you can tell the server to start a build of a specific SVN revision. Or to queue up a build to be performed after the current one if there happens to be one in progress already.

When a full build round is complete, a hundred or so builds have been done, and full packages and log files are now in a directory on the build server, the server will simply trigger an external script that then takes care of updating our build table etc. In fact, every single completed build will optionally trigger an external script to allow web pages or stats pages to get updated as we go.

This build system is currently pretty Rockbox-specific as this is the project and development system we’re writing this for, but there’s really nothing in this that must be this way. I’m sure that if someone (you?) wants to adapt this for another project, I’d be more than happy to assist and to help ensuring that this becomes a more generic distributed build system. Just raise your hand and step forward!

At the time of this writing, (primarily) me and BjÃ¶rn are still ironing out quirks in this new system to hopefully get it going live real soon…

Electronics, Rockbox

Lyre

June 10, 2009 Daniel Stenberg

I’ve previously blogged about the initiative to build an own open hardware platform that can run Rockbox fine, and just today I noticed their new site is up and alive at:

http://lyre.sourceforge.net/

The hardware has changed quite significantly since the last blog entry of mine, and they’re now using a LPC3130 from NXP instead of the Atmel they had before, and I believe they’ve also changed codec/DAC etc. Me knowingly, Rockbox does not yet run on this newly produced board.

I should probably also add that this board is of course still quite far from being portable and there’s no news or info anywhere on how or if you can actually get one of these yourself yet.

Development, Electronics

USB converter woes

April 16, 2009 Daniel Stenberg 1 Comment

USB to rs232 converters are just never sold properly advertising what chip’s inside and right now I want to know if this one UART I’m working with perhaps is not playing fine with my existing converter cable.

I have this XScale PXA270 on a board, and it has only one full featured RS232 (FFUART) and I’m about to move things over to the lesser featured BTUART.

A theory is that my current USB converter that is based on a “Prolific PL2303” doesn’t play nicely on the serial port that isn’t a full RS232.

So I ran off and bought a new cable. I grabbed the only model I found in my local Kjell & Company store – it’s quite different looking than my existing but there’s no hint anywhere on the package or inside of it that says what chipset that empowers it.

A quick drive back home (I’m working from home in this assignment), I plugged it in and I got to see this depressingly familiar dmesg output:

usbcore: registered new interface driver usbserial
usbserial: USB Serial support registered for generic
usbcore: registered new interface driver usbserial_generic
usbserial: USB Serial Driver core
usbserial: USB Serial support registered for pl2303
pl2303 2-2.4:1.0: pl2303 converter detected
usb 2-2.4: pl2303 converter now attached to ttyUSB0
usbcore: registered new interface driver pl2303
pl2303: Prolific PL2303 USB to serial adaptor driver

So what now? I hate how (my) computers these days don’t have serial ports while the entire embedded world still very much uses them. I think I’ll go searching in my closet to see if I can find an old crap computer with a serial port to try.

Another theory is that the port simply is broken hw-wise on the dev board but that’s harder to check for me right now.

Update: it was (as usual) only my stupidity that prevented this from working. If I switch it over to the correct baudrate the usb converter does fine. But before I found that out, I did find a computer with a serial port and I did see it working on that too…

Development, Open Source

C Code Commandments

April 15, 2009 Daniel Stenberg 3 Comments

I’m an old school C programmers guy and I stay true to some of the older and commonly used rules present in many open source and similar projects. Since I sometimes rant about this to people, I thought I’d amuse my surrounding by stating them here for public use/ridicule. Of course heavily inspired by the great and superior The Ten Commandments for C Programmers. My commandments are not necessarily in any prio order.

Thy Code Shall Be Narrow

Only in very rare situations should code be allowed to be wider than 80 columns. I want my two or three windows next to each other horizontally and still see the code fine. Not to mention the occasional loading up in an editor in a 80 columns terminal and that is should be possibly print nicely (for reviews etc). Wide code is also harder to read I think, quite similarly to how very wide texts in web pages etc aren’t kind to your eyes either.

Thou Shall Not Use Long Symbol Names

To be able to keep the code easily readable by human eyes so that you quickly get an overview and understand things, you simply need to keep the function and variable names fairly short. Not to mention that the code gets harder to keep within 80 columns if you use ridiculously long names.

Comments Shall Be Plenty

Yes, this is something we know everyone says and few live up to. In statistical analyzes of my own C code I usually reach around 25-27% comments and I’m usually happy with that amount. Comments should explain what is otherwise not obvious in the code.

No Hiding What’s Really Happening

I’m not a fan of overloaded operators or snazzy macros that do fancy stuff without it being noticeable in the code. It should be clear when reading the code what it does. That’s also one of the reasons you don’t catch me doing a lot of C++ work…

Thou Shalt Hunt Down and Kill Compiler Warnings

Compiler warnings may be significant and in some cases they are not. Either way, it is our duty to silence them at all times. Firstly because it is often simpler to fix the code to not warn than to figure out if the warning is indeed right or not, but perhaps primarily because it makes it harder to see new warnings appearing if the old ones have been left there.

Write Portable Code Unless Forced by Evil

You may first believe that your code will live on forever on this single platform with this single compiler, but soon and very soon you will learn otherwise. Then you will cheer this rule as it makes you consider unaligned memory accesses, assuming byte-order of binary data or the size of your ‘long’ variable type.

Repeat Not, Use Functions

I see a lot of “copy and paste” programming in my daily life and I’ve learned that sooner or later such practices lead to sorrow. If you paste the same code on multiple places it not only makes it repetitive and boring to update it when an API or something changes, more seriously it increases the risk that you address bugs only on one out of many places or that the fix differ etc. It also makes the code larger and thus harder to follow and understand.

Thou Shalt Not Typedef Away Pointers

A really nasty habit to be seen in some source codes is when people use typedefs to define their own types that is simply a pointer to something. Like with ‘typedef struct whatever * whatever_t’. While I’m in general against excessive typedefing, I’m fine with them in many cases but not when used to hide pointers to look like “ordinary” types. It makes code harder to follow.

Defines, no fixed numbers

Code that relies on zero and non-zero can get away without this, but as soon as you start relying on more numbers in the code you must start using #defines or possibly enums to make them appear with names in the code. Using names is more clever than hardcoded numbers since you can avoid having to explain the number in a comment, and of course it’ll be easier to change the actual number in the code at a later point without it being a painful search-and-replace operation.

cURL and libcurl, Development

Some stats on curl development

January 22, 2009 Daniel Stenberg

Counting curl 6.0 and up to curl 7.19.3 we’ve done 78 releases during the 9.4 years it took.

In this time, we’ve mentioned 1259 bugfixes and 389 notable changes.

This makes one bugfix done every 2.7 days. One release done every 43rd day with an average of 16 bugfixes done in each. The longest interval ever between two curl releases was 139 days, back in 2000 when we worked to release the first version 7 release (known as 7.1).

To compare with how our work has been more recently, doing the same math limited to the 20 latest releases only (the 3.3 years since and including 7.15.0) shows that we’re still on 2.7 days per bugfix (although we know that the code base has grown steadily for years) but we’re now on 61 days between releases and 21 bugfixes/release…

All this info and more will be visible on a web page on the curl site soonish, I’m still working on polishing it up.

What other useful or useless but interesting numbers could be extracted from this?

Development, Linux, Network

10G and Direct Cache Access

December 18, 2008 Daniel Stenberg 6 Comments

As some of you might know, I currently work with a client doing 10G network stuff. 10G as in 10 gigabit/second Ethernet. That’s a lot of data. It’s actually so much data it’s hard to even generate network loads of this magnitude to be able to do good tests, as a typical server using SATA harddrives hardly fills a one gigabit pipe due to “slow” I/O: ordinary SATA drives don’t even reach 100MB/sec. You need RAID solutions or putting the entire thing in RAM first. So generating 10 gigabit network loads thus requires some extraordinary solutions.

Having a server that tries to “eat” a line speed 10G is a big challenge, and in fact we can’t do it as 1.25 GB/sec is just too much and yet we run a quad-core 3.00GHz Xeon thing here which is at least near the best “off-the-shelf” CPU/server you can get at the moment. Of course our software does a little bit more with the data than just receiving it as well.

Anyway, recently I’ve been experimenting with 10G cards from Myricom and when trying to maximize our performance with these beauties, I fell over the three-letter acronym DCA. Direct Cache Access. A terribly overused acronym consisting of often-used words make it hard to research and learn about! But here’s a great document describing some of the gory details:

Direct Cache Access for High Bandwidth Network I/O

Summary: it is an Intel technology for delivering data directly into the CPU’s cache, to reduce the bandwidth requirement to memory (note: it only decreases theÂ bandwidthÂ requirement at that moment, not the total requirement as it still needs to be read from memory into the cache, as noted in a comment below). Using this technique it should be possible to drastically reduce the time for getting the traffic. Support for this tech has been added to the Linux kernel as well since a while back.

It seems DCA is (only?) implemented in Intel’s 7300 chipset family which seems to only exist for Xeon 7300 and 7400. Too bad we don’t have one of these monsters so I haven’t been able to try this out for real yet…

Currently we can generate 10G network loads using two different approaches: one is uploading a specially crafted binary blob embedded with the FPGA image to a Xilinx-equipped board with a 10G MAC that then can do some fiddling with the packages (like increasing a counter) so that they aren’t all 100% identical. It makes a pretty good load test, even if the traffic isn’t at all shaped like the “real” traffic our product will receive. Our other approach has been even less good: upload a custom firmware to the network card and have that send the same Ethernet frame… This latter approach didn’t get better because it was a bit too complicated and badly documented on how to make a really good generator out of it. Even if I liked being able to upload custom code to my network card! 😉

Allow me to also mention that the problems with generating 10G is with small packet sizes, like 100 bytes or so as the main problem in the hardwares seem to the number of packets, not the payload part. Thus it is easier to do full line speed with 9000 bytes packets (jumbo frames) than the tiny ones we are likely to get when this product is in use by customers in the wild.

Update: this article was written in 2008. Please note that many things may have changed since then.

Development, Open Source

Open source in my day job

December 3, 2008 Daniel Stenberg

From people in the open source community and then especially friends and fellow hackers in projects I am involved in, I sometimes get questions on how my open source participations affect my “real job”.

I work as a consultant during the days and I do contract development for hire. I’ve been a consultant since 1996 and during this time I’ve participated in more than 25 “full-time” projects for almost as many customers.

I contribute to numerous open source projects and I’ve done it for many years. I lead and maintain several open source projects. I’ve committed many thousands of times to public source code repositories.

Does the contributions make me more attractive to potential customers? Not particularly is my rather sad experience. While some of my customers notice my track record (my CV does of course mention my most notable contributions) most of my day-job clients focus solely on other paid projects I’ve done and that exact technologies and products I worked with and created in the past. It may of course not be too strange as things I do and get paid for is then potentially “good enough for someone to pay me for” while the stuff I do for free in open source projects are… well, not paid for and thus it can’t be qualified by that ruler.

Do I get new assigments thanks to my open source projects? Yes, I do, but usually they tend to be on the smallish side and not of the bigger kinds my regular assignments at work are.

And the reality is of course also that the vast vast vaaast majority of all software consultants that people hire to do development have no public record of open source involvement so it could just be a result of that this is so rare the customers never had a reason to learn or adapt to using open source contributions as a “factor”.

Or maybe I’m just ignorant and haven’t figured out how my customers truly work.

Do I work with open source in my day job? Yes almost exclusively. I’ve been working with Linux in various embedded systems basically the last 8 years and working with Linux systems pretty much implies a wide range of open source development tools as well.

Development, Open Source

4 ohloh improvements I’d like

November 28, 2008 Daniel Stenberg

I am a stats junkie so I like my stats in large amounts. But I like the stats to be right and as accurate as possible, and when I look at what ohloh produces I like the concepts and ideas in general, I just think their implementation is lacking in a few vital areas that need improvement:

1. There are no dependencies or hierarchies between packages, so “I use this” counters get worthless since people mark end-user packages they use. Low-level support packages and libraries that are used indirectly don’t get many “use counts”

2. Doing very few commits in a very well used project with few authors gives you way way more points than doing a bus-load of commits in something less used with many fellow contributors. This makes the top-list of people very skewed as some of the top-64 people only did a few hundred commits ever. I doubt many mortals would consider someone who only ever did 300 commits to be a top community person. At the very moment I write this, the #1 ranked person has done 20 commits during 5 months…!

3. Too few versioning systems are supported, leaving out huge chunks of the open source world. Bazaar, mercurial and a few more are a bit too popular to be ignored without the results getting skewed.

4. I’d like to see the “number of users” of products as a percentage, as the total number of users they show include all contributors to all projects. Out of the 140,000 users (which undoubtedly include a lot of duplicates), it would surprise me if more than 10,000 have actually registered what products they use. I’ve tried to find the exact number but I failed. So 3,000 users don’t mean 3,000 out of 140,000 but 3,000 out of 10,000…

daniel.haxx.se

Tag Archives: Development

How much for a bug?

How to pay to get a bug fixed?

Going full-time Haxx

Concepts of a new distributed build

Lyre

USB converter woes

C Code Commandments

Thy Code Shall Be Narrow

Thou Shall Not Use Long Symbol Names

Comments Shall Be Plenty

No Hiding What’s Really Happening

Thou Shalt Hunt Down and Kill Compiler Warnings

Write Portable Code Unless Forced by Evil

Repeat Not, Use Functions

Thou Shalt Not Typedef Away Pointers

Defines, no fixed numbers

Some stats on curl development

10G and Direct Cache Access

Open source in my day job

4 ohloh improvements I’d like

curl, open source and networking