Category Archives: Development

Development, Open Source

11 years of curl

March 20, 2009 Daniel Stenberg 1 Comment

This very date, eleven years ago, the first curl release was brought to the world.

I’m still enjoying working with it, and I’m quite pleased with both the products and the community around the project.

I’ll stick around for a while more!

Happy birthday curl!

cURL and libcurl, Development

Some stats on curl development

January 22, 2009 Daniel Stenberg

Counting curl 6.0 and up to curl 7.19.3 we’ve done 78 releases during the 9.4 years it took.

In this time, we’ve mentioned 1259 bugfixes and 389 notable changes.

This makes one bugfix done every 2.7 days. One release done every 43rd day with an average of 16 bugfixes done in each. The longest interval ever between two curl releases was 139 days, back in 2000 when we worked to release the first version 7 release (known as 7.1).

To compare with how our work has been more recently, doing the same math limited to the 20 latest releases only (the 3.3 years since and including 7.15.0) shows that we’re still on 2.7 days per bugfix (although we know that the code base has grown steadily for years) but we’re now on 61 days between releases and 21 bugfixes/release…

All this info and more will be visible on a web page on the curl site soonish, I’m still working on polishing it up.

What other useful or useless but interesting numbers could be extracted from this?

cURL and libcurl, Development, Linux, Network, Web

IETF http-state group created

January 9, 2009 Daniel Stenberg

Over at the IETF another group was just created named http-state (with an associated mailing list) with the specific goal:

Ultimately, the purpose of this group is to create an updated HTTP State Management Mechanism RFC (aka cookies) that will supersede the Netscape spec, RFCs 2109, 2964, 2965 then add in real-world usage (e.g. HTTPOnly), and possibly add in additional features and possibly merge in draft-broyer-http-cookie-auth-00.txt and draft-pettersen-cookie-v2-03.txt.

I’ve joined the list and I hope to follow and participate in this, as I believe the current state of HTTP cookies is a rather sorry mess and the Netscape spec is still what closest describes how cookies work in the wild. Of course I’ll do it with my libcurl experience in my luggage.

While it perhaps would be cool to join the group in more formal way, there’s no way for me to participate in that IETF meeting in San Francisco in March.

Development, Open Source

emacs!

January 8, 2009 Daniel Stenberg

I haven’t said it here before, but I feel I really should. I’ve been an avid emacs user since I started to learn it back in 1991 on emacs 18. I worked at IBM with their RS/6000 machines at the time and I learned C on AIX with emacs as my editor.

To me there was no alternative at that time and I soon learned all the quirks, got used to things and appreciated all the beautiful parts – my fellow colleagues being emacs fans of course helped pushing me into that team. For the fun of it, I’ve checked when vim was started and I’ve learned that it got available for “unix” in 1992. And it probably was quite far away from a real emacs competitor at that time. The vi editor was of course there, but it had no C syntax indent support and it used and still uses that quirky “mode” approach that I’ve never liked.

I came from hacking the C64 to programming Amiga over to AIX and I was used to and liked full-screen editors without any particular mode switching necessary. Still today I find it a bit curious how vi (in the shape of vim) can be this popular given that (in my view) funny concept.

While I do see some of the benefits of XEmacs over GNU Emacs I always disable all menus, icons and toolbars so to me in real-life editing the differences don’t mean a lot, so I tend to go with the plain GNU Emacs.

So, I’ve pretty much used Emacs just about daily at work and in my spare time hacking since then. Even during the few times I’ve been locked in at a windows desktop I’ve managed to get a windows version installed to survive my days!

Thank you emacs team!

Development, IT Politics, Open Source, Security

SSL certs crash without trust

December 23, 2008 Daniel Stenberg

Eddy Nigg found out and blogged about how he could buy SSL certificates for a domain he clearly doesn’t own nor control. The cert is certified by Comodo who apparently has outsourced (parts of) there cert business to a separate company who obviously does very little or perhaps no verification at all of the buyers.

As a result, buyers could buy certificates from there for just about any domain/site name, and Comodo being a trusted CA in at least Firefox would thus make it a lot easier for phishers and other cyber-style criminals to setup fraudulent sites that even get the padlock in Firefox and looks almost perfectly legitimate!

The question is now what Mozilla should do. What Firefox users should expect their browser to do when HTTPS sites use Comodo-verified certs and how Comodo and their resellers are going to deal with everything…

Read the scary thread on the mozilla dev-tech-crypto list.

Update: if you’re on the paranoid/safe side you can disable trusting their certificates by doing this:

Select Preferences -> Advanced -> View Certificates -> Authorities. Search for
AddTrust AB -> AddTrust External CA Root and click “Edit”. Remove all Flags.

Development, Linux, Network

10G and Direct Cache Access

December 18, 2008 Daniel Stenberg 6 Comments

As some of you might know, I currently work with a client doing 10G network stuff. 10G as in 10 gigabit/second Ethernet. That’s a lot of data. It’s actually so much data it’s hard to even generate network loads of this magnitude to be able to do good tests, as a typical server using SATA harddrives hardly fills a one gigabit pipe due to “slow” I/O: ordinary SATA drives don’t even reach 100MB/sec. You need RAID solutions or putting the entire thing in RAM first. So generating 10 gigabit network loads thus requires some extraordinary solutions.

Having a server that tries to “eat” a line speed 10G is a big challenge, and in fact we can’t do it as 1.25 GB/sec is just too much and yet we run a quad-core 3.00GHz Xeon thing here which is at least near the best “off-the-shelf” CPU/server you can get at the moment. Of course our software does a little bit more with the data than just receiving it as well.

Anyway, recently I’ve been experimenting with 10G cards from Myricom and when trying to maximize our performance with these beauties, I fell over the three-letter acronym DCA. Direct Cache Access. A terribly overused acronym consisting of often-used words make it hard to research and learn about! But here’s a great document describing some of the gory details:

Direct Cache Access for High Bandwidth Network I/O

Summary: it is an Intel technology for delivering data directly into the CPU’s cache, to reduce the bandwidth requirement to memory (note: it only decreases theÂ bandwidthÂ requirement at that moment, not the total requirement as it still needs to be read from memory into the cache, as noted in a comment below). Using this technique it should be possible to drastically reduce the time for getting the traffic. Support for this tech has been added to the Linux kernel as well since a while back.

It seems DCA is (only?) implemented in Intel’s 7300 chipset family which seems to only exist for Xeon 7300 and 7400. Too bad we don’t have one of these monsters so I haven’t been able to try this out for real yet…

Currently we can generate 10G network loads using two different approaches: one is uploading a specially crafted binary blob embedded with the FPGA image to a Xilinx-equipped board with a 10G MAC that then can do some fiddling with the packages (like increasing a counter) so that they aren’t all 100% identical. It makes a pretty good load test, even if the traffic isn’t at all shaped like the “real” traffic our product will receive. Our other approach has been even less good: upload a custom firmware to the network card and have that send the same Ethernet frame… This latter approach didn’t get better because it was a bit too complicated and badly documented on how to make a really good generator out of it. Even if I liked being able to upload custom code to my network card! 😉

Allow me to also mention that the problems with generating 10G is with small packet sizes, like 100 bytes or so as the main problem in the hardwares seem to the number of packets, not the payload part. Thus it is easier to do full line speed with 9000 bytes packets (jumbo frames) than the tiny ones we are likely to get when this product is in use by customers in the wild.

Update: this article was written in 2008. Please note that many things may have changed since then.

Development, Open Source

Open source in my day job

December 3, 2008 Daniel Stenberg

From people in the open source community and then especially friends and fellow hackers in projects I am involved in, I sometimes get questions on how my open source participations affect my “real job”.

I work as a consultant during the days and I do contract development for hire. I’ve been a consultant since 1996 and during this time I’ve participated in more than 25 “full-time” projects for almost as many customers.

I contribute to numerous open source projects and I’ve done it for many years. I lead and maintain several open source projects. I’ve committed many thousands of times to public source code repositories.

Does the contributions make me more attractive to potential customers? Not particularly is my rather sad experience. While some of my customers notice my track record (my CV does of course mention my most notable contributions) most of my day-job clients focus solely on other paid projects I’ve done and that exact technologies and products I worked with and created in the past. It may of course not be too strange as things I do and get paid for is then potentially “good enough for someone to pay me for” while the stuff I do for free in open source projects are… well, not paid for and thus it can’t be qualified by that ruler.

Do I get new assigments thanks to my open source projects? Yes, I do, but usually they tend to be on the smallish side and not of the bigger kinds my regular assignments at work are.

And the reality is of course also that the vast vast vaaast majority of all software consultants that people hire to do development have no public record of open source involvement so it could just be a result of that this is so rare the customers never had a reason to learn or adapt to using open source contributions as a “factor”.

Or maybe I’m just ignorant and haven’t figured out how my customers truly work.

Do I work with open source in my day job? Yes almost exclusively. I’ve been working with Linux in various embedded systems basically the last 8 years and working with Linux systems pretty much implies a wide range of open source development tools as well.

Development, Open Source

4 ohloh improvements I’d like

November 28, 2008 Daniel Stenberg

I am a stats junkie so I like my stats in large amounts. But I like the stats to be right and as accurate as possible, and when I look at what ohloh produces I like the concepts and ideas in general, I just think their implementation is lacking in a few vital areas that need improvement:

1. There are no dependencies or hierarchies between packages, so “I use this” counters get worthless since people mark end-user packages they use. Low-level support packages and libraries that are used indirectly don’t get many “use counts”

2. Doing very few commits in a very well used project with few authors gives you way way more points than doing a bus-load of commits in something less used with many fellow contributors. This makes the top-list of people very skewed as some of the top-64 people only did a few hundred commits ever. I doubt many mortals would consider someone who only ever did 300 commits to be a top community person. At the very moment I write this, the #1 ranked person has done 20 commits during 5 months…!

3. Too few versioning systems are supported, leaving out huge chunks of the open source world. Bazaar, mercurial and a few more are a bit too popular to be ignored without the results getting skewed.

4. I’d like to see the “number of users” of products as a percentage, as the total number of users they show include all contributors to all projects. Out of the 140,000 users (which undoubtedly include a lot of duplicates), it would surprise me if more than 10,000 have actually registered what products they use. I’ve tried to find the exact number but I failed. So 3,000 users don’t mean 3,000 out of 140,000 but 3,000 out of 10,000…

Development, Technology

Cure coming for Wrap Rage?

November 17, 2008 Daniel Stenberg 4 Comments

This phenomena you thought you were alone to experience, the rage and anger you feel when you’ve bought some new toy and you get it packaged in tight and nearly un-enforceable plastic that demands a decent amount of violence and persistence to crack. It’s called Wrap Rage.

I’ve been told the packages (called blister packs or clam shells) are designed to be this way to be able to show off the merchandise while at the same time prevent thefts: it is hard for a customer to just extract something out of those things in your typical physical store.

Amazon’s initiative Frustration-Free packaging is indeed a refreshing take on this and apparently an attempt to reverse this development. Online stores really cannot have any good reasons to use this kind of armor around products since there’s no risk of stealing. I wish others will follow to make the manufacturers realize that there is a market for this. This needs to be done by manufacturers of stuff, the stores cannot be made to repackage stuff due to warranties and what not.

It wouldn’t surprise me if you could even find cheaper ways to package products once you let go of some of the requirements that no longer apply for online stores. Visibility of the products once packaged is another thing that is pointless for online stores but I would expect is very important to sales in physical stores. I’ve always thought it is pretty pointless and expensive that every single package is made to be able to be a display model. To be able to attract customers to buy it. When you buy the thing online it’s no longer just pointless, it’s plain stupid.

Imagine a future when you can just open your new toy without getting bruises or scratch marks!

Development, Open Source

My million users

November 12, 2008 Daniel Stenberg

I’ve been working professionally with computers since 1991 and explicitly as a developer since 1993. I’ve written one or two lines of code since then. How many users could there be out there that are using something that includes my code?

Open source

I’ve participated in a wide range of open source projects, so of course all direct users of those projects would count: curl, Rockbox and let’s include subversion and others. I would guess that there are at least one million users of curl, quite likely more than so of subversion and Rockbox may also reach a million users or so. It’s of course impossible to know for sure…

Lots of open source projects use libraries that I work on now and have worked with in the past. Primarily libcurl and c-ares. Such as Boinc, git, bazaar, darcs. Millions of users, no doubt (Boinc alone has some 1.5 million users). The OLPC’s XO laptop comes with (lib)curl. I think most Linux distros these days come with curl installed. How many linux installations are there? libcurl is rather popular when used within PHP as well and there are many many million installations of PHP out there. I have code in wget, also used by millions.

Closed source users of open source I’ve participated in

Adobe acrobat reader (for non-windows platforms), Adobe’s flash player and various other Adobe products, Second life, Google Earth and others. They’re bound to have several million users. curl is included in Mac OS X.

There are also a lot of devices that use libcurl that are even harder to track: Sandisk makes mp3 players that use libcurl, Sony makes a video device that uses libcurl, Tilgin, Neuros and others make IPTV-devices that use libcurl. libcurl is used for multiple “installers” such as the one AOL provide for a specific router. There are many company users.

Closed source stuff I’ve worked with on my day-job

… is of course also used widely and all over, but me being an embedded guys I mostly work on software in products and most of the products I’ve worked within have been for various niche markets in which I have little or no knowledge about how much the products (and thus my code) are actually used. I’ve left my fingerprints on several networking products, IPTV/Digital TV settop boxes, railroad equipments, a car ignition tester, 3g/telecom switches, rfid receivers, laser-using positioning systems and more.

How many millions?

Ok, let’s for the sake of the argument say that there’s somewhere around 100 million devices with my code from me included – I really have no idea how to make a sensible estimate. Let’s for simplicity also say that there are 100 million users of these devices. I would also guess that about half of the world’s population isn’t near using devices I may have programmed. Thus, if you’re using “devices” in general there’s a probability of 3 billion/100 million = 1/30 that you’re using something that includes code that I’ve worked on…

In fact, that number is then valid for any random “device” user – if you’re reading this on my blog I don’t expect you to be very random but rather a specialized person and then I would say the likeliness of you having at least something with my code in it is almost 100% guaranteed…

Where would you say my biggest weaknesses in this reasoning are?

daniel.haxx.se

Category Archives: Development

11 years of curl

Some stats on curl development

IETF http-state group created

emacs!

SSL certs crash without trust

10G and Direct Cache Access

Open source in my day job

4 ohloh improvements I’d like

Cure coming for Wrap Rage?

My million users

curl, open source and networking