Category Archives: Development

internally, we’re all multi now!

libcurl internals suddenly become a lot cleaner and neater to work with when we made all code assume and work with the multi interface!

libcurl was initially created slightly after the birth of the curl tool. After the tool started to get some traction and use out in the world, requests and queries about a library with its powers started to drop in. Soon enough, in the year 2000 we shipped the first release of libcurl and it featured a synchronous API (the “easy” interface) that performs the complete operation and then returns. I think we can now say that the blocking easy interface was successful and its ease of use has been very popular and appreciated by many users.

During 2002 the need for a non-blocking API had been identified and we introduced the multi interface. The multi interface is kind of a super-set as it re-uses the same handles as is used with the easy interface, so it cleverly makes it fairly easy for a standard application to move from the easy interface to the multi.

Basically since that day, we’ve struggled in the source code structure to handle the fact that we have both a blocking and a non-blocking API. In lots of places we’ve had different code paths and choices done depending on which API that was used. It made the source code hard to follow and it occasionally introduced hard to track bugs which could lead to the multi and easy interface not behaving the same way to the underlying network or protocol behavior. It was clear very early on that it wasn’t an ideal design choice, but it was a design choice that was spread out among the code and it stuck.

During November 2012 I finally took on the code that we’ve had #ifdef’ed since around 2005 which makes the blocking easy interface operation a wrapper function around the non-blocking multi interface functions. Using this method, all internals should be considered non-blocking and there is no need left to treat things differently depending on which API that was used because everything is now multi interface == non-blocking.

On January 17th 2013 the big patch was committed. 400 added lines, 800 removed over 54 modified files…

cURL

The curl year 2012

2012

So what did happen in the curl project during 2012?

First some basic stats

We shipped 6 releases with 199 identified bug fixes and some 40 other changes. That makes on average 33 bug fixes shipped every 61st day or a little over one bug fix done every second day. All this done with about 1000 commits to the git repository, which is roughly the same amount of git activity as 2010 and 2011. We merged commits from 72 different authors, which is a slight increase from the 62 in 2010 and 68 in 2011.

On our main development mailing list, the curl-library list, we now have 1300 subscribers and during 2012 it got about 3500 postings from almost 500 different From addresses. To no surprise, I posted by far the largest amount of mails there (847) with the number two poster being Günter Knauf who posted 151 times. Four more members posted more than 100 times: Steve Holme (145), Dan Fandrich (131), Marc Hoersken (130) and Yang Tse (107). Last year I sent 1175 mails to the same list…

Notable events

I’ve walked through the biggest changes and fixes and here are the particular ones I found stood out during this otherwise rather calm and laid back curl year. Possibly in a rough order of importance…

  1. We started the year with two security vulnerability announcements, regarding an SSL weakness and an injection flaw. They were reported in 2011 though and we didn’t get any further security alerts during 2012 which I think is good. Or a sign that nobody has been looking close enough…
  2. We got two interesting additions in the SSL backend department almost simultaneously. We got native Windows support with the use of the schannel subsystem and we got native Mac OS X support with the use of Darwin SSL. Thanks to these, we can now offer SSL-enabled libcurls on those operating systems without relying on third party SSL libraries.
  3. The VERIFYHOST debacle took off with “security researchers” throwing accusations and insults, ending with us releasing a curl release with the bug removed. It did however unfortunately lead to some follow-up problems in for example the PHP binding.
  4. During the autumn, the brokeness of WSApoll was identified, and we now build libcurl without it and as a result libcurl now works better on Windows!
  5. In an attempt to allow libcurl-using applications to avoid select() and its problems, we introduced the new public function curl_multi_wait. It avoids the FD_SETSIZE limit and makes it harder to screw up…
  6. The overly bloated User-Agent string for the curl tool was dramatically shortened when we cut out all the subsystems/libraries and their version numbers from the string. Now there’s only curl and its version number left. Nice and clean.
  7. In July we finally introduced metalink support in the curl tool with the curl 7.27.0 release. It’s been one of those things we’ve discussed for ages that finally came through and became reality.
  8. With the brand new HTTP CONNECT support in the test suite we suddenly could get much improved test cases that does SSL or just tunnel through an HTTP proxy with the CONNECT request. It of course helps us avoid regressions and otherwise improve curl and libcurl.

What didn’t happen

  1. I made an attempt to get the spindly hacking going, but I’ve mostly failed with that effort. I have personally not had enough time and energy to work on it, and the interest from the rest of the world seems luke warm at best.
  2. HTTP pipelining. Linus Nielsen Feltzing has a patch series in the works with a much improved pipelining support for libcurl. I’ll write a separate post about it once it gets in. Obviously we failed to merge it before the end of the year.
  3. Some of my friends like to mock me about curl not being completely IPv6 friendly due to its lack of support for Happy Eyeballs, and of course they’re right. Making curl just do two connects on IPv6-enabled machines should be a fairly small change but yet I haven’t yet managed to get into actually implementing it…
  4. DANE is SSL cert verification with records from DNS thanks to DNSSEC. Firefox has some experiments going and Chrome already supports it. This is a technology that truly can improve HTTPS going forwards and allow us to avoid the annoyingly weak and broken CA model…

I won’t promise that any of these will happen during 2013 but I can promise there will be efforts…

The Future

I wrote a separate post a short while ago about the HTTP2 progress, and I expect 2013 to bring much more details and discussions in that area. Will we get SRV record support soon? Or perhaps even URI records? Will some of the recent discussions about new HTTP auth schemes develop into something that will reach the internet in the coming year?

In libcurl we will switch to an internal design that is purely non-blocking with a lot of if-then-that-else source code removed for checks which interface that is used. I’ll make a follow-up post with details about that as well as soon as it actually happens.

Our Responsibility

curl and libcurl are considered pillars in the internet world by now. This year I’ve heard from several places by independent sources how people consider support by curl to be an important driver for internet technology. As long as we don’t have it, it hasn’t really reached everyone and that things won’t get adopted for real in the Internet community until curl has it supported. As father of the project it makes me proud and humble, but I also feel the responsibility of making sure that we continue to do the right thing the right way.

I also realize that this position of ours is not automatically glued to us, we need to keep up the good stuff to make it stick.

cURL

Say hello to Moo

I decided it was about time to upgrade my main development machine to something modern and snappy. It is 5.5 years ago since I bought my current work horse, a dual-core AMD Athlon 64 X2 5600+ (2.8GHz) equipped thing.Fractal Design I’m using my machine primarily for development. I never game. I decided to go for the higher end of what’s available to get me something to live with for several years to come.

Motherboard: Asus P8Z77-M. Micro-ATX. Intel Z77 chipset.

CPU: Intel Core i7 3770K 3,5Ghz Socket 1155. This is a 22nm monster featuring 8 MB L3-cache

Memory: TridentX DDR3 PC19200/2400MHz CL10 2x8GB. 16GB of ram.

HDD: Seagate Barracuda ST3000DM001 64MB 3TB.

Chassi: Fractal Design Define R3 USB3. See picture. Rather big and fits a lot more drives and stuff than what I have now…

SSD: OCZ Vertex 4 256GB

CPU cooler: Cooler Master Hyper 412S

Graphics: ASUS Radeon HD5450 512MB (very simple and cheap thing but supports 2560×1600 which the MB doesn’t do)

PSU: Plexgear PS-500 500W

(a prisjakt list with the full setup)

All in all, this has two 120mm chassi fans, one 135mm fan on the big CPU cooler and there’s one fan in the PSU. I hope they won’t be causing too much noise or problems for me. The rather low-end graphics should keep the total power consumption (and thus heat production) at a decent level. ASUS p8z77-m

I purchased all the individual parts separately as I dislike how I can’t get an as optimized machine prebuilt from anywhere – I basically have to pay around 50% more, and then I still wouldn’t get the exact set of pieces I’d like. This way I also avoid the highly disturbing Microsoft tax prebuilt systems come with.

Unfortunately I got some bad luck included too, as when I first put everything together and pressed the power button nothing happened. Well, a single led was turned on but nothing else happened. It took me a while and some sweat to figure out where the problem lied and once I replaced the broken motherboard it would start properly and then I could proceed and install it.gskill TridentX ddr3

Once my new machine (which now goes under the name Moo) gets settled, my old box will become my daughter’s new machine as hers existing tired old PIII machine isn’t really fun to do a lot with.

Embedded Linux Contest

During our embedded Linux hacking event in Stockholm on October 20th I ran a little contest for the ones who wanted to participate. I created it entirely by myself to allow as many people as possibly to participate with them knowing me or me knowing them etc limiting the fun.

For your amusement I include the full contest here. If you want to try it out, then make sure you don’t attempt to google for any answers or otherwise use a machine/computer as a help.

img1

img2

img3

Here I just want to mention that, as is shown in the above example question, ‘ace‘ is the correct character sequence and the letters should then be kept in that order in the final question. Also note that a character sequence can legally contain a dash as well. You will get 16 similar sequences of 1 to 3 letters, and those 16 sequences should be moved around to form the 17th question.

img4

… at this point I fired off all the questions one by one at about 15-20 seconds per question. In this blog post I’ll take a shortcut and instead show you the final page I made that showed all questions at once, which I then left displayed for the remainder of the competition  time. Click the image to get a full resolution version that is perfectly readable:

all questions at once

the winners of the contestMy take away from this contest is that it was harder than I anticipated and took a longer time to crack than I thought. I gave away a few additional clues and hints as the time went by, but in the end I believe there were several persons who were very close to breaking it at almost the same time. In the end, Klas and Jonas presented the correct answer first and won the bottle of Champagne. I’m sure you appreciate their efforts after having tried this yourself!

The answers? Are you really sure? The correct answers and the final question with its answer is available

I had a great time creating the competition and I believe the competitors appreciated it.

Additional trivia: I created the letter sequences for the other alternatives by writing other English phrases and chopped them up, so that they were from actual English and hence possibly more believable.

ptest because “make test” is insufficient

CAUTION: test in progressMuch thanks to autoconf and automake we have an established more or less standardized way to build and install tools, libraries and other software. We build them with ‘make’ and we install them with ‘make install‘. This works great and it works equally fine even when we build stuff cross-compiled.

For testing however, the established concept and procedure is not as good. For testing we have ‘make test’ or ‘make check’ which typically first builds whatever needs to get built for the tests to run and then it runs all tests.

This is not good enough

Why? Because in lots of use-cases we build software using a cross-compiler on a build system that can’t run the executables. Therefore we need to first build the tests, then install the tests (somewhere that is reachable from the target system) and finally execute them. These steps need to be possible to run independently since at least the building and installing will sometimes happen on a different host than the execution of the tests.

yocto-projectIntroducing ptest

Within the yocto project, Björn Stenberg has pushed for ptest to be the basis of this new reform and concept. The responses he’s gotten so far has been positive and there’s a pending updated patch to be posted to the upstream oe-core list soon.

The work does not end there

Even if or when this can be incorporated into OpenEmbedded and Yocto – and I really think it is a matter of when since I believe we can work out all the flaws and quirks until virtually everyone involved is happy. The bulk of the changes however, really should be done upstream, in hundreds and thousands of open source packages. We (as upstream open source projects) need to start doing testing in at least two different steps, where one step build everything that needs to be built for the tests and then a second step that run the suite. The two steps could then in a cross-compiled scenario get executed first on the host system and then on the target system.

I expect that this will mean a whole bunch of patches and scripts to have to be maintained within OpenEmbedded for a while, when things will be tried to get merged into upstream projects and I also foresee that a certain percentage of all projects just won’t accept this new approach and will reject all patches in this vein.

Output format

I think the most controversial part of these suggested “universal” changes is the common test suite output format. The common format is of course required so that we can “supervise” the output and results from any package without having to know any specifics.

While the ptest output format follows the automake test output syntax, I expect many projects that have selected a particular output format to rather stick with that. Hopefully we can then make projects introduce a separate make target or option that runs the test suite with the standard output format.

One little step forward

Building full-fledged Linux distributions cross-compiled that are completely tested on target will remain being hard work for a while more. But we are improving things, one step at a time.

Of course, the name ‘ptest’ is what the system is currently called by Björn within the yocto/OE environment. It is not supposed to be a catchy name for this idea outside of there. The ‘P’ refers to package, as opposed to for example system test and to make it less generic than simply test.

Three static code analyzers compared

I’m a fan of static code analyzing. With the use of fancy scanner tools we can get detailed reports about source code mishaps and quite decently pinpoint what source code that is suspicious and may contain bugs. In the old days we used different lint versions but they were all annoying and very often just puked out far too many warnings and errors to be really useful.

Out of coincidence I ended up getting analyses done (by helpful volunteers) on the curl 7.26.0 source base with three different tools. An excellent opportunity for me to compare them all and to share the outcome and my insights of this with you, my friends. Perhaps I should add that the analyzed code base is 100% pure C89 compatible C code.

Some general observations

First out, each of the three tools detected several issues the other two didn’t spot. I would say this indicates that these tools still have a lot to improve and also that it actually is worth it to run multiple tools against the same source code for extra precaution.

Secondly, the libcurl source code has some known peculiarities that admittedly is hard for static analyzers to figure out and not alert with false positives. For example we have several macros that look like functions and on several platforms and build combinations they evaluate as nothing, which causes dead code to be generated. Another example is that we have several cases of vararg-style functions and these functions are documented to work in ways that the analyzers don’t always figure out (both clang-analyzer and Coverity show problems with these).

Thirdly, the same lesson we knew from the lint days is still true. Tools that generate too many false positives are really hard to work with since going through hundreds of issues that after analyses turn out to be nothing makes your eyes sore and your head hurt.

Fortify

The first report I got was done with Fortify. I had heard about this commercial tool before but I had never seen any results from a run but now I did. The report I got was a PDF containing 629 pages listing 1924 possible issues among the 130,000 lines of code in the project.

fortify-curl

Fortify claimed 843 possible buffer overflows. I quickly got bored trying to find even one that could lead to a problem. It turns out Fortify has a very short attention span and warns very easily on lots of places where a very quick glance by a human tells us there’s nothing to be worried about. Having hundreds and hundreds of these is really tedious and hard to work with.

If we’re kind we call them all false positives. But sometimes it is more than so, some of the alerts are plain bugs like when it warns on a buffer overflow on this line, warning that it may write beyond the buffer. All variables are ‘int’ and as we know sscanf() writes an integer to the passed in variable for each %d instance.

sscanf(ptr, "%d.%d.%d.%d", &int1, &int2, &int3, &int4);

I ended up finding and correcting two flaws detected with Fortify, both were cases where memory allocation failures weren’t handled properly.

LLVM, clang-analyzer

Given the exact same code base, clang-analyzer reported 62 potential issues. clang is an awesome and free tool. It really stands out in the way it clearly and very descriptive explains exactly how the code is executed and which code paths that are selected when it reaches the passage is thinks might be problematic.

clang-analyzer report - click for larger version

The reports from clang-analyzer are in HTML and there’s a single file for each issue and it generates a nice looking source code with embedded comments about which flow that was followed all the way down to the problem. A little snippet from a genuine issue in the curl code is shown in the screenshot I include above.

Coverity

Given the exact same code base, Coverity detected and reported 118 issues. In this case I got the report from a friend as a text file, which I’m sure is just one output version. Similar to Fortify, this is a proprietary tool.

coverity curl report - click for larger version

As you can see in the example screenshot, it does provide a rather fancy and descriptive analysis of the exact the code flow that leads to the problem it suggests exist in the code. The function referenced in this shot is a very large function with a state-machine featuring many states.

Out of the 118 issues, many of them were actually the same error but with different code paths leading to them. The report made me fix at least 4 accurate problems but they will probably silence over 20 warnings.

Coverity runs scans on open source code regularly, as I’ve mentioned before, including curl so I’ve appreciated their tool before as well.

Conclusion

From this test of a single source base, I rank them in this order:

  1. Coverity – very accurate reports and few false positives
  2. clang-analyzer – awesome reports, missed slightly too many issues and reported slightly too many false positives
  3. Fortify – the good parts drown in all those hundreds of false positives

Three out of one hundred

If I’m not part of the solution, I’m part of the problem and I don’t want to be part of the problem. More specifically, I’m talking about female presence in tech and in particular in open source projects.

3 out of 100I’ve been an open source and free software hacker, contributor and maintainer for almost 20 years. I’m the perfect stereo-type too: a white, hetero, 40+ years old male living in a suburb of a west European city. (I just lack a beard.) I’ve done more than 20,000 commits in public open source code repositories. In the projects I maintain, and have a leading role in, and for the sake of this argument I’ll limit the discussion to curl, libssh2, and c-ares, we’re certainly no better than the ordinary average male-dominated open source projects. We’re basically only men (boys?) talking to other men and virtually all the documentation, design and coding is done by male contributors (to a significant degree).

Sure, we have female contributors in all these projects, but for example in the curl case we have over 850 named contributors and while I’m certainly not sure who is a woman and who is not when I get contributions, there’s only like 10 names in the list that are typically western female names. Let’s say there are 20. or 30. Out of a total of 850 the proportions are devastating no matter what. It might be up to 3%. Three. THREE. I know women are under-represented in technology in general and in open source in particular, but I think 3% is even lower than the already low bad average open source number. (Although, some reports claim the number of female developers in foss is as low as just above 1%, geekfeminism says 1-5%).

Numbers

Three percent. (In a project that’s been alive and kicking for thirteen years…) At this level after this long time, there’s already a bad precedent and it of course doesn’t make it easier to change now. It is also three percent of the contributors when we consider all contributors alike. If we’d count the number of female persons in leading roles in these projects, the amount would be even less.

It could be worth noting that we don’t really have any recent reliable stats for “real world” female share either. Most sources that I find on the Internet and people have quoted in talks tend to repeat old numbers that were extracted using debatable means and questions. The comparisons I’ve seen repeated many times on female participation in FOSS vs commercial software, are very often based on stats that are really not comparable. If someone has reliable and somewhat fresh data, please point them out for me!

“Ghosh, R. A.; Glott, R.; Krieger, B.;
Robles, G. 2002. Free/Libre and Open Source Software: Survey and Study. Part
IV: Survey of Developers. Maastricht: International Institute of Infonomics
/Merit.

A design problem of “the system”

I would blame “the system”. I’m working in embedded systems professionally as a consultant and contract developer. I’ve worked as a professional developer for some 20 years. In my niche, there’s not even 10% female developers. A while ago I went through my past assignments in order to find the last female developer that I’ve worked with, in a project, physically located in the same office. The last time I met a fellow developer at work who was female was early 2007. I’ve worked in 17 (seventeen!) projects since then, without even once having had a single female developer colleague. I usually work in smaller projects with like 5-10 people. So one female in 18 projects makes it something like one out of 130 or so. I’m not saying this is a number that is anything to draw any conclusions from since it just me and my guesstimates. It does however hint that the problem is far beyond “just” FOSS. It is a tech problem. Engineering? Software? Embedded software? Software development? I don’t know, but I know it is present both in my professional life as well as in my volunteer open source work.

Geekfeminism says the share is 10-30% in the “tech industry”. My experience says the share gets smaller and smaller the closer to “the metal” and low level programming you get – but I don’t have any explanation for it.

Fixing the problems

What are we (I) doing wrong? Am I at fault? Is the the way I talk or the way we run these projects in some subtle – or obvious – ways not friendly enough or downright hostile to women? What can or should we change in these projects to make us less hostile? The sad reality is that I don’t think we have any such fatal flaws in our projects that create the obstacles. I don’t think many females ever show up near enough the projects to even get mistreated in the first place.

I have a son and I have a daughter – they’re both still young and unaware of this kind of differences and problems. I hope I will be able to motivate and push and raise them equally. I don’t want to live in a world where my daughter will have a hard time to get into tech just because she’s a girl.

Haxx, the second year

Last year I posted my report of what I and my fellows did at Haxx after the first year of true and real independence. As I probably mentioned before, we registered our company 1997 but it was just a side project for over a decade.Haxx logo

Now, when we’re slowly approaching two years it is time to look back and what we’ve done during the past twelve months and what we’re doing right now.

We have firmly established ourselves even more as expert developers within embedded systems. We’re over and over again being hired by the teams that themselves are hired by companies to provide services or products. During the last twelve months, we’ve written software and software designs for a huge medical equipment company, a small video equipment manufacturer, a major international telecom, a market-leading embedded systems provider and a global chip manufacturer. We’ve debugged simulation software, designed video streaming servers, done video subtitling magic, poked on Linux kernel code and we’ve done old-school 8051 and 16bit x86 assembly. I’ve also managed to do a Embedded Linux development (in user-space) training course – twice. All this, in just the past year!

Haxx was (and presented) at FSCONS in Gothenburg, we went to (and presented at) FOSDEM in Brussels and we went to the Rockbox devcon in London. We did lots of work within the foss-sthlm community.

Oh, and we’ve revamped our logo and graphical design.

Haxx consists of three full-time employed senior expert embedded systems consultants. We’ve all been in the industry for over twenty years: Daniel Stenberg, Björn Stenberg and Linus Nielsen Feltzing.

We continuously work with partners in the area to reach out to new and existing customers. As we’re very small and rather spend our time on working in our actual assignments we appreciate the help with sales and marketing. If you’re in the Stockholm area and ever end up needing devoted and skilled embedded software hackers, call us!

I’m gonna do my very best to make sure we get another great year! I’ll report back and tell you how it went.

curl meetup at Fosdem 2012

The FOSDEM 2012 dates were recently revealed (4-5 February 2012).

A pint of guinness

I’d be happy to arrange a get-together for libcurl hackers at Fosdem this year. To me, Brussels, Belgium seems mid-europe enough to be able to attract a bunch of us:

  • libcurl application users/authors
  • libcurl binding hackers
  • libcurl contributors
  • … and everyone else who’s doing related activities or who just is interested

Potential subjects to discuss at such a meeting:

  • what’s the most important stuff libcurl still lacks?
  • what’s the least documented/understood parts of libcurl?
  • are there shared problems several/many libcurl bindings have to solve?
  • can we improve how we work/develop libcurl and bindings?
  • what kind of beer is best at a curl meetup?
  • [fill in your own curl related subject]

I would like at least 4-5 people voicing interest for this to be worthwhile for me to actually try to do anything. Please speak up on the libcurl mailing list, tweet me or mail me privately! The more people that are interested, the more planning and stuff we’ll do for it.

Haxx – the first year

Last year I left my former employment, and focused on Haxx full-time. My brother joined me a few months afterward (January 2010). Today, at October 1 2010 we celebrate the official one year anniversary of Haxx AB as employer.Haxx

The history of Haxx goes far longer back than so. Linus Nielsen Feltzing and I first registered the company Haxx back in October 1997 and we used it then primarily as a way to market and do business on the side of our “real” jobs. To have a way to charge and do things we wanted to, that wasn’t conflicting with our day jobs. And of course we also bought the domain and could setup our “permanent” email addresses etc, which turned out great since I’ve thus used the same email address since back then and I hope I never need to change it again!

The first year of Haxx has been nothing but great fun and a major success.

As we’re contract developers and consultants, we of course need to make sure that our employees are sold to customers to a high degree with as little gaps as possible. Our projects are typically going on from a few months up to a year or two. During this year, both me and Björn have worked with several end customers and we’ve thus both managed to change assignments several times and none of the times caused any gaps – at all. Our services seem to be in high demand.

Being only two employees brings challenges on how to deal with sales, financial accounting etc as we’re just a few guys and we’re experts on development! We have found a few great partners that “sell” us (and of course we pay them a certain amount of percentage, but that’s a price we need to accept and is nothing but fair anyway since we can then remain doing what we’re good at and what we love) and we’re buying the bookkeeping etc from another company that is specialized at doing it for companies like us.

We’re looking forward to many more years of great fun. We also hope to be able to grow the company slightly over time, so if you’re a kick-ass embedded open source guy with networking experience and some 10+ years in the business and you live in the Stockholm Sweden area, do get in touch! As I’ve mentioned before, we’re gonna start out our second year with Linus onboard.

I’ll get back with an update next year! 🙂