Tag Archives: cURL and libcurl

curl presentation video

I held a 38 minute talk (in English) at the FSCONS conference 2007 about curl and libcurl, and now I’ve realized that the recording from that event is available online in various forms and ways.

You can get the pure Ogg Theora video files by using these links:

The slides from the presentation are still available.

fsfe.org hosts the complete collection of videos from that conference.

I haven’t yet had time and oppurtunity to watch it myself. I figure I’ll do that soon to see and learn from my own mistakes and odd habits when talking in public… and try to not get disturbed too much by my own accent!

What is this yassl really?

yassl is said to be Yet Another SSL library and I’ve been told that for example it is the preferred library used by the mysql camp. I got interested in this several years ago when I learned about it since I thought it was fun to see an alternative implementation of OpenSSL that still offers the same API.

Since then, I’ve amused myself by trying to build and run curl with it like every six months or so. I’ve made (lib)curl build fine with yassl (and its configure script also detects that it is an OpenSSL API emulated by yassl), but I’ve never seen it run the entire curl test suite through without failing at least one test!

I asked the mysql guy about how yassl has worked for them, but he kind of shrugged and admitted that they hadn’t tried it much (and then I don’t know really who he spoke for, the entire team or just he and his closest friends) but he said it worked for them.

Today I noticed the yassl version 1.9.6 that I downloaded, built and tried against curl. This time curl completely fails to build with it…

Let me also point out that it’s not like I’ve not told the yassl team (person?) about these problems in the past. I have, and there have been adjustments that have been meant to address problems I’ve seen. I just can’t make curl use it successfully… libcurl can still be built and run with OpenSSL, GnuTLS or NSS so it’s not like we lack SSL library alternatives.

The same team/person seems to behind another SSL lib called Cyassl that’s aimed for smaller footprint systems and I’ve heard whispers about people trying to get libcurl to build against that and it surely is going to be interesting to see where that leads!

Metalink in curl bounty

The Metalink guys host a list of project ideas and one of those ideas is to add metalink support to curl, and I recently bumped the stakes a bit by raising the bounty with an additional 200 USD so that the offer is now 500 USD for the person or team that brings the feature as described.

My primary motivation for doing this is that I like the metalink idea and I’d like to help making sure it gets used more widely.

curl 7.19.1

Trying hard to maintain the bimonthly release schedule we’ve been keeping up with for quite some time by now, we therefore now proudly announce the release of curl and libcurl 7.19.1

This release includes at least 24 bug fixes and the following changes:

The Open Source Census Report

I’d never heard about the Open Source Census before when I fell over a mention of their recent report somewhere. Their mission is to get “enterprises” to install their little client which scans computers for open source products and reports the findings back to a central server.

Anyway, their current database consists of a “mere” 2300 machines scanned but that equals a total of 314,000 open source installations. 768 different packages are identified. The top-10 found products are:

  1. firefox 84.4%
  2. zlib 65.75%
  3. xerces 61.24%
  4. wget 61.12%
  5. xalan 58.19%
  6. prototype 57.03%
  7. activation 53.01%
  8. javamail 50.15%
  9. openssl 46.45%
  10. docbook-xml 46.27%

Ok, as an open source hacker and a geek, there are two things we need to do here: 1) find out how our own projects rank among the others and 2) how the scanning is done and thus how good it is. Thankfully all this is possible due to the entire data set being downloadable for free and the client being fully open source.

find out how our own projects rank

“curl” was found on 18.19% of all computers. That makes it #81 on the list, just below virtualbox and wireshark, but immediately above jstl and busybox. This includes “All Versions” of all tools, and for curl’s sake that was 22 different versions!

I found no other project I do anything noticeable in. Subversion is at #44.

how the scanning is done

It’s quite simple. It scans for file names based on a file name pattern and then it pattern matches contents of those files. It also extracts version numbers for the files using those regex patterns. You can see the full set of patterns/rules in the XML file straight off their source code repository: project-rules.xml.

how good is it

With this specific patterns for binary contents they of course need special human treatment for many versions and that is of course error-prone. That could explain why no curl version of the latest version (7.19.0) was reported. It will also cause renamed tools to remain undetected.

In my particular case I would of course also like to know how much libcurl is used, but they don’t seem to check for that (I found several projects besides the curl tool that I know use libcurl).

All this said, I didn’t actually try out the client myself so I haven’t verified it for real.

ohloh vs statcvs

I’ve played a bit with statcvs lately and I generated reports for the curl repository. It turned out rather interesting (well, assuming you’re a statistics geek such as me) especially in comparison to the data and stats ohloh.net presents for the same code:

Executive summary:

  • I’ve done 82% of all code changes.
  • We seem to grow at roughly the same pace (both number of code lines and number of files) over the last years.
  • The lines of code per file count seems rather fixed

Oh, that initial big bump at late 1999/early 2000 was due to a lot of “wrong” files such as configure, config.guess etc were committed and subsequently removed. It is a bit annoying to have there as it ruins the data somewhat but I’ve not managed to fool statcvs into ignoring that part…

strcasecmp in Turkish

A friendly user submitted the (lib)curl bug report #2154627 which identified a problem with our URL parser. It doesn’t treat “file://” as a known protocol if the locale in use is Turkish.

This was the beginning of a minor world-moving revelation for me. Of course this is already known to mankind and I’m just behind, but really: lots of my fellow hacker friends had no idea either.

So “file” and “FILE” are not the same word case insensitively in Turkish because ‘i’ is not the lowercase version of ‘I’.

Back to strcasecmp: POSIX pretty much makes the function useless by saying that “The results are unspecified in other locales [than POSIX]”.

I’m a bit annoyed by this fact, as now I have to introduce my own function (which thus cannot use tolower() or toupper() since they also are affected by the locale) and use since the strings in our code is clearly used for “English” strings so file and FILE truly are the same string when compared case insensitively…

Another curl scan shows work to do

The nice guys on Coverity did a new scan on curl (the 7.19.0 source code) and they dug a bunch of new flaws. The previous version they checked was 7.16.1, release some 20 months before. The new changes are not only because of how the code has changed in the mean time, but it seems their scanner have improved a bit since the last time as well!

Here’s a sample view of how libcurl might dereference a NULL pointer with a step-by-step explanation on what conditions that lead to the flaw:

They identify 22 flaws and I found it interesting to compare the top list of bad functions as reported by Coverity with the complexity list I showed the other day. First we need to ignore the 9 flaws Coverity found in the ‘curl’ tool code (i.e not within the library). Then the 10 remaining functions with flaws marked by Coverity are:

  • Curl_getinfo (4 flaws, all the other ones have one each)
  • Curl_cookie_add (present in the complexity top-10 table)
  • FormAdd (present in the complexity top-10 table)
  • parsedate
  • ftp_parse_url_path
  • tftp_do
  • resolve_server
  • curl_easy_pause
  • add_closure
  • Curl_connect

See? Only two of them were present in that list. The Coverity tool does in fact also count the complexity for each function, and while it doesn’t match the values pmccabe shows exactly, they seem to agree in general about what functions that are the most complex ones.

Ok, now let’s go work on fixing all these problems…

Curl Cyclomatic Complexity

I was at the OWASP Sweden meeting last night and spoke about Open source and security. One of the other speakers present was Simon Josefsson who in his talk showed a nice table listing functions in his project sorted by “complexity“. Functions above a certain score are then considered “high risk” as they are hard to read and follow and thus may be subject to security problems.

The kind man he is, Simon already shows a page with a Curl Cyclomatic Complexity Report nicely identifying a bunch of functions we should really consider poking at to decrease complexity of. The top-10 “bad” functions are:

Function Score Statements Lines Code
ssh_statemach_act 254 880 1582 lib/ssh.c
Curl_http 204 395 886 lib/http.c
readwrite_headers 129 269 709 lib/transfer.c
Curl_cookie_add 118 247 502 lib/cookie.c
FormAdd 105 210 421 lib/formdata.c
dprintf_formatf 92 233 395 lib/mprintf.c
multi_runsingle 94 251 606 lib/multi.c
Curl_proxyCONNECT 74 212 443 lib/http.c
readwrite_data 73 127 319 lib/transfer.c
ftp_state_use_port 60 195 387 lib/ftp.c

I intend to use this as an indication on what functions within libcurl to work on. My plan is to primarily break down each of these functions to smaller ones to make them easier to read and follow. It would be cool to get every single function below 50. But I’m not sure that’s feasible or even really a good idea.