Category Archives: Open Source

Open Source, Free Software, and similar

A libcurl postergirl?

google for libcurl

If you click the image you’ll see a full-resolution screendump for my recent search for “libcurl” on google. Where did that (image of a) girl come from? Judging from where it appears on the results page right next to the information about the cURL project you can’t but assume that she’s somehow related to the project.

That’s of course not true. When moving the mouse over the image I get a tooltip with a funny “hair curling” URL and that’s also where a click on the image takes me.

A mighty weird way of presenting a search result if you ask me!

I like a good firmware bump

So I have this TV that I got for Christmas 2009. As it happens the guys at Philips clearly kept fixing the software and removed bugs after that moment. No surprise there really. I’ve been an embedded software developer for some twenty years by now. I know that software never gets “done” and that what ships in products is only what seems to be “good enough” at some point in time. Sometimes of course not even that good.

So the other day I took a photo of my TV firmware version. It shows how the firmware was made in April 2009. I did it during a discussion with a friend who happens to have the exact same TV as I do, and it then of course turns out he has a different (newer) firmware.

Oh right, I wonder if I can upgrade to a newer one? Once I’ve mastered the maze of the Philips web site I eventually found a download link and PDFs that told me how to. The list of fixes since my version was extensive and I noticed a few flaws mentioned that I have actually experienced!

The TV firmware download was a whopping 43MB. I realize this is because it is a full-fledged Linux system with kernel and God knows what else they’ve crammed in there. I decided to give it a closer check! The result of that was a little disappointing. It is quite clearly encrypted after some basic initial header.

hexdump -C firmware image

The data that starts on offset 0x220 is not x86 instructions and in fact nothing in the beginning of the file looks like x86 code (I just ran a quick “objdump -D –target binary -m i386” on the file). Of course, I don’t know what architecture my TV runs so perhaps even checking for x86 is wrong. I know MIPS is popular in DVDs, settop-boxes and related graphics stuff but…. Nah, I decided it really wasn’t worth the effort so I stopped investigating. I have no real intention of hacking on it anyway.

So I instead proceeded to the actual procedure of upgrading the thing.

Unzip the zip file and put the file in the root dir of a FAT32-formatted usb-stick. The instructions of course didn’t say it needs to be FAT32 but I used that and it worked, and I just smug in the dark to how a manufacturer like this just assumes that we would have FAT32 on our usb-sticks…

But I digress. When I inserted the upgrade USB, the TV switched itself off, was dark for a short while and then turned itself on again and showed the firmware upgrade screen.

The process was very fast, just like 30-40 seconds or something like that and then it was done and asked me to remove the “media” and restart. Of course we know that a usb stick is “media” so I removed it from the TV set.

The instructions were very clear that to “restart” the TV I must only press the ON/OFF button on the remote once and only once. So I was careful to do just that… 😉

Nothing strange happened, but after a brief moment of black screen the regular and familiar interface.

I jumped into the firmware version menu to check it out and yes, it shows an updated version now:

I did a quick check to see if I could detect my previous quirks now, but they may really be gone. They’ve been related to sound through HDMI and some graphical “glitches” when feeding the TV with full HD from a laptop.

So, with this firmware that was shipped many months after I got my TV, I seem to have gotten a better product.

I haven’t yet tested this new version to a significant degree so I don’t know yet if I’ve gotten some new nasty side-effects from it, as sometimes these kinds of firmware upgrades really cause you pain when something that formerly used to work so good suddenly turns out to not work that good any longer.

Rockbox Devcon 2011

Rockbox

Hoards of hackers in similar-looking t-shirts with funny logos having the b in front of the K (see below for some sort of explanation) were seen on the streets of London on Friday June 3rd 2011.

Thanks a lot to  Google UK who hosted our Rockbox developers conference this time in central London.

We had some short-time visitors but we were 16-18 reverse engineering happy persons in a single room most of the weekend, where we hacked away on code, whined on the amount of outstanding patches and bugs and generally made a large amount of bad jokes and Monthy Python references.

The happy core team was caught on a picture:

Rockbox team Devcon 2011

On the Saturday we plowed through a lengthy list of discussion points to really make the most of all of us gathering physically. Among the outcomes from that is that we decided we want to change to git, we think a lot of future of Rockbox lies in the app for Android, we keep the Archos support and more. The Android builds are going to get into the build system ASAP and we’re gonna setup a system where (only) trusted build clients will participate in the building of Android builds that will be distributed to users – this since applications on phones will have a much greater risk of causing harm if some “bad guy” would try to infect our system with stupid things.

Dominik “bluebrother” Riebling brought up the very interesting point that none of us had noticed: we have two different logos being used in the project: one with the K being in front of the b (like the one on the web page) and one with the K being behind the b – which is used in SVG logos and on just about all Rockbox t-shirts made so far! If you zoom in on the tshirts on the group picture you’ll see!

We will also start allowing GPLv3 code into Rockbox in order to be able to use espeak, but all our code will remain GPLv2 or later. I could only find a single USB header file left that comes from the Linux source tree and has a GPLv2 only license.

Even more than this was discussed but I figure the rest of the details will be posted properly on rockbox.org for those seriously interested.

All in all, it was a very enjoyable weekend with a lot of fun and great friends. We stayed at a hotel just a few blocks from the devcon office which was really convenient. even though its wakeup routine was a bit non-standard. Peter “petur” D’Hoye took a lot of pictures as usual.

We also managed to break the Tower of Rockbox record.

Daniel "Bagder" Stenberg Rockbox Devcon 2011

The group picture was taken by a Google person I don’t know the name of who helped us out, and the one of me was taken by Peter D’Hoye.

Rockbox bridge and tower

Keeping to the tradition and subtle arts of Rockbox Towers, but doing it with a twist to celebrate the place we have Rockbox devcon 2011, we decided to make a Rockbox bridge.

We started out by gathering all devices we had in the room that can run Rockbox and distributed them on the construction floor area. As the Android app runs fine on tablets now there’s actually a rather good way to get some solid base into the construction…

Many Rockbox devices

Once all material was known, the construction started with a large amount of eager engineers contributing with good and bad ideas and at times very shaky hands:

constructing a Rockbox bridge

(wods, scorche, gevaerts and paumary)

The result, involving an iRiver beneath the bridge catching the digital flow, became what might be the longest Rockbox construction done so far:

Rockbox bridge

Rockbox bridge closeup

After the bridge, the work started on the real stuff. Building the tallest Rockbox tower ever made. After a couple of accidents and crashes, the tireless team managed to break the previous 104 cm record and the new Rockbox tower record is now officially 117cm:

Rockbox devcon 2011 tower 117cm

(Pictures in this post were all taken by Peter D’Hoye.)

foss-sthlm, the sixth, the controversial one

I’m happy to say that I was the main planner and organizer of yet another foss-sthlm meetup, the sixth. Last week we attracted about one hundred eager FOSS hackers to attend to this meeting to which I had found and cooperated a somewhat controversial sponsor.

We have had a discussion within foss-sthlm now due to this event’s sponsor: what kinds of companies are acceptable as sponsor for FOSS events and what are not? It is obvious that we have a lot of different opinions here and several people have expressed that they deliberately didn’t go to meetup #6 simply because of the sponsor’s involvement and relationship to the Swedish defence industry. Do you think a defence manufacturer or weapon systems creator can also have its good sides and sponsor good activities or should we distance ourselves from them?

I’m honestly still interested in more opinions on this. We have not formed a policy around this subject because I simply don’t think that’s a good idea. We’re not a formal organization and we all have our different views. I think we have our members hand around and participate as long as we stay within a reasonable “boundary”. If we would get involved with the wrong kind of companies, a larger portion of the group would boycott the meeting or just plainly leave foss-sthlm. But why would we ever? I’m the one who’s mostly been in touch with sponsors and I would certainly not get involved and ask for money from companies that I believe have crossed the magic line in the sand.

The meeting was perhaps the most techy and most advanced of them all so far, and I found the talks very inspiring and educating. I’ve not had time or energy to put up a page with pictures or descriptions of them, and I think I’ll just skip it. You should probably try harder to attend next time instead!

US patent 6,098,180

(I am not a lawyer, this is not legal advice and these are not legal analyses, just my personal observations and ramblings. Please correct me where I’m wrong or add info if you have any!)

At 3:45 pm on March 18th 2011, the company Content Delivery Solutions LLC filed a complaint in a court in Texas, USA. The defendants are several bigwigs and the list includes several big and known names of the Internet:

  • Akamai
  • AOL
  • AT&T
  • CD Networks
  • Globalscape
  • Google
  • Limelight Networks
  • Peer 1 Network
  • Research In Motion
  • Savvis
  • Verizon
  • Yahoo!

The complaint was later amended with an additional patent (filed on April 18th), making it list three patents that these companies are claimed to violate (I can’t find the amended version online though). Two of the patents ( 6,393,471 and 6,058,418) are for marketing data and how to use client info to present ads basically. The third is about file transfer resumes.

I was contacted by a person involved in the case at one of the defendants’. This unspecified company makes one or more products that use “curl“. I don’t actually know if they use the command line tool or the library – but I figure that’s not too important here. curl gets all its superpowers from libcurl anyway.

This Patent Troll thus basically claims that curl violates a patent on resumed file transfers!

The patent in question that would be one that curl would violate is the US patent 6,098,180 which basically claims to protect this idea:

A system is provided for the safe transfer of large data files over an unreliable network link in which the connection can be interrupted for a long period of time.

The patent describes several ways in how it may detect how it should continue the transfer from such a break. As curl only does transfer resumes based on file name and an offset, as told by the user/application, that could be the only method that they can say curl would violate of their patent.

The patent goes into detail in how a client first sends a “signature” and after an interruption when the file transfer is about to continue, the client would ask the server about details of what to send in the continuation. With a very vivid imagination, that could possibly equal the response to a FTP SIZE command or the Content-Length: response in a HTTP GET or HEAD request.

A more normal reader would rather say that no modern file transfer protocol works as described in that patent and we should go with “defendant is not infringing, move on nothing to see here”.

But for the sake of the argument, let’s pretend that the patent actually describes a method of file transfer resuming that curl uses.

The ‘180 (it is referred to with that name within the court documents) patent was filed at February 18th 1997 (and issued on August 1, 2000). Apparently we need to find prior art that was around no later than February 17th 1996, that is to say one year before the filing of the stupid thing. (This I’ve been told, I had no idea it could work like this and it seems shockingly weird to me.)

What existing tools and protocols did resumed transfers in February 1996 based on a file name and a file offset?

Lots!

Thank you all friends for the pointers and references you’ve brought to me.

  • The FTP spec RFC 959 was published in October 1985. FTP has a REST command that tells at what offset to “restart” the transfer at. This was being used by FTP clients long before 1996, and an example is the known Kermit FTP client that did offset-based file resumed transfer in 1995.
  • The HTTP header Range: introduces this kind of offset-based resumed transfer, although with a slightly fancier twist. The Range: header was discussed before the magic date, as also can be seen on the internet already in this old mailing list post from December 1995.
  • One of the protocols from the old days that those of us who used modems and BBSes in the old days remember is zmodem. Zmodem was developed in 1986 and there’s this zmodem spec from 1988 describing how to do file transfer resumes.
  • A slightly more modern protocol that I’ve unfortunately found no history for before our cut-off date is rsync, as I could only find the release mail for rsync 1.0 from June 1996. Still long before the patent was filed obviously, and also clearly showing that the one year margin is silly as for all we know they could’ve come up with the patent idea after reading the rsync releases notes and still rsync can’t be counted as prior art.
  • Someone suggested GetRight as a client doing this, but GetRight wasn’t released in 1.0 until Febrary 1997 so unfortunately that didn’t help our case even if it seems to have done it at the time.
  • curl itself does not pre-date the patent filing. curl was first released in March 1998, and the predecessor was started around summer-time 1997. I don’t have any remaining proofs of that, and it still wasn’t before “the date” so I don’t think it matters much now.

At the time of this writing I don’t know where this will end up or what’s going to happen. Time will tell.

This Software patent obviously is a concern mostly to US-based companies and those selling products in the US. I am neither a US citizen nor do I have or run any companies based in the US. However, since curl and libcurl are widely used products that are being used by several hundred companies already, I want to help bring out as much light as possible onto this problem.

The patent itself is of course utterly stupid and silly and it should never have been accepted as it describes trivially thought out ideas and concepts that have been thought of and implemented already decades before this patent was filed or granted although I claim that the exact way explained in the patent is not frequently used. Possibly the protocol using a method that is closed to the description of the patent is zmodem.

I guess I don’t have to mention what I think about software patents.

I’m convinced that most or all download tools and browsers these days know how to resume a previously interrupted transfer this way. Why wouldn’t these guys also approach one of the big guys (with thick wallets) who also use this procedure? Surely we can think of a few additional major players with file tools that can resume file transfers and who weren’t targeted in this suit!

I don’t know why. Clearly they’ve not backed down from attacking some of the biggest tech and software companies.

patent drawing

(Illustration from the ‘180 patent.)

libcurl’s name resolving

Recently we’ve put in some efforts into remodeling libcurl’s code that handles name resolves, and then in particular the two asynchronous name resolver backends that we support: c-ares and threaded.

Name resolving in general in libcurl

libcurl can be built to do name resolves using different means. The primary difference between them is that they are either synchronous or asynchronous. The synchronous way makes the operation block during name resolves and there’s no “decent” way to abort the resolves if they take longer time than the program wants to allow it (other than using signals and that’s not what we consider a decent way).

Asynch resolving in libcurl

This is done using one of two ways: by building libcurl with c-ares support or by building libcurl and tell it to use threads to solve the problem. libcurl can be built using either mechanism on just about all platforms, but on Windows the build defaults to using the threaded resolver.

The c-ares solution

c-ares’ primary benefit is that it is an asynchronous name resolver library so it can do name resolves without blocking without requiring a new thread. It makes it use less resources and remain a perfect choice even if you’d scale up your application up to and beyond an insane number of simultaneous connections. Its primary drawback is that since it isn’t based on the system default name resolver functions, they don’t work exactly like the system name resolver functions and that causes trouble at times.

The threaded solution

By making sure the system functions are still used, this makes name resolving work exactly as with the synchronous solution, but thanks to the threading it doesn’t block. The downside here is of course that it uses a new thread for every name resolve, which in some cases can become quite a large number and of course creating and killing threads at a high rate is much more costly than sticking with the single thread.

Pluggable

Now we’ve made sure that we have an internal API that both our asynchronous name resolvers implement, and all code internally use this API. It makes the code a lot cleaner than the previous #ifdef maze for the different approaches, and it has the side-effect that it should allow much easier pluggable backends in case someone would like to make libcurl support another asynchronous name resolver or system.

This is all brand new in the master branch so please try it out and help us polish the initial quirks that may still exist in the code.

There is no current plan to allow this plugging to happen run-time or using any kind of external plugins. I don’t see any particular benefit for us to do that, but it would give us a lot more work and responsibilities.

cURL

HTTP transfer compression

HTTP is a protocol that looks simple in its simplest form and its readability can easily fool you into believing an implementation is straight forward and quickly done.

That’s not the reality though. HTTP is a very big protocol with lots of corners and twisting mazes that one can get lost in. Even after having been the primary author of curl for 13+ years, there are still lots of HTTP things I don’t master.

To name an example of an area with little known quirks, there’s a funny situation when it comes to how HTTP supports and doesn’t support compression of data and compression of data in transfer.

No header compression

A little flaw in HTTP in regards to compression is that there’s no way to compress headers, in either direction. No matter what we do, we must send the text as-is and both requests and responses are sometimes very big these days. Especially taken into account how cookies are always inserted in requests if they match. Anyway, this flaw is nothing we can do anything about in HTTP 1.1 so we need to live with it.

On the other side, compression of the response body is supported.

Compressing data

Compression of data can be done in two ways: either the actual transfer is compressed or the body data is compressed. The difference is subtle, but when the body data is compressed there’s really nothing that mandates that the client has to uncompress it for the end user, and if the transfer is compressed the receiver must uncompress it in order to deal with the transfer properly.

For reasons that are unknown to me, HTTP clients and servers started out supporting compression only using the Content-Encoding style. It means that the client tells the server what kind of content encodings it supports (using Accept-Encoding:) and the server then sends the response data using one of the supported encodings. The client then decides on its own that if it gets the content in one of the compressed formats that it said it can handle, it will automatically uncompress that on arrival.

The HTTP protocol designers however intended this kind of automatic compression and subsequent uncompress to be done using Transfer-Encoding, as the end result is the completely transparent and the uncompress action is implied and intended by the protocol design. This is done by the client telling the server what transfer encodings it supports with the TE: header and the server adds a Transfer-Encoding: header in the response telling how the transfer is encoded.

HTTP 1.1 introduced a mandatory encoding that all servers can use whenever they feel like it: chunked encoding, so all HTTP 1.1 clients already have to deal with Transfer-Encoding to some degree.

Surely curl is better than all those other guys, right?

Not really. Not yet anyway.

curl has a long history of copying its behavior from what the browsers do, in order to allow users to basically script anything imaginable that is HTTP-like with curl. In this vein, we implemented compression support the same way as all the browsers did it: the content encoding style. (I have reason to believe that at least Opera actually supports or used to support compressed Transfer-Encoding.)

Starting now (code pushed to git repo just after the 7.21.5 release), we’ve taken steps to improve things. We’re changing gears and we’re introducing support for asking for and using compressed Transfer-Encoding. This will start out as an optional feature/flag (–tr-encoding / CURLOPT_TRANSFER_ENCODING) so that we can start out and see how servers in the wild behave and that we can deal with them properly. Then possibly we can switch the default in the future to always ask for compressed transfers. At least for the command line tool.

We know from the little tests we are aware of, that there are at least one known little problem or shall we call it a little detail to keep on eye at, with introducing compressed Transfer-Encoding. As has been so fine reported several years ago in the opera blog Browser sniffing gone wrong (again): Cars.com, there are cases where this may cause the server to send data that gets compressed twice (using both Content and Transfer Encoding) and that needs to be taken care of properly by the client.

At the time of this writing, I’ve not yet taken care of the double-compress case in the code, but I intend to get on to it within shortly.

I’m otherwise very interested in hearing what kind of experience people will have from this. What servers and sites will support this as documented and intended?

Shipping curl 7.21.5

I don’t usually post anything here when we do curl releases, pretty much because we do them bimonthly on a fairly steady schedule so there should be little surprise to anyone interested by the time they get public.

But hey, this is hard work and just to remind you all what’s going on I thought I’d throw in a mention of what we’ve spent the last two months doing. curl and libcurl 7.21.5 is released today.

The five notable changes introduced this time include:

The CURLOPT_SOCKOPTFUNCTION callback can now return information back to libcurl that the socket libcurl operates on is already connected. This is useful for applications that do a lot of fiddling on their own and possibly provide its own socket to start with using the CURLOPT_OPENSOCKETFUNCTION.

curl the tool got support for the –netrc-file option, that allows a user to point out a specific .netrc file instead of always forcing the user to use the fixed $HOME/.netrc one.

Brand new support for building libcurl with the cyassl library for SSL/TLS support. Previously curl only had support for the older OpenSSL emulation API that cyassl used to provide, but starting now we’re using cyassl directly and it is now a proper SSL citizen among the seven SSL libraries curl supports.

Since the previous release when we shipped the first support for TLS-SRP that required GnuTLS, the OpenSSL project accepted patches that introduced TLS-SRP into their official version as well and accordingly we have received patches that now allow users to use TLS-SRP with libcurl built against (a new enough) OpenSSL as well.

We have started to re-use two error codes a bit differently within libcurl, so that it now can return: CURLE_NOT_BUILT_IN (4) when an application tries to use a feature that was missing or was explicitly disabled at build-time and CURLE_UNKNOWN_OPTION (48) when the application has passed in an option that isn’t known or recognized.

And we’re counting more than 40 bugfixes worth mentioning. The most important ones are possibly:

If using the multi interface doing RTSP, libcurl could crash when trying to re-use a previous connection.

POP3 didn’t do TLS properly, it issued the wrong command to start TLS and it didn’t send the password correctly once it did switch to TLS!

When using the multi interface, there could be times when the timeout didn’t trigger so it wouldn’t close lingering connections even when asked to do so.

SFTP and SFP with the multi_socket interface were not working correctly and would very easily end up with stalled transfers due to the application being told to wait for the wrong action (or none at all).

If told to use the CCC command (which is used with FTP-SSL when the client asks the server to switch off from an SSL connection back to plain TCP again), curl would disable SSL on the connection but then use the wrong socket reader function and crash.

… but of course, if you’ve suffered from a particular bug in a previous release I’m sure you’ll consider the exact bug fix that corrects your problem to be the most important one!

Not to forget, the great people apart from yours truly that have contributed with code and insights since the previous release. Without them, the above list of changes and bugfixes just wouldn’t exist. The friends we have to thank are (in no particular order):

Mike Crowe, Kamil Dudka, Julien Chaffraix, Hoi-Ho Chan, Ben Noordhuis, Dan Fandrich, Henry Ludemann, Karl M, Manuel Massing, Marcus Sundberg, Stefan Krause, Todd A Ouska, Saqib Ali, Andre Guibert de Bruet, Tor Arntsen, Vincent Torri, Dave Reisner, Chris Smowton, Tinus van den Berg, Hongli Lai, Gisle Vanem, Andrei Benea, Mehmet Bozkurt

… and now back to working towards the next release. To be expected in roughly two months. Repeat.