Category Archives: cURL and libcurl

curl and/or libcurl related

curl vs libcurl

In my mini-series of articles A vs B, the time is up for curl vs libcurl.

For me, the differences are so very clear and obvious but I get a fair stream of questions from users and random people that I thought it was about time to make an effort to once and for all make a page with the facts stated. A fixed home for curl vs libcurl knowledge.

So I did. And now I mentioned it to you. Enjoy! If you have additional content you think belong there or if you think anything is unclear or wrong, don’t hesitate to let me know!

cURL

Daniel’s currency exchange is no more

For quite a number of years I maintained a little web service to provide currency exchange rates in a handy format and in a way that was friendly for machines and other machine-exchangers. My personal favorite feature was the “easy conversion” helper that would provide a “easy to calculate in head” formula for back and forth between two currencies based on their current rates. Like “multiply by 5 and divide by 2” etc.

This service goes all the way back to 1997 when I started to work on getting exchange rates downloaded as a service to the IRC bot I ran in #amiga on efnet (even before the split when ircnet was created). Back then I was primarily working on the IRC bot named Dancer. 1997 I started the work on a tool to fetch rates. The tool would become curl and the web site to access the rates was initially hosted by the company Frontec for which I worked back then.One dollar bill

The URL changed a few more times but it has been available at http://daniel.haxx.se/currency for the last few years until a few weeks ago. Well, technically the URL still works but the service does not.

So a few weeks ago the primary site I’ve scraped for this info changed their format and I decided to not play cat and mouse anymore. I was already bending the rules by not reading their terms of service as I feared I wouldn’t be allowed to use their data like this. Also, I really don’t have any use for this service myself so I decided to do myself a service and stop wasting spare time on one of these projects that don’t give me enough personal satisfaction. I’m sure that if there is a demand for such a service I now closed down, there will be someone else out there ready to fire it up and serve users.

So long, and thanks for all the currency exchange fun.

My talk Optimera Sthlm

30 minutes is a tricky period to fill with contents when you do a talk, and yesterday I did my best at confusing/informing the audience at the OPTIMERA STHLM conference in transport layer performance. Where time is spent or lost today in TCP, what to think about to get things to behave faster, that RTT is not getting better even though brandwidth is growing really fast these days and a little about some future technologies like WebSockets, SPDY, SCTP and MPTCP.

Note: this talk is entirely in Swedish.

My slides for this is also viewable with slideshare.net like this:

OPTIMERA STHLM

Our friends at .SE are once again putting together an interesting conference-style day with talks, and this time the title of it is “OPTIMERA STHLM” (yes they use all caps) and it is all about optimizing web-related things.

I’ve been invited and I will do a 30 minute talk during that day about the transport layer and stuff on top of the transport layer. In other words it’ll include things to consider for TCP, DNS, HTTP, handling sockets, libcurl and a quick look at things such as Websockets, SPDY, MPTCP and SCTP.

The full day’s program is now available on the linked page. Enjoy!

professional libcurl hackers look this way!

In my company, Haxx, we work as consultants and we do contract development for customers who pay for our skill, time and dedication. We help them develop stuff.Haxx

We’re a small company, with basically two full-time employees. Most of our working days, we are involved with a single customer each who pays for our full-time involvement during a number of months. This is all good and fine. We love our jobs and we love our customers. We’re in it for the fun.

Now, these days we can see that the economy is slowly but surely gaining ground again and is getting up to speed. We hear more and more requests for help and potential assignments are starting to pour in. That’s great and all. Except that we’re only two guys and can’t accept very many projects…

Recently we’ve experienced a noticeable increase in amount of requests for support and other development help that involves curl and libcurl. I am the originator and maintainer of curl, there’s really no surprise or wonder that these companies contact me and us about it. I’m always very happy to see that there are companies and persons who are willing to pay for support of open source and in many cases pay for extending and bug fixing libcurl and have those fixes going back to the mainline sources without complaints.

Since we fail to accept a lot of requests, I’m interested in finding you who are interested in helping out with such work. Are you interested in helping out customers with curl related problems? Customers often come to us when they’ve got stuck within something they can’t easily solve themselves and they turn to us as experts in general, and experts on curl and libcurl in particular. And we are.

Before you think this is a great idea and you send me an email introducing yourself and your greatness in this area, please be aware that I will require proof of your qualifications. Most preferably, that proof is at least one good patch posted to the libcurl mailing list and accepted into the mainline libcurl code, but I’m open to accepting slightly less ideal proofs as well if you can just motivate why you failed to provide the ideal ones. Of course you will also need to be able to communicate in English without problems. Your geographical location, gender, race, religion, skin-color and shoe size are completely uninteresting.

I’m looking for someone interested in contract development, not full-time employment. We still do these kinds of jobs on a case by case basis and there may be one every two days, one per week or sometimes even less frequently. I want to increase my network of people I know and trust can deliver quality code and services for this kind of projects.

Can you help us?

curl and speced cookie order

I just posted this on the curl-library list, but I feel it suits to be mentioned here separately.

As I’ve mentioned before, I’m involved in the IETF http-state working group which is working to document how cookies are used in the wild today. The idea is to create a spec that new implementations can follow and that existing implementations can use to become more interoperable.

(If you’re interested in these matters, I can only urge you to join the http-state mailing list and participate in the discussions.)

The subject of how to order cookies in outgoing HTTP Cookie: headers have been much debated over the recent months and I’ve also blogged about it. Now, the issue has been closed and the decision went quite opposite to my standpoint and now the spec will say that while the servers SHOULD not rely on the order (yeah right, some obviously already do and with this specified like this even more will soon do the same) it will recommend clients to sort the cookies in a given way that is close to the way current Firefox does it[*].

This has the unfortunate side-effect that to become fully compatible with how the browsers do cookies, we will need to sort our cookies a bit more than what we just recently introduced. That in itself really isn’t very hard since once we introduced qsort() it is easy to sort on more/other keys.

The biggest problem we get with this, is that the sorting uses creation time of the cookies. libcurl and curl and others mostly use the Netscape cookie files to store cookies and keep state between invokes, and that file format doesn’t include creation time info! It is a simple text-based file format with TAB-separated columns and the last (7th) column is the cookie’s content.

In order to support the correct sorting between sessions, we need to invent a way to pass on the creation time. My thinking is that we do this in a way that allows older libcurls still understand the file but just not see/understand the creation time, while newer versions will be able to get it. This would be possible by extending the expires field (the 6th) as it is a numerical value that the existing code will parse as a number and it will stop at the first non-digit character. We could easily add a character separation and store the
creation time after. Like:

Old expire time:

2345678

New expire+creation time:

2345678/1234567

This format might even work with other readers of this file format if they do similar assumptions on the data, but the truth is that while we picked the format in the first place to be able to exchange cookies with a well known and well used browser, no current browser uses that format anymore. I assume there are still a bunch of other tools that do, like wget and friends.

Update: as my friend Micah Cowan explains: we can in fact use the order of the cookie file as “creation time” hint and use that as basis for sorting. Then we don’t need to modify the file format. We just need to make sure to store them in time-order… Internally we will need to keep a line number or something per cookie so that we can use that for sorting.

[*] – I believe it sorts on path length, domain length and time of creation, but as soon as the -06 draft goes online it will be easy to read the exact phrasing. The existing -05 draft exists at: http://tools.ietf.org/html/draft-ietf-httpstate-cookie-05

Apple – only 391 days behind

In the curl project, we take security seriously. We work hard to make sure we don’t open up for security problems of any kind and once we fail, we work hard at analyzing the problem and coming up with a proper fix as swiftly as possible to make our “customer” as little vulnerable as possible.

Recently I’ve been surprised and slightly shocked by the fact that a lot of open source operating systems didn’t release any security upgrades to our most recent security flaw until well over a month after we first publicized the flaw. I’m not sure why they all reacted so slowly. Possibly it is because vendor-sec isn’t quite working as they were informed prior to the notification, and of course I don’t really expect many security guys to be subscribed to the curl mailing lists. Slow distros include Debian and Mandriva while Redhat did great.

Today however, I got a mail from Apple (and no, I don’t know why they send these mails to me but I guess they think I need them or something) with the subject “APPLE-SA-2010-03-29-1 Security Update 2010-002 / Mac OS X v10.6.3“. Aha! Did Apple now also finally update their curl version you might think?

They did. But they did not fix this problem. They fixed two previous problems universally known as CVE-2009-0037 and CVE-2009-2417. Look at the date of that first one. March 3, 2009. Yes, a whopping 391 days after the problem was first made public, Apple sends out the security update. Cool. At least they eventually fixed the problem…

curl goes git

Just a few days ago the curl project turned twelve years old, and I decided that it was time for us to ditch our trusty old CVS setup and switch over to use git instead for source code control.

Why Switch at All

I’ve been very content with CVS over the years and in our small project we don’t really have any particularly weird or high demands on the version control software.

Lately (like in recent years) I’ve dipped my toes into various projects that have been using git, and more and more over time I’ve learned to appreciate the little goodies that git does that CVS simply cannot. I’m then not even speaking about branches or merges etc that git does a whole lot better and easier than CVS, I’m in fact even more in love with git’s way to ease handling with diffs sent by email and its great way of keeping track of authors separately from the committer etc. git am and git commit –author are simply two very handy tools missing in CVS.

Why Git

So if we want to switch from CVS to another tool what would we chose? That wasn’t really the question in my case so I didn’t answer it. In my case, it was rather that I’ve been using git in several projects and it is used in some of the biggest projects I work with so it was some git’s features I wanted. I didn’t consider any of the other distributed version tools as quite frankly: they wouldn’t be much better for me than what CVS already is. I want to reduce the number of different tools I need, and I’m quite sure anyway that git is one of the top contenders even if I would do an actual comparison.

So the choice to go git was quite selfish and done by me, but I felt that quite a few guys in the curl community supported this decision and very few actually believe remaining with CVS was a better idea.

The fact that git itself uses libcurl for its HTTP access of course also proves its good taste! 🙂

How did the conversion go

Very easy and swiftly. First, as I mentioned above we never used branches much so we basically had a linear development with a set of tags. I did an rsync of the full repo to get a local copy to work with, then I ran ‘git cvsimport’ on that to created a new repo. I did run it a couple of times to make sure I had done a correct mapping of all CVS user names to their git equivalents. Converting >10 years of CVS commits took roughly 10 minutes on my desktop machine so it wasn’t that tedious even.

Once I had a local repo created with all authors looking good, I simply followed the instructions on github.com on how to add a remote origin to a local branch and when I pushed to that, git sent off all commits ever made to curl to the remote repo now exposed to the world from github.com.

cURL

When that part was done, I did a quick read on the ‘git help daemon’ docs and 30 seconds later I had a local repo setup that is a mirror of the github one, so that users can still opt to get the code from haxx.se.

Unchanged work flow

Git allows different ways of working with the code, but I’ve decided that at least as a start we won’t change the way we work. I’ll offer all committers push rights to the master branch on the repository and we will simply all push to that, as our head development branch.

We will prefer patches made with git format-patch sent to the mailing list, but as before you can still produce patches by diffing source code using extracted tarballs or whatever approach you prefer.

All details on how to get the code for curl using git is available online.

140 foss hackers

At last, the first meeting with our recently started foss-sthlm effort took place. The amount of attention and attendance we achieved by far surpassed our wildest assumptions, and around 130-140 persons interested in open source showed up (we don’t know the exact number). The facilities in Kista where we held the event, were graciously let to our use by Stockholm’s University (DSV) and they were very good. MSC and Nohup were our two sponsors who donated great sub sandwiches and drinks to all of us. Thank you!

I’m glad we manage to offer this event completely free of charge to the attendees, and hey we had quality talkers speaking up on really interesting subject that I think the audience appreciated on the subjects of PostgreSQL, Upstart, Open Source Sweden, Rockbox and Debian packaging.

I did a 20 minute talk about curl – in Swedish. The slides are available (they are thus in Swedish too), see below, and hopefully there will soon be video available online with my presentation.

I also hope that we will gather all the slides at one single point to offer on the foss-sthlm web site, so check that out later on if you want the lot!

And here are Björn‘s slides:

a big curl forward

We’re proudly presenting a major new release of curl and libcurl and we call it 7.20.0.

The primary reason we decided to bump the minor number this time was that we introduce a range of new protocols, but we also did some other rather big works. This is the biggest update to curl and libcurl that have been made in recent years. Let me mention some of the other noteworthy changes and bugfixes:

We fixed a potential security issue, that would occur if an application requested to download compressed HTTP content and told libcurl to automatically uncompress it (CURLOPT_ENCODING) as then libcurl could wrongly call the write callback (CURLOPT_WRITEFUNCTION) with a larger buffer than what is documented to be the maximum size.

TFTP was finally converted to a “proper” protocol internally. By that I mean that it can now be used with the multi interface in an asynchronous way and it has far less special treatments. It is now “just another protocol” basically and that is a good thing. Also, the BLKSIZE problem with TFTP that has haunted us for a while was fixed so I really think this is the best version ever for TFTP in libcurl.

In several different places in the code older versions of libcurl didn’t properly call the progress callback while waiting for some special event to happen. This made the curl tool’s progress meter less responding but perhaps more importantly it prevented apps that use libcurl to abort the transfer during those phases. The affected periods included the ftp connection phase (including the initial FTP commands and responses), waiting for the TCP connect to complete and resolving host names using c-ares.

The DNS cache was found to have at least two bugs that could make entries linger in the database eternally and in another case too long. For apps that use a lot of connections to a lot of hosts, these problems could result in some serious performance punishments when the DNS cache lookups got slower and slower over time.

Users of the funny ftp server drftpd will appreciate that (lib)curl now support the PRET command, which is needed when getting data off such servers in passive mode. It’s a bit of a hack, but what can we do? We didn’t invent it nor can we help that it’s a popular thing to use! 😉

cURL