Tag Archives: cURL and libcurl

daniel.haxx.se week #3

I won’t keep posting every video update here, but I mostly wanted to mention that I’ve kept posting a weekly video over at youtube basically explaining what’s going on right now within my dearest projects. Mostly curl and some Firefox stuff.

This week: libcurl server cert verification API got a bashing at SEC-T, is HTTP for UDP a good idea? How about adding HTTP cache support to libcurl? HTTP/2 is getting deployed as we speak. Interesting curl bug when used by XBMC. The patch series for Firefox bug 939318 is improving slowly – will it ever land?

Using APIs without reading docs

This morning, my debug session was interrupted for a brief moment when two friends independently of each other pinged me to inform me about a talk at the current SEC-T conference going on here in Stockholm right now. It was yet again time to bring up the good old fun called libcurl API bashing. Again from the angle that users who don’t read the API docs might end up using it wrong.

Updated: You can see Meredith Patterson’s talk here, and the libcurl parts start at 24:15.

The specific libcurl topic at hand once again mostly had the CURLOPT_VERIFYHOST option in focus, with basically is the same argument that was thrown at us two years ago when libcurl was said to be dangerous. It is not a boolean. It is an option that takes (or took) three different values, where 2 is the secure level and 0 is disabled.

SEC-T on curl API

(This picture is a screengrab from the live stream off youtube, I don’t have any link to a stored version of it yet. Click it for slightly higher resolution.)

Speaker Meredith L. Patterson actually spoke for quite a long time about curl and its options to verify server certificates. While I will agree that she has a few good points, it was still riddled with errors and I think she deliberately phrased things in a manner to make the talk good and snappy rather than to be factually correct and trying to understand why things are like they are.

The VERIFYHOST option apparently sounds as if it takes a boolean (accordingly), but it doesn’t. She says verifying a certificate has to be a Yes/No question so obviously it is a boolean. First, let’s be really technical: the libcurl options that take numerical values always accept a ‘long’ and all documentation specify which values you can pass in. None of them are boolean, not by actual type in the C language and not described like that in the man pages. There are however language bindings running on top of libcurl that may use booleans for the values that take 0 or 1, but there’s no guarantee we won’t add more values in a future to numerical options.

I wrote down a few quotes from her that I’d like to address.

“In order for it to do anything useful, the value actually has to be set to two”

I get it, she wants a fun presentation that makes the audience listen and grin cheerfully. But this is highly inaccurate. libcurl has it set to verify by default. An application doesn’t have to set it to anything. The only reason to set this value is if you’re not happy with checking the cert unconditionally, and then you’ve already wondered off the secure route.

“All it does when set to to two is to check that the common name in the cert matches the host name in the URL. That’s literally all it does.”

No, it’s not. It “only” verifies the host name curl connects to against the name hints in the server cert, yes, but that’s a lot more than just the common name field.

“there’s been 10 versions and they haven’t fixed this yet […] the docs still say they’re gonna fix this eventually […] I wanna know when eventually is”

Qualified BS and ignorance of details. Let’s see the actual code first: it ignores the 1 value and returns an error and thus leaves the internal default 2, Alas, code that sets 1 or 2 gets the same effect == verified certificate. Why is this a problem?

Then, she says she really wants to know when “eventually” is. (The docs say “Future versions will…”) So if she was so curious you’d think she would’ve tried to ask us? We’re an accessible bunch, on mailing lists, on IRC and on twitter. No she didn’t ask.

But perhaps most importantly: did she really consider why it returns an error for 1? Since libcurl silently accepted 1 as a value for something like 10 years, there are a lot of old installations “out there” in the wild, and by returning an error for 1 we try to make applications notice and adjust. By silently accepting 1 without errors, there would be no notice and people will keep using 1 in new applications as well and thus when running such an newly written application with an older libcurl – you’d be back to having the security problem again. So, we have the error there to improve the situation.

“a peer is someone like you […] a host is a server”

I’m a networking guy since 20+ years and I’m not used to people having a hard time to understand these terms. While perhaps there are rookies out in the world who don’t immediately understand some terms in the curl option names, should we really be criticized for that? I find that a hilarious critique. Also, these names were picked 13 years ago and we have them around for compatibility and API stability.

“why would you ever want to …”

Welcome to the real world. Why would an application author ever want to have these options to something else than just full check and no check? Because people and software development is a large world with many different desires and use case scenarios and curl is more widely used and abused than what many people consider. Lots of people have wanted something else than just a Yes/No to server cert verification. In fact, I’ve had many users ask for even more switches and fine-grained ways to fiddle with verification. Yes/No is a lay mans simplified view of certificate verification.

SEC-T curl slide

(This picture is the slide from the above picture, just zoomed and straightened out a bit.)

API age, stability and organic growth

We started working on libcurl in spring 1999, we added the CURLOPT_SSL_VERIFYPEER option in October 2000 and we added CURLOPT_SSL_VERIFYHOST in August 2001. All that quite a long time ago.

Then add thousands of hours, hundreds of hackers, thousands of applications, a user count that probably surpasses one billion users by now. Then also add the fact that option names are sticky in the way we write docs, examples pop up all over the internet and everyone who’s close to the project learns them by name and spirit and we quite simply grow attached to them and the way they work. Changing the name of an option is really painful and cause of a lot of confusion.

I’ve instead tried to more and more emphasize the functionality in the docs, to stress what the options do and how to do server cert verifications with curl the safe way.

I can’t force users to read docs. I can’t forbid users to blindly assume something and I’m not in control of, nor do I want to affect, the large population of third party bindings that exist for using on top of libcurl to cater for every imaginable programming language – and some of them may of course themselves have documentation problems and what not.

Would I change some of the APIs and names for options we have in libcurl if I would redo them today? Yes I would.

So what do we do about it?

I think this is the only really interesting question to take from all this. Everyone wants stable APIs. Everyone wants sensible and easy to understand APIs and as we can see they should also basically be possible to figure out without reading any documentation. And yet the API has to be powerful and flexible enough to be really useful for all those different applications.

At this point where we have these options that we do, when you’ve done your mud slinging and the finger of blame is firmly pointed at us. How exactly do you suggest we move forward to fix these claimed problems?

Taking it personally

Before anyone tells me to not take it personally: curl is my biggest hobby and a project I’ve spent many years and thousands of hours on. Of course I take it personally, otherwise I would’ve stopped working in the project a long time ago. This is personal to me. I give it my loving care and personal energy and then someone comes here and throw ill-founded and badly researched criticisms at me. I think criticizers of open source projects should learn to discuss the matters with the projects as their primary way instead of using it to make their conference presentations become more feisty.

Daladevelop hackathon

On Saturday the 13th of September, I took part in a hackathon in Falun Sweden organized by Daladevelop.

20-something hacker enthusiasts gathered in a rather large and comfortable room in this place, an almost three hour drive from my home. A number of talks and lectures were held through the day and the difficulty level ranged from newbie to more advanced. My own contribution was a talk about curl followed by one about HTTP/2. Blabbermouth as I am, I exhausted the friendly audience by talking a good total of almost 90 minutes straight. I got a whole range of clever and educated questions and I think and hope we all had a good time as a result.

The organizers ran a quiz for two-person teams. I teamed up with Andreas Olsson in team Emacs, and after having identified x86 assembly, written binary, spotted perl, named Ada Lovelace, used the term lightfoot and provided about 15 more answers we managed to get first prize and the honor of having beaten the others. Great fun!

Video perhaps?

I decided to try to do a short video about my current work right now and make it available for you all. I try to keep it short (5-7 minutes) and I’m certainly no pro at it, but I will try to make a weekly one for a while and see if it gets any fun. I’m going to read your comments and responses to this very eagerly and that will help me decide how I will proceed on this experiment.

Enjoy.

Credits in the curl project

Friends!

When we receive patches, improvements, suggestions, advice and whatever that lead to a change in curl or libcurl, I make an effort to log the contributor’s name in association with that change. Ideally, I add a line in the commit message. We use “Reported-by: <full name>” quite frequently but also other forms of “…-by: <full name>” too like when there was an original patch by someone or testing and similar. It shouldn’t matter what the nature of the contribution is, if it helped us it is a contribution and we say thanks!

curl-give-credits

I want all patch providers and all of us who have push rights to use this approach so that we give credit where credit is due. Giving credit is the only payment we can offer in this project and we should do it with generosity.

The green bars on the right show the results from the question how good we are at giving credit in the project from the 2014 curl survey, where 5 is really good and 1 is really bad. Not too shabby, but I’d say we can do even better! (59% checked the top score, 15% checked the 3′)

I have a script called contributors.sh that extracts all contributors since a tag (typically the previous release) and I use that to get a list of names to thank in the RELEASE-NOTES file for the pending curl release. Easy and convenient.

After every release (which means every 8th week) I then copy the list of names from RELEASE-NOTES into docs/THANKS. So all contributors get remembered and honored after having helped us in one way or another.

When there’s no name

When contributors don’t provide a real name but only a nick name like foobar123, user_5678 and so on I tend to consider that as request to not include the person’s name anywhere and hence I tend to not include it in the THANKS or RELEASE-NOTES. This also sometimes the result of me not always wanting to bother by asking people over and over again for their real name in case they want to be given proper and detailed credit for what they’ve provided to us.

Unfortunately, a notable share of all contributions we get to the project are provided by people “hiding” behind a made up handle. I’m fine with that as long as it truly is what the helpers’ actually want.

So please, if you help us out, we will happily credit you, but please tell us your name!

keep-calm-and-improve-curl

I’m eight months in on my Mozilla adventure

I started working for Mozilla in January 2014. Here’s some reflections from my first time as Mozilla employee.

Working from home

I’ve worked completely from home during some short periods before in my life so I had an idea what it would be like. So far, it has been even better than I had anticipated. It suits me so well it is almost scary! No commutes. No delays due to traffic. No problems ever with over-crowded trains or buses. No time wasted going to work and home again. And I’m around when my kids get home from school and it’s easy to receive deliveries all days. I don’t think I ever want to work elsewhere again… 🙂

Another effect of my work place is also that I probably have become somewhat more active on social networks and IRC. If I don’t use those means, I may spent whole days without talking to any humans.

Also, I’m the only Mozilla developer in Sweden – although we have a few more employees in Sweden. (Update: apparently this is wrong and there’s’ also a Mats here!)

Daniel's home office

The freedom

I have freedom at work. I control and decide a lot of what I do and I get to do a lot of what I want at work. I can work during the hours I want. As long as I deliver, my employer doesn’t mind. The freedom isn’t just about working hours but I also have a lot of control and saying about what I want to work on and what I think we as a team should work on going further.

The not counting hours

For the last 16 years I’ve been a consultant where my customers almost always have paid for my time. Paid by the hour I spent working for them. For the last 16 years I’ve counted every single hour I’ve worked and made sure to keep detailed logs and tracking of whatever I do so that I can present that to the customer and use that to send invoices. Counting hours has been tightly integrated in my work life for 16 years. No more. I don’t count my work time. I start work in the morning, I stop work in the evening. Unless I work longer, and sometimes I start later. And sometimes I work on the weekend or late at night. And I do meetings after regular “office hours” many times. But I don’t keep track – because I don’t have to and it would serve no purpose!

The big code base

I work with Firefox, in the networking team. Firefox has about 10 million lines C and C++ code alone. Add to that everything else that is other languages, glue logic, build files, tests and lots and lots of JavaScript.

It takes time to get acquainted with such a large and old code base, and lots of the architecture or traces of the original architecture are also designed almost 20 years ago in ways that not many people would still call good or preferable.

Mozilla is using Mercurial as the primary revision control tool, and I started out convinced I should too and really get to learn it. But darn it, it is really too similar to git and yet lots of words are intermixed and used as command but they don’t do the same as for git so it turns out really confusing and yeah, I felt I got handicapped a little bit too often. I’ve switched over to use the git mirror and I’m now a much happier person. A couple of months in, I’ve not once been forced to switch away from using git. Mostly thanks to fancy scripts and helpers from fellow colleagues who did this jump before me and already paved the road.

C++ and code standards

I’m a C guy (note the absence of “++”). I’ve primarily developed in C for the whole of my professional developer life – which is approaching 25 years. Firefox is a C++ fortress. I know my way around most C++ stuff but I’m not “at home” with C++ in any way just yet (I never was) so sometimes it takes me a little time and reading up to get all the C++-ishness correct. Templates, casting, different code styles, subtleties that isn’t in C and more. I’m slowly adapting but some things and habits are hard to “unlearn”…

The publicness and Bugzilla

I love working full time for an open source project. Everything I do during my work days are public knowledge. We work a lot with Bugzilla where all (well except the security sensitive ones) bugs are open and public. My comments, my reviews, my flaws and my patches can all be reviewed, ridiculed or improved by anyone out there who feels like doing it.

Development speed

There are several hundred developers involved in basically the same project and products. The commit frequency and speed in which changes are being crammed into the source repository is mind boggling. Several hundred commits daily. Many hundred and sometimes up to a thousand new bug reports are filed – daily.

yet slowness of moving some bugs forward

Moving a particular bug forward into actually getting it land and included in pending releases can be a lot of work and it can be tedious. It is a large project with lots of legacy, traditions and people with opinions on how things should be done. Getting something to change from an old behavior can take a whole lot of time and massaging and discussions until they can get through. Don’t get me wrong, it is a good thing, it just stands in direct conflict to my previous paragraph about the development speed.

In the public eye

I knew about Mozilla before I started here. I knew Firefox. Just about every person I’ve ever mentioned those two brands to have known about at least Firefox. This is different to what I’m used to. Of course hardly anyone still fully grasp what I’m actually doing on a day to day basis but I’ve long given up on even trying to explain that to family and friends. Unless they really insist.

Vitriol and expectations of high standards

I must say that being in the Mozilla camp when changes are made or announced has given me a less favorable view on the human race. Almost anything or any chance is received by a certain amount of users that are very aggressively against the change. All changes really. “If you’ll do that I’ll be forced to switch to Chrome” is a very common “threat” – as if that would A) work B) be a browser that would care more about such “conservative loonies” (you should consider that my personal term for such people)). I can only assume that the Chrome team also gets a fair share of that sort of threats in the other direction…

Still, it seems a lot of people out there and perhaps especially in the Free Software world seem to hold Mozilla to very high standards. This is both good and bad. This expectation of being very good also comes from people who aren’t even Firefox users – we must remain the bright light in a world that goes darker. In my (biased) view that tends to lead to unfair criticisms. The other browsers can do some of those changes without anyone raising an eyebrow but when Mozilla does similar for Firefox, a shitstorm breaks out. Lots of those people criticizing us for doing change NN already use browser Y that has been doing NN for a good while already…

Or maybe I’m just not seeing these things with clear enough eyes.

How does Mozilla make money?

Yeps. This is by far the most common question I’ve gotten from friends when I mention who I work for. In fact, that’s just about the only question I get from a lot of people… (possibly because after that we get into complicated questions such as what exactly do I do there?)

curl and IETF

I’m grateful that Mozilla allows me to spend part of my work time working on curl.

I’m also happy to now work for a company that allows me to attend to IETF/httpbis and related activities much better than ever I’ve had the opportunity to in the past. Previously I’ve pretty much had to spend spare time and my own money, which has limited my participation a great deal. The support from Mozilla has allowed me to attend to two meetings so far during the year, in London and in NYC and I suspect there will be more chances in the future.

Future

I only just started. I hope to grab on to more and bigger challenges and tasks as I get warmer and more into everything. I want to make a difference. See you in bugzilla.

libressl vs boringssl for curl

I tried to use two OpenSSL forks yesterday with curl. I built both from source first (of course, as I wanted the latest and greatest) an interesting thing in itself since both projects have modified the original build system so they’re now three different ways.

libressl 2.0.0 installed and built flawlessly with curl and I’ve pushed a change that shows LibreSSL instead of OpenSSL when doing curl -V etc.

boringssl didn’t compile from git until I had manually fixed a minor nit, and then it has no “make install” target at all so I had manually copy the libs and header files to a place suitable for curl’s configure to detect. Then the curl build failed because boringssl isn’t API compatiable with some of the really old DES stuff – code we use for NTLM. I asked Adam Langley about it and he told me that calling code using DES “needs a tweak” – but I haven’t yet walked down that road so I don’t know how much of a nuisance that actually is or isn’t.

Summary: as an openssl replacement, libressl wins this round over boringssl with 3 – 0.

improving the curl docs, step 1

As I mentioned before, the curl documentation needs improvement. As a first step I converted the man page for curl_easy_setopt into no less than 210(!) individual man pages. One new for each option the function supports.

The man page was originally (a few days ago) almost 3000 lines, and now with them all split up we end up with a lot more text. This because the new format encourages more text per option and each page now has to detail itself more. This should also make each option much easier to google/search and to link to when we help users understand the options.

I’ve made some server-side scripts to generate html versions of them all, I generate a list of all options we have and the examples we host on the web site now have all mentions of the options linked directly to these new pages.

The curl_easy_setopt man page will then get most explanations cut out and mainly be used as an index with the options grouped into logical sections to help users find the options they want to use. I could cut out almost 2500 lines.

The new man pages add about 7500 lines of documentation (excluding the headers in each file)…

curl the next few years

Roadmap of things Daniel Stenberg and Steve Holme want to work on next. It is intended to serve as a guideline for others for information, feedback and possible participation.

If you agree, disagree or would like to add stuff you want to work on, please join us on the curl-library list! This “roadmap” is likely to change over time. We’ll keep the updated ROADMAP in git.

New stuff – libcurl

  1. http2 test suite

  2. http2 multiplexing/pipelining

  3. SPDY

  4. SRV records

  5. HTTPS to proxy

  6. make sure there’s an easy handle passed in to curl_formadd(), curl_formget() and curl_formfree() by adding replacement functions and deprecating the old ones to allow custom mallocs and more

  7. HTTP Digest authentication via Windows SSPI

  8. GSSAPI authentication in the email protocols

  9. add support for third-party SASL libraries such as Cyrus SASL – may need to move existing native and SSPI based authentication into vsasl folder after reworking HTTP and SASL code

  10. SASL authentication in LDAP

  11. Simplify the SMTP email interface so that programmers don’t have to construct the body of an email that contains all the headers, alternative content, images and attachments – maintain raw interface so that programmers that want to do this can

  12. Allow the email protocols to return the capabilities before authenticating. This will allow an application to decide on the best authentication mechanism

  13. Allow Windows threading model to be replaced by Win32 pthreads port

  14. Implement a dynamic buffer size to allow SFTP to use much larger buffers and possibly allow the size to be customizable by applications. Use less memory when handles are not in use?

New stuff – curl

  1. Embed a language interpreter (lua?). For that middle ground where curl isn’t enough and a libcurl binding feels “too much”. Build-time conditional of course.

  2. Simplify the SMTP command line so that the headers and multi-part content don’t have to be constructed before calling curl

Improve

  1. build for windows (considered hard by many users)

  2. curl -h output (considered overwhelming to users)

  3. we have  > 160 command line options, is there a way to redo things to simplify or improve the situation as we are likely to keep adding features/options in the future too

  4. docs (considered “bad” by users but how do we make it better?)

  5. authentication framework (consider merging HTTP and SASL authentication to give one API for protocols to call)

  6. Perform some of the clean up from the TODO document, removing old definitions and such like that are currently earmarked to be removed years ago

Remove

  1. cmake support (nobody maintains it)

  2. makefile.vc files as there is no point in maintaining two sets of Windows makefiles. Note: These are currently being used by the Windows autobuilds