Category Archives: Technology

Really everything related to technology

curl and speced cookie order

I just posted this on the curl-library list, but I feel it suits to be mentioned here separately.

As I’ve mentioned before, I’m involved in the IETF http-state working group which is working to document how cookies are used in the wild today. The idea is to create a spec that new implementations can follow and that existing implementations can use to become more interoperable.

(If you’re interested in these matters, I can only urge you to join the http-state mailing list and participate in the discussions.)

The subject of how to order cookies in outgoing HTTP Cookie: headers have been much debated over the recent months and I’ve also blogged about it. Now, the issue has been closed and the decision went quite opposite to my standpoint and now the spec will say that while the servers SHOULD not rely on the order (yeah right, some obviously already do and with this specified like this even more will soon do the same) it will recommend clients to sort the cookies in a given way that is close to the way current Firefox does it[*].

This has the unfortunate side-effect that to become fully compatible with how the browsers do cookies, we will need to sort our cookies a bit more than what we just recently introduced. That in itself really isn’t very hard since once we introduced qsort() it is easy to sort on more/other keys.

The biggest problem we get with this, is that the sorting uses creation time of the cookies. libcurl and curl and others mostly use the Netscape cookie files to store cookies and keep state between invokes, and that file format doesn’t include creation time info! It is a simple text-based file format with TAB-separated columns and the last (7th) column is the cookie’s content.

In order to support the correct sorting between sessions, we need to invent a way to pass on the creation time. My thinking is that we do this in a way that allows older libcurls still understand the file but just not see/understand the creation time, while newer versions will be able to get it. This would be possible by extending the expires field (the 6th) as it is a numerical value that the existing code will parse as a number and it will stop at the first non-digit character. We could easily add a character separation and store the
creation time after. Like:

Old expire time:

2345678

New expire+creation time:

2345678/1234567

This format might even work with other readers of this file format if they do similar assumptions on the data, but the truth is that while we picked the format in the first place to be able to exchange cookies with a well known and well used browser, no current browser uses that format anymore. I assume there are still a bunch of other tools that do, like wget and friends.

Update: as my friend Micah Cowan explains: we can in fact use the order of the cookie file as “creation time” hint and use that as basis for sorting. Then we don’t need to modify the file format. We just need to make sure to store them in time-order… Internally we will need to keep a line number or something per cookie so that we can use that for sorting.

[*] – I believe it sorts on path length, domain length and time of creation, but as soon as the -06 draft goes online it will be easy to read the exact phrasing. The existing -05 draft exists at: http://tools.ietf.org/html/draft-ietf-httpstate-cookie-05

Apple – only 391 days behind

In the curl project, we take security seriously. We work hard to make sure we don’t open up for security problems of any kind and once we fail, we work hard at analyzing the problem and coming up with a proper fix as swiftly as possible to make our “customer” as little vulnerable as possible.

Recently I’ve been surprised and slightly shocked by the fact that a lot of open source operating systems didn’t release any security upgrades to our most recent security flaw until well over a month after we first publicized the flaw. I’m not sure why they all reacted so slowly. Possibly it is because vendor-sec isn’t quite working as they were informed prior to the notification, and of course I don’t really expect many security guys to be subscribed to the curl mailing lists. Slow distros include Debian and Mandriva while Redhat did great.

Today however, I got a mail from Apple (and no, I don’t know why they send these mails to me but I guess they think I need them or something) with the subject “APPLE-SA-2010-03-29-1 Security Update 2010-002 / Mac OS X v10.6.3“. Aha! Did Apple now also finally update their curl version you might think?

They did. But they did not fix this problem. They fixed two previous problems universally known as CVE-2009-0037 and CVE-2009-2417. Look at the date of that first one. March 3, 2009. Yes, a whopping 391 days after the problem was first made public, Apple sends out the security update. Cool. At least they eventually fixed the problem…

An FTP hash command

Anthony Bryan strikes again. This time his name is attached to a new standards draft for how to get a hash checksum of a given file when using the FTP protocol. draft-bryan-ftp-hash-00 was published just a few days ago.

The idea is basically to introduce a spec for a new command named ‘HASH’ that a client can issue to a server to get a hash checksum for a given file in order to know that the file has the exact same contents you want before you even start downloading it or otherwise consider it for actions.

The spec details how you can ask for different hash algorithms, how the server announces its support for this in its FEAT response etc.

I’ve already provided some initial feedback on this draft, and I’ll try to assist Anthony a bit more to get this draft pushed onwards.

a big curl forward

We’re proudly presenting a major new release of curl and libcurl and we call it 7.20.0.

The primary reason we decided to bump the minor number this time was that we introduce a range of new protocols, but we also did some other rather big works. This is the biggest update to curl and libcurl that have been made in recent years. Let me mention some of the other noteworthy changes and bugfixes:

We fixed a potential security issue, that would occur if an application requested to download compressed HTTP content and told libcurl to automatically uncompress it (CURLOPT_ENCODING) as then libcurl could wrongly call the write callback (CURLOPT_WRITEFUNCTION) with a larger buffer than what is documented to be the maximum size.

TFTP was finally converted to a “proper” protocol internally. By that I mean that it can now be used with the multi interface in an asynchronous way and it has far less special treatments. It is now “just another protocol” basically and that is a good thing. Also, the BLKSIZE problem with TFTP that has haunted us for a while was fixed so I really think this is the best version ever for TFTP in libcurl.

In several different places in the code older versions of libcurl didn’t properly call the progress callback while waiting for some special event to happen. This made the curl tool’s progress meter less responding but perhaps more importantly it prevented apps that use libcurl to abort the transfer during those phases. The affected periods included the ftp connection phase (including the initial FTP commands and responses), waiting for the TCP connect to complete and resolving host names using c-ares.

The DNS cache was found to have at least two bugs that could make entries linger in the database eternally and in another case too long. For apps that use a lot of connections to a lot of hosts, these problems could result in some serious performance punishments when the DNS cache lookups got slower and slower over time.

Users of the funny ftp server drftpd will appreciate that (lib)curl now support the PRET command, which is needed when getting data off such servers in passive mode. It’s a bit of a hack, but what can we do? We didn’t invent it nor can we help that it’s a popular thing to use! 😉

cURL

The big protocols

OWASP Sweden once again arranged another interesting meeting, this time with three talks.owasp

The title of the meeting on January 21st here in Stockholm called the protocols “the big ones” (but in Swedish) but I have no idea what kind of measurement they’ve used or what the small ones are or what other “big protocols” there might be! 😉

First we got to hear HÃ¥vard Eidnes tell us about BGP and that protocol seems to suffer from its share of security problems with the protocol itself but perhaps even more with the actual implementations as one of the bigger recent BGP-related incidents that was spoken about was about how internal routes were leaked to the outside from Pakistan in Feb 2008 which made them block the entire world’s access to Youtube. This talk also gave us some insights on the “wild west” of international routing and the lack of control and proper knowledge about who’s allowed to route what to where.

There then was a session by Rickard Bellgrim about DNSSEC and even though I’ve heard talks about this protocol in the past I couldn’t but to again feel that man they have a lot of terminology in that world that makes even a basic description fairly hard to keep up with in some parts of it all. And man do they have a lot of signing and keys and fingerprints and trusts going on… Of course DNSSEC is the answer to lots of existing problems with DNS and DNSSEC certainly opens up a range of new fun. The idea to somehow replace the need for ca-certs by storing keys in DNS is interesting, but even though technically working and sound I fear the browser vendors and the CAs of the SSL world won’t be very fast to turn the wheels to roll in that direction. DNSSEC certainly makes name resolving a lot more complicated, and I wonder if c-ares should ever get into that game… And BTW, DNSSEC of course doesn’t take away the fact that specific implementations may still be vulnerable to security flaws.

The last talk of the evening was about SSL, or rather TLS, held by Fredrik Hesse. He gave us a pretty detailed insight into how the protocol works, and then a fairly detailed overview of the flaws discovered during the last year or so, primarily MD5 and rogue ca certs, the null-prefix cert names and the TLS renegotiation bug. I felt good about already knowing just about everything of what he told us. I can also boast with having corrected the speaker afterward at the pub where we were having our post-talk-beers as he was evidently very OpenSSL focused when he spoke about what SSL libraries can and cannot do.

A great evening. And with good beers too. Thanks to the organizers!

httpstate cookie domain pains

Back in 2008, I wrote about when Mozilla started to publish their effort the “public suffixes list” and I was a bit skeptic.

Well, the problem with domains in cookies of course didn’t suddenly go away and today when we’re working in the httpstate working group in the IETF with documenting how cookies work, we need to somehow describe how user agents tend to work with these and how they should work with them. It’s really painful.

The problem in short, is that a site that is called ‘www.example.com’ is allowed to set a cookie for ‘example.com’ but not for ‘com’ alone. That’s somewhat easy to understand. It gets more complicated at once when we consider the UK where ‘www.example.co.uk’ is fine but it cannot be allowed to set a cookie for ‘co.uk’.

So how does a user agent know that co.uk is magic?

Firefox (and Chrome I believe) uses the suffixes list mentioned above and IE is claimed to have its own (not published) version and Opera is using a trick where it tries to check if the domain name resolves to an IP address or not. (Although Yngve says Opera will soon do online lookups against the suffix list as well.) But if you want to avoid those tricks, Adam Barth’s description on the http-state mailing list really is creepy.

For me, being a libcurl hacker, I can’t of course but to think about Adam’s words in his last sentence: “If a user agent doesn’t care about security, then it can skip the public suffix check”.

Well. Ehm. We do care about security in the curl project but we still (currently) skip the public suffix check. I know, it is a bit of a contradiction but I guess I’m just too stuck in my opinion that the public suffixes list is a terrible solution but then I can’t figure anything that will work better and offer the same level of “safety”.

I’m thinking perhaps we should give in. Or what should we do? And how should we in the httpstate group document that existing and new user agents should behave to be optimally compliant and secure?

(in case you do take notice of details: yes the mailing list is named http-state while the working group is called httpstate without a dash!)

My Debian Black-out – the price of bleeding edge

Ok, I admit it. I run Debian Unstable so I know I deserve to get hit really bad at times when things turn really ugly. It is called unstable for a reason.

The other day I decided it was about time I did a dist-upgrade. When I did that, I got a remark that I better restart my gnome session as otherwise apps would crash. So I logged out and… I couldn’t login again. In fact, neither my keyboard nor mouse (both on USB) worked anymore! I sighed, and rebooted (for the first time in many months) only to find out that 1) it didn’t fix the problem, both input devices were still non-functional and perhaps even more important 2) the wifi network didn’t work either so I couldn’t login to it from one of my other computers either!

Related to this story is the fact that I’ve been running an older kernel, 2.6.26, since that was the last version that built my madwifi drivers correctly and kernels after that I was supposed to use ath5k for my Atheros card, but I’ve not been very successful with ath5k and thus remained using the latest kernel I had a fine madwifi for.

I rebooted again and tried a more recent kernel (2.6.30). Yeah, then the keyboard and mouse worked again, but the ath5k didn’t get the wifi up properly. I think I basically was just lacking the proper tools to check the wifi network and set the desired ssid etc, but without network that’s a bit of a pain. Also, when I logged in on my normal gnome setup, it mentioned a panel something being broken and logged me out again! 🙁

Grrr. Of course I could switch to my backup – my laptop – but it was still highly annoying to end up being locked out from your computer.

Today I bought myself 20 meter cat5e cable and made my desktop wired so I can reach the network with the existing setup, I dist-upgraded again (now at kernel 2.6.31) and when I tried to login it just worked. Phew. Back in business. I think I’ll leave myself with the cable connected now that I’ve done the job on that already.

The lesson? Eeeh… when things break, fix them!

Spammers now subscribe

During several years I’ve been setting mailing lists I admin to only accept posts from subscribers iA can with spamn order to avoid having to deal with very large amounts of spam posts.

While that is slightly awkward to users of the list, the huge benefit for me as admin has been the deciding factor.

Recently however, I’ve noticed how this way to prevent spam on the mailing lists have started to fail more and more frequently.

Now, I see a rapid growth in spam from users who actually subscribe first and then post their spam to the list. Of course, sometimes spammers happen to just fake the from address from a member of a list – like when a spammer fakes my address and sends spam to a list I am subscribed to, but it’s quite obvious that we also see the actual original spammer join lists and send spam as well.

It makes me sad, since I figure the next step I then need to take on the mailing lists I admin is to either spam check the incoming mails with a tool like spamassassin (and risk false positives or to not trap all spams) and/or start setting new members as moderated so that I have to acknowledge their first post to the list in order to make sure they’re not spammers.

Or is there any other good idea of what I can do that I haven’t thought of?

null-prefix domino

dominos

At the end of July 2009, Scott Cantor contacted us in the curl project and pointed out a security flaw in libcurl (in code that was using OpenSSL to verify server certificates). Having read his explanation I recalled that I had witnessed the discussion on the NSS list about this problem just a few days earlier (which resulted in their August 1st security advisory). The problem is basically that the cert can at times contain a name with an embedded zero in the middle, while most source code assumes plain C-style strings that ends with a zero. This turns out to be exploitable, and is explained in great detail in this document (PDF).

I started to work on a patch, and in the mean time I talked to Simon Josefsson of the GnuTLS team to see if GnuTLS was fine or not, only to get him confirm that GnuTLS did indeed have the same problem.

So I contacted vendor-sec, and then on the morning of August 5 I thought I’d just make a quick check how the other HTTPS client implementations do their cert checks.

Wget: vulnerable

neon: vulnerable

serf: vulnerable

So, Internet Explorer and Firefox were vulnerable. NSS and GnuTLS were. (OpenSSL wasn’t, but then it doesn’t provide this verifying feature by itself) (lib)curl, wget, neon, serf were all vulnerable. If that isn’t a large amount of the existing HTTPS clients then what is? I also think that this shows that it would be good for all of us if OpenSSL had this functionality, as even if it had been vulnerable we could’ve fixed a busload of different applications by repairing a single library. Now we instead need to hunt down all apps that use OpenSSL and that verify certificate names.

Quite clearly we (as implementers) have all had the same silly assumptions, and quite likely we’ve affected each other into doing these sloppy codes. SSL and certificates are over and over again getting hit by this kind of painful flaws and setbacks. Darn, getting things right really is very very hard…

(Disclaimer: I immediately notified the neon and serf projects but to my knowledge they have not yet released any fixed versions.)

Mini 2440 Lyre

On ebay there’s a fancy S3C244-based board named mini 2440 with a 3.5″ touch LCD attached on sale for 85 USD. 64MB ram, 400MHz CPU, a nand flash and more. Lots of stuff for the money.

mini2440

The guys in the lyre project seem to have adopted this as yet another hardware platform to attempt to run Rockbox on. After their Atmel AT91SAM target was ditched, they went the ARMopendous route and now this seems to have entered. This third hardware platform is called the Lyre prototype 2

You should note that this Mini 2440 board has no batteries or anything and thus is not really meant to be a portable device in this shape.

“Bob” seems to have initial Rockbox code running on this device, and well-established Rockbox hackers JdGordon and domonoky have both ordered their own kits so the future looks bright.