Pointless respecifying FTP URI

Tuesday, May 24th, 2011

There’s this person wiIETFthin IETF who seems to possess endless energy and a never-ending wish to clean up tiny details within the IETF processes. He continuously digs up specifications that need to be registered or submitted again somewhere due to some process. Often under loud protests from fellow IETFers since it steals time and energy from people on the lists for discussions and reviews – only to satisfy some administrative detail. Time and energy perhaps better spent on things like new protocols and interesting new technologies.

This time, he has deemed that the FTP a FILE URI specs need to be registered properly, and alas he has submitted his first suggested update of the FTP URI.

From my work with curl I have of course figured out a few problems with RFC1738 that I don’t think we should just repeat in a new version of the spec. It turns out I’m not alone in thinking this work isn’t really good like this, and I posted a second mail to clarify my points.

We’re not working on fixing the problems with FTP URIs that are present in RFC1738 so just rephrasing those into a new spec is a bad idea.

We could possibly start the work on fixing the problems, but so far I’ve seen no such will or efforts and I don’t plan on pushing for that myself either.

Please tell me or the ftpext2 group where I or the others are wrong or right or whatever!

US patent 6,098,180

Wednesday, May 11th, 2011

(I am not a lawyer, this is not legal advice and these are not legal analyses, just my personal observations and ramblings. Please correct me where I’m wrong or add info if you have any!)

At 3:45 pm on March  18th 2011, the company Content Delivery Solutions LLC filed a complaint in a court in Texas, USA. The defendants are several bigwigs and the list includes several big and known names of the Internet:

  • Akamai
  • AOL
  • AT&T
  • CD Networks
  • Globalscape
  • Google
  • Limelight Networks
  • Peer 1 Network
  • Research In Motion
  • Savvis
  • Verizon
  • Yahoo!

The complaint was later amended with an additional patent (filed on April 18th), making it list three patents that these companies are claimed to violate (I can’t find the amended version online though). Two of the patents ( 6,393,471 and 6,058,418) are for marketing data and how to use client info to present ads basically. The third is about file transfer resumes.

I was contacted by a person involved in the case at one of the defendants’. This unspecified company makes one or more products that use “curl“. I don’t actually know if they use the command line tool or the library – but I figure that’s not too important here. curl gets all its superpowers from libcurl anyway.

This Patent Troll thus basically claims that curl violates a patent on resumed file transfers!

The patent in question that would be one that curl would violate is the US patent 6,098,180 which basically claims to protect this idea:

A system is provided for the safe transfer of large data files over an unreliable network link in which the connection can be interrupted for a long period of time.

The patent describes several ways in how it may detect how it should continue the transfer from such a break. As curl only does transfer resumes based on file name and an offset, as told by the user/application, that could be the only method that they can say curl would violate of their patent.

The patent goes into detail in how a client first sends a “signature” and after an interruption when the file transfer is about to continue, the client would ask the server about details of what to send in the continuation. With a very vivid imagination, that could possibly equal the response to a FTP SIZE command or the Content-Length: response in a HTTP GET or HEAD request.

A more normal reader would rather say that no modern file transfer protocol works as described in that patent and we should go with “defendant is not infringing, move on nothing to see here”.

But for the sake of the argument, let’s pretend that the patent actually describes a method of file transfer resuming that curl uses.

The ‘180 (it is referred to with that name within the court documents) patent was filed at February 18th 1997 (and issued on August 1, 2000). Apparently we need to find prior art that was around no later than February 17th 1996, that is to say one year before the filing of the stupid thing. (This I’ve been told, I had no idea it could work like this and it seems shockingly weird to me.)

What existing tools and protocols did resumed transfers in February 1996 based on a file name and a file offset?

Lots!

Thank you all friends for the pointers and references you’ve brought to me.

  • The FTP spec RFC 959 was published in October 1985. FTP has a REST command that tells at what offset to “restart” the transfer at. This was being used by FTP clients long before 1996, and an example is the known Kermit FTP client that did offset-based file resumed transfer in 1995.
  • The HTTP header Range: introduces this kind of offset-based resumed transfer, although with a slightly fancier twist. The Range: header was discussed before the magic date, as also can be seen on the internet already in this old mailing list post from December 1995.
  • One of the protocols from the old days that those of us who used modems and BBSes in the old days remember is zmodem. Zmodem was developed in 1986 and there’s this zmodem spec from 1988 describing how to do file transfer resumes.
  • A slightly more modern protocol that I’ve unfortunately found no history for before our cut-off date is rsync, as I could only find the release mail for rsync 1.0 from June 1996. Still long before the patent was filed obviously, and also clearly showing that the one year margin is silly as for all we know they could’ve come up with the patent idea after reading the rsync releases notes and still rsync can’t be counted as prior art.
  • Someone suggested GetRight as a client doing this, but GetRight wasn’t released in 1.0 until Febrary 1997 so unfortunately that didn’t help our case even if it seems to have done it at the time.
  • curl itself does not pre-date the patent filing. curl was first released in March 1998, and the predecessor was started around summer-time 1997. I don’t have any remaining proofs of that, and it still wasn’t before “the date” so I don’t think it matters much now.

At the time of this writing I don’t know where this will end up or what’s going to happen. Time will tell.

This Software patent obviously is a concern mostly to US-based companies and those selling products in the US. I am neither a US citizen nor do I have or run any companies based in the US. However, since curl and libcurl are widely used products that are being used by several hundred companies already, I want to help bring out as much light as possible onto this problem.

The patent itself is of course utterly stupid and silly and it should never have been accepted as it describes trivially thought out ideas and concepts that have been thought of and implemented already decades before this patent was filed or granted although I claim that the exact way explained in the patent is not frequently used. Possibly the protocol using a method that is closed to the description of the patent is zmodem.

I guess I don’t have to mention what I think about software patents.

I’m convinced that most or all download tools and browsers these days know how to resume a previously interrupted transfer this way. Why wouldn’t these guys also approach one of the big guys (with thick wallets) who also use this procedure? Surely we can think of a few additional major players with file tools that can resume file transfers and who weren’t targeted in this suit!

I don’t know why. Clearly they’ve not backed down from attacking some of the biggest tech and software companies.

patent drawing

(Illustration from the ‘180 patent.)

“Hacking me”

Thursday, December 23rd, 2010

If you ever wonder how clever it was of me to make an FTP tool that used the default anonymous password “curl_by_daniel@…” once upon a time and you want to know why I changed that to ftp@example.com instead? Here’s a golden snippet to just absorb and enjoy:

Date: Thu, 23 Dec 2010 22:56:00
From: iHack3r <miithehacker@gmail.com>
To: info@haxx.se
Subject: Hacking me
Parts/Attachments:
1 Shown    8 lines  Text (charset: ISO-8859-1)
2   OK    ~7 lines  Text (charset: ISO-8859-1)
—————————————-
To the idiot named Daniel, Please stop brute force attacking my FTP client.
I do not appreciate it, i have an anonymous account set up for the general
public to access my files that i want them to access, QUIT trying to hack
the admin because 1. DISABLED unless i am leaving to go somewhere without my
computer 2: THE PASSWORD is random letters and numbers.
-iHack3r
Date: Thu, 23 Dec 2010 22:56:00 From: iHack3r <hidden> To: info@[my company] Subject: Hacking me To the idiot named Daniel, Please stop brute force attacking my FTP client. I do not appreciate it, i have an anonymous account set up for the general public to access my files that i want them to access, QUIT trying to hack the admin because 1. DISABLED unless i am leaving to go somewhere without my computer 2: THE PASSWORD is random letters and numbers. -iHack3r

The password was changed at Feb 13 2007 in the curl version 7.16.2, but there are a surprisingly large amount of older curls still around out there…

Update: as the person responded again after having read this blog post and still didn’t get it, I felt the urge to speak up in even more clear terms:

I didn’t have anything to do with any “hacker attack” on any site. Not yours, and not anyone else’s. The fact that almost-my-email address appeared in your logs is because I wrote the FTP client. It is a general FTP client that is being used by a very very large amount of people all over the world. If I ever would attack a site, why on earth would I send along my real name or email address?

Byte ranges for FTP

Monday, December 20th, 2010

In the IETF ftpext2 working group there have been some talks around clients’ and servers’ ability to do and support “ranged” file transfers, that is transferring only a piece of any given file. FTP supports the REST command and has done so since the dawn of man (RFC765 – June 1980), and using that command, a client can set the starting point for a transfer but there is no way to set the end point. HTTP has supported the Range: header since the first HTTP 1.1 spec back in January 1997, and that supports both a start and an end point. The HTTP header does in fact support multiple ranges within the same header, but let’s not overdo it here!

Currently, to avoid getting an entire file a client would simply close the data connection when it has got all the data it wants. The unfortunate reality is that some servers don’t notice clients doing this, so in order for this to work reliably a client also has to send ABOR, and after this command has been sent there is no way for the client to reliably figure out the state of the control connection so it has to get closed as well (which is crap in case more files are to be transferred to or from the same host). It primarily becomes unreliable because when ABOR is sent, the client gets one or two responses back due to a race condition between the closing and the actual end of transfer etc, and it isn’t possible to tell exactly how to continue.

A solution for the future is being worked on. I’ve joined up the effort to write a spec that will suggest a new FTP command that sets the end point for a transfer in the same vein REST sets the start point. For the moment, we’ve named our suggested command RANG (as short for range). “We” in this context means Tatsuhiro Tsujikawa, Anthony Bryan and myself but we of course hope to get further valuable feedback by the great ftpext2 people.

There already are use cases that want range request for FTP. The people behind metalinks for example want to download the same file from many servers, and then it makes sense to be able to download little pieces from different sources.

The people who found the libcurl bugs I linked to above use libcurl as part of the Fedora/Redhat installer Anaconda, and if I understand things right they use this feature to just get the beginning of some files to check them out and avoid having to download the full file before it knows it truly wants it. Thus it saves lots of bandwidth.

In short, the use-cases for ranged FTP retrievals are quite likely pretty much the same ones as they are for HTTP!

The first RANG draft is now available.

An FTP hash command

Monday, March 29th, 2010

Anthony Bryan strikes again. This time his name is attached to a new standards draft for how to get a hash checksum of a given file when using the FTP protocol. draft-bryan-ftp-hash-00 was published just a few days ago.

The idea is basically to introduce a spec for a new command named ‘HASH’ that a client can issue to a server to get a hash checksum for a given file in order to know that the file has the exact same contents you want before you even start downloading it or otherwise consider it for actions.

The spec details how you can ask for different hash algorithms, how the server announces its support for this in its FEAT response etc.

I’ve already provided some initial feedback on this draft, and I’ll try to assist Anthony a bit more to get this draft pushed onwards.

an application protocol monster

Friday, January 8th, 2010

During the autumn 2009 I was sponsored by a company to work on some new protocols for libcurl to support – IMAP, POP3, SMTP and their TLS-powered versions. It took me a little while to get things working the way I’d like them to due work load elsewhere and other irrelevant distractions.

In the next curl and libcurl release, to be called version 7.20.0, which if everything runs fine and according to plan might happen at the end of January 2010 or possibly a little later, libcurl will truly have converted from a file transfer library into a full fledged application protocol layer monster.

The current incarnation of my development libcurl supports the following 18(!) protocols:

tftp ftp telnet dict ldap ldaps http file https ftps scp sftp imap imaps pop3 pop3s smtp smtps

While SSL versions of protocols are arguably not separate protocols, the 12 protocols in the list without SSL are still many in my view.

The fundamentals of libcurl remain the same though: specify a URL to operate against. Send or receive data. Sometimes both send and receive in the same request.

Internally in libcurl, I converted a big portion of the FTP-specific code into a more generic “pingpong”-generic code which is now designed to work similarly with all these new protocols that share many similarities with FTP. They are all sending commands to the server and get responses back, in similar ways.

As before, the ability to disable specific protocols when building libcurl remains so for those who don’t particularly care about these new ones and want to maintain building a library that is as small and lean as possible, there should be little extra weight due to this recent development. I’ve been pondering but I’ve not yet figured out the most perfect way to deal with such build options in the code so they are still a bit too #if and #ifdef intensive for my taste…

Meanwhile, Chris Conroy has been busy in his end implementing support for RTSP that also soon might be ripe for inclusion in the main code.

Not too surprisingly perhaps, the curl tool also then gets these protocol abilities so it becomes even more than before the Swiss army knife tool for internet protocols and then of course explicitly application layer ones.

cURL

FTP vs HTTP, it talks!

Friday, August 29th, 2008

Anthony Bryan and I had a talk the other day regarding FTP vs HTTP etc, and the outcome is available as this podcast.

FTP vs HTTP, really!

Monday, August 25th, 2008

Since I’m doing my share of both FTP and HTTP hacking in the curl project, I quite often see and sometimes get the questions about what the actual differences are between FTP and HTTP, which is the “best” and isn’t it so that … is the faster one?

FTP vs HTTP is my attempt at a write-up covering most differences to users of the protocols without going into too technical details. If you find flaws or have additional info you think should be included, please let me know!

The document includes comparisons between the protocols in these areas:

  • Age
  • Upload
  • ASCII/binary
  • Headers
  • Pipelining
  • FTP Command/Response
  • Two Connections
  • Active and Passive
  • Firewalls
  • Encrypted Control Connections
  • Authentications
  • Download
  • Ranges/resume
  • Persistent Connections
  • Chunked Encoding
  • Compression
  • FXP
  • IPv6
  • Name based virtual hosting
  • Proxy Support
  • Transfer Speed

With your help it could become a good resource to point curious minds to in the future…