Anthony Bryan strikes again. This time his name is attached to a new standards draft for how to get a hash checksum of a given file when using the FTP protocol. draft-bryan-ftp-hash-00 was published just a few days ago.
The idea is basically to introduce a spec for a new command named ‘HASH’ that a client can issue to a server to get a hash checksum for a given file in order to know that the file has the exact same contents you want before you even start downloading it or otherwise consider it for actions.
The spec details how you can ask for different hash algorithms, how the server announces its support for this in its FEAT response etc.
I’ve already provided some initial feedback on this draft, and I’ll try to assist Anthony a bit more to get this draft pushed onwards.
Does the HASH allow to get a hash of a part of the file? Would be great to verify the partial bit before doing a resume…
I agree that partial hashes have some benefits, and I would assume – without doing any deeper checks or thinking – that the regular “resume functionality” (the REST command) could be used even for the HASH command.
Doing partial hashes do have some drawbacks though: it will be next to impossible to have them pre-calculated, and pre-calculations will be great in order to avoid the DOS-risk when clients start asking for hashes on many files.
This is of course subject for debate and discussion, nothing’s set in stone yet.
Aloha!
(SInce we’ve been talking about precomputed hashes, deprecating MD5 etc before I skip that now). Another thing I really think should be in the draft is a section explaining the threat model – what is the pupose of the hash, that is what is the protection provided supposed to protect from.
The thing is, AFAIK this hash will not be a proper shared secret-MAC, but a simple digest. This means that it will NOT protect agains malicious changes to the file on the server, NOR protect against MITM-situation. But it will add some protection agains random errors induced during transfer. Which might be handled by TCP checksums anyway… I might not understand the purpose of the draft…
One big point to me is to quickly tell if the file is likely to be the same as another file in another location, as an identical hash should imply that with a fairly high level of certainty.
TCP checksums are but 16 bits and have many problems so I think this can complement that but I don’t think that’s one of the primary purposes.
Aloha!
The drafts points out that it tries to standardize what has been implemented several times independently so there must be a need for this, I agree on that. And your point that it allows you to check if the file is the same as another file is a valid point.
But I think this should be clarified in the draft – what the checksum is inteded for and what it is not intended for. Security isn’t it.
There will be fun and hilarity when people suddenly start getting checksums instead of progress indicators when they type “hash” in common ftp clients
haha, I wasn’t even aware of that command…!
an updated version (-01) is available
http://tools.ietf.org/html/draft-bryan-ftp-hash