Category Archives: Web

web stuff

Please hide my email

... I don't want my employer/wife/friends to see that I've contributed something cool to an open source project, or perhaps that I said something stupid 10 years ago.

I host and co-host a bunch of different mailing list archives for projects on web sites, and I never cease to get stumped by how many people are trying hard to avoid getting seen on the internet. I can understand the cases where users accidentally leak information they intended to be kept private (although the removal from an archive is then not a fix since it has already been leaked to the world), but I can never understand the large crowd that tries to hide previous contributions to open source projects because they think the current or future employers may notice and have a (bad) opinion about it.

I don't have the slightest sympathy for the claim that they get a lot of spam because of their email on my archives, since I only host very public lists and the person's address was already posted publicly to hundreds of receivers and in most cases also to several other mailing list archives.

People are weird!

Estimated-Content-Length

Greg Dean posted an interesting idea on the ietf-http-wg mailing list, suggesting that a new response header would be added to HTTP (Estimated-Content-Length:) to allow servers to indicate a rough estimation of the content length in situation where it doesn't actually now the exact size before it starts sending data.

In the current world, HTTP servers can only report the exact size to the client or no size at all and then the client will have to just deal with the response becoming any size at all. It then has no way to know even roughly how large the data is or how long the transfer is going to take.

The discussions following Greg's post seem mostly positive thus far from several people.

Shared Dictionary Compression over HTTP

Wei-Hsin Lee of Google posted about their effort to create a dictionary-based compression scheme for HTTP. I find the idea rather interesting, and it'll be fun to see what the actual browser and server vendors will say about this.

The idea is basically to use "cookie rules" (domain, path, port number, max-age etc) to make sure a client gets a dictionary and then the server can deliver responses that are diffs computed against the dictionary it has delivered before to the client. For repeated similar contents it should be able to achieve a lot better compression ratios than any other existing HTTP compression in use.

I figure it should be seen as a relative to the "Delta encoding in HTTP" idea, although the SDCH idea seems somewhat more generically applicable.

Since they seem to be using the VCDIFF algorithm for SDCH, the recent open-vcdiff announcement of course is interesting too.

A bad move. A really bad move.

So I wrote this little perl script to perform a lot of repeated binary Rockbox builds. It builds something like 35 builds and zips them up and gives them proper names in a dedicated output directory. Perfect to do things such as release builds.

Then I wrote a similar one to build manuals and offer them too. I then made the results available on the Rockbox 3.0RC (release candidate) page of mine.

Cool, me thinks, and since I'll be away now for a week starting Wednesday I think I should make the scripts available in case someone else wants to play with them and possibly make a release while I'm gone.

I did

mv buildall.pl webdirectory/buildall.pl.txt

... thinking that I don't want it to try to execute as a perl script on the server so I rename it to a .txt extension. But did this work? No. Did it cause total havoc? Yes.

First, Apache apparently still thinks these files are perl scripts (== cgi scripts) on my server, even if they got an additional extension. I really really didn't expect this.

Then, my scripts are doing a command chain similar to "mkdir dir; cd dir; rm -rf *". It works great when invoked in the correct directory. It works less fine when the web server invokes this because someone clicked on the file I just made available to the world.

Recursive deletion of all files the web server user was allowed to erase.

Did I immediately suspect foul play and evil doings by outsiders? Yes. Did it take quite a while to restore the damages from backups? Yes. Did it feel painful to realize that I myself was to blame for this entire incident and not at all any outside or evil perpetrator? Yes yes yes.

But honestly, in the end I felt good that it wasn't a security hole somewhere that caused it since I hate spending all that time to track it down and fix it. And thanks to a very fine backup system, I had most of the site and things back up and running after roughly one hour off-line time.

Rockbox

Getting cacerts for your tools

As the primary curl author, I'm finding the comments here interesting. That blog entry "Teaching wget About Root Certificates" is about how you can get cacerts for wget by downloading them from curl's web site, and people quickly point out how getting cacerts from an untrusted third party place of course is an ideal situation for an MITM "attack".

Of course you can't trust any files off a HTTP site or a HTTPS site without a "trusted" certificate, but thinking that the curl project would run one of those just to let random people load PEM files from our site seems a bit weird. Thus, we also provide the scripts we do all this with so that you can run them yourself with whatever input data you need, preferably something you trust. The more paranoid you are, the harder that gets of course.

On Fedora, curl does come with ca certs (at least I'm told recent Fedoras do) and even if it doesn't, you can actually point curl to use whatever cacert you like and since most default installs of curl uses OpenSSL like wget does, you could tell curl to use the same cacert your wget install uses.

This last thing gets a little more complicated when one of the two gets compiled with a SSL library that doesn't easily support PEM (read: NSS), but in the case of curl in recent Fedora they build it with NSS but with an additional patch that allows it to still be able to read PEM files.

FTP vs HTTP, really!

Since I'm doing my share of both FTP and HTTP hacking in the curl project, I quite often see and sometimes get the questions about what the actual differences are between FTP and HTTP, which is the "best" and isn't it so that ... is the faster one?

FTP vs HTTP is my attempt at a write-up covering most differences to users of the protocols without going into too technical details. If you find flaws or have additional info you think should be included, please let me know!

The document includes comparisons between the protocols in these areas:

  • Age
  • Upload
  • ASCII/binary
  • Headers
  • Pipelining
  • FTP Command/Response
  • Two Connections
  • Active and Passive
  • Firewalls
  • Encrypted Control Connections
  • Authentications
  • Download
  • Ranges/resume
  • Persistent Connections
  • Chunked Encoding
  • Compression
  • FXP
  • IPv6
  • Name based virtual hosting
  • Proxy Support
  • Transfer Speed

With your help it could become a good resource to point curious minds to in the future...

Site deadness

When I got to work this morning I immediately noticed that one of the servers that host a lot of services for open source projects I tend to play around with (curl, Rockbox and more), had died. It responded to pings but didn't allow my usual login via ssh. It also hosts this blog.

I called our sysadmin guy who works next to the server and he reported that the screen mentioned inode problems on an ext3 filesystem on sda1. Powercycling the machine did nothing good but the machine simply didn't even see the hard drive...

I did change our slave DNS for rockbox.org and made it point to a backup web server in the mean time, just to make people aware of the situation.

Some 12 hours after the discovery of the situation, Linus Nielsen Feltzing had the system back up again and it's looking more or less identical to how it was yesterday. The backup procedure proved itself to be working flawlessly. Linus inserted a new disk, partitioned similar like the previous one, restored the whole backup, fixed the boot (lilo) and wham (ignoring some minor additional fiddling) the server was again up and running.

Thanks Linus!