Two fun quotes for the curious who don’t feel like reading the whole post:
1. On the subject of making wget deal with multiple simultanous connections/requests: The obvious solution to that is to use c-ares, which does exactly that: handle DNS queries asynchronously. Actually, I didn’t know this until just now, but c-ares was split off from ares to meet the needs of the curl developers.
2. Following the first reasoning, they can indeed get away with even less work if they base that work on an existing solution: While I’ve talked about not reinventing the wheel, using existing packages to save us the trouble of having to maintain portable async code, higher-level buffered-IO and network comm code, etc, I’ve been neglecting one more package choice. There is, after all, already a Free Software package that goes beyond handling asynchronous network operations, to specifically handle asynchronous _web_ operations; I’m speaking, of course, of libcurl. There would seem to be some obvious motivation for simply using libcurl to handle all asynchronous web traffic, and wrapping it with the logic we need to handle retries, recursion, timestamping, traversing, selecting which files to download, etc. Besides async web code, of course, we’d also automatically get support for a number of various protocols (SFTP, for example) that have been requested in Wget.
I am of course happy to see that the consideration exists – even if this won’t go further than just expressed in a mail. I did ventilate this idea to the wget people back in 2001, and even though we’re now more than six years down the road since then the situation is now even more clear: libcurl is a much more capable and proven transport layer solution and it supports much more protocols than wget is/does.
Me biased? naaah…