Category Archives: Web

web stuff

The web shop timeout mystery

Another one of the things in the modern world I’ve not yet understood:

why on earth do some web-based shops timeout your shopping and automatically clear you “shopping cart” if you just leave it around for a few hours/days? Why why why? What harm does it do them if I don’t hurry on to purchase?

I love being able to press ‘buy’ on lots of stuff (that then are added to the “cart”) and then ponder a few days if I want more stuff, if I selected the right models, alter a few things and similar. So when they time-out on me like this, it’s like a blow in the face and I need to start over again. It’s simply crazy that I have to backup my list of things to buy just in case they’ll flush me before I’m done!

Yes, I’m aware that some sites offer “save lists” etc if you’re registered and logged in and all, but I don’t want to have to do that.

I can imagine that at times things run out of stock or they even change the prices of merchandise that’s in my cart, but they could still solve that in other ways than just clearing everything.

HTTP Status Report

Mark Nottingham Mark Nottingham held a very interesting one hour talk on the status of HTTP and the work on HTTPbis on a QCon conference recently, and luckily for us HTTP geeks there’s this great video/presentation from that.

curl is mentioned at least twice in the slides, unfortunately it has a wrong fact on the second mention where it says curl uses “Pragma: no-cache” as it isn’t true anymore. It used to do that, but we’ve stopped doing it in curl since a while ago.

I’m a subscriber to the httpbis mailing list and a casual contributor, but nonetheless his summary and overview of the state was refreshing as I’ve not been able to keep up with all the details and I haven’t been tracking that working group from its start either.

Not social enough

There’s this concept that’s very popular these days. Social networking web sites. I’ve always been intrigued by the six degrees of separation idea so I joined Facebook and I’ve given it a try. Result: yawn.

Of course I realize everything depends on who you are, how your social network works and so on, but for me the Facebook experiment has only proven to me what I already suspected: I’m not “social enough” to care about all my friends’ teeny weeny little issues and expressions. I don’t have many friend added (35 at this particular moment) but already at this low number I get terribly uncomfortable after reading too much personal goings-on. And I’m not interested in everyones’ top-lists, what IKEA furniture they would be or which of the characters in the Muppet Show they resemble the most. I’m not going to use Facebook much until something changes.

Twitter is another one of the more trendy sites and services. This is very chaotic and most of the stuff posted there is utter crap. But there are some interesting people to follow and I do my best at following the tradition and contribute with my junk: My Twitter feed. More seriously I kind of use and view twitter as chatter around the coffee machine at a virtual office. You can select who to listen to. You can say whatever you feel like and the ones who might care could be reading it… The good part – for me of course – being that I can stay all geeky and techy and avoid that facebookish stuff I don’t like. Oh, and if you’re a friend in this manner, do tell me so that I can follow you!

LinkedIn is different. Here’s a site with a different goal and perspective, and keeping in touch with people I’ve been involved with professionally is a totally different matter. This makes a lot of sense to me, and it’s actually proven to pay off – several times. I believe me being a contract developer of course also make me value having a large network to reach out to so that I keep getting myself interesting assignments on a regular basis! My LinkedIn page.

User data probably for sale

It’s time for a little “doomsday prophesy”.

Already seen happen

As was reported last year in Sweden, mobile operators here sell customer data (Swedish article) to companies who are willing to pay. Even though this might be illegal (Swedish article), all the major Swedish mobile phone operators do this. This second article mentions that the operators think this practice is allowed according to the contract every customer has signed, but that’s far from obvious in everybody else’s eyes and may in fact not be legal.

For the non-Swedes: one mobile phone user found himself surfing to a web site that would display his phone number embedded on the site! This was only possible due to the site buying this info from the operator.

While the focus on what data they sell has been on the phone number itself – and I do find that a pretty good privacy breach in itself – there’s just so much more the imaginative operators just very likely soon will offer companies who just pay enough.

Legislations going the wrong way

There’s this EU “directive” from a few years back:

Directive 2006/24/EC of the European Parliament and of the Council of 15 March 2006 on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC

It basically says that Internet operators must store information of users’ connections made on the net and keep them around for a certain period. Sweden hasn’t yet ratified this but I hear other EU member states already have it implemented…

(The US also has some similar legislation being suggested.)

It certainly doesn’t help us who believe in maintaining a level of privacy!

What soon could happen

There’s hardly a secret that operators run network supervision equipments on their customer networks and thus they analyze and snoop on network data sent and received by each and every customer. They do this for network management reasons and for such legislations I mentioned above. (Disclaimer: I’ve worked and developed code for a client that makes and sells products for exactly this purpose.)

Anyway, it is thus easy for the operators to for example spot common URLs their users visit. They can spot what services (bittorrent, video sites, Internet radio, banks, porn etc) a user frequents. Given a particular company’s interest, it could certainly be easy to check for specific competitors in users’ visitor logs or whatever and sell that info.

If operators can sell the phone numbers of their individual users, what stops them from selling all this other info – given a proper stash of money from the ones who want to know? I’m convinced this will happen sooner or later, unless we get proper legislation that forbids the operators from doing this… In Sweden this sell of info is mostly likely to get done by the mobile network operators and not the regular Internet providers simply because the mobile ones have this end user contract to lean on that they claim gives them this right. That same style of contract and terminology, is not used for regular Internet subscriptions (I believe).

So here’s my suggestion for Think Geek to expand somewhat on their great shirt:

i-read-your-everything

(yeah, I have one of those boring ones with only the first line on it…)

More suggested HTTP fun

I’ve already previously expressed my deepest dislike with where the HTML5 work is going, and just yesterday two new internet-drafts appeared on ietf.org that spurred up discussions all around. They’re claimed to be “part of our effort to remove from HTML5 sections that are more appropriate elsewhere” but I’m thinking they’re rather inappropriate everywhere…

The first one named Content-Type Processing Model hits a subject that I’ve been over before, namely the stupidity of having web browsers guess the content based on what it looks like. IE introduced the “I really mean it property“, the HTML5 team wants to standardize the way of the guessing. Personally, I think the world of web will become a better place if the browsers would instead become stricter and more closer follow what the servers actually say the contents is, and then all users would complain to the site admins if things are wrong and then things should be fixed.

Guessing content types allows for sloppy behaviors, it makes it harder to write browsers for the web and it still features a significant risk of guessing wrong.

The second draft propagates for the new HTTP header “Origin”, which according to the authors would help to guard servers against CSRF (“Cross-Site Request Forgery“). The main author says 3% of users on the Internet gets their Referer header stripped while virtually none gets Origin stripped. I claim this is a bogus argument since they strip Referer beacause it is a known and established header and Origin is not. I also completely fail to see the goodness of this and based on several of the other responses on the ieth-http-wg mailing list I am not alone…

IETF http-state group created

Over at the IETF another group was just created named http-state (with an associated mailing list) with the specific goal:

Ultimately, the purpose of this group is to create an updated HTTP State Management Mechanism RFC (aka cookies) that will supersede the Netscape spec, RFCs 2109, 2964, 2965 then add in real-world usage (e.g. HTTPOnly), and possibly add in additional features and possibly merge in draft-broyer-http-cookie-auth-00.txt and draft-pettersen-cookie-v2-03.txt.

I’ve joined the list and I hope to follow and participate in this, as I believe the current state of HTTP cookies is a rather sorry mess and the Netscape spec is still what closest describes how cookies work in the wild. Of course I’ll do it with my libcurl experience in my luggage.

While it perhaps would be cool to join the group in more formal way, there’s no way for me to participate in that IETF meeting in San Francisco in March.

Fun with executable extensions in viewvc

A few years ago I wrote up silly little perl script (let’s call it script.pl) that would fetch a page from a site that returns a “random URL off the internet”. I needed a range of URLs for a test program of mine and just making up a thousand or so URLs is tricky. Thus I wrote this script that I would run and allow to get a range of URLs on each invoke and then run it again later and append to the log file. It wasn’t a fancy script, but it solved my task.

The script was part of a project I got funded to work on, that was improving libcurl back in 2005/2006 so I thought adding and committing the script to CVS felt only natural and served a good purpose. To allow others to repeat what I did.

Fast forward to late 2008. The script is now browsable via viewvc on a site that… eh, doesn’t have “.pl” disabled as a cgi extension in its config! The result of course is that each time someone tries to view the script using the web interface, the web server invokes the script locally!

All of a sudden I get a mail from someone, who apparently is admin or something of the site this old script was using, and he mentions that a machine on our network is hammering his site with many requests per second (38 requests/second apparently) and asked me to stop this. It turns out a search engine crawler has indexed the viewvc output several times, and now some 8 processes or so were running this script.pl and they were all looping around getting a page, outputting the URL, getting another page…

While I think 38 requests second is a bit low to even be considered a DOS, it certainly wasn’t intended nor friendly and I was greatly surprised when I slowly realized how it all came to end up like this! Man I suck! It reminds me of my other extension mess from just a few months ago…

Maybe I’ll learn how to do things right in the future when I grow up!

Avatars by gravatar

Daniel's gravatar avatar imageI’m using one of those fancy WordPress plugins on this blog that makes use of gravatar for the avatar images that appear next to your name when you post a comment. So if you comment here on daniel.haxx.se and want to see a fancy personal image next to your wise words, skip over to gravatar.com and put up a picture of you that then will be associated with your email address.

This system does not reveal your email address to any outsider, as the avatar is received from their service simply by sending a oneway hash of your address.

This isn’t really anything new here, it’s been like this for a while but I figured I should explain it better to the few who might not have realized this yet…

Filling our pipes

At around 13:43 GMT Friday the 5th of December 2008, the network that hosts a lot of services like this site, the curl site, the rockbox site, the c-ares site, CVS repositories, mailing lists, my own email and a set of other open source related stuff, become target of a vicious and intense DDoS attack. The attack was in progress until about 17:00 GMT on Sunday the 7th. The target network is owned and ran by CAG Contactor.

Tens of thousands of machines on the internet suddenly started trying to access a single host within the network. The IP they targeted has in fact never been publicly used as long as we’ve owned it (which is just a bit under two years) and it has never had any public services.

We have no clue whatsoever why someone would do this against us. We don’t have any particular services that anyone would gain anything by killing. We’re just very puzzled.

Our “ISP”, the guys we buy bandwidth and related services from, said they used up about 1 gigabit/sec worth of bandwidth and with our “mere” 10 megabit/sec connection it was of course impossible to offer any services while this was going on.

It turns out our ISP did the biggest blunder and is the main cause for the length of this outage: we could immediately spot that the target was a single IP in our class C network. We asked them to block all traffic to this IP as far out as possible to stop such packets from entering their network. And they did. For a short while there was silence and sense again.

For some reason that block “fell off” and our network got swamped again and it then remained unusable for another 48 hours or so. We know this, since our sysadmin guy investigated our firewall logs on midday Sunday and they all revealed that same target IP as destination. Since we only have a during-office-hours support deal with our network guys (as we’re just a consultant company with no services that really need 24 hour support) they simply didn’t care much about our problem but said they would deal with it Monday morning. So our sysadmin shutdown our firewall to save our own network from logging overload and what not.

Given the explanations I’ve got over phone (I have yet to see and analyze logs from this), it does sound like some sort of SYN flood and they attempted to connect to many different TCP ports.

4-5 hours after the firewall was shutdown, the machines outside of our firewall (but still on our network) suddenly became accessible again. The attack had stopped. We have not seen any traces of it since then. The firewall is still shutdown though, as the first guy coming to the office Monday morning will switch it on again and then – hopefully – all services should be back to normal.

My Firefox Add-ons

I simply need to have this list somewhere so that I can find out my own add-ons again when I’m running Firefox away from home!

Adblock Plus – since ads are too annoying these days

DownThemAll – because I like to be able to get whole batches of images or similar at times

Fission – just a silly eye-candy thing

Forecastfox – I like weather forecasts!

FoxClocks – helps me keep track of the time my friends around the world have at different moments.

It’s All Text – makes web based editing/posting a more pleasurable experience by allowing me to edit such contents with emacs!

Live HTTP Headers is a must when you want to figure out how to repeat your browser’s actions with a set of curl commands.

Open in Browser allows me to open more stuff within the browser itself, even when the Content-Type is bad.

Right-Click-Link is great when you quickly want to browse to links you find in plain text sections.

Torbutton lets me quickly switch to anonymous browsing.

User Agent Switcher lets me trick stupid server-side scripts into beleiving I use a different browser or even operating system.

What great add-ons did I miss?

(Some nitpickers would say that I don’t run Firefox since I use Debian and then it is called Iceweasel, but while that is entirely true, Iceweasel is still the Firefox source code and the Add-ons are in fact still Firefox Add-ons even if they also run perfectly fine on Iceweasel.)