Category Archives: Technology

Really everything related to technology

The web shop timeout mystery

Another one of the things in the modern world I’ve not yet understood:

why on earth do some web-based shops timeout your shopping and automatically clear you “shopping cart” if you just leave it around for a few hours/days? Why why why? What harm does it do them if I don’t hurry on to purchase?

I love being able to press ‘buy’ on lots of stuff (that then are added to the “cart”) and then ponder a few days if I want more stuff, if I selected the right models, alter a few things and similar. So when they time-out on me like this, it’s like a blow in the face and I need to start over again. It’s simply crazy that I have to backup my list of things to buy just in case they’ll flush me before I’m done!

Yes, I’m aware that some sites offer “save lists” etc if you’re registered and logged in and all, but I don’t want to have to do that.

I can imagine that at times things run out of stock or they even change the prices of merchandise that’s in my cart, but they could still solve that in other ways than just clearing everything.

bittorrent vs HTTP

A while ago I put together my document FTP vs HTTP that compares data transfers done using those two protocols. Similarities and differences.

Today I’m taking the next step in this little series and I offer you Bittorrent vs HTTP! This document discusses differences in areas such as:

  • Transfer Speed
  • Streaming
  • Uplink
  • Firewalls
  • Redundancy
  • Server Load
  • Encryption
  • Protocol Standards

As usual, I’m all ears for your valuable input and help on making it more accurate and more detailed than I manage to myself. Point out my mistakes, my weird use of words or whatever. Post a comment here or email me.

bittorrent vs http

HTTP Status Report

Mark Nottingham Mark Nottingham held a very interesting one hour talk on the status of HTTP and the work on HTTPbis on a QCon conference recently, and luckily for us HTTP geeks there’s this great video/presentation from that.

curl is mentioned at least twice in the slides, unfortunately it has a wrong fact on the second mention where it says curl uses “Pragma: no-cache” as it isn’t true anymore. It used to do that, but we’ve stopped doing it in curl since a while ago.

I’m a subscriber to the httpbis mailing list and a casual contributor, but nonetheless his summary and overview of the state was refreshing as I’ve not been able to keep up with all the details and I haven’t been tracking that working group from its start either.

Rockbox gsoc2009

So finally it went public that this year Rockbox will be mentoring five students to reach their

individual goals and get their projects turned into realities.Gsoc 2009

The projects are new codecs, one is a new port, one is USB HID work and finally there’s this “make Rockbox an instrument” project.

Personally I’m admin for Rockbox gsoc effort for the third year, and this year I’m also co-mentoring a student (Robert Keevil) in his project to bring Rockbox to the Sansa View.

Let’s make this a great gsoc year!

USB converter woes

USB to rs232 converters are just never sold properly advertising what chip’s inside and right now I want to know if this one UART I’m working with perhaps is not playing fine with my existing converter cable.

I have this XScale PXA270 on a toradex-colibriboard, and it has only one full featured RS232 (FFUART) and I’m about to move things over to the lesser featured BTUART.

A theory is that my current USB converter that is based on a “Prolific PL2303” doesn’t play nicely on the serial port that isn’t a full RS232.

So I ran off and bought a new cable. I grabbed the only model I found in my local Kjell & Company store – it’s quite different looking than my existing but there’s no hint anywhere on the package or inside of it that says what chipset that empowers it.

A quick drive back home (I’m working from home in this assignment), I plugged it in and I got to see this depressingly familiar dmesg output:

usbcore: registered new interface driver usbserial
usbserial: USB Serial support registered for generic
usbcore: registered new interface driver usbserial_generic
usbserial: USB Serial Driver core
usbserial: USB Serial support registered for pl2303
pl2303 2-2.4:1.0: pl2303 converter detected
usb 2-2.4: pl2303 converter now attached to ttyUSB0
usbcore: registered new interface driver pl2303
pl2303: Prolific PL2303 USB to serial adaptor driver

So what now? I hate how (my) computers these days don’t have serial ports while the entire embedded world still very much uses them. I think I’ll go searching in my closet to see if I can find an old crap computer with a serial port to try.

Another theory is that the port simply is broken hw-wise on the dev board but that’s harder to check for me right now.

Update: it was (as usual) only my stupidity that prevented this from working. If I switch it over to the correct baudrate the usb converter does fine. But before I found that out, I did find a computer with a serial port and I did see it working on that too…

User data probably for sale

It’s time for a little “doomsday prophesy”.

Already seen happen

As was reported last year in Sweden, mobile operators here sell customer data (Swedish article) to companies who are willing to pay. Even though this might be illegal (Swedish article), all the major Swedish mobile phone operators do this. This second article mentions that the operators think this practice is allowed according to the contract every customer has signed, but that’s far from obvious in everybody else’s eyes and may in fact not be legal.

For the non-Swedes: one mobile phone user found himself surfing to a web site that would display his phone number embedded on the site! This was only possible due to the site buying this info from the operator.

While the focus on what data they sell has been on the phone number itself – and I do find that a pretty good privacy breach in itself – there’s just so much more the imaginative operators just very likely soon will offer companies who just pay enough.

Legislations going the wrong way

There’s this EU “directive” from a few years back:

Directive 2006/24/EC of the European Parliament and of the Council of 15 March 2006 on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC

It basically says that Internet operators must store information of users’ connections made on the net and keep them around for a certain period. Sweden hasn’t yet ratified this but I hear other EU member states already have it implemented…

(The US also has some similar legislation being suggested.)

It certainly doesn’t help us who believe in maintaining a level of privacy!

What soon could happen

There’s hardly a secret that operators run network supervision equipments on their customer networks and thus they analyze and snoop on network data sent and received by each and every customer. They do this for network management reasons and for such legislations I mentioned above. (Disclaimer: I’ve worked and developed code for a client that makes and sells products for exactly this purpose.)

Anyway, it is thus easy for the operators to for example spot common URLs their users visit. They can spot what services (bittorrent, video sites, Internet radio, banks, porn etc) a user frequents. Given a particular company’s interest, it could certainly be easy to check for specific competitors in users’ visitor logs or whatever and sell that info.

If operators can sell the phone numbers of their individual users, what stops them from selling all this other info – given a proper stash of money from the ones who want to know? I’m convinced this will happen sooner or later, unless we get proper legislation that forbids the operators from doing this… In Sweden this sell of info is mostly likely to get done by the mobile network operators and not the regular Internet providers simply because the mobile ones have this end user contract to lean on that they claim gives them this right. That same style of contract and terminology, is not used for regular Internet subscriptions (I believe).

So here’s my suggestion for Think Geek to expand somewhat on their great shirt:

i-read-your-everything

(yeah, I have one of those boring ones with only the first line on it…)

Haxx for you

So our company is named Haxx and it has been named like this for more than a decade, but the name is considered by some people be a mark of evil or something.

In my closest circle of friends we’ve kind of “always” liked using silly names and we’ve since long had a fascination with double Xes. Once upon the time in the early 90s we teamed up under the name Frexx and we did some funky programs on the Amiga. Most notably a programming language called FPL and the text editor FrexxEd.

When we then during the second half of the 90s needed to start an actual company to easier cater for our “spare time businesses” we wanted a new name but still one in a similar spirit. Being big friends and practitioners of writing “quick hacks” (“hack” in the sense that it is a quickly done program/script that perhaps isn’t always written very solidly or nice but works for the moment) to solve our own problems both at work and at home, we found Haxx to be a perfect name for us – Hack in pluralis, spelled with double-x.

Already at the time we took the name we knew about this bad habit at places that seemed to lump Hackers with Crackers or similar so we knew there would be a risk that some could assume us to be something else based on our name, but what the heck, we liked the name and we are and were hackers and we do and did a lot of hacks. Haxx it was. Haxx it is.

These days we get some minor problems due to this. At some companies (let’s not name any specific but you know the kind) they have black-listed haxx.se web sites (presumably because of the name ‘haxx’ in the domain name), some people get mails from us our the mailing lists we host easier filtered as spam and we get our share of strange suggestions etc.

I guess the upside of it is that we get our chances to whine on people and systems who decide to filter contents purely based on the presence of a single 4-letter word, either in a domain name or in web page or mail contents, and that is actually hilariously stupid.

Haxx

HD is the thing

Thomson apparently brought the new mp3hd format for music the other day. “HD” is apparently the thing we need to have included when a new term is announced. Why does the world need another lossless music format?

It seems they’ve introduced a crafty dual-format thing where they stuff MPEG-4 SLS lossless encoded data in a new id3 XHD3 tag within the mp3 and then stuff a “regular” mp3 as the normal data in there. This way it is supposed to still work fine with existing and older mp3 players. Of course the total size of all id3v2 tags is limited to 256MB, which could be a limiting factor for it.

As usual, you can find a thriving discussion on this topic on hydrogenaudio.

Rockbox should of course be possible to at first use the mp3 parts and if this truly is an existing established lossless codec there’s a chance it might be able to play that part in the future.

libssh2 upped a notch

There have been some well-founded criticism against libssh2 for a long time for its bad transfer performance when doing SCP and SFTP based transfers. Tests have proved it to be significantly slower than the openssh based alternatives in comparisons done in similar conditions. We’re talking down to a tenth(!) of the speed for SFTP.

Luckily I have a unnamed (by agreement) sponsor who pays me for improving this.

Giving it some love

I basically started out reading the SFTP code and cleaned it up as I went over it, and I added some clarifying comments etc. I found some irregularities that I fixed. Soon I could spot an obvious performance boost, like perhaps 3-4 times the previous speed. But since SFTP was painfully slow originally, this was still very crappy compared to openssh.

I then switched over to plain SCP tests. SCP is basically just an “scp” command sent over SSH and then streaming the data over a plain SSH “channel”, while SFTP is a whole additional protocol layer on top. Thus SCP is more low-level, on the actual SSH level, and the foundation on which SFTP runs anyway so getting SCP faster was fundamental.

Make it speedier

My initial tests with libssh2 1.0 showed libssh2 to download data at roughly 25% of the speed of openssh when SCPing a 1GB file from an openssh server running on localhost. The openssh client shows roughly 40MB/sec on my test box.

Also, just checking my CPU load meter while doing the libssh2 transfers showed that it certainly wasn’t hitting the roof or anything. It was barely even noticeable! Of course something was really wrong but what was it?

SSH has a lower protocol layer that does the entire encryption thing, the transport layer, but on top of that is the “channel layer” that is packet based for sending data back and forth over the transport layer. This channel thing has a receive window concept, much like TCP itself has, which tells the remote side how much data it is allowed to send until it gets further notice.

libssh2 1.0 had a very conservative windowing logic. It started with a default window size of 64KB and it upped it at every read with the same amount that was read (which then could be 1K to 16KB something depending on the app).

My remake of this was to simplify the logic, read data from the network more evenly distributed over time, update the window size much less frequent and increase the window size by magnitudes! I found that when using a window size of 38MB (600 times the previous default size!!) things started flying.

Improved

With these modifications, libssh2 transfers SCP at close to 40MB/sec! SFTP is still left behind at a “mere” 14MB/sec on the same test setup but it has its own set of problems and solutions. Now this discussion on the libssh2 list is more about how to sensibly size the window to work the best way for different situations.

SFTP is a protocol that works more on file operations. The client sends OPEN, READ and CLOSE requests and the server replies with status and data. The READ request asks for N bytes starting at offset Z so a simple implementation like libssh2 asks for chunk after chunk in a serial manner, increasing the offset as it loops over the range. This causes a back-and-forth effect that certainly does not make optimized use of the network bandwidth.

SFTP ping pong

openssh has a nifty approach to enhance throughput for SFTP: it sends off and handles multiple outstanding READ requests in parallel so that it better can keep things busy (and the reverse when doing uploads). That concept is slightly harder to do with an API like the one libssh2 offers but it is of course still quite doable. I suspect that we might achieve results somewhat faster by simply use multiple connections as then we can remain using this simplistic approach but still use the full bandwidth. (Yes, I realize multiple connections may not be feasible for all applications.)

Previous tests we’ve done with SFTP uploads using multiple connections have proven libssh2 to be on par or even better than competitors on both Windows and Mac.

Please test

I’ll leave it like this for now. I’ll be very happy if people could test this version and report findings so that we make sure this is working and stable enough to release soonish. We’ll need to do something that offers window size controlling to apps, but we’ll discuss that further on the mailing list. Join in!

logo1-250