What, me over-analyzing?

fiber cableI got fiber installed to my house a few months ago. 100/100 mbit is very nice.

My first speed checks using the Swedish bredbandskollen service were a bit disappointing since it only showed something like 50mbit down and 80mbit up. I decided to ignore that fact for the moment as things were new and I had some other more annoying issues.

I detected that sometimes, on some specific sites I had problems to get HTTP! The TCP connection would get connected and data would get sent, but it would stall somewhere and then get disconnected. This showed up when for example my wife tried to download a Spotify client and my phone got trouble to download some of my favorite podcasts. Some, not all. Possibly one out of ten or twenty sites showed the problem. Most of the sites I use frequently worked flawlessly and I would only ever see the problem when I tried HTTP.

Puzzling!

I could avoid the problem by setting up a SSH connection to my server and running that as a SOCKS proxy, and so I could still get service even from the sites I apparently couldn’t quite use. I tried to collect the problematic sites and I tried traceroute etc on all the ones that failed in order to get data that would help me and my operator to pinpoint the problem. I reported this problem to my ISP really early on, but they too were puzzled and it never got far in their end.

I was almost convinced they had some kind of traffic shaping thing in the middle that was broken for HTTP somehow.

Time to fix it once and for all

Finally one day I stopped being nice with the support people and I demanded that they would send over a guy and fix it. It wasn’t good enough that they would find that everything is OK from remote. I clearly had problems and I could escape the problems by switching over to my old ADSL so I knew it wasn’t due to my computers’ configs or due to my own firewalls or routers.

I was also convinced my ISP would get me some cable guy coming over when the problem wasn’t really in the cable, but sure it would be a necessary first step towards finding the real error.

They sent a cable guy, and it took him like three minutes to detect a bad signal level on the fiber, meaning that the problem was certainly not in my house anyway. He then drove down to the “station” that terminates my fiber, and after having “polished” that end of the fiber connection too he called me and said that he couldn’t spot any problem anymore and asked me to verify…

Now my checks showed me 80-90 mbit in both directions and the sites that used to give me problems all worked just fine.

It was the cable all along. Bad signal level. “A mystery it worked at all for you” my cable person said.

I’m left with my over-analyzed problem suspecting all sorts of high level stuff. But why did this only affect some sites? How come I could circumvent this with SOCKS? Gah, my brain hurts trying to answer these questions…

Anyway, now it all works and the family is happy again.

SSL verification still often disabled

SSL padlockBack in 2002 I realized that having libcurl not do SSL server verification by default basically meant that everyone writing libcurl apps would inherit that flaw, simply because most people always just let the defaults remain unless they really have to read up on what something does and then modify them. If things work, things will just remain. So when we shipped libcurl 7.10 on the first of October that year, libcurl started verifying server certs by default.

Fast forward about ten years.

Surely SSL clients everywhere now do the right thing?

One day a couple of months ago, I was referred to this bug report for the pyssl module in Python which identifies that it doesn’t verify server certs by default! The default SSL handler in Python doesn’t verify the certificate properly. It makes all python programs that use this without special attention vulnerable for man in the middle attacks.

So let’s look at the state of another popular language: PHP. A plain standard PHP program opens a ssl:// or tls:// stream. Unless the author of said program knows and understands these things, it too runs without verifying server certs. If a program instead decides to use the PHP/CURL binding for HTTPS or similar, it will use libcurl’s default which verifies it (as I explained above).

But not everything is gloomy. Some parts of our community have decided to do the right thing:

I was told (and proven) that Ruby now does the right thing, but I don’t know how recent that is and thus how many older Ruby programs that suffer.

The same problem existed with perl’s major HTTPS using module, the LWP, for a very long time. The perl camp however already modified LWP to do verification by default with the release of libwww-perl 6.00, released in March 2011.

Side-note: in the curl project we make it easy for everyone on the Internet to use Firefox’s excellent CA cert bundle to verify server certs by providing the Firefox CA cert collection converted to PEM – the preferred format for OpenSSL, GnuTLS and others.

Conclusion:

Even today, lots and lots of applications and scripts will remain insecure – even though they probably think they’re fairly safe when they switch to a HTTPS or SSL using protocol –  and might be subject for man-in-the-middle attacks without even being able to spot it. I think it is pretty sad.

this vs that and ssh through proxy

Taken from the web stats for daniel.haxx.se during September 2012. The top-10 search phrases used to end up on a page on this site:

  1. ssh proxy (198)
  2. curl vs wget (145)
  3. ftp vs http (92)
  4. wget vs curl (91)
  5. ssh through proxy (72)
  6. http vs ftp (67)
  7. curl wget (55)
  8. wget curl (53)
  9. http ftp (46)
  10. difference between ftp and http (45)

The top-3 most visited pages on my site during the same month were:

  1. SSH Through or Over Proxy (viewed 4800 times)
  2. curl vs Wget (viewed 3000 times)
  3. FTP vs HTTP (viewed 2300 times)

I guess this tells me something. I’m not sure what…

Three years of Haxx

Haxx logoAt October first, another full year of work at Haxx has been spent since I last summed up the past year (my previous posts about Haxx’s first year and second year). Three years working for Haxx full-time, and it has been another great year with lots of fun, challenges and us enjoying being independent.

During this year I ended my previous engagement with that large chip company and got a new assignment for the same customer both Björn and Linus were working for at the time. It has been a big adventure for me as I dove straight into unknown territories and I’ve spent my work days since then as a product manager, making an embedded Linux distribution. In this role I’ve travelled to US, China and South Korea during the year and I’m serving as a member of an advisory board in a related organization on behalf of my customer! I recently agreed to extending this contract to at least April 2013. Partly due to this new assignment I’ve not worked very much on foss-sthlm activities recently, but after the summer I’ve really made an effort to get this back up to speed.

Birthdaycake

Later during the year, Linus changed assignment to a new customer when we signed a sort of partnership contract with a leading global embedded software company and he then continued to do a whole series of little projects for them. After the summer Linus has grabbed a couple of curl related projects, partly still in progress.

Björn stuck around at the same customer during the entire year, and he’s been working as an engineer and developer in the team that actually makes the product I am a manager for.

Haxx towelThis year we made more Haxx merchandise. Towels, stickers and jackets have now been sent out in the world to make our name more visible in a few weird corners of the universe.

We visisted FSCONS 2011 and FOSDEM 2012, two really nice conferences for FOSS fans like us and we got to meet a lot of friends and like-minded people there.

We continue to see a demand on the market for highly skilled embedded developers, including embedded Linux and open source related activities. We wouldn’t mind extending our merry team, so we decided to document a list of requirements of what to have in order to get hired by us. So far not a single person has applied…

Snaxx 27

A pint of guinnessGoing strong after 12 years in the making. For the 27th time we’re gathering friends in the Stockholm Sweden area who are interested in technology, open source, beers, slightly inaccurate Monty Python quotes, reverse engineering electronics and similar very important topics. We might also have a beer or two and talk rubbish.

On October 31st 2012 we invite all and every of our tech oriented friends to visit

Snaxx-27

We figured the 27th time would be the perfect time to do something new, so we now host the information on the fine snaxx.se domain.

Introducing curl_multi_wait

Facebook contributes fix to libcurl’s multi interface to overcome problem with more than 1024 file descriptors.

When we introduced the multi interface to libcurl about (what feels like) one hundred years ago, we went with simple in some ways. One way it shows: an application that wants to do many transfers in parallel asks libcurl to do it, and then it extracts the set of file descriptors (sockets!) from libcurl (using curl_multi_fdset) to wait for as plain fd_sets. fd_set is the variable type made for select(). This API choice made applications pretty much forced to use select. select() has its fair share of problems, where possibly the biggest one is that it has problems with file descriptors > 1024.

Later on we introduced an enhanced version of the multi interface for libcurl that allows an application to use whatever method it pleases. I tend to refer to that variation as the multi_socket API after its main function curl_multi_socket_action. That’s the high performance, event-driven API.

As you may be aware, event-driven code make things a bit more complicated at times so many people still prefer to use the older and simpler multi interface and thus they were forced to remain using select(). But now that era has ended. Now…

curl_multi_wait() is introduced!

This poll(3)-like function basically works as a replacement for curl_multi_fdset() + select(). Starting in libcurl 7.28.0 (strictly speaking in commit de24d7bd4c03ea3), this is a function that any application can use for this purpose, and thus avoid the problem with many file descriptors.

This new function doesn’t use any struct from the “real” poll() or associated headers to make sure that it works even for systems without a real poll() implementation. It instead uses private curl versions of both the struct and the defines used. An application can of course also tell curl_multi_wait to wait for a set of private file descriptors, just like poll() or select().

The patch set that brought this function was provided by Sara Golemon, a friend from from a related project

cURL

ptest because “make test” is insufficient

CAUTION: test in progressMuch thanks to autoconf and automake we have an established more or less standardized way to build and install tools, libraries and other software. We build them with ‘make’ and we install them with ‘make install‘. This works great and it works equally fine even when we build stuff cross-compiled.

For testing however, the established concept and procedure is not as good. For testing we have ‘make test’ or ‘make check’ which typically first builds whatever needs to get built for the tests to run and then it runs all tests.

This is not good enough

Why? Because in lots of use-cases we build software using a cross-compiler on a build system that can’t run the executables. Therefore we need to first build the tests, then install the tests (somewhere that is reachable from the target system) and finally execute them. These steps need to be possible to run independently since at least the building and installing will sometimes happen on a different host than the execution of the tests.

yocto-projectIntroducing ptest

Within the yocto project, Björn Stenberg has pushed for ptest to be the basis of this new reform and concept. The responses he’s gotten so far has been positive and there’s a pending updated patch to be posted to the upstream oe-core list soon.

The work does not end there

Even if or when this can be incorporated into OpenEmbedded and Yocto – and I really think it is a matter of when since I believe we can work out all the flaws and quirks until virtually everyone involved is happy. The bulk of the changes however, really should be done upstream, in hundreds and thousands of open source packages. We (as upstream open source projects) need to start doing testing in at least two different steps, where one step build everything that needs to be built for the tests and then a second step that run the suite. The two steps could then in a cross-compiled scenario get executed first on the host system and then on the target system.

I expect that this will mean a whole bunch of patches and scripts to have to be maintained within OpenEmbedded for a while, when things will be tried to get merged into upstream projects and I also foresee that a certain percentage of all projects just won’t accept this new approach and will reject all patches in this vein.

Output format

I think the most controversial part of these suggested “universal” changes is the common test suite output format. The common format is of course required so that we can “supervise” the output and results from any package without having to know any specifics.

While the ptest output format follows the automake test output syntax, I expect many projects that have selected a particular output format to rather stick with that. Hopefully we can then make projects introduce a separate make target or option that runs the test suite with the standard output format.

One little step forward

Building full-fledged Linux distributions cross-compiled that are completely tested on target will remain being hard work for a while more. But we are improving things, one step at a time.

Of course, the name ‘ptest’ is what the system is currently called by Björn within the yocto/OE environment. It is not supposed to be a catchy name for this idea outside of there. The ‘P’ refers to package, as opposed to for example system test and to make it less generic than simply test.

Screen scraping expert witness

This is a slightly edited version of a genuine email I received in May 2012:

Dear Mr. Stenberg –

I recently came across the text you co-authored with Michael Schrenk, Webbots, Spiders, and Screen Scrapers, and was wondering if you might be interested in being a paid expert witness in a lawsuit we’re handling.

One of the major claims in the suit is unauthorized computer access in the form of a massive, multi-year campaign of screen scraping, and we’re looking for a qualified expert who can make the activity make sense to a jury (in the unlikely event that this matter reaches trial; fewer than 2% of cases do, in federal court).

We’re in Los Angeles, California, as is the case (and naturally would cover travel expenses, an hourly or per diem expert witness fee, etc). If you’re interested (or even if you’re not), please let me know? You can reach me via email or at (xyz) xyz-xyzx.

Many thanks,
[withheld]

Link to the book.

I responded to this mail saying that I’d rather not due to the distance and travel it’d require, but I never heard back from them again so I have no idea whatever happened in this case or who got to be the expert in the end…

Fixed name to dynamic IP with CNAME

Notice: this is not an advanced nor secret trickery. This is just something I’ve found even techsavvy people in my surrounding not having done so its worth being highlighted.

When I upgraded to fiber from ADSL, I had to give up my fixed IPv4 address that I’ve been using for around 10 years and switch to a dynamic DNS service .

In this situation I see lots of people everywhere use dyndns.org and similar services that offer dynamicly assigning a new IP to a fixed host name so that you and your computer illiterate friends still can reach your home site.

I suggest a minor variation of this, that still avoids having to run your own dyndns server on your server. It only requires that you admin and control a DNS domain already, but who doesn’t these days?

  1. Get that dyndns host name at the free provider that makes the name hold your current IP. Let’s call it home.dyndns.org.
  2. In your own DNS zone for your domain (example.com) you add an entry for your host ‘home.example.com‘ as a CNAME, pointing over to home.dyndns.org
  3. Now you can ping or ssh or whatever to ‘home.example.com‘ and it will remain your home IP even when it moves.
  4. Smile and keep using that host name in your own domain to your dynamic IP

curl, open source and networking