Category Archives: Technology

Really everything related to technology

curl 7.49.0 goodies coming

Here’s a closer look at three new features that we’re shipping in curl and libcurl 7.49.0, to be released on May 18th 2016.

connect to this instead

If you’re one of the users who thought --resolve and doing Host: header tricks with --header weren’t good enough, you’ll appreciate that we’re adding yet another option for you to fiddle with the connection procedure. Another “Swiss army knife style” option for you who know what you’re doing.

With --connect-to you basically provide an internal alias for a certain name + port to instead internally use another name + port to connect to.

Instead of connecting to HOST1:PORT1, connect to HOST2:PORT2

It is very similar to --resolve which is a way to say: when connecting to HOST1:PORT1 use this ADDR2:PORT2. --resolve effectively prepopulates the internal DNS cache and makes curl completely avoid the DNS lookup and instead feeds it with the IP address you’d like it to use.

--connect-to doesn’t avoid the DNS lookup, but it will make sure that a different host name and destination port pair is used than what was found in the URL. A typical use case for this would be to make sure that your curl request asks a specific server out of several in a pool of many, where each has a unique name but you normally reach them with a single URL who’s host name is otherwise load balanced.

--connect-to can be specified multiple times to add mappings for multiple names, so that even following HTTP redirects to other host names etc can be handled. You don’t even necessarily have to redirect the first used host name.

The libcurl option name for for this feature is CURLOPT_CONNECT_TO.

Michael Kaufmann brought this feature.

http2 prior knowledge

In our ongoing quest to provide more and better HTTP/2 support in a world that is slowly but steadily doing more and more transfers over the new version of the protocol, curl now offers --http2-prior-knowledge.

As the name might hint, this is a way to tell curl that you have “prior knowledge” that the URL you specifies goes to a host that you know supports HTTP/2. The term prior knowledge is in fact used in the HTTP/2 spec (RFC 7540) for this scenario.

Normally when given a HTTP:// or a HTTPS:// URL, there will be no assumption that it supports HTTP/2 but curl when then try to upgrade that from version HTTP/1. The command line tool tries to upgrade all HTTPS:// URLs by default even, and libcurl can be told to do so.

libcurl wise, you ask for a prior knowledge use by setting CURLOPT_HTTP_VERSION to CURL_HTTP_VERSION_2_PRIOR_KNOWLEDGE.

Asking for http2 prior knowledge when the server does in fact not support HTTP/2 will give you an error back.

Diego Bes brought this feature.

TCP Fast Open

TCP Fast Open is documented in RFC 7413 and is basically a way to pass on data to the remote machine earlier in the TCP handshake – already in the SYN and SYN-ACK packets. This of course as a means to get data over faster and reduce latency.

The --tcp-fastopen option is supported on Linux and OS X only for now.

This is an idea and technique that has been around for a while and it is slowly getting implemented and supported by servers. There have been some reports of problems in the wild when “middle boxes” that fiddle with TCP traffic see these packets, that sometimes result in breakage. So this option is opt-in to avoid the risk that it causes problems to users.

A typical real-world case where you would use this option is when  sending an HTTP POST to a site you don’t have a connection already established to. Just note that TFO relies on the client having had contact established with the server before and having a special TFO “cookie” stored and non-expired.

TCP Fast Open is so far only used for clear-text TCP protocols in curl. These days more and more protocols switch over to their TLS counterparts (and there’s room for future improvements to add the initial TLS handshake parts with TFO). A related option to speed up TLS handshakes is --false-start (supported with the NSS or the secure transport backends).

With libcurl, you enable TCP Fast Open with CURLOPT_TCP_FASTOPEN.

Alessandro Ghedini brought this feature.

Absorbing 1,000 emails per day

Some people say email is dead. Some people say there are “email killers” and bring up a bunch of chat and instant messaging services. I think those people communicate far too little to understand how email can scale.

I receive up to around 1,000 emails per day. I average on a little less but I do have spikes way above.

Why do I get a thousand emails?

Primarily because I participate on a lot of mailing lists. I run a handful of open source projects myself, each with at least one list. I follow a bunch more projects; more mailing lists. We have a whole set of mailing lists at work (Mozilla) and I participate and follow several groups in the IETF. Lists and lists. I discuss things with friends on a few private mailing lists. I get notifications from services about things that happen (commits, bugs submitted, builds that break, things that need to get looked at). Mails, mails and mails.

Don’t get me wrong. I prefer email to web forums and stuff because email allows me to participate in literally hundreds of communities from a single spot in an asynchronous manner. That’s a good thing. I would not be able to do the same thing if I had to use one of those “email killers” or web forums.

Unwanted email

I unsubscribe from lists that I grow tired from. I stamp down on spam really hard and I run aggressive filters and blacklists that actually make me receive rather few spam emails these days, percentage wise. There are nowadays about 3,000 emails per month addressed to me that my mail server accepts that are then classified as spam by spamassassin. I used to receive a lot more before we started using better blacklists. (During some periods in the past I received well over a thousand spam emails per day.) Only 2-3 emails per day out of those spam emails fail to get marked as spam correctly and subsequently show up in my inbox.

Flood management

My solution to handling this steady high paced stream of incoming data is prioritization and putting things in different bins. Different inboxes.

  1. Filter incoming email. Save the email into its corresponding mailbox. At this very moment, I have about 30 named inboxes that I read. I read them in order, top to bottom as they’re sorted in roughly importance order (to me).
  2. Mails that don’t match an existing mailing list or topic that get stored into the 28 “topic boxes” run into another check: is the sender a known “friend” ? That’s a loose term I use, but basically means that the mail is from an email address that I have had conversations with before or that I know or trust etc. Mails from “friends” get the honor of getting put in mailbox 0. The primary one. If the mail comes from someone not listed as friend, it’ll end up in my “suspect” mailbox. That’s mailbox 1.
  3. Some of the emails get the honor of getting forwarded to a cloud email service for which I have an app in my phone so that I can get a sense of important mail that arrive. But I basically never respond to email using my phone or using a web interface.
  4. I also use the “spam level” in spams to save them in different spam boxes. The mailbox receiving the highest spam level emails is just erased at random intervals without ever being read (unless I’m tracking down a problem or something) and the “normal” spam mailbox I only check every once in a while just to make sure my filters are not hiding real mails in there.

Reading

I monitor my incoming mails pretty frequently all through the day – every day. My wife calls me obsessed and maybe I am. But I find it much easier to handle the emails a little at a time rather than to wait and have it pile up to huge lumps to deal with.

I receive mail at my own server and I read/write my email using Alpine, a text based mail client that really excels at allowing me to plow through vast amounts of email in a short time – something I can’t say that any UI or web based mail client I’ve tried has managed to do at a similar degree.

A snapshot from my mailbox from a while ago looked like this, with names and some topics blurred out. This is ‘INBOX’, which is the main and highest prioritized one for me.

alpine screenshot

I have my mail client to automatically go to the next inbox when I’m done reading this one. That makes me read them in prio order. I start with the INBOX one where supposedly the most important email arrives, then I check the “suspect” one and then I go down the topic inboxes one by one (my mail client moves on to the next one automatically). Until either I get overwhelmed and just return to the main box for now or I finish them all up.

I tend to try to deal with mails immediately, or I mark them as ‘important’ and store them in the main mailbox so that I can find them again easily and quickly.

I try to only keep mails around in my mailbox that concern ongoing topics, discussions or current matters of concern. Everything else should get stored away. It is hard work to maintain the number of emails there at a low number. As you all know.

Writing email

I averaged at less than 200 emails written per month during 2015. That’s 6-7 per day.

That makes over 150 received emails for every email sent.

fcurl is fread and friends for URLs

This whole family of functions, fopen, fread, fwrite, fgets, fclose and more are defined in the C standard since C89. You can’t really call yourself a C programmer without knowing them and probably even using them in at least a few places.

The charm with these is that they’re standard, they’re easy to use and they’re available everywhere where there’s a C compiler.

A basic example that just reads a file from disk and writes it to stdout could look like this:

FILE *file;

file = fopen("hello.txt", "r");
if(file) {
  char buffer [256];
  while(1) {
    size_t rc = fread(buffer, sizeof(buffer),
                1, file);
    if(rc > 0)
      fwrite(buffer, rc, 1, stdout);
    else
      break;
  }
  fclose(file);
}

Imagine you’d like to switch this example, or one of your actual real world programs that use the fopen() family of functions to read or write files, and instead read and write files from and to the Internet instead using your favorite Internet protocols. How would you do that without having to change your code a lot and do a major refactoring job?

Enter fcurl

I’ve started to work on a library that provides a look-alike API with matching functions and behaviors, but that allows fopen() to instead specify a URL instead of a file name. I call it fcurl. (Much inspired by the libcurl example fopen.c, which I wrote the first version of already back in 2002!)

It is of course open source and is powered by libcurl.

The project is in its early infancy. I think it would be interesting to try it out and I’ve mentioned the idea to a few people that have shown interest. I really can’t make this happen all on my own anyway so while I’ve created a first embryo, it will take some time before it gets truly useful. Help from others would be greatly appreciated of course.

Using this API, a version of the above example that instead reads data from a HTTPS site instead of a local file could look like:

FCURL *file;

file = fcurl_open("https://daniel.haxx.se/",
                  "r");
if(file) {
  char buffer [256];
  while(1) {
    size_t rc = fcurl_read(buffer,         
                           sizeof(buffer), 1, 
                           file);
    if(rc > 0)
      fwrite(buffer, rc, 1, stdout);
    else
      break;
  }
  fcurl_close(file);
}

And it could even actually also read a local file using the file:// sheme.

Drop-in replacement

The idea here is to make the alternative functions have new names but as far as possible accept the same input arguments, return the same return codes and so on.

If we do it right, you could possibly even convert an existing program with just a set of #defines at the top without even having to change the code!

Something like this:

#define FILE FCURL
#define fopen(x,y) fcurl_open(x, y)
#define fclose(x) fcurl_close(x)

I think it is worth considering a way to provide an official macro set like that for those who’d like to switch easily (?) and quickly.

Fun things to consider

1. for non-scheme input, use normal fopen?

An interesting take is probably to make fcurl_open() treat input specified without a “scheme://” to be a local file, and then passed to fopen() instead under the hood. That would then enable even more code to switch to fcurl since all the existing use cases with local file names would just continue to work.

2. LD_PRELOAD

An interesting area of deeper research around this could be to provide a way to LD_PRELOAD replacements for the functions so that not even any source code would need be changed and already built existing binaries could be given this functionality.

3. fopencookie

There’s also the GNU libc’s fopencookie concept to figure out if that is something for fcurl to support/use. BSD and OS X have something similar called funopen.

4. merge in official libcurl

If this turns out useful, appreciated and good. We could consider moving the API in under the curl project’s umbrella and possibly eventually even making it part of the actual libcurl. But hey, we’re far away from that and I’m not saying that is even the best idea…

Your input is valuable

Please file issues or pull-requests. Let’s see where we can take this!

HTTP/2 in April 2016

On April 12 I had the pleasure of doing another talk in the Google Tech Talk series arranged in the Google Stockholm offices. I had given it the title “HTTP/2 is upon us, and here’s what you need to know about it.” in the invitation.

The room seated 70 persons but we had the amazing amount of over 300 people in the waiting line who unfortunately didn’t manage to get a seat. To those, and to anyone else who cares, here’s the video recording of the event.

If you’ve seen me talk about HTTP/2 before, you might notice that I’ve refreshed the material somewhat since before.

POWERMASTR 10: KOM OK

My phone just lighted up. POWERMASTER 10 told me something. It said “POWERMASTR 10: KOM OK”.

Over the last fewSMS conversation screenshot months, I’ve received almost 30 weird text messages from a “POWERMASTER 10”, originating from a Swedish phone number in a number range reserved for “devices”. Yeps, I’m showing the actual number below in the screenshot because I think it doesn’t matter and if for the unlikely event that the owner of +467190005601245 would see this, he/she might want to change his/her alarm config.

Powermaster 10 is probably a house alarm control panel made by Visonic. It is also clearly localized and sends messages in Swedish.

As this habit has been going on for months already, one can only suspect that the user hasn’t really found the SMS feedback to be a really valuable feature. It also makes me wonder what the feedback it sends really means.

The upside of this story is that you seem to be a very happy person when you have one of these control panels, as this picture from their booklet shows. Alarm systems, control panels, text messages. Why wouldn’t you laugh?!

powermaster10-laughing

Summers are for HTTP

stockholm castle and ship
Stockholm City, as photographed by Michael Caven

In July 2015, 40-something HTTP implementers and experts of the world gathered in the city of Münster, Germany, to discuss nitty gritty details about the HTTP protocol during four intense days. Representatives for major browsers, other well used HTTP tools and the most popular HTTP servers were present. We discussed topics like how HTTP/2 had done so far, what we thought we should fix going forward and even some early blue sky talk about what people could potentially see being subjects to address in a future HTTP/3 protocol.

You can relive the 2015 version somewhat from my daily blog entries from then that include a bunch of details of what we discussed: day one, two, three and four.

http workshopThe HTTP Workshop was much appreciated by the attendees and it is now about to be repeated. In the summer of 2016, the HTTP Workshop is again taking place in Europe, but this time as a three-day event slightly further up north: in the capital of Sweden and my home town: Stockholm. During 25-27 July 2016, we intend to again dig in deep.

If you feel this is something for you, then please head over to the workshop site and submit your proposal and show your willingness to attend. This year, I’m also joining the Program Committee and I’ve signed up for arranging some of the local stuff required for this to work out logistically.

The HTTP Workshop 2015 was one of my favorite events of last year. I’m now eagerly looking forward to this year’s version. It’ll be great to meet you here!

Stockholm
The city of Stockholm in summer sunshine

HTTP redirects

I find that many web minded people working client-side or even server-side have neglected to learn the subtle details of the redirects of today. Here’s my attempt at writing another text about it that the ones who should read it still won’t.

Nothing here, go there!

The “redirect” is a fundamental part of the HTTP protocol. The concept was present and is documented already in the first spec (RFC 1945), published in 1996, and it has remained well used ever since.

A redirect is exactly what it sounds like. It is the sredirect-signerver sending back an instruction to the client – instead of giving back the contents the client wanted. The server basically says “go look over [here] instead for that thing you asked for“.

But not all redirects are alike. How permanent is the redirect? What request method should the client use in the next request?

All redirects also need to send back a Location: header with the new URI to ask for, which can be absolute or relative.

Permanent or Temporary

Is the redirect meant to last or just remain for now? If you want a GET to resource A permanently redirect users to resource B with another GET, send back a 301. It also means that the user-agent (browser) is meant to cache this and keep going to the new URI from now on when the original URI is requested.

The temporary alternative is 302. Right now the server wants the client to send a GET request to B, but it shouldn’t cache this but keep trying the original URI when directed to it.

Note that both 301 and 302 will make browsers do a GET in the next request, which possibly means changing method if it started with a POST (and only if POST). This changing of the HTTP method to GET for 301 and 302 responses is said to be “for historical reasons”, but that’s still what browsers do so most of the public web will behave this way.

In practice, the 303 code is very similar to 302. It will not be cached and it will make the client issue a GET in the next request. The differences between a 302 and 303 are subtle, but 303 seems to be more designed for an “indirect response” to the original request rather than just a redirect.

These three codes were the only redirect codes in the HTTP/1.0 spec.

GET or POST?

All three of these response codes, 301 and 302/303, will assume that the client sends a GET to get the new URI, even if the client might’ve sent a POST in the first request. This is very important, at least if you do something that doesn’t use GET.

If the server instead wants to redirect the client to a new URI and wants it to send the same method in the second request as it did in the first, like if it first sent POST it’d like it to send POST again in the next request, the server would use different response codes.

To tell the client “the URI you sent a POST to, is permanently redirected to B where you should instead send your POST now and in the future”, the server responds with a 308. And to complicate matters, the 308 code is only recently defined (the spec was published in June 2014) so older clients may not treat it correctly! If so, then the only response code left for you is…

The (older) response code to tell a client to send a POST also in the next request but temporarily is 307. This redirect will not be cached by the client though so it’ll again post to A if requested to. The 307 code was introduced in HTTP/1.1.

Oh, and redirects work the exact same way in HTTP/2 as they do in HTTP/1.1.

The helpful table version

Permanent Temporary
Switch to GET 301 302 and 303
Keep original method 308 307

It’s a gap!

Yes. The 304, 305, and 306 codes are not used for redirects at all.

What about other HTTP methods?

I decided to simplify the explanation above. In all places where it says POST above, you can replace it with any non-GET method. They’re just slightly less common on the browser centric web.

curl and redirects

I couldn’t write a text like this without spicing it up with some curl details!

First, curl and libcurl don’t follow redirects by default. You need to ask curl to do it with -L (or –location) or libcurl with CURLOPT_FOLLOWLOCATION.

It turns out that there are web services out there in the world that want a POST sent, are responding with HTTP redirects that use a 301, 302 or 303 response code and still want the HTTP client to send the next request as a POST. As explained above, browsers won’t do that and neither will curl – by default.

Since these setups exist, and they’re actually not terribly rare, curl offers options to alter its behavior.

You can tell curl to not change the non-GET request method to GET after a 30x response by using the dedicated options for that:
–post301, –post302 and –post303. If you are instead writing a libcurl based application, you control that behavior with the CURLOPT_POSTREDIR option.

Here’s how a simple HTTP/1.1 redirect can look like. Note the 301, this is “permanent”:
curl-shows-redirect

“Subject: Urgent Warning”

Back in December I got a desperate email from this person. A woman who said her Instagram had been hacked and since she found my contact info in the app she mailed me and asked for help. I of course replied and said that I have nothing to do with her being hacked but I also have nothing to do with Instagram other than that they use software I’ve written.

Today she writes back. Clearly not convinced I told the truth before, and now she strikes back with more “evidence” of my wrongdoings.

Dear Daniel,

I had emailed you a couple months ago about my “screen dumps” aka screenshots and asked for your help with restoring my Instagram account since it had been hacked, my photos changed, and your name was included in the coding. You claimed to have no involvement whatsoever in developing a third party app for Instagram and could not help me salvage my original Instagram photos, pre-hacked, despite Instagram serving as my Photography portfolio and my career is a Photographer.

Since you weren’t aware that your name was attached to Instagram related hacking code, I thought you might want to know, in case you weren’t already aware, that your name is also included in Spotify terms and conditions. I came across this information using my Spotify which has also been hacked into and would love your help hacking out of Spotify. Also, I have yet to figure out how to unhack the hackers from my Instagram so if you change your mind and want to restore my Instagram to its original form as well as help me secure my account from future privacy breaches, I’d be extremely grateful. As you know, changing my passwords did nothing to resolve the problem. Please keep in mind that Facebook owns Instagram and these are big companies that you likely don’t want to have a trail of evidence that you are a part of an Instagram and Spotify hacking ring. Also, Spotify is a major partner of Spotify so you are likely familiar with the coding for all of these illegally developed third party apps. I’d be grateful for your help fixing this error immediately.

Thank you,

[name redacted]

P.S. Please see attached screen dump for a screen shot of your contact info included in Spotify (or what more likely seems to be a hacked Spotify developed illegally by a third party).

Spotify credits screenshot

Here’s the Instagram screenshot she sent me in a previous email:

Instagram credits screenshot

I’ve tried to respond with calm and clear reasonable logic and technical details on why she’s seeing my name there. That clearly failed. What do I try next?

Tales from my inbox, part++

“Josh” sent me an email. Pardon the language here but I decided to show the mail body unaltered:

From: Josh Yanez <a gmail address>
Date: Wed, 6 Jan 2016 22:27:13 -0800
To: daniel
Subject: Hey fucker

I got all your fucking info either you turn yourself in or ill show it to the police. You think I'm playing try me I got all your stupid little coding too.

Sent from my iPhone

This generates so many questions

  1. I’ve had threats mailed to be before (even done over phone) so this is far from the first time. The few times I’ve bothered to actually try to understand what these people are hallucinating about, it usually turns out that they’ve discovered that someone has hacked them or targeted them in some sort of attack and curl was used and I am the main author so I’m the bad guy.
  2. He has all my “info” and my “stupid little coding too” ? What “coding” could that be? What is all my info?
  3. Is this just a spam somehow that wants me to reply? It is directed to me only and I’ve not heard of anyone else who got a mail similar to this.
  4. The lovely “Sent from my iPhone” signature is sort of hilarious too after such an offensive message.

Very aware this could just as well suck me into a deep and dark hole of sadness, I was just too curious to resist so I responded. Unfortunately I didn’t get anything further back so the story thus ends here, a bit abrupt. :-(