Tag Archives: HTTP2

HTTP/2 connection coalescing

Section 9.1.1 in RFC7540 explains how HTTP/2 clients can reuse connections. This is my lengthy way of explaining how this works in reality.

Many connections in HTTP/1

With HTTP/1.1, browsers are typically using 6 connections per origin (host name + port). They do this to overcome the problems in HTTP/1 and how it uses TCP – as each connection will do a fair amount of waiting. Plus each connection is slow at start and therefore limited to how much data you can get and send quickly, you multiply that data amount with each additional connection. This makes the browser get more data faster (than just using one connection).

6 connections

Add sharding

Web sites with many objects also regularly invent new host names to trigger browsers to use even more connections. A practice known as “sharding”. 6 connections for each name. So if you instead make your site use 4 host names you suddenly get 4 x 6 = 24 connections instead. Mostly all those host names resolve to the same IP address in the end anyway, or the same set of IP addresses. In reality, some sites use many more than just 4 host names.

24 connections

The sad reality is that a very large percentage of connections used for HTTP/1.1 are only ever used for a single HTTP request, and a very large share of the connections made for HTTP/1 are so short-lived they actually never leave the slow start period before they’re killed off again. Not really ideal.

One connection in HTTP/2

With the introduction of HTTP/2, the HTTP clients of the world are going toward using a single TCP connection for each origin. The idea being that one connection is better in packet loss scenarios, it makes priorities/dependencies work and reusing that single connections for many more requests will be a net gain. And as you remember, HTTP/2 allows many logical streams in parallel over that single connection so the single connection doesn’t limit what the browsers can ask for.

Unsharding

The sites that created all those additional host names to make the HTTP/1 browsers use many connections now work against the HTTP/2 browsers’ desire to decrease the number of connections to a single one. Sites don’t want to switch back to using a single host name because that would be a significant architectural change and there are still a fair number of HTTP/1-only browsers still in use.

Enter “connection coalescing”, or “unsharding” as we sometimes like to call it. You won’t find either term used in RFC7540, as it merely describes this concept in terms of connection reuse.

Connection coalescing means that the browser tries to determine which of the remote hosts that it can reach over the same TCP connection. The different browsers have slightly different heuristics here and some don’t do it at all, but let me try to explain how they work – as far as I know and at this point in time.

Coalescing by example

Let’s say that this cool imaginary site “example.com” has two name entries in DNS: A.example.com and B.example.com. When resolving those names over DNS, the client gets a list of IP address back for each name. A list that very well may contain a mix of IPv4 and IPv6 addresses. One list for each name.

You must also remember that HTTP/2 is also only ever used over HTTPS by browsers, so for each origin speaking HTTP/2 there’s also a corresponding server certificate with a list of names or a wildcard pattern for which that server is authorized to respond for.

In our example we start out by connecting the browser to A. Let’s say resolving A returns the IPs 192.168.0.1 and 192.168.0.2 from DNS, so the browser goes on and connects to the first of those addresses, the one ending with “1”. The browser gets the server cert back in the TLS handshake and as a result of that, it also gets a list of host names the server can deal with: A.example.com and B.example.com. (it could also be a wildcard like “*.example.com”)

If the browser then wants to connect to B, it’ll resolve that host name too to a list of IPs. Let’s say 192.168.0.2 and 192.168.0.3 here.

Host A: 192.168.0.1 and 192.168.0.2
Host B: 192.168.0.2 and 192.168.0.3

Now hold it. Here it comes.

The Firefox way

Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.

one connection

The Chrome way

Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection.  If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.

Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.

The Edge and Safari ways

They don’t do coalescing at all, so each host name will get its own single connection. Better than the 6 connections from HTTP/1 but for very sharded sites that means a lot of connections even in the HTTP/2 case.

curl also doesn’t coalesce anything (yet).

Surprises and a way to mitigate them

Given some comments in the Firefox bugzilla, the aggressive coalescing sometimes causes some surprises. Especially when you have for example one IPv6-only host A and a second host B with both IPv4 and IPv4 addresses. Asking for data on host A can then still use IPv4 when it reuses a connection to B (assuming that host A covers host B in its cert).

In the rare case where a server gets a resource request for an authority (or scheme) it can’t serve, there’s a dedicated error code 421 in HTTP/2 that it can respond with and the browser can then  go back and retry that request on another connection.

Starts out with 6 anyway

Before the browser knows that the server speaks HTTP/2, it may fire up 6 connection attempts so that it is prepared to get the remote site at full speed. Once it figures out that it doesn’t need all those connections, it will kill off the unnecessary unused ones and over time trickle down to one. Of course, on subsequent connections to the same origin the client may have the version information cached so that it doesn’t have to start off presuming HTTP/1.

A third day of deep HTTP inspection

The workshop roomThis fine morning started off with some news: Patrick is now our brand new official co-chair of the IETF HTTPbis working group!

Subodh then sat down and took us off on a presentation that really triggered a long and lively discussion. “Retry safety extensions” was his name of it but it involved everything from what browsers and HTTP clients do for retrying with no response and went on to also include replaying problems for 0-RTT protocols such as TLS 1.3.

Julian did a short presentation on http headers and his draft for JSON in new headers and we quickly fell down a deep hole of discussions around various formats with ups and downs on them all. The general feeling seems to be that JSON will not be a good idea for headers in spite of a couple of good characteristics, partly because of its handling of duplicate field entries and how it handles or doesn’t handle numerical precision (ie you can send “100” as a monstrously large floating point number).

Mike did a presentation he called “H2 Regrets” in which he covered his work on a draft for support of client certs which was basically forbidden due to h2’s ban of TLS renegotiation, he brought up the idea of extended settings and discussed the lack of special handling dates in HTTP headers (why we send 29 bytes instead of 4). Shows there are improvements to be had in the future too!

Martin talked to us about Blind caching and how the concept of this works. Put very simply: it is a way to make it possible to offer cached content for clients using HTTPS, by storing the data in a 3rd host and pointing out that data to the client. There was a lengthy discussion around this and I think one of the outstanding questions is if this feature is really giving as much value to motivate the rather high cost in complexity…

The list of remaining Lightning Talks had grown to 10 talks and we fired them all off at a five minutes per topic pace. I brought up my intention and hope that we’ll do a QUIC library soon to experiment with. I personally particularly enjoyed EKR’s TLS 1.3 status summary. I heard appreciation from others and I agree with this that the idea to feature lightning talks was really good.

With this, the HTTP Workshop 2016 was officially ended. There will be a survey sent out about this edition and what people want to do for the next/future ones, and there will be some sort of  report posted about this event from the organizers, summarizing things.

Attendees numbers

http workshopThe companies with most attendees present here were: Mozilla 5, Google 4, Facebook, Akamai and Apple 3.

The attendees were from the following regions of the world: North America 19, Europe 15, Asia/pacific 6.

38 participants were male and 2 female.

23 of us were also at the 2015 workshop, 17 were newcomers.

15 people did lightning talks.

I believe 40 is about as many as you can put in a single room and still have discussions. Going larger will make it harder to make yourself heard as easily and would probably force us to have to switch to smaller groups more and thus not get this sort of great dynamic flow. I’m not saying that we can’t do this smaller or larger, just that it would have to make the event different.

Some final words

I had an awesome few days and I loved all of it. It was a pleasure organizing this and I’m happy that Stockholm showed its best face weather wise during these days. I was also happy to hear that so many people enjoyed their time here in Sweden. The hotel and its facilities, including food and coffee etc worked out smoothly I think with no complaints at all.

Hope to see again on the next HTTP Workshop!

Workshop day two

HTTP Workshop At 5pm we rounded off another fully featured day at the HTTP workshop. Here’s some of what we touched on today:

Moritz started the morning with an interesting presentation about experiments with running the exact same site and contents on h1 vs h2 over different kinds of networks, with different packet loss scenarios and with different ICWND set and more. Very interesting stuff. If he makes his presentation available at some point I’ll add a link to it.

I then got the honor to present the state of the TCP Tuning draft (which I’ve admittedly been neglecting a bit lately), the slides are here. I made it brief but I still got some feedback and in general this is a draft that people seem to agree is a good idea – keep sending me your feedback and help me improve it. I just need to pull myself together now and move it forward. I tried to be quick to leave over to…

Jana, who was back again to tell us about QUIC and the state of things in that area. His presentation apparently was a subset of slides he presented last week in the Berlin IETF. One interesting take-away for me, was that they’ve noticed that the amount of connections for which they detect UDP rate limiting on, has decreased with 2/3 during the last year!

Here’s my favorite image from his slide set. Apparently TCP/2 is not a name for QUIC that everybody appreciates! ;-)

call-it-tcp2-one-more-time

While I think the topic of QUIC piqued the interest of most people in the room and there were a lot of questions, thoughts and ideas around the topic we still managed to get the lunch break pretty much in time and we could run off and have another lovely buffet lunch. There’s certainly no risk for us loosing weight during this event…

After lunch we got ourselves a series of Lightning talks presented for us. Seven short talks on various subjects that people had signed up to do

One of the lightning talks that stuck with me was what I would call the idea about an extended Happy Eyeballs approach that I’d like to call Even Happier Eyeballs: make the client TCP connect to all IPs in a DNS response and race them against each other and use the one that responds with a SYN-ACK first. There was interest expressed in the room to get this concept tested out for real in at least one browser.

We then fell over into the area of HTTP/3 ideas and what the people in the room think we should be working on for that. It turned out that the list of stuff we created last year at the workshop was still actually a pretty good list and while we could massage that a bit, it is still mostly the same as before.

Anne presented fetch and how browsers use HTTP. Perhaps a bit surprising that soon brought us over into the subject of trailers, how to support that and voilá, in the end we possibly even agreed that we should perhaps consider handling them somehow in browsers and even for javascript APIs… ( nah, curl/libcurl doesn’t have any particular support for trailers, but will of course get that if we’ll actually see things out there start to use it for real)

I think we deserved a few beers after this day! The final workshop day is tomorrow.

A workshop Monday

http workshopI decided I’d show up a little early at the Sheraton as I’ve been handling the interactions with hotel locally here in Stockholm where the workshop will run for the coming three days. Things were on track, if we ignore how they got the wrong name of the workshop on the info screens in the lobby, instead saying “Haxx Ab”…

Mark welcomed us with a quick overview of what we’re here for and quick run-through of the rough planning for the days. Our schedule is deliberately loose and open to allow for changes and adaptations as we go along.

Patrick talked about the 1 1/2 years of HTTP/2 working in Firefox so far, and we discussed a lot around the numbers and telemetry. What do they mean and why do they look like this etc. HTTP/2 is now at 44% of all HTTPS requests and connections using HTTP/2 are used for more than 8 requests on median (compared to slightly over 1 in the HTTP/1 case). What’s almost not used at all? HTTP/2 server push, Alt-Svc and HTTP 308 responses. Patrick’s presentation triggered a lot of good discussions. His slides are here.

RTT distribution for Firefox running on desktop and mobile, from Patrick’s slide set:

rtt-dist

The lunch was lovely.

Vlad then continued to talk about experiences from implementing and providing server push at Cloudflare. It and the associated discussions helped emphasize that we need better help for users on how to use server push and there might be reasons for browsers to change how they are stored in the current “secondary cache”. Also, discussions around how to access pushed resources and get information about pushes from javascript were briefly touched on.

After a break with some sweets and coffee, Kazuho continued to describe cache digests and how this concept can help making servers do better or more accurate server pushes. Back to more discussions around push and what it actually solved, how much complexity it is worth and so on. I thought I could sense hesitation in the room on whether this is really something to proceed with.

We intend to have a set of lightning talks after lunch each day and we have already have twelve such suggested talks listed in the workshop wiki, but the discussions were so lively and extensive that we missed them today and we even had to postpone the last talk of today until tomorrow. I can already sense how these three days will not be enough for us to cover everything we have listed and planned…

We ended the evening with a great dinner sponsored by Mozilla. I’d say it was a great first day. I’m looking forward to day 2!

curl wants to QUIC

The interesting Google transfer protocol that is known as QUIC is being passed through the IETF grinding machines to hopefully end up with a proper “spec” that has been reviewed and agreed to by many peers and that will end up being a protocol that is thoroughly documented with a lot of protocol people’s consensus. Follow the IETF QUIC mailing list for all the action.

I’d like us to join the fun

Similarly to how we implemented HTTP/2 support early on for curl, I would like us to get “on the bandwagon” early for QUIC to be able to both aid the protocol development and serve as a testing tool for both the protocol and the server implementations but then also of course to get us a solid implementation for users who’d like a proper QUIC capable client for data transfers.

implementations

The current version (made entirely by Google and not the output of the work they’re now doing on it within the IETF) of the QUIC protocol is already being widely used as Chrome speaks it with Google’s services in preference to HTTP/2 and other protocol options. There exist only a few other implementations of QUIC outside of the official ones Google offers as open source. Caddy offers a separate server implementation for example.

the Google code base

For curl’s sake, it can’t use the Google code as a basis for a QUIC implementation since it is C++ and code used within the Chrome browser is really too entangled with the browser and its particular environment to become very good when converted into a library. There’s a libquic project doing exactly this.

for curl and others

The ideal way to implement QUIC for curl would be to create “nghttp2” alternative that does QUIC. An ngquic if you will! A library that handles the low level protocol fiddling, the binary framing etc. Done that way, a QUIC library could be used by more projects who’d like QUIC support and all people who’d like to see this protocol supported in those tools and libraries could join in and make it happen. Such a library would need to be written in plain C and be suitably licensed for it to be really interesting for curl use.

a needed QUIC library

I’m hoping my post here will inspire someone to get such a project going. I will not hesitate to join in and help it get somewhere! I haven’t started such a project myself because I think I already have enough projects on my plate so I fear I wouldn’t be a good leader or maintainer of a project like this. But of course, if nobody else will do it I will do it myself eventually. If I can think of a good name for it.

some wishes for such a library

  • Written in C, to offer the same level of portability as curl itself and to allow it to get used as extensions by other languages etc
  • FOSS-licensed suitably
  • It should preferably not “own” the socket but also work in-memory and to allow applications to do many parallel connections etc.
  • Non-blocking. It shouldn’t wait for things on its own but let the application do that.
  • Should probably offer both client and server functionality for maximum use.
  • What else?

No websockets over HTTP/2

There is no websockets for HTTP/2.

By this, I mean that there’s no way to negotiate or upgrade a connection to websockets over HTTP/2 like there is for HTTP/1.1 as expressed by RFC 6455. That spec details how a client can use Upgrade: in a HTTP/1.1 request to switch that connection into a websockets connection.

Note that websockets is not part of the HTTP/1 spec, it just uses a HTTP/1 protocol detail to switch an HTTP connection into a websockets connection. Websockets over HTTP/2 would similarly not be a part of the HTTP/2 specification but would be separate.

(As a side-note, that Upgrade: mechanism is the same mechanism a HTTP/1.1 connection can get upgraded to HTTP/2 if the server supports it – when not using HTTPS.)

chinese-socket

Draft

There’s was once a draft submitted that describes how websockets over HTTP/2 could’ve been done. It didn’t get any particular interest in the IETF HTTP working group back then and as far as I’ve seen, there has been very little general interest in any group to pick up this dropped ball and continue running. It just didn’t go any further.

This is important: the lack of websockets over HTTP/2 is because nobody has produced a spec (and implementations) to do websockets over HTTP/2. Those things don’t happen by themselves, they actually require a bunch of people and implementers to believe in the cause and work for it.

Websockets over HTTP/2 could of course have the benefit that it would only be one stream over the connection that could serve regular non-websockets traffic at the same time in many other streams, while websockets upgraded on a HTTP/1 connection uses the entire connection exclusively.

Instead

So what do users do instead of using websockets over HTTP/2? Well, there are several options. You probably either stick to HTTP/2, upgrade from HTTP/1, use Web push or go the WebRTC route!

If you really need to stick to websockets, then you simply have to upgrade to that from a HTTP/1 connection – just like before. Most people I’ve talked to that are stuck really hard on using websockets are app developers that basically only use a single connection anyway so doing that HTTP/1 or HTTP/2 makes no meaningful difference.

Sticking to HTTP/2 pretty much allows you to go back and use the long-polling tricks of the past before websockets was created. They were once rather bad since they would waste a connection and be error-prone since you’d have a connection that would sit idle most of the time. Doing this over HTTP/2 is much less of a problem since it’ll just be a single stream that won’t be used that much so it isn’t that much of a waste. Plus, the connection may very well be used by other streams so it will be less of a problem with idle connections getting killed by NATs or firewalls.

The Web Push API was brought by W3C during 2015 and is in many ways a more “webby” way of doing push than the much more manual and “raw” method that websockets is. If you use websockets mostly for push notifications, then this might be a more convenient choice.

Also introduced after websockets, is WebRTC. This is a technique introduced for communication between browsers, but it certainly provides an alternative to some of the things websockets were once used for.

Future

Websockets over HTTP/2 could still be done. The fact that it isn’t done just shows that there isn’t enough interest.

Non-TLS

Recall how browsers only speak HTTP/2 over TLS, while websockets can also be done over plain TCP. In fact, the only way to upgrade a HTTP connection to websockets is using the HTTP/1 Upgrade: header trick, and not the ALPN method for TLS that HTTP/2 uses to reduce the number of round-trips required.

If anyone would introduce websockets over HTTP/2, they would then probably only be possible to be made over TLS from within browsers.

curl 7.49.0 goodies coming

Here’s a closer look at three new features that we’re shipping in curl and libcurl 7.49.0, to be released on May 18th 2016.

connect to this instead

If you’re one of the users who thought --resolve and doing Host: header tricks with --header weren’t good enough, you’ll appreciate that we’re adding yet another option for you to fiddle with the connection procedure. Another “Swiss army knife style” option for you who know what you’re doing.

With --connect-to you basically provide an internal alias for a certain name + port to instead internally use another name + port to connect to.

Instead of connecting to HOST1:PORT1, connect to HOST2:PORT2

It is very similar to --resolve which is a way to say: when connecting to HOST1:PORT1 use this ADDR2:PORT2. --resolve effectively prepopulates the internal DNS cache and makes curl completely avoid the DNS lookup and instead feeds it with the IP address you’d like it to use.

--connect-to doesn’t avoid the DNS lookup, but it will make sure that a different host name and destination port pair is used than what was found in the URL. A typical use case for this would be to make sure that your curl request asks a specific server out of several in a pool of many, where each has a unique name but you normally reach them with a single URL who’s host name is otherwise load balanced.

--connect-to can be specified multiple times to add mappings for multiple names, so that even following HTTP redirects to other host names etc can be handled. You don’t even necessarily have to redirect the first used host name.

The libcurl option name for for this feature is CURLOPT_CONNECT_TO.

Michael Kaufmann brought this feature.

http2 prior knowledge

In our ongoing quest to provide more and better HTTP/2 support in a world that is slowly but steadily doing more and more transfers over the new version of the protocol, curl now offers --http2-prior-knowledge.

As the name might hint, this is a way to tell curl that you have “prior knowledge” that the URL you specifies goes to a host that you know supports HTTP/2. The term prior knowledge is in fact used in the HTTP/2 spec (RFC 7540) for this scenario.

Normally when given a HTTP:// or a HTTPS:// URL, there will be no assumption that it supports HTTP/2 but curl when then try to upgrade that from version HTTP/1. The command line tool tries to upgrade all HTTPS:// URLs by default even, and libcurl can be told to do so.

libcurl wise, you ask for a prior knowledge use by setting CURLOPT_HTTP_VERSION to CURL_HTTP_VERSION_2_PRIOR_KNOWLEDGE.

Asking for http2 prior knowledge when the server does in fact not support HTTP/2 will give you an error back.

Diego Bes brought this feature.

TCP Fast Open

TCP Fast Open is documented in RFC 7413 and is basically a way to pass on data to the remote machine earlier in the TCP handshake – already in the SYN and SYN-ACK packets. This of course as a means to get data over faster and reduce latency.

The --tcp-fastopen option is supported on Linux and OS X only for now.

This is an idea and technique that has been around for a while and it is slowly getting implemented and supported by servers. There have been some reports of problems in the wild when “middle boxes” that fiddle with TCP traffic see these packets, that sometimes result in breakage. So this option is opt-in to avoid the risk that it causes problems to users.

A typical real-world case where you would use this option is when  sending an HTTP POST to a site you don’t have a connection already established to. Just note that TFO relies on the client having had contact established with the server before and having a special TFO “cookie” stored and non-expired.

TCP Fast Open is so far only used for clear-text TCP protocols in curl. These days more and more protocols switch over to their TLS counterparts (and there’s room for future improvements to add the initial TLS handshake parts with TFO). A related option to speed up TLS handshakes is --false-start (supported with the NSS or the secure transport backends).

With libcurl, you enable TCP Fast Open with CURLOPT_TCP_FASTOPEN.

Alessandro Ghedini brought this feature.

HTTP/2 in April 2016

On April 12 I had the pleasure of doing another talk in the Google Tech Talk series arranged in the Google Stockholm offices. I had given it the title “HTTP/2 is upon us, and here’s what you need to know about it.” in the invitation.

The room seated 70 persons but we had the amazing amount of over 300 people in the waiting line who unfortunately didn’t manage to get a seat. To those, and to anyone else who cares, here’s the video recording of the event.

If you’ve seen me talk about HTTP/2 before, you might notice that I’ve refreshed the material somewhat since before.

Summers are for HTTP

stockholm castle and ship
Stockholm City, as photographed by Michael Caven

In July 2015, 40-something HTTP implementers and experts of the world gathered in the city of Münster, Germany, to discuss nitty gritty details about the HTTP protocol during four intense days. Representatives for major browsers, other well used HTTP tools and the most popular HTTP servers were present. We discussed topics like how HTTP/2 had done so far, what we thought we should fix going forward and even some early blue sky talk about what people could potentially see being subjects to address in a future HTTP/3 protocol.

You can relive the 2015 version somewhat from my daily blog entries from then that include a bunch of details of what we discussed: day one, two, three and four.

http workshopThe HTTP Workshop was much appreciated by the attendees and it is now about to be repeated. In the summer of 2016, the HTTP Workshop is again taking place in Europe, but this time as a three-day event slightly further up north: in the capital of Sweden and my home town: Stockholm. During 25-27 July 2016, we intend to again dig in deep.

If you feel this is something for you, then please head over to the workshop site and submit your proposal and show your willingness to attend. This year, I’m also joining the Program Committee and I’ve signed up for arranging some of the local stuff required for this to work out logistically.

The HTTP Workshop 2015 was one of my favorite events of last year. I’m now eagerly looking forward to this year’s version. It’ll be great to meet you here!

Stockholm
The city of Stockholm in summer sunshine