Tag Archives: HTTP2

Post FOSDEM 2017

I attended FOSDEM again in 2017 and it was as intense, chaotic and wonderful as ever. I met old friends, got new friends and I got to test a whole range of Belgian beers. Oh, and there was also a set of great open source related talks to enjoy!

On Saturday at 2pm I delivered my talk on curl in the main track in the almost frighteningly large room Janson. I estimate that it was almost half full, which would mean upwards 700 people in the audience. The talk itself went well. I got audible responses from the audience several times and I kept well within my given time with time over for questions. The trickiest problem was the audio from the people who asked questions because it wasn’t at all very easy to hear, while the audio is great for the audience and in the video recording. Slightly annoying because as everyone else heard, it made me appear half deaf. Oh well. I got great questions both then and from people approaching me after the talk. The questions and the feedback I get from a talk is really one of the things that makes me appreciate talking the most.

The video of the talk is available, and the slides can also be viewed.

So after I had spent some time discussing curl things and handing out many stickers after my talk, I managed to land in the cafeteria for a while until it was time for me to once again go and perform.

We’re usually a team of friends that hang out during FOSDEM and we all went over to the Mozilla room to be there perhaps 20 minutes before my talk was scheduled and wow, there was a huge crowd outside of that room already waiting by the time we arrived. When the doors then finally opened (about 10 minutes before my talk started), I had to zigzag my way through to get in, and there was a large amount of people who didn’t get in. None of my friends from the cafeteria made it in!

The Mozilla devroom had 363 seats, not a single one was unoccupied and there was people standing along the sides and the back wall. So, an estimated nearly 400 persons in that room saw me speak about HTTP/2 deployments numbers right now, how HTTP/2 doesn’t really work well under 2% packet loss situations and then a bit about how QUIC can solve some of that and what QUIC is and when we might see the first experiments coming with IETF-QUIC – which really isn’t the same as Google-QUIC was.

To be honest, it is hard to deliver a talk in twenty minutes and I  was only 30 seconds over my time. I got questions and after the talk I spent a long time talking with people about HTTP, HTTP/2, QUIC, curl and the future of Internet protocols and transports. Very interesting.

The video of my talk can be seen, and the slides are online too.

I’m not sure if I was just unusually unlucky in my choices, or if there really was more people this year, but I experienced that “FULL” sign more than usual this year.

I fully intend to return again next year. Who knows, maybe I’ll figure out something to talk about then too. See you there?

QUIC is h2 over UDP

The third day of the QUIC interim passed and now that meeting has ended. It continued to work very well to attend from remote and the group manged to plow through an extensive set of issues. A lot of consensus was achieved and I personally now have a much better feel for the protocol and many of its details thanks to the many discussions.

The drafts are still a bit too early for us to start discussing inter-op for real. But there were mentions and hopes expressed that maybe maybe we might start to see some of that by mid 2017. When we did HTTP/2, we had about 10 different implementations by the time draft-04 was out. I suspect we will see a smaller set for QUIC simply because of it being much more complex.

The next interim is planned to occur in the beginning of June in Europe.

There is an official QUIC logo being designed, but it is not done yet so you still need to imagine one placed here.

QUIC needs HTTP/2 needs HTTP/1

QUIC is primarily designed to send and receive HTTP/2 frames and entire streams over UDP (not only, but this is where the bulk of the work has been put in so far). Sure, TLS encrypted and everything, but my point here is that it is being designed to transfer HTTP/2 frames. You remember how HTTP/2 is “just a new framing” layer that changes how HTTP is sent over the wire, but when “decoded” again in the receiving end it is in most important aspects still HTTP/1 there. You have to implement most of a HTTP/1 stack in order to support HTTP/2. Now QUIC adds another layer to that. QUIC is a new way to send HTTP/2 frames over the network.

A QUIC stack needs to handle most aspects of HTTP/2!

Of course, there are notable differences and changes to some underlying principles that makes QUIC a bit different. It isn’t exactly HTTP/2 over secure UDP. Let me give you a few examples…

Streams are more independent

Packets sent over the wire with UDP are independent from each other to a very large degree. In order to avoid Head-of-Line blocking (HoL), packets that are lost and re-transmitted will only block the particular streams to which the lost packets belong. The other streams can keep flowing, unaware and uncaring.

Thanks to the nature of the Internet and how packets are handled, it is not unusual for network packets to arrive in a slightly different order than they were sent, even when they aren’t exactly “lost”.

So, streams in HTTP/2 were entirely synced and the order the sender of frames use, will be the exact same order in which the frames arrive in the other end. Packet loss or not.

In QUIC, individual frames and entire streams may arrive in the receiver in a different order than what was used in the sender.

Stream ID gaps means open

When receiving a QUIC packet, there’s basically no way to know if there are packets missing that were intended to arrive but got lost and haven’t yet been re-transmitted.

If a frame is received that uses the new stream ID N (a stream not previously seen), the receiver is then forced to assume that all the other streams ID from our previously highest ID to N are all just missing and will arrive soon. They are then presumed to exist!

In HTTP/2, we could handle gaps in stream IDs much differently because of TCP. Then a gap is known to be deliberate.

Some h2 frames are done by QUIC

Since QUIC is designed with streams, flow control and more and is used to send HTTP/2 frames over them, some of the h2 frames aren’t needed but are instead handled by the transport layer within QUIC and won’t show up in the HTTP/2 layer.

HPACK goes QPACK?

HPACK is the header compression system used in HTTP/2. Among other things it features a dictionary that you manipulate with instructions and then subsequent header frames can refer to those dictionary indexes instead of sending the full header. Header frame one says “insert my user-agent string” and then header frame two can refer back to the index in the dictionary for where that identical user-agent string is stored.

Due to the out of order streams in QUIC, this dictionary treatment is harder. The second header frame could arrive before the first, so if it would refer to an index set in the first header frame, it would have to block the entire stream until that first header arrives.

HPACK also has a concept of just adding things to the dictionary without specifying the index, and since both sides are in perfect sync it works just fine. In QUIC, if we want to maintain the independence of streams and avoid blocking to the highest degree, we need to instead specify exact indexes to use and not assume perfect sync.

This (and more) are reasons why QPACK is being suggested as a replacement for HPACK when HTTP/2 header frames are sent over QUIC.

My talks at FOSDEM 2017

I couldn’t even recall how many times I’ve done this already, but in 2017 I am once again showing up in the cold and grey city called Brussels and the lovely FOSDEM conference, to talk. (Yes, it is cold and grey every February, trust me.) So I had to go back and count, and it turns out 2017 will become my 8th straight visit to FOSDEM and I believe it is the 5th year I’ll present there.First, a reminder about what I talked about at FOSDEM 2016: An HTTP/2 update. There’s also a (rather low quality) video recording of the talk to see there.

I’m scheduled for two presentations in 2017, and this year I’m breaking new ground for myself as I’m doing one of them on the “main track” which is the (according to me) most prestigious track held in one of the biggest rooms – seating more than 1,400 persons.

You know what’s cool? Running on billions of devices

Room: Janson, time: Saturday 14:00

Thousands of contributors help building the curl software which runs on several billions of devices and are affecting every human in the connected world daily. How this came to happen, who contributes and how Daniel at the wheel keeps it all together. How a hacking ring is actually behind it all and who funds this entire operation.

So that was HTTP/2, what’s next?

Room: UD2.218A, time: Saturday 16:30

A shorter recap on what HTTP/2 brought that HTTP/1 couldn’t offer before we dig in and look at some numbers that show how HTTP/2 has improved (browser) networking and the web experience for people.

Still, there are scenarios where HTTP/1’s multiple connections win over HTTP/2 in performance tests. Why is that and what is being done about it? Wasn’t HTTP/2 supposed to be the silver bullet?

A closer look at QUIC, its promises to fix the areas where HTTP/2 didn’t deliver and a check on where it is today. Is QUIC perhaps actually HTTP/3 in everything but the name?

Depending on what exactly happens in this area over time until FOSDEM, I will spice it up with more details on how we work on these protocol things in Mozilla/Firefox.

This will become my 3rd year in a row that I talk in the Mozilla devroom to present the state of the HTTP protocol and web transport.

1,000,000 sites run HTTP/2

… out of the top ten million sites that is. So there’s at least that many, quite likely a few more.

This is according to w3techs who runs checks daily. Over the last few months, there have been about 50,000 new sites per month switching it on.

ht2-10-percent

It also shows that the HTTP/2 ratio has increased from a little over 1% deployment a year ago to the 10% today.

HTTP/2 gets more used the more  popular site it is. Among the top 1,000 sites on the web, more than 20% of them use HTTP/2. HTTP/2 also just recently (September 9) overcame SPDY among the top-1000 most popular sites.

h2-sep28

On September 7, Amazon announced their CloudFront service having enabled HTTP/2, which could explain an adoption boost over the last few days. New CloudFront users get it enabled by default but existing users actually need to go in and click a checkbox to make it happen.

As the web traffic of the world is severely skewed toward the top ones, we can be sure that a significantly larger share than 10% of the world’s HTTPS traffic is using version 2.

Recent usage stats in Firefox shows that HTTP/2 is used in half of all its HTTPS requests!

http2

HTTP/2 connection coalescing

Section 9.1.1 in RFC7540 explains how HTTP/2 clients can reuse connections. This is my lengthy way of explaining how this works in reality.

Many connections in HTTP/1

With HTTP/1.1, browsers are typically using 6 connections per origin (host name + port). They do this to overcome the problems in HTTP/1 and how it uses TCP – as each connection will do a fair amount of waiting. Plus each connection is slow at start and therefore limited to how much data you can get and send quickly, you multiply that data amount with each additional connection. This makes the browser get more data faster (than just using one connection).

6 connections

Add sharding

Web sites with many objects also regularly invent new host names to trigger browsers to use even more connections. A practice known as “sharding”. 6 connections for each name. So if you instead make your site use 4 host names you suddenly get 4 x 6 = 24 connections instead. Mostly all those host names resolve to the same IP address in the end anyway, or the same set of IP addresses. In reality, some sites use many more than just 4 host names.

24 connections

The sad reality is that a very large percentage of connections used for HTTP/1.1 are only ever used for a single HTTP request, and a very large share of the connections made for HTTP/1 are so short-lived they actually never leave the slow start period before they’re killed off again. Not really ideal.

One connection in HTTP/2

With the introduction of HTTP/2, the HTTP clients of the world are going toward using a single TCP connection for each origin. The idea being that one connection is better in packet loss scenarios, it makes priorities/dependencies work and reusing that single connections for many more requests will be a net gain. And as you remember, HTTP/2 allows many logical streams in parallel over that single connection so the single connection doesn’t limit what the browsers can ask for.

Unsharding

The sites that created all those additional host names to make the HTTP/1 browsers use many connections now work against the HTTP/2 browsers’ desire to decrease the number of connections to a single one. Sites don’t want to switch back to using a single host name because that would be a significant architectural change and there are still a fair number of HTTP/1-only browsers still in use.

Enter “connection coalescing”, or “unsharding” as we sometimes like to call it. You won’t find either term used in RFC7540, as it merely describes this concept in terms of connection reuse.

Connection coalescing means that the browser tries to determine which of the remote hosts that it can reach over the same TCP connection. The different browsers have slightly different heuristics here and some don’t do it at all, but let me try to explain how they work – as far as I know and at this point in time.

Coalescing by example

Let’s say that this cool imaginary site “example.com” has two name entries in DNS: A.example.com and B.example.com. When resolving those names over DNS, the client gets a list of IP address back for each name. A list that very well may contain a mix of IPv4 and IPv6 addresses. One list for each name.

You must also remember that HTTP/2 is also only ever used over HTTPS by browsers, so for each origin speaking HTTP/2 there’s also a corresponding server certificate with a list of names or a wildcard pattern for which that server is authorized to respond for.

In our example we start out by connecting the browser to A. Let’s say resolving A returns the IPs 192.168.0.1 and 192.168.0.2 from DNS, so the browser goes on and connects to the first of those addresses, the one ending with “1”. The browser gets the server cert back in the TLS handshake and as a result of that, it also gets a list of host names the server can deal with: A.example.com and B.example.com. (it could also be a wildcard like “*.example.com”)

If the browser then wants to connect to B, it’ll resolve that host name too to a list of IPs. Let’s say 192.168.0.2 and 192.168.0.3 here.

Host A: 192.168.0.1 and 192.168.0.2
Host B: 192.168.0.2 and 192.168.0.3

Now hold it. Here it comes.

The Firefox way

Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.

one connection

The Chrome way

Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection.  If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.

Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.

The Edge and Safari ways

They don’t do coalescing at all, so each host name will get its own single connection. Better than the 6 connections from HTTP/1 but for very sharded sites that means a lot of connections even in the HTTP/2 case.

curl also doesn’t coalesce anything (yet).

Surprises and a way to mitigate them

Given some comments in the Firefox bugzilla, the aggressive coalescing sometimes causes some surprises. Especially when you have for example one IPv6-only host A and a second host B with both IPv4 and IPv4 addresses. Asking for data on host A can then still use IPv4 when it reuses a connection to B (assuming that host A covers host B in its cert).

In the rare case where a server gets a resource request for an authority (or scheme) it can’t serve, there’s a dedicated error code 421 in HTTP/2 that it can respond with and the browser can then  go back and retry that request on another connection.

Starts out with 6 anyway

Before the browser knows that the server speaks HTTP/2, it may fire up 6 connection attempts so that it is prepared to get the remote site at full speed. Once it figures out that it doesn’t need all those connections, it will kill off the unnecessary unused ones and over time trickle down to one. Of course, on subsequent connections to the same origin the client may have the version information cached so that it doesn’t have to start off presuming HTTP/1.

A third day of deep HTTP inspection

The workshop roomThis fine morning started off with some news: Patrick is now our brand new official co-chair of the IETF HTTPbis working group!

Subodh then sat down and took us off on a presentation that really triggered a long and lively discussion. “Retry safety extensions” was his name of it but it involved everything from what browsers and HTTP clients do for retrying with no response and went on to also include replaying problems for 0-RTT protocols such as TLS 1.3.

Julian did a short presentation on http headers and his draft for JSON in new headers and we quickly fell down a deep hole of discussions around various formats with ups and downs on them all. The general feeling seems to be that JSON will not be a good idea for headers in spite of a couple of good characteristics, partly because of its handling of duplicate field entries and how it handles or doesn’t handle numerical precision (ie you can send “100” as a monstrously large floating point number).

Mike did a presentation he called “H2 Regrets” in which he covered his work on a draft for support of client certs which was basically forbidden due to h2’s ban of TLS renegotiation, he brought up the idea of extended settings and discussed the lack of special handling dates in HTTP headers (why we send 29 bytes instead of 4). Shows there are improvements to be had in the future too!

Martin talked to us about Blind caching and how the concept of this works. Put very simply: it is a way to make it possible to offer cached content for clients using HTTPS, by storing the data in a 3rd host and pointing out that data to the client. There was a lengthy discussion around this and I think one of the outstanding questions is if this feature is really giving as much value to motivate the rather high cost in complexity…

The list of remaining Lightning Talks had grown to 10 talks and we fired them all off at a five minutes per topic pace. I brought up my intention and hope that we’ll do a QUIC library soon to experiment with. I personally particularly enjoyed EKR’s TLS 1.3 status summary. I heard appreciation from others and I agree with this that the idea to feature lightning talks was really good.

With this, the HTTP Workshop 2016 was officially ended. There will be a survey sent out about this edition and what people want to do for the next/future ones, and there will be some sort of  report posted about this event from the organizers, summarizing things.

Attendees numbers

http workshopThe companies with most attendees present here were: Mozilla 5, Google 4, Facebook, Akamai and Apple 3.

The attendees were from the following regions of the world: North America 19, Europe 15, Asia/pacific 6.

38 participants were male and 2 female.

23 of us were also at the 2015 workshop, 17 were newcomers.

15 people did lightning talks.

I believe 40 is about as many as you can put in a single room and still have discussions. Going larger will make it harder to make yourself heard as easily and would probably force us to have to switch to smaller groups more and thus not get this sort of great dynamic flow. I’m not saying that we can’t do this smaller or larger, just that it would have to make the event different.

Some final words

I had an awesome few days and I loved all of it. It was a pleasure organizing this and I’m happy that Stockholm showed its best face weather wise during these days. I was also happy to hear that so many people enjoyed their time here in Sweden. The hotel and its facilities, including food and coffee etc worked out smoothly I think with no complaints at all.

Hope to see again on the next HTTP Workshop!

Workshop day two

HTTP Workshop At 5pm we rounded off another fully featured day at the HTTP workshop. Here’s some of what we touched on today:

Moritz started the morning with an interesting presentation about experiments with running the exact same site and contents on h1 vs h2 over different kinds of networks, with different packet loss scenarios and with different ICWND set and more. Very interesting stuff. If he makes his presentation available at some point I’ll add a link to it.

I then got the honor to present the state of the TCP Tuning draft (which I’ve admittedly been neglecting a bit lately), the slides are here. I made it brief but I still got some feedback and in general this is a draft that people seem to agree is a good idea – keep sending me your feedback and help me improve it. I just need to pull myself together now and move it forward. I tried to be quick to leave over to…

Jana, who was back again to tell us about QUIC and the state of things in that area. His presentation apparently was a subset of slides he presented last week in the Berlin IETF. One interesting take-away for me, was that they’ve noticed that the amount of connections for which they detect UDP rate limiting on, has decreased with 2/3 during the last year!

Here’s my favorite image from his slide set. Apparently TCP/2 is not a name for QUIC that everybody appreciates! 😉

call-it-tcp2-one-more-time

While I think the topic of QUIC piqued the interest of most people in the room and there were a lot of questions, thoughts and ideas around the topic we still managed to get the lunch break pretty much in time and we could run off and have another lovely buffet lunch. There’s certainly no risk for us loosing weight during this event…

After lunch we got ourselves a series of Lightning talks presented for us. Seven short talks on various subjects that people had signed up to do

One of the lightning talks that stuck with me was what I would call the idea about an extended Happy Eyeballs approach that I’d like to call Even Happier Eyeballs: make the client TCP connect to all IPs in a DNS response and race them against each other and use the one that responds with a SYN-ACK first. There was interest expressed in the room to get this concept tested out for real in at least one browser.

We then fell over into the area of HTTP/3 ideas and what the people in the room think we should be working on for that. It turned out that the list of stuff we created last year at the workshop was still actually a pretty good list and while we could massage that a bit, it is still mostly the same as before.

Anne presented fetch and how browsers use HTTP. Perhaps a bit surprising that soon brought us over into the subject of trailers, how to support that and voilá, in the end we possibly even agreed that we should perhaps consider handling them somehow in browsers and even for javascript APIs… ( nah, curl/libcurl doesn’t have any particular support for trailers, but will of course get that if we’ll actually see things out there start to use it for real)

I think we deserved a few beers after this day! The final workshop day is tomorrow.

A workshop Monday

http workshopI decided I’d show up a little early at the Sheraton as I’ve been handling the interactions with hotel locally here in Stockholm where the workshop will run for the coming three days. Things were on track, if we ignore how they got the wrong name of the workshop on the info screens in the lobby, instead saying “Haxx Ab”…

Mark welcomed us with a quick overview of what we’re here for and quick run-through of the rough planning for the days. Our schedule is deliberately loose and open to allow for changes and adaptations as we go along.

Patrick talked about the 1 1/2 years of HTTP/2 working in Firefox so far, and we discussed a lot around the numbers and telemetry. What do they mean and why do they look like this etc. HTTP/2 is now at 44% of all HTTPS requests and connections using HTTP/2 are used for more than 8 requests on median (compared to slightly over 1 in the HTTP/1 case). What’s almost not used at all? HTTP/2 server push, Alt-Svc and HTTP 308 responses. Patrick’s presentation triggered a lot of good discussions. His slides are here.

RTT distribution for Firefox running on desktop and mobile, from Patrick’s slide set:

rtt-dist

The lunch was lovely.

Vlad then continued to talk about experiences from implementing and providing server push at Cloudflare. It and the associated discussions helped emphasize that we need better help for users on how to use server push and there might be reasons for browsers to change how they are stored in the current “secondary cache”. Also, discussions around how to access pushed resources and get information about pushes from javascript were briefly touched on.

After a break with some sweets and coffee, Kazuho continued to describe cache digests and how this concept can help making servers do better or more accurate server pushes. Back to more discussions around push and what it actually solved, how much complexity it is worth and so on. I thought I could sense hesitation in the room on whether this is really something to proceed with.

We intend to have a set of lightning talks after lunch each day and we have already have twelve such suggested talks listed in the workshop wiki, but the discussions were so lively and extensive that we missed them today and we even had to postpone the last talk of today until tomorrow. I can already sense how these three days will not be enough for us to cover everything we have listed and planned…

We ended the evening with a great dinner sponsored by Mozilla. I’d say it was a great first day. I’m looking forward to day 2!

curl wants to QUIC

The interesting Google transfer protocol that is known as QUIC is being passed through the IETF grinding machines to hopefully end up with a proper “spec” that has been reviewed and agreed to by many peers and that will end up being a protocol that is thoroughly documented with a lot of protocol people’s consensus. Follow the IETF QUIC mailing list for all the action.

I’d like us to join the fun

Similarly to how we implemented HTTP/2 support early on for curl, I would like us to get “on the bandwagon” early for QUIC to be able to both aid the protocol development and serve as a testing tool for both the protocol and the server implementations but then also of course to get us a solid implementation for users who’d like a proper QUIC capable client for data transfers.

implementations

The current version (made entirely by Google and not the output of the work they’re now doing on it within the IETF) of the QUIC protocol is already being widely used as Chrome speaks it with Google’s services in preference to HTTP/2 and other protocol options. There exist only a few other implementations of QUIC outside of the official ones Google offers as open source. Caddy offers a separate server implementation for example.

the Google code base

For curl’s sake, it can’t use the Google code as a basis for a QUIC implementation since it is C++ and code used within the Chrome browser is really too entangled with the browser and its particular environment to become very good when converted into a library. There’s a libquic project doing exactly this.

for curl and others

The ideal way to implement QUIC for curl would be to create “nghttp2” alternative that does QUIC. An ngquic if you will! A library that handles the low level protocol fiddling, the binary framing etc. Done that way, a QUIC library could be used by more projects who’d like QUIC support and all people who’d like to see this protocol supported in those tools and libraries could join in and make it happen. Such a library would need to be written in plain C and be suitably licensed for it to be really interesting for curl use.

a needed QUIC library

I’m hoping my post here will inspire someone to get such a project going. I will not hesitate to join in and help it get somewhere! I haven’t started such a project myself because I think I already have enough projects on my plate so I fear I wouldn’t be a good leader or maintainer of a project like this. But of course, if nobody else will do it I will do it myself eventually. If I can think of a good name for it.

some wishes for such a library

  • Written in C, to offer the same level of portability as curl itself and to allow it to get used as extensions by other languages etc
  • FOSS-licensed suitably
  • It should preferably not “own” the socket but also work in-memory and to allow applications to do many parallel connections etc.
  • Non-blocking. It shouldn’t wait for things on its own but let the application do that.
  • Should probably offer both client and server functionality for maximum use.
  • What else?