Tag Archives: Firefox

HTTPS proxy with curl

Starting in version 7.52.0 (due to ship December 21, 2016), curl will support HTTPS proxies when doing network transfers, and by doing this it joins the small exclusive club of HTTP user-agents consisting of Firefox, Chrome and not too many others.

Yes you read this correctly. This is different than the good old HTTP proxy.

HTTPS proxy means that the client establishes a TLS connection to the proxy and then communicates over that, which is different to the normal and traditional HTTP proxy approach where the clients speak plain HTTP to the proxy.

Talking HTTPS to your proxy is a privacy improvement as it prevents people from snooping on your proxy communication. Even when using HTTPS over a standard HTTP proxy, there’s typically a setting up phase first that leaks information about where the connection is being made, user credentials and more. Not to mention that an HTTPS proxy makes HTTP traffic “safe” to and from the proxy. HTTPS to the proxy also enables clients to speak HTTP/2 more easily with proxies. (Even though HTTP/2 to the proxy is not yet supported in curl.)

In the case where a client wants to talk HTTPS to a remote server, when using a HTTPS proxy, it sends HTTPS through HTTPS.

Illustrating this concept with images. When using a traditional HTTP proxy, we connect initially to the proxy with HTTP in the clear, and then from then on the HTTPS makes it safe:

HTTP proxyto compare with the HTTPS proxy case where the connection is safe already in the first step:

HTTPS proxyThe access to the proxy is made over network A. That network has traditionally been a corporate network or within a LAN or something but we’re seeing more and more use cases where the proxy is somewhere on the Internet and then “Network A” is really huge. That includes use cases where the proxy for example compresses images or otherwise reduces bandwidth requirements.

Actual HTTPS connections from clients to servers are still done end to end encrypted even in the HTTP proxy case. HTTP traffic to and from the user to the web site however, will still be HTTPS protected to the proxy when a HTTPS proxy is used.

A complicated pull request

This awesome work was provided by Dmitry Kurochkin, Vasy Okhin, and Alex Rousskov. It was merged into master on November 24 in this commit.

Doing this sort of major change in the TLS area in curl code is a massive undertaking, much so because of the fact that curl supports getting built with one out of 11 or 12 different TLS libraries. Several of those are also system-specific so hardly any single developer can even build all these backends on his or hers own machines.

In addition to the TLS backend maze, curl and library also offers a huge amount of different options to control the TLS connection and handling. You can switch on and off features, provide certificates, CA bundles and more. Adding another layer of TLS pretty much doubles the amount of options since now you can tweak everything both in the TLS connection to the proxy as well as the one to the remote peer.

This new feature is supported with the OpenSSL, GnuTLS and NSS backends to start with.

Consider it experimental for now

By all means, go ahead and use it and torture the code and file issues for everything bad you see, but I think we make ourselves a service by considering this new feature set to be a bit experimental in this release.

New options

There’s a whole forest of new command line and libcurl options to control all the various aspects of the new TLS connection this introduces. Since it is a totally separate connection it gets a whole set of options that are basically identical to the server connection but with a –proxy prefix instead. Here’s a list:

  --proxy-cacert 
  --proxy-capath
  --proxy-cert
  --proxy-cert-type
  --proxy-ciphers
  --proxy-crlfile
  --proxy-insecure
  --proxy-key
  --proxy-key-type
  --proxy-pass
  --proxy-ssl-allow-beast
  --proxy-sslv2
  --proxy-sslv3
  --proxy-tlsv1
  --proxy-tlsuser
  --proxy-tlspassword
  --proxy-tlsauthtype

curl and TLS 1.3

Draft 18 of the TLS version 1.3 spec was publiSSL padlockshed at the end of October 2016.

Already now, both Firefox and Chrome have test versions out with TLS 1.3 enabled. Firefox 52 will have it by default, and while Chrome will ship it, I couldn’t figure out exactly when we can expect it to be there by default.

Over the last few days we’ve merged TLS 1.3 support to curl, primarily in this commit by Kamil Dudka. Both the command line tool and libcurl will negotiate TLS 1.3 in the next version (7.52.0 – planned release date at the end of December 2016) if built with a TLS library that supports it and told to do it by the user.

The two current TLS libraries that will speak TLS 1.3 when built with curl right now is NSS and BoringSSL. The plan is to gradually adjust curl over time as the other libraries start to support 1.3 as well. As always we will appreciate your help in making this happen!

Right now, there’s also the minor flux in that servers and clients may end up running implementations of different draft versions of the TLS spec which contributes to a layer of extra fun!

Three TLS current 1.3 test servers to play with: https://enabled.tls13.com/ , https://www.allizom.org/ and https://tls13.crypto.mozilla.org/. I doubt any of these will give you any guarantees of functionality.

TLS 1.3 offers a few new features that allow clients such as curl to do subsequent TLS connections much faster, with only 1 or even 0 RTTs, but curl has no code for any of those features yet.

1,000,000 sites run HTTP/2

… out of the top ten million sites that is. So there’s at least that many, quite likely a few more.

This is according to w3techs who runs checks daily. Over the last few months, there have been about 50,000 new sites per month switching it on.

ht2-10-percent

It also shows that the HTTP/2 ratio has increased from a little over 1% deployment a year ago to the 10% today.

HTTP/2 gets more used the more  popular site it is. Among the top 1,000 sites on the web, more than 20% of them use HTTP/2. HTTP/2 also just recently (September 9) overcame SPDY among the top-1000 most popular sites.

h2-sep28

On September 7, Amazon announced their CloudFront service having enabled HTTP/2, which could explain an adoption boost over the last few days. New CloudFront users get it enabled by default but existing users actually need to go in and click a checkbox to make it happen.

As the web traffic of the world is severely skewed toward the top ones, we can be sure that a significantly larger share than 10% of the world’s HTTPS traffic is using version 2.

Recent usage stats in Firefox shows that HTTP/2 is used in half of all its HTTPS requests!

http2

HTTP/2 connection coalescing

Section 9.1.1 in RFC7540 explains how HTTP/2 clients can reuse connections. This is my lengthy way of explaining how this works in reality.

Many connections in HTTP/1

With HTTP/1.1, browsers are typically using 6 connections per origin (host name + port). They do this to overcome the problems in HTTP/1 and how it uses TCP – as each connection will do a fair amount of waiting. Plus each connection is slow at start and therefore limited to how much data you can get and send quickly, you multiply that data amount with each additional connection. This makes the browser get more data faster (than just using one connection).

6 connections

Add sharding

Web sites with many objects also regularly invent new host names to trigger browsers to use even more connections. A practice known as “sharding”. 6 connections for each name. So if you instead make your site use 4 host names you suddenly get 4 x 6 = 24 connections instead. Mostly all those host names resolve to the same IP address in the end anyway, or the same set of IP addresses. In reality, some sites use many more than just 4 host names.

24 connections

The sad reality is that a very large percentage of connections used for HTTP/1.1 are only ever used for a single HTTP request, and a very large share of the connections made for HTTP/1 are so short-lived they actually never leave the slow start period before they’re killed off again. Not really ideal.

One connection in HTTP/2

With the introduction of HTTP/2, the HTTP clients of the world are going toward using a single TCP connection for each origin. The idea being that one connection is better in packet loss scenarios, it makes priorities/dependencies work and reusing that single connections for many more requests will be a net gain. And as you remember, HTTP/2 allows many logical streams in parallel over that single connection so the single connection doesn’t limit what the browsers can ask for.

Unsharding

The sites that created all those additional host names to make the HTTP/1 browsers use many connections now work against the HTTP/2 browsers’ desire to decrease the number of connections to a single one. Sites don’t want to switch back to using a single host name because that would be a significant architectural change and there are still a fair number of HTTP/1-only browsers still in use.

Enter “connection coalescing”, or “unsharding” as we sometimes like to call it. You won’t find either term used in RFC7540, as it merely describes this concept in terms of connection reuse.

Connection coalescing means that the browser tries to determine which of the remote hosts that it can reach over the same TCP connection. The different browsers have slightly different heuristics here and some don’t do it at all, but let me try to explain how they work – as far as I know and at this point in time.

Coalescing by example

Let’s say that this cool imaginary site “example.com” has two name entries in DNS: A.example.com and B.example.com. When resolving those names over DNS, the client gets a list of IP address back for each name. A list that very well may contain a mix of IPv4 and IPv6 addresses. One list for each name.

You must also remember that HTTP/2 is also only ever used over HTTPS by browsers, so for each origin speaking HTTP/2 there’s also a corresponding server certificate with a list of names or a wildcard pattern for which that server is authorized to respond for.

In our example we start out by connecting the browser to A. Let’s say resolving A returns the IPs 192.168.0.1 and 192.168.0.2 from DNS, so the browser goes on and connects to the first of those addresses, the one ending with “1”. The browser gets the server cert back in the TLS handshake and as a result of that, it also gets a list of host names the server can deal with: A.example.com and B.example.com. (it could also be a wildcard like “*.example.com”)

If the browser then wants to connect to B, it’ll resolve that host name too to a list of IPs. Let’s say 192.168.0.2 and 192.168.0.3 here.

Host A: 192.168.0.1 and 192.168.0.2
Host B: 192.168.0.2 and 192.168.0.3

Now hold it. Here it comes.

The Firefox way

Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.

one connection

The Chrome way

Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection.  If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.

Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.

The Edge and Safari ways

They don’t do coalescing at all, so each host name will get its own single connection. Better than the 6 connections from HTTP/1 but for very sharded sites that means a lot of connections even in the HTTP/2 case.

curl also doesn’t coalesce anything (yet).

Surprises and a way to mitigate them

Given some comments in the Firefox bugzilla, the aggressive coalescing sometimes causes some surprises. Especially when you have for example one IPv6-only host A and a second host B with both IPv4 and IPv4 addresses. Asking for data on host A can then still use IPv4 when it reuses a connection to B (assuming that host A covers host B in its cert).

In the rare case where a server gets a resource request for an authority (or scheme) it can’t serve, there’s a dedicated error code 421 in HTTP/2 that it can respond with and the browser can then  go back and retry that request on another connection.

Starts out with 6 anyway

Before the browser knows that the server speaks HTTP/2, it may fire up 6 connection attempts so that it is prepared to get the remote site at full speed. Once it figures out that it doesn’t need all those connections, it will kill off the unnecessary unused ones and over time trickle down to one. Of course, on subsequent connections to the same origin the client may have the version information cached so that it doesn’t have to start off presuming HTTP/1.

A workshop Monday

http workshopI decided I’d show up a little early at the Sheraton as I’ve been handling the interactions with hotel locally here in Stockholm where the workshop will run for the coming three days. Things were on track, if we ignore how they got the wrong name of the workshop on the info screens in the lobby, instead saying “Haxx Ab”…

Mark welcomed us with a quick overview of what we’re here for and quick run-through of the rough planning for the days. Our schedule is deliberately loose and open to allow for changes and adaptations as we go along.

Patrick talked about the 1 1/2 years of HTTP/2 working in Firefox so far, and we discussed a lot around the numbers and telemetry. What do they mean and why do they look like this etc. HTTP/2 is now at 44% of all HTTPS requests and connections using HTTP/2 are used for more than 8 requests on median (compared to slightly over 1 in the HTTP/1 case). What’s almost not used at all? HTTP/2 server push, Alt-Svc and HTTP 308 responses. Patrick’s presentation triggered a lot of good discussions. His slides are here.

RTT distribution for Firefox running on desktop and mobile, from Patrick’s slide set:

rtt-dist

The lunch was lovely.

Vlad then continued to talk about experiences from implementing and providing server push at Cloudflare. It and the associated discussions helped emphasize that we need better help for users on how to use server push and there might be reasons for browsers to change how they are stored in the current “secondary cache”. Also, discussions around how to access pushed resources and get information about pushes from javascript were briefly touched on.

After a break with some sweets and coffee, Kazuho continued to describe cache digests and how this concept can help making servers do better or more accurate server pushes. Back to more discussions around push and what it actually solved, how much complexity it is worth and so on. I thought I could sense hesitation in the room on whether this is really something to proceed with.

We intend to have a set of lightning talks after lunch each day and we have already have twelve such suggested talks listed in the workshop wiki, but the discussions were so lively and extensive that we missed them today and we even had to postpone the last talk of today until tomorrow. I can already sense how these three days will not be enough for us to cover everything we have listed and planned…

We ended the evening with a great dinner sponsored by Mozilla. I’d say it was a great first day. I’m looking forward to day 2!

HTTP Workshop 2016, day -1

http workshop The HTTP Workshop 2016 will take place in Stockholm starting tomorrow Monday, as I’ve mentioned before. Today we’ll start off slowly by having a few pre workshop drinks and say hello to old and new friends.

I did a casual count, and out of the 40 attendees coming, I believe slightly less than half are newcomers that didn’t attend the workshop last year. We’ll see browser people come, more independent HTTP implementers, CDN representatives, server and intermediary developers as well as some friends from large HTTP operators/sites. I personally view my attendance to be primarily with my curl hat on rather than my Firefox one. Firmly standing in the client side trenches anyway.

Visitors to Stockholm these days are also lucky enough to arrive when the weather is possibly as good as it can get here with the warmest period through the summer so far with lots of sun and really long bright summer days.

News this year includes the @http_workshop twitter account. If you have questions or concerns for HTTP workshoppers, do send them that way and they might get addressed or at least noticed.

I’ll try to take notes and post summaries of each workshop day here. Of course I will fully respect our conference rules about what to reveal or not.

stockholm castle and ship

HTTP/2 in April 2016

On April 12 I had the pleasure of doing another talk in the Google Tech Talk series arranged in the Google Stockholm offices. I had given it the title “HTTP/2 is upon us, and here’s what you need to know about it.” in the invitation.

The room seated 70 persons but we had the amazing amount of over 300 people in the waiting line who unfortunately didn’t manage to get a seat. To those, and to anyone else who cares, here’s the video recording of the event.

If you’ve seen me talk about HTTP/2 before, you might notice that I’ve refreshed the material somewhat since before.

decent durable defect density displayed

Here’s an encouraging graph from our regular Coverity scans of the curl source code, showing that we’ve maintained a fairly low “defect density” over the last two years, staying way below the average density level.
defect density over timeClick the image to view it slightly larger.

Defect density is simply the number of found problems per 1,000 lines of code. As a little (and probably unfair) comparison, right now when curl is flat on 0, Firefox is at 0.47, c-ares at 0.12 and libssh2 at 0.21.

Coverity is still the primary static code analyzer for C code that I’m aware of. None of the flaws Coverity picked up in curl during the last two years were detected by clang-analyzer for example.

HTTP/2 adoption, end of 2015

http2 front imageWhen I asked my surrounding in March 2015 to guess the expected HTTP/2 adoption by now, we as a group ended up with about 10%. OK, the question was vaguely phrased and what does it really mean? Let’s take a look at some aspects of where we are now.

Perhaps the biggest flaw in the question was that it didn’t specify HTTPS. All the browsers of today only implement HTTP/2 over HTTPS so of course if every HTTPS site in the world would support HTTP/2 that would still be far away from all the HTTP requests. Admittedly, browsers aren’t the only HTTP clients…

During the fall of 2015, both nginx and Apache shipped release versions with HTTP/2 support. nginx made it slightly harder for people by forcing users to select either SPDY or HTTP/2 (which was a technical choice done by them, not really enforced by the protocols) and also still telling users that SPDY is the safer choice.

Let’s Encrypt‘s finally launching their public beta in the early December also helps HTTP/2 by removing one of the most annoying HTTPS obstacles: the cost and manual administration of server certs.

Amount of Firefox responses

This is the easiest metric since Mozilla offers public access to the metric data. It is skewed since it is opt-in data and we know that certain kinds of users are less likely to enable this (if you’re more privacy aware or if you’re using it in enterprise environments for example). This also then measures the share by volume of requests; making the popular sites get more weight.

Firefox 43 counts no less than 22% of all HTTP responses as HTTP/2 (based on data from Dec 8 to Dec 16, 2015).

Out of all HTTP traffic Firefox 43 generates, about 63% is HTTPS which then makes almost 35% of all Firefox HTTPS requests are HTTP/2!

Firefox 43 is also negotiating HTTP/2 four times as often as it ends up with SPDY.

Amount of browser traffic

One estimate of how large share of browsers that supports HTTP/2 is the caniuse.com number. Roughly 70% on a global level. Another metric is the one published by KeyCDN at the end of October 2015. When they enabled HTTP/2 by default for their HTTPS customers world wide, the average number of users negotiating HTTP/2 turned out to be 51%. More than half!

Cloudflare however, claims the share of supported browsers are at a mere 26%. That’s a really big difference and I personally don’t buy their numbers as they’re way too negative and give some popular browsers very small market share. For example: Chrome 41 – 49 at a mere 15% of the world market, really?

I think the key is rather that it all boils down to what you measure – as always.

Amount of the top-sites in the world

Netcraft bundles SPDY with HTTP/2 in their October report, but it says that “29% of SSL sites within the thousand most popular sites currently support SPDY or HTTP/2, while 8% of those within the top million sites do.” (note the “of SSL sites” in there)

That’s now slightly old data that came out almost exactly when Apache first release its HTTP/2 support in a public release and Nginx hadn’t even had it for a full month yet.

Facebook eventually enabled HTTP/2 in November 2015.

Amount of “regular” sites

There’s still no ideal service that scans a larger portion of the Internet to measure adoption level. The httparchive.org site is about to change to a chrome-based spider (from IE) and once that goes live I hope that we will get better data.

W3Tech’s report says 2.5% of web sites in early December – less than SPDY!

I like how isthewebhttp2yet.com looks so far and I’ve provided them with my personal opinions and feedback on what I think they should do to make that the preferred site for this sort of data.

Using the shodan search engine, we could see that mid December 2015 there were about 115,000 servers on the Internet using HTTP/2.  That’s 20,000 (~24%) more than isthewebhttp2yet site says. It doesn’t really show percentages there, but it could be interpreted to say that slightly over 6% of HTTP/1.1 sites also support HTTP/2.

On Dec 3rd 2015, Cloudflare enabled HTTP/2 for all its customers and they claimed they doubled the number of HTTP/2 servers on the net in that single move. (The shodan numbers seem to disagree with that statement.)

Amount of system lib support

iOS 9 supports HTTP/2 in its native HTTP library. That’s so far the leader of HTTP/2 in system libraries department. Does Mac OS X have something similar?

I had expected Window’s wininet or other HTTP libs to be up there as well but I can’t find any details online about it. I hear the Android HTTP libs are not up to snuff either but since okhttp is now part of Android to some extent, I guess proper HTTP/2 in Android is not too far away?

Amount of HTTP API support

I hear very little about HTTP API providers accepting HTTP/2 in addition or even instead of HTTP/1.1. My perception is that this is basically not happening at all yet.

Next-gen experiments

If you’re using a modern Chrome browser today against a Google service you’re already (mostly) using QUIC instead of HTTP/2, thus you aren’t really adding to the HTTP/2 client side numbers but you’re also not adding to the HTTP/1.1 numbers.

QUIC and other QUIC-like (UDP-based with the entire stack in user space) protocols are destined to grow and get used even more as we go forward. I’m convinced of this.

Conclusion

Everyone was right! It is mostly a matter of what you meant and how to measure it.

Future

Recall the words on the Chromium blog: “We plan to remove support for SPDY in early 2016“. For Firefox we haven’t said anything that absolute, but I doubt that Firefox will support SPDY for very long after Chrome drops it.