Tag Archives: Firefox

Lesser HTTPS for non-browsers

An HTTPS client needs to do a whole lot of checks to make sure that the remote host is fine to communicate with to maintain the proper high security levels.

In this blog post, I will explain why and how the entire HTTPS ecosystem relies on the browsers to be good and strict and thanks to that, the rest of the HTTPS clients can get away with being much more lenient. And in fact that is good, because the browsers don’t help the rest of the ecosystem very much to do good verification at that same level.

Let me me illustrate with some examples.

CA certs

The server’s certificate must have been signed by a trusted CA (Certificate Authority). A client then needs the certificates from all the CAs that are trusted. Who’s a trusted CA and how would a client get their certs to use for verification?

You can say that you trust the same set of CAs that your operating system vendor trusts (which I’ve always thought is a bit of a stretch but hey, I can very well understand the convenience in this). If you want to do this as an HTTPS client you need to use native APIs in Windows or macOS, or you need to figure out where the cert bundle is stored if you’re using Linux.

If you’re not using the native libraries on windows and macOS or if you can’t find the bundle in your Linux distribution, or you’re in one of a large amount of other setups where you can’t use someone else’s bundle, then you need to gather this list by yourself.

How on earth would you gather a list of hundreds of CA certs that are used for the popular web sites on the net of today? Stand on someone else’s shoulders and use what they’ve done? Yeps, and conveniently enough Mozilla has such a bundle that is licensed to allow others to use it…

Mozilla doesn’t offer the set of CA certs in a format that anyone else can use really, which is the primary reason why we offer Mozilla’s cert bundle converted to PEM format on the curl web site. The other parties that collect CA certs at scale (Microsoft for Windows, Apple for macOS, etc) do even less.

Before you ask, Google doesn’t maintain their own list for Chrome. They piggyback the CA store provided on the operating system it runs on. (Update: Google maintains its own list for Android/Chrome OS.)

Further constraints

But the browsers, including Firefox, Chrome, Edge and Safari all add additional constraints beyond that CA cert store, on what server certificates they consider to be fine and okay. They blacklist specific fingerprints, they set a last allowed date for certain CA providers to offer certificates for servers and more.

These additional constraints, or additional rules if you want, are never exported nor exposed to the world in ways that are easy for anyone to mimic (in other ways than that everyone of course can implement the same code logic in their ends). They’re done in code and they’re really hard for anyone not a browser to implement and keep up with.

This makes every non-browser HTTPS client susceptible to okaying certificates that have already been deemed not OK by security experts at the browser vendors. And in comparison, not many HTTPS clients can compare or stack up the amount of client-side TLS and security expertise that the browser developers can.

HSTS preload

HTTP Strict Transfer Security is a way for sites to tell clients that they are to be accessed over HTTPS only for a specified time into the future, and plain HTTP should then not be used for the duration of this rule. This setup removes the Man-In-The-Middle (MITM) risk on subsequent accesses for sites that may still get linked to via HTTP:// URLs or by users entering the web site names directly into the address bars and so on.

The browsers have a “HSTS preload list” which is a list of sites that people have submitted and they are HSTS sites that basically never time out and always will be accessed over HTTPS only. Forever. No risk for MITM even in the first access to these sites.

There are no such HSTS preload lists being provided for non-browser HTTPS clients so there’s no easy way for non-browsers to avoid the first access MITM even for these class of forever-on-HTTPS sites.

Update: The Chromium HSTS preload list is available in a JSON format.

SHA-1

I’m sure you’ve heard about the deprecation of SHA-1 as a certificate hashing algorithm and how the browsers won’t accept server certificates using this starting at some cut off date.

I’m not aware of any non-browser HTTPS client that makes this check. For services, API providers and others don’t serve “normal browsers” they can all continue to play SHA-1 certificates well into 2017 without tears or pain. Another ecosystem detail we rely on the browsers to fix for us since most of these providers want to work with browsers as well…

This isn’t really something that is magic or would be terribly hard for non-browsers to do, its just that it will make some users suddenly get errors for their otherwise working setups and that takes a firm attitude from the software provider that is hard to maintain. And you’d have to introduce your own cut-off date that you’d have to fight with your users about! ;-)

TLS is hard to get right

TLS and HTTPS are full of tricky areas and dusty corners that are hard to get right. The more we can share tricks and rules the better it is for everyone.

I think the browser vendors could do much better to help the rest of the ecosystem. By making their meta data available to us in sensible formats mostly. For the good of the Internet.

Disclaimer

Yes I work for Mozilla which makes Firefox. A vendor and a browser that I write about above. I’ve been communicating internally about some of these issues already, but I’m otherwise not involved in those parts of Firefox.

HTTPS proxy with curl

Starting in version 7.52.0 (due to ship December 21, 2016), curl will support HTTPS proxies when doing network transfers, and by doing this it joins the small exclusive club of HTTP user-agents consisting of Firefox, Chrome and not too many others.

Yes you read this correctly. This is different than the good old HTTP proxy.

HTTPS proxy means that the client establishes a TLS connection to the proxy and then communicates over that, which is different to the normal and traditional HTTP proxy approach where the clients speak plain HTTP to the proxy.

Talking HTTPS to your proxy is a privacy improvement as it prevents people from snooping on your proxy communication. Even when using HTTPS over a standard HTTP proxy, there’s typically a setting up phase first that leaks information about where the connection is being made, user credentials and more. Not to mention that an HTTPS proxy makes HTTP traffic “safe” to and from the proxy. HTTPS to the proxy also enables clients to speak HTTP/2 more easily with proxies. (Even though HTTP/2 to the proxy is not yet supported in curl.)

In the case where a client wants to talk HTTPS to a remote server, when using a HTTPS proxy, it sends HTTPS through HTTPS.

Illustrating this concept with images. When using a traditional HTTP proxy, we connect initially to the proxy with HTTP in the clear, and then from then on the HTTPS makes it safe:

HTTP proxyto compare with the HTTPS proxy case where the connection is safe already in the first step:

HTTPS proxyThe access to the proxy is made over network A. That network has traditionally been a corporate network or within a LAN or something but we’re seeing more and more use cases where the proxy is somewhere on the Internet and then “Network A” is really huge. That includes use cases where the proxy for example compresses images or otherwise reduces bandwidth requirements.

Actual HTTPS connections from clients to servers are still done end to end encrypted even in the HTTP proxy case. HTTP traffic to and from the user to the web site however, will still be HTTPS protected to the proxy when a HTTPS proxy is used.

A complicated pull request

This awesome work was provided by Dmitry Kurochkin, Vasy Okhin, and Alex Rousskov. It was merged into master on November 24 in this commit.

Doing this sort of major change in the TLS area in curl code is a massive undertaking, much so because of the fact that curl supports getting built with one out of 11 or 12 different TLS libraries. Several of those are also system-specific so hardly any single developer can even build all these backends on his or hers own machines.

In addition to the TLS backend maze, curl and library also offers a huge amount of different options to control the TLS connection and handling. You can switch on and off features, provide certificates, CA bundles and more. Adding another layer of TLS pretty much doubles the amount of options since now you can tweak everything both in the TLS connection to the proxy as well as the one to the remote peer.

This new feature is supported with the OpenSSL, GnuTLS and NSS backends to start with.

Consider it experimental for now

By all means, go ahead and use it and torture the code and file issues for everything bad you see, but I think we make ourselves a service by considering this new feature set to be a bit experimental in this release.

New options

There’s a whole forest of new command line and libcurl options to control all the various aspects of the new TLS connection this introduces. Since it is a totally separate connection it gets a whole set of options that are basically identical to the server connection but with a –proxy prefix instead. Here’s a list:

  --proxy-cacert 
  --proxy-capath
  --proxy-cert
  --proxy-cert-type
  --proxy-ciphers
  --proxy-crlfile
  --proxy-insecure
  --proxy-key
  --proxy-key-type
  --proxy-pass
  --proxy-ssl-allow-beast
  --proxy-sslv2
  --proxy-sslv3
  --proxy-tlsv1
  --proxy-tlsuser
  --proxy-tlspassword
  --proxy-tlsauthtype

curl and TLS 1.3

Draft 18 of the TLS version 1.3 spec was publiSSL padlockshed at the end of October 2016.

Already now, both Firefox and Chrome have test versions out with TLS 1.3 enabled. Firefox 52 will have it by default, and while Chrome will ship it, I couldn’t figure out exactly when we can expect it to be there by default.

Over the last few days we’ve merged TLS 1.3 support to curl, primarily in this commit by Kamil Dudka. Both the command line tool and libcurl will negotiate TLS 1.3 in the next version (7.52.0 – planned release date at the end of December 2016) if built with a TLS library that supports it and told to do it by the user.

The two current TLS libraries that will speak TLS 1.3 when built with curl right now is NSS and BoringSSL. The plan is to gradually adjust curl over time as the other libraries start to support 1.3 as well. As always we will appreciate your help in making this happen!

Right now, there’s also the minor flux in that servers and clients may end up running implementations of different draft versions of the TLS spec which contributes to a layer of extra fun!

Three TLS current 1.3 test servers to play with: https://enabled.tls13.com/ , https://www.allizom.org/ and https://tls13.crypto.mozilla.org/. I doubt any of these will give you any guarantees of functionality.

TLS 1.3 offers a few new features that allow clients such as curl to do subsequent TLS connections much faster, with only 1 or even 0 RTTs, but curl has no code for any of those features yet.

1,000,000 sites run HTTP/2

… out of the top ten million sites that is. So there’s at least that many, quite likely a few more.

This is according to w3techs who runs checks daily. Over the last few months, there have been about 50,000 new sites per month switching it on.

ht2-10-percent

It also shows that the HTTP/2 ratio has increased from a little over 1% deployment a year ago to the 10% today.

HTTP/2 gets more used the more  popular site it is. Among the top 1,000 sites on the web, more than 20% of them use HTTP/2. HTTP/2 also just recently (September 9) overcame SPDY among the top-1000 most popular sites.

h2-sep28

On September 7, Amazon announced their CloudFront service having enabled HTTP/2, which could explain an adoption boost over the last few days. New CloudFront users get it enabled by default but existing users actually need to go in and click a checkbox to make it happen.

As the web traffic of the world is severely skewed toward the top ones, we can be sure that a significantly larger share than 10% of the world’s HTTPS traffic is using version 2.

Recent usage stats in Firefox shows that HTTP/2 is used in half of all its HTTPS requests!

http2

HTTP/2 connection coalescing

Section 9.1.1 in RFC7540 explains how HTTP/2 clients can reuse connections. This is my lengthy way of explaining how this works in reality.

Many connections in HTTP/1

With HTTP/1.1, browsers are typically using 6 connections per origin (host name + port). They do this to overcome the problems in HTTP/1 and how it uses TCP – as each connection will do a fair amount of waiting. Plus each connection is slow at start and therefore limited to how much data you can get and send quickly, you multiply that data amount with each additional connection. This makes the browser get more data faster (than just using one connection).

6 connections

Add sharding

Web sites with many objects also regularly invent new host names to trigger browsers to use even more connections. A practice known as “sharding”. 6 connections for each name. So if you instead make your site use 4 host names you suddenly get 4 x 6 = 24 connections instead. Mostly all those host names resolve to the same IP address in the end anyway, or the same set of IP addresses. In reality, some sites use many more than just 4 host names.

24 connections

The sad reality is that a very large percentage of connections used for HTTP/1.1 are only ever used for a single HTTP request, and a very large share of the connections made for HTTP/1 are so short-lived they actually never leave the slow start period before they’re killed off again. Not really ideal.

One connection in HTTP/2

With the introduction of HTTP/2, the HTTP clients of the world are going toward using a single TCP connection for each origin. The idea being that one connection is better in packet loss scenarios, it makes priorities/dependencies work and reusing that single connections for many more requests will be a net gain. And as you remember, HTTP/2 allows many logical streams in parallel over that single connection so the single connection doesn’t limit what the browsers can ask for.

Unsharding

The sites that created all those additional host names to make the HTTP/1 browsers use many connections now work against the HTTP/2 browsers’ desire to decrease the number of connections to a single one. Sites don’t want to switch back to using a single host name because that would be a significant architectural change and there are still a fair number of HTTP/1-only browsers still in use.

Enter “connection coalescing”, or “unsharding” as we sometimes like to call it. You won’t find either term used in RFC7540, as it merely describes this concept in terms of connection reuse.

Connection coalescing means that the browser tries to determine which of the remote hosts that it can reach over the same TCP connection. The different browsers have slightly different heuristics here and some don’t do it at all, but let me try to explain how they work – as far as I know and at this point in time.

Coalescing by example

Let’s say that this cool imaginary site “example.com” has two name entries in DNS: A.example.com and B.example.com. When resolving those names over DNS, the client gets a list of IP address back for each name. A list that very well may contain a mix of IPv4 and IPv6 addresses. One list for each name.

You must also remember that HTTP/2 is also only ever used over HTTPS by browsers, so for each origin speaking HTTP/2 there’s also a corresponding server certificate with a list of names or a wildcard pattern for which that server is authorized to respond for.

In our example we start out by connecting the browser to A. Let’s say resolving A returns the IPs 192.168.0.1 and 192.168.0.2 from DNS, so the browser goes on and connects to the first of those addresses, the one ending with “1”. The browser gets the server cert back in the TLS handshake and as a result of that, it also gets a list of host names the server can deal with: A.example.com and B.example.com. (it could also be a wildcard like “*.example.com”)

If the browser then wants to connect to B, it’ll resolve that host name too to a list of IPs. Let’s say 192.168.0.2 and 192.168.0.3 here.

Host A: 192.168.0.1 and 192.168.0.2
Host B: 192.168.0.2 and 192.168.0.3

Now hold it. Here it comes.

The Firefox way

Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.

one connection

The Chrome way

Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection.  If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.

Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.

The Edge and Safari ways

They don’t do coalescing at all, so each host name will get its own single connection. Better than the 6 connections from HTTP/1 but for very sharded sites that means a lot of connections even in the HTTP/2 case.

curl also doesn’t coalesce anything (yet).

Surprises and a way to mitigate them

Given some comments in the Firefox bugzilla, the aggressive coalescing sometimes causes some surprises. Especially when you have for example one IPv6-only host A and a second host B with both IPv4 and IPv4 addresses. Asking for data on host A can then still use IPv4 when it reuses a connection to B (assuming that host A covers host B in its cert).

In the rare case where a server gets a resource request for an authority (or scheme) it can’t serve, there’s a dedicated error code 421 in HTTP/2 that it can respond with and the browser can then  go back and retry that request on another connection.

Starts out with 6 anyway

Before the browser knows that the server speaks HTTP/2, it may fire up 6 connection attempts so that it is prepared to get the remote site at full speed. Once it figures out that it doesn’t need all those connections, it will kill off the unnecessary unused ones and over time trickle down to one. Of course, on subsequent connections to the same origin the client may have the version information cached so that it doesn’t have to start off presuming HTTP/1.

A workshop Monday

http workshopI decided I’d show up a little early at the Sheraton as I’ve been handling the interactions with hotel locally here in Stockholm where the workshop will run for the coming three days. Things were on track, if we ignore how they got the wrong name of the workshop on the info screens in the lobby, instead saying “Haxx Ab”…

Mark welcomed us with a quick overview of what we’re here for and quick run-through of the rough planning for the days. Our schedule is deliberately loose and open to allow for changes and adaptations as we go along.

Patrick talked about the 1 1/2 years of HTTP/2 working in Firefox so far, and we discussed a lot around the numbers and telemetry. What do they mean and why do they look like this etc. HTTP/2 is now at 44% of all HTTPS requests and connections using HTTP/2 are used for more than 8 requests on median (compared to slightly over 1 in the HTTP/1 case). What’s almost not used at all? HTTP/2 server push, Alt-Svc and HTTP 308 responses. Patrick’s presentation triggered a lot of good discussions. His slides are here.

RTT distribution for Firefox running on desktop and mobile, from Patrick’s slide set:

rtt-dist

The lunch was lovely.

Vlad then continued to talk about experiences from implementing and providing server push at Cloudflare. It and the associated discussions helped emphasize that we need better help for users on how to use server push and there might be reasons for browsers to change how they are stored in the current “secondary cache”. Also, discussions around how to access pushed resources and get information about pushes from javascript were briefly touched on.

After a break with some sweets and coffee, Kazuho continued to describe cache digests and how this concept can help making servers do better or more accurate server pushes. Back to more discussions around push and what it actually solved, how much complexity it is worth and so on. I thought I could sense hesitation in the room on whether this is really something to proceed with.

We intend to have a set of lightning talks after lunch each day and we have already have twelve such suggested talks listed in the workshop wiki, but the discussions were so lively and extensive that we missed them today and we even had to postpone the last talk of today until tomorrow. I can already sense how these three days will not be enough for us to cover everything we have listed and planned…

We ended the evening with a great dinner sponsored by Mozilla. I’d say it was a great first day. I’m looking forward to day 2!

HTTP Workshop 2016, day -1

http workshop The HTTP Workshop 2016 will take place in Stockholm starting tomorrow Monday, as I’ve mentioned before. Today we’ll start off slowly by having a few pre workshop drinks and say hello to old and new friends.

I did a casual count, and out of the 40 attendees coming, I believe slightly less than half are newcomers that didn’t attend the workshop last year. We’ll see browser people come, more independent HTTP implementers, CDN representatives, server and intermediary developers as well as some friends from large HTTP operators/sites. I personally view my attendance to be primarily with my curl hat on rather than my Firefox one. Firmly standing in the client side trenches anyway.

Visitors to Stockholm these days are also lucky enough to arrive when the weather is possibly as good as it can get here with the warmest period through the summer so far with lots of sun and really long bright summer days.

News this year includes the @http_workshop twitter account. If you have questions or concerns for HTTP workshoppers, do send them that way and they might get addressed or at least noticed.

I’ll try to take notes and post summaries of each workshop day here. Of course I will fully respect our conference rules about what to reveal or not.

stockholm castle and ship

HTTP/2 in April 2016

On April 12 I had the pleasure of doing another talk in the Google Tech Talk series arranged in the Google Stockholm offices. I had given it the title “HTTP/2 is upon us, and here’s what you need to know about it.” in the invitation.

The room seated 70 persons but we had the amazing amount of over 300 people in the waiting line who unfortunately didn’t manage to get a seat. To those, and to anyone else who cares, here’s the video recording of the event.

If you’ve seen me talk about HTTP/2 before, you might notice that I’ve refreshed the material somewhat since before.

decent durable defect density displayed

Here’s an encouraging graph from our regular Coverity scans of the curl source code, showing that we’ve maintained a fairly low “defect density” over the last two years, staying way below the average density level.
defect density over timeClick the image to view it slightly larger.

Defect density is simply the number of found problems per 1,000 lines of code. As a little (and probably unfair) comparison, right now when curl is flat on 0, Firefox is at 0.47, c-ares at 0.12 and libssh2 at 0.21.

Coverity is still the primary static code analyzer for C code that I’m aware of. None of the flaws Coverity picked up in curl during the last two years were detected by clang-analyzer for example.