Category Archives: Firefox

How to DoH-only with Firefox

Firefox supports DNS-over-HTTPS (aka DoH) since version 62.

You can instruct your Firefox to only use DoH and never fall-back and try the native resolver; the mode we call trr-only. Without any other ability to resolve host names, this is a little tricky so this guide is here to help you. (This situation might improve in the future.)

In trr-only mode, nobody on your local network nor on your ISP can snoop on your name resolves. The SNI part of HTTPS connections are still clear text though, so eavesdroppers on path can still figure out which hosts you connect to.

There's a name in my URI

A primary problem for trr-only is that we usually want to use a host name in the URI for the DoH server (we typically need it to be a name so that we can verify the server's certificate against it), but we can't resolve that host name until DoH is setup to work. A catch-22.

There are currently two ways around this problem:

  1. Tell Firefox the IP address of the name that you use in the URI. We call it the "bootstrapAddress". See further below.
  2. Use a DoH server that is provided on an IP-number URI. This is rather unusual. There's for example one at 1.1.1.1.

Setup and use trr-only

There are three prefs to focus on (they're all explained elsewhere):

network.trr.mode - set this to the number 3.

network.trr.uri - set this to the URI of the DoH server you want to use. This should be a server you trust and want to hand over your name resolves to. The Cloudflare one we've previously used in DoH tests with Firefox is https://mozilla.cloudflare-dns.com/dns-query.

network.trr.bootstrapAddress- when you use a host name in the URI for the network.trr.uri pref you must set this pref to an IP address that host name resolves to for you. It is important that you pick an IP address that the name you use actually would resolve to.

Example

Let's pretend you want to go full trr-only and use a DoH server at https://example.com/dns. (it's a pretend URI, it doesn't work).

Figure out the bootstrapAddress with dig. Resolve the host name from the URI:

$ dig +short example.com
93.184.216.34

or if you prefer to be classy and use the IPv6 address (only do this if IPv6 is actually working for you)

$ dig -t AAAA +short example.com
2606:2800:220:1:248:1893:25c8:1946

dig might give you a whole list of addresses back, and then you can pick any one of them in the list. Only pick one address though.

Go to "about:config" and paste the copied IP address into the value field for network.trr.bootstrapAddress. Now TRR / DoH should be able to get going. When you can see web pages, you know it works!

DoH-only means only DoH

If you happen to start Firefox behind a captive portal while in trr-only mode, the connections to the DoH server will fail and no name resolves can be performed.

In those situations, normally Firefox's captive portable detector would trigger and show you the login page etc, but when no names can be resolved and the captive portal can't respond with a fake response to the name lookup and redirect you to the login, it won't get anywhere. It gets stuck. And currently, there's no good visual indication anywhere that this is what happens.

You simply can't get out of a captive portal with trr-only. You probably then temporarily switch mode, login to the portal and switch the mode to 3 again.

If you "unlock" the captive portal with another browser/system, Firefox's regular retries while in trr-only will soon detect that and things should start working again.

quic wg interim Kista

The IETF QUIC working group had its fifth interim meeting the other day, this time in Kista, Sweden hosted by Ericsson. For me as a Stockholm resident, this was ridiculously convenient. Not entirely coincidentally, this was also the first quic interim I attended in person.

We were 30 something persons gathered in a room without windows, with another dozen or so participants joining from remote. This being a meeting in a series, most people already know each other from before so the atmosphere was relaxed and friendly. Lots of the participants have also been involved in other protocol developments and standards before. Many familiar faces.

Schedule

As QUIC is supposed to be done "soon", the emphasis is now a lot to close issues, postpone some stuff to "QUICv2" and make sure to get decisions on outstanding question marks.

Kazuho did a quick run-through with some info from the interop days prior to the meeting.

After MT's initial explanation of where we're at for the upcoming draft-13, Ian took us a on a deep dive into the Stream 0 Design Team report. This is a pretty radical change of how the wire format of the quic protocol, and how the TLS is being handled.

The existing draft-12 approach...

Is suggested to instead become...

What's perhaps the most interesting take away here is that the new format doesn't use TLS records anymore - but simplifies a lot of other things. Not using TLS records but still doing TLS means that a QUIC implementation needs to get data from the TLS layer using APIs that existing TLS libraries don't typically provide. PicoTLS, Minq, BoringSSL. NSS already have or will soon provide the necessary APIs. Slightly behind, OpenSSL should offer it in a nightly build soon but the impression is that it is still a bit away from an actual OpenSSL release.

EKR continued the theme. He talked about the quic handshake flow and among other things explained how 0-RTT and early data works. Taken from that context, I consider this slide (shown below) fairly funny because it makes it look far from simple to me. But it shows communication in different layers, and how the acks go, etc.

HTTP

Mike then presented the state of HTTP over quic. The frames are no longer that similar to the HTTP/2 versions. Work is done to ensure that the HTTP layer doesn't need to refer or "grab" stream IDs from the transport layer.

There was a rather lengthy discussion around how to handle "placeholder streams" like the ones Firefox uses over HTTP/2 to create "anchors" on which to make dependencies but are never actually used over the wire. The nature of the quic transport makes those impractical and we talked about what alternatives there are that could still offer similar functionality.

The subject of priorities and dependencies and if the relative complexity of the h2 model should be replaced by something simpler came up (again) but was ultimately pushed aside.

QPACK

Alan presented the state of QPACK, the HTTP header compression algorithm for hq (HTTP over QUIC). It is not wire compatible with HPACK anymore and there have been some recent improvements and clarifications done.

Alan also did a great step-by-step walk-through how QPACK works with adding headers to the dynamic table and how it works with its indices etc. It was very clarifying I thought.

The discussion about the static table for the compression basically ended with us agreeing that we should just agree on a fairly small fixed table without a way to negotiate the table. Mark said he'd try to get some updated header data from some server deployments to get another data set than just the one from WPT (which is from a single browser).

Interop-testing of QPACK implementations can be done by encode  + shuffle + decode a HAR file and compare the results with the source data. Just do it - and talk to Alan!

And the first day was over. A fully packed day.

ECN

Magnus started off with some heavy stuff talking Explicit Congestion Notification in QUIC and it how it is intended to work and some remaining issues.

He also got into the subject of ACK frequency and how the current model isn't ideal in every situation, causing to work like this image below (from Magnus' slide set):

Interestingly, it turned out that several of the implementers already basically had implemented Magnus' proposal of changing the max delay to min(RTT/4, 25 ms) independently of each other!

mvfst deployment

Subodh took us on a journey with some great insights from Facebook's deployment of mvfast internally, their QUIC implementation. Getting some real-life feedback is useful and with over 100 billion requests/day, it seems they did give this a good run.

Since their usage and stack for this is a bit use case specific I'm not sure how relevant or universal their performance numbers are. They showed roughly the same CPU and memory use, with a 70% RPS rate compared to h2 over TLS 1.2.

He also entertained us with some "fun issues" from bugs and debugging sessions they've done and learned from. Awesome.

The story highlights the need for more tooling around QUIC to help developers and deployers.

Load balancers

Martin talked about load balancers and servers, and how they could or should communicate to work correctly with routing and connection IDs.

The room didn't seem overly thrilled about this work and mostly offered other ways to achieve the same results.

Implicit Open

During the last session for the day and the entire meeting, was mt going through a few things that still needed discussion or closure. On stateless reset and the rather big bike shed issue: implicit open. The later being the question if opening a stream with ID N + 1 implicitly also opens the stream with ID N. I believe we ended with a slight preference to the implicit approach and this will be taken to the list for a consensus call.

Frame type extensibility

How should the QUIC protocol allow extensibility? The oldest still open issue in the project can be solved or satisfied in numerous different ways and the discussion waved back and forth for a while, debating various approaches merits and downsides until the group more or less agreed on a fairly simple and straight forward approach where the extensions will announce support for a feature which then may or may involve one or more new frame types (to be in a registry).

We proceeded to discuss other issues all until "closing time", which was set to be 16:00 today. This was just two days of pushing forward but still it felt quite intense and my personal impression is that there were a lot of good progress made here that took the protocol a good step forward.

The facilities were lovely and Ericsson was a great host for us. The Thursday afternoon cakes were great! Thank you!

Coming up

There's an IETF meeting in Montreal in July and there's a planned next QUIC interim probably in New York in September.

Inside Firefox’s DOH engine

DNS over HTTPS (DOH) is a feature where a client shortcuts the standard native resolver and instead asks a dedicated DOH server to resolve names.

Compared to regular unprotected DNS lookups done over UDP or TCP, DOH increases privacy, security and sometimes even performance. It also makes it easy to use a name server of your choice for a particular application instead of the one configured globally (often by someone else) for your entire system.

DNS over HTTPS is quite simply the same regular DNS packets (RFC 1035 style) normally sent in clear-text over UDP or TCP but instead sent with HTTPS requests. Your typical DNS server provider (like your ISP) might not support this yet.

To get the finer details of this concept, check out Lin Clark's awesome cartoon explanation of DNS and DOH.

This new Firefox feature is planned to get ready and ship in Firefox release 62 (early September 2018). You can test it already now in Firefox Nightly by setting preferences manually as described below.

This article will explain some of the tweaks, inner details and the finer workings of the Firefox TRR implementation (TRR == Trusted Recursive Resolver) that speaks DOH.

Preferences

All preferences (go to "about:config") for this functionality are located under the "network.trr" prefix.

network.trr.mode - set which resolver mode you want.

0 - Off (default). use standard native resolving only (don't use TRR at all)
1 - Race native against TRR. Do them both in parallel and go with the one that returns a result first.
2 - TRR first. Use TRR first, and only if the name resolve fails use the native resolver as a fallback.
3 - TRR only. Only use TRR. Never use the native (after the initial setup).
4 - Shadow mode. Runs the TRR resolves in parallel with the native for timing and measurements but uses only the native resolver results.
5 - Explicitly off. Also off, but selected off by choice and not default.

network.trr.uri - (default: none) set the URI for your DOH server. That's the URL Firefox will issue its HTTP request to. It must be a HTTPS URL (non-HTTPS URIs will simply be ignored). If "useGET" is enabled, Firefox will append "?ct&dns=...." to the URI when it makes its HTTP requests. For the default POST requests, they will be issued to exactly the specified URI.

"mode" and "uri" are the only two prefs required to set to activate TRR. The rest of them listed below are for tweaking behavior.

We list some publicly known DOH servers here. If you prefer to, it is easy to setup and run your own.

network.trr.credentials - (default: none) set credentials that will be used in the HTTP requests to the DOH end-point. It is the right side content, the value, sent in the Authorization: request header. Handy if you for example want to run your own public server and yet limit who can use it.

network.trr.wait-for-portal - (default: true) this boolean tells Firefox to first wait for the captive portal detection to signal "okay" before TRR is used.

network.trr.allow-rfc1918 - (default: false) set this to true to allow RFC 1918 private addresses in TRR responses. When set false, any such response will be considered a wrong response that won't be used.

network.trr.useGET - (default: false) When the browser issues a request to the DOH server to resolve host names, it can do that using POST or GET. By default Firefox will use POST, but by toggling this you can enforce GET to be used instead. The DOH spec says a server MUST support both methods.

network.trr.confirmationNS - (default: example.com) At startup, Firefox will first check an NS entry to verify that TRR works, before it gets enabled for real and used for name resolves. This preference sets which domain to check. The verification only checks for a positive answer, it doesn't actually care what the response data says.

network.trr.bootstrapAddress - (default: none) by setting this field to the IP address of the host name used in "network.trr.uri", you can bypass using the system native resolver for it. This avoids that initial (native) name resolve for the host name mentioned in the network.trr.uri pref.

network.trr.blacklist-duration - (default: 60) is the number of seconds a name will be kept in the TRR blacklist until it expires and can be tried again. The default duration is one minute. (Update: this has been cut down from previous longer defaults.)

network.trr.request-timeout - (default: 3000) is the number of milliseconds a request to and corresponding response from the DOH server is allowed to spend until considered failed and discarded.

network.trr.early-AAAA (default: false) For each normal name resolve, Firefox issues one HTTP request for A entries and another for AAAA entries. The responses come back separately and can come in any order. If the A records arrive first, Firefox will - as an optimization - continue and use those addresses without waiting for the second response. If the AAAA records arrive first, Firefox will only continue and use them immediately if this option is set to true.

Split-horizon and blacklist

With regular DNS, it is common to have clients in different places get different results back. This can be done since the servers know from where the request comes (which also enables quite a degree of spying) and they can then respond accordingly. When switching to another resolver with TRR, you may experience that you don't always get the same set of addresses back. At times, this causes problems.

As a precaution, Firefox features a system that detects if a name can't be resolved at all with TRR and can then fall back and try again with just the native resolver (the so called TRR-first mode). Ending up in this scenario is of course slower and leaks the name over clear-text UDP but this safety mechanism exists to avoid users risking ending up in a black hole where certain sites can't be accessed. Names that causes such TRR failures are then put in an internal dynamic blacklist so that subsequent uses of that name automatically avoids using DNS-over-HTTPS for a while (see the blacklist-duration pref to control that period). Of course this fall-back is not in use if TRR-only mode is selected.

In addition, if a host's address is retrieved via TRR and Firefox subsequently fails to connect to that host, it will redo the resolve without DOH and retry the connect again just to make sure that it wasn't a split-horizon situation that caused the problem.

When a host name is added to the TRR blacklist, its domain also gets checked in the background to see if that whole domain perhaps should be blacklisted to ensure a smoother ride going forward.

Additionally, "localhost" and all names in the ".local" TLD are sort of hard-coded as blacklisted and will never be resolved with TRR. (Unless you run TRR-only...)

TTL as a bonus!

With the implementation of DNS-over-HTTPS, Firefox now gets the TTL (Time To Live, how long a record is valid) value for each DNS address record and can store and use that for expiry time in its internal DNS cache. Having accurate lifetimes improves the cache as it then knows exactly how long the name is meant to work and means less guessing and heuristics.

When using the native name resolver functions, this time-to-live data is normally not provided and Firefox does in fact not use the TTL on other platforms than Windows and on Windows it has to perform some rather awkward quirks to get the TTL from DNS for each record.

Server push

Still left to see how useful this will become in real-life, but DOH servers can push new or updated DNS records to Firefox. HTTP/2 Server Push being responses to requests the client didn't send but the server thinks the client might appreciate anyway as if it sent requests for those resources.

These pushed DNS records will be treated as regular name resolve responses and feed the Firefox in-memory DNS cache, making subsequent resolves of those names to happen instantly.

Bootstrap

You specify the DOH service as a full URI with a name that needs to be resolved, and in a cold start Firefox won't know the IP address of that name and thus needs to resolve it first (or use the provided address you can set with network.trr.bootstrapAddress). Firefox will then use the native resolver for that, until TRR has proven itself to work by resolving the network.trr.confirmationNS test domain. Firefox will also by default wait for the captive portal check to signal "OK" before it uses TRR, unless you tell it otherwise.

As a result of this bootstrap procedure, and if you're not in TRR-only mode, you might still get  a few native name resolves done at initial Firefox startups. Just telling you this so you don't panic if you see a few show up.

CNAME

The code is aware of CNAME records and will "chase" them down and use the final A/AAAA entry with its TTL as if there were no CNAMEs present and store that in the in-memory DNS cache. This initial approach, at least, does not cache the intermediate CNAMEs nor does it care about the CNAME TTL values.

Firefox currently allows no more than 64(!) levels of CNAME redirections.

about:networking

Enter that address in the Firefox URL bar to reach the debug screen with a bunch of networking information. If you then click the DNS entry in the left menu, you'll get to see the contents of Firefox's in-memory DNS cache. The TRR column says true or false for each name if that was resolved using TRR or not. If it wasn't, the native resolver was used instead for that name.

Private Browsing

When in private browsing mode, DOH behaves similar to regular name resolves: it keeps DNS cache entries separately from the regular ones and the TRR blacklist is then only kept in memory and not persisted to disk. The DNS cache is flushed when the last PB session is exited.

Tools

I wrote up dns2doh, a little tool to create DOH requests and responses with, that can be used to build your own toy server with and to generate requests to send with curl or similar.

It allows you to manually issue a type A (regular IPv4 address) DOH request like this:

$ dns2doh --A --onlyq --raw daniel.haxx.se | \
curl --data-binary @- \
https://dns.cloudflare.com/.well-known/dns \
-H "Content-Type: application/dns-udpwireformat"

I also wrote doh, which is a small stand-alone tool (based on libcurl) that issues requests for the A and AAAA records of a given host name from the given DOH URI.

Why HTTPS

Some people giggle and think of this as a massive layer violation. Maybe it is, but doing DNS over HTTPS makes a lot of sense compared to for example using plain TLS:

  1. We get transparent and proxy support "for free"
  2. We get multiplexing and the use of persistent connections from the get go (this can be supported by DNS-over-TLS too, depending on the implementation)
  3. Server push is a potential real performance booster
  4. Browsers often already have a lot of existing HTTPS connections to the same CDNs that might offer DOH.

Further explained in Patrick Mcmanus' The Benefits of HTTPS for DNS.

It still leaks the SNI!

Yes, the Server Name Indication field in the TLS handshake is still clear-text, but we hope to address that as well in the future with efforts like encrypted SNI.

Bugs?

File bug reports in Bugzilla! (in "Core->Networking:DNS" please)

If you want to enable HTTP logging and see what TRR is doing, set the environment variable MOZ_LOG component and level to "nsHostResolver:5". The TRR implementation source code in Firefox lives in netwerk/dns.

Caveats

Credits

While I have written most of the Firefox TRR implementation, I've been greatly assisted by Patrick Mcmanus. Valentin Gosu, Nick Hurley and others in the Firefox Necko team.

DOH in curl?

Since I am also the lead developer of curl people have asked. The work on DOH for curl has not really started yet, but I've collected some thoughts on how DNS-over-HTTPS could be implemented in curl and the doh tool I mentioned above has the basic function blocks already written.

Other efforts to enhance DNS security

There have been other DNS-over-HTTPS protocols and efforts. Recently there was one offered by at least Google that was a JSON style API. That's different.

There's also DNS-over-TLS which shares some of the DOH characteristics, but lacks for example the nice ability to work through proxies, do multiplexing and share existing connections with standard web traffic.

DNScrypt is an older effort that encrypts regular DNS packets and sends them over UDP or TCP.

Inspect curl’s TLS traffic

Since a long time back, the venerable network analyzer tool Wireshark (screenshot above) has provided a way to decrypt and inspect TLS traffic when sent and received by Firefox and Chrome.

You do this by making the browser tell Wireshark the SSL secrets:

  1. set the environment variable named SSLKEYLOGFILE to a file name of your choice before you start the browser
  2. Setting the same file name path in the Master-secret field in Wireshark. Go to Preferences->Protocols->SSL and edit the path as shown in the screenshot below.

Having done this simple operation, you can now inspect your browser's HTTPS traffic in Wireshark. Just super handy and awesome.

Just remember that if you record TLS traffic and want to save it for analyzing later, you need to also save the file with the secrets so that you can decrypt that traffic capture at a later time as well.

curl

Adding curl to the mix. curl can be built using a dozen different TLS libraries and not just a single one as the browsers do. It complicates matters a bit.

In the NSS library for example, which is the TLS library curl is typically built with on Redhat and Centos, handles the SSLKEYLOGFILE magic all by itself so by extension you have been able to do this trick with curl for a long time - as long as you use curl built with NSS. A pretty good argument to use that build really.

Since curl version 7.57.0 the SSLKEYLOGFILE feature can also be enabled when built with GnuTLS, BoringSSL or OpenSSL. In the latter two libs, the feature is powered by new APIs in those libraries and in GnuTLS the library's own logic similar to how NSS does it. Since OpenSSL is the by far most popular TLS backend for curl, this feature is now brought to users more widely.

In curl 7.58.0 (due to ship on Janurary 24, 2018), this feature is built by default also for curl with OpenSSL and in 7.57.0 you need to define ENABLE_SSLKEYLOGFILE to enable it for OpenSSL and BoringSSL.

And what's even cooler? This feature is at the same time also brought to every single application out there that is built against this or later versions of libcurl. In one single blow. now suddenly a whole world opens to make it easier for you to debug, diagnose and analyze your applications' TLS traffic when powered by libcurl!

Like the description above for browsers, you

  1. set the environment variable SSLKEYLOGFILE to a file name to store the secrets in
  2. tell Wireshark to use that same file to find the TLS secrets (Preferences->Protocols->SSL), as the screenshot showed above
  3. run the libcurl-using application (such as curl) and Wireshark will be able to inspect TLS-based protocols just fine!

trace options

Of course, as a light weight alternative: you may opt to use the --trace or --trace-ascii options with the curl tool and be fully satisfied with that. Using those command line options, curl will log everything sent and received in the protocol layer without the TLS applied. With HTTPS you'll see all the HTTP traffic for example.

Credits

Most of the curl work to enable this feature was done by Peter Wu and Ray Satiro.

Firefox Quantum

Next week, Mozilla will release Firefox 57. Also referred to as Firefox Quantum, from the project name we've used for all the work that has been put into making this the most awesome Firefox release ever. This is underscored by the fact that I've gotten mailed release-swag for the first time during my four years so far as a Mozilla employee.

Firefox 57 is the major milestone hundreds of engineers have worked really hard toward during the last year or so, and most of the efforts have been focused on performance. Or perhaps perceived end user snappiness. Early comments I've read and heard also hints that it is also quite notable. I think every single Mozilla engineer (and most non-engineers as well) has contributed to at least some parts of this, and of course many have done a lot. My personal contributions to 57 are not much to write home about, but are mostly a stream of minor things that combined at least move the notch forward.

[edited out some secrets I accidentally leaked here.] I'm a proud Mozillian and being part of a crowd that has put together something as grand as Firefox 57 is an honor and a privilege.

Releasing a product to hundreds of millions of end users across the world is interesting. People get accustomed to things, get emotional and don't particularly like change very much. I'm sure Firefox 57 will also get a fair share of sour feedback and comments written in uppercase. That's inevitable. But sometimes, in order to move forward and do good stuff, we have to make some tough decisions for the greater good that not everyone will agree with.

This is however not the end of anything. It is rather the beginning of a new Firefox. The work on future releases goes on, we will continue to improve the web experience for users all over the world. Firefox 58 will have even more goodies, and I know there are much more good stuff planned for the releases coming in 2018 too...

Onwards and upwards!

(Update: as I feared in this text, I got a lot of negativism, vitriol and criticism in the comments to this post. So much that I decided to close down comments for this entry and delete the worst entries.)

Talk: web transport, today and tomorrow

At the Netnod spring meeting 2017 in Stockholm on the 5th of April I did a talk with the title of this post.

Why was HTTP/2 introduced, how well has HTTP/2 been deployed and used, did it deliver on its promises, where doesn't HTTP/2 perform as well. Then a quick (haha) overview on what QUIC is and how it intends to fix some of the shortcomings of HTTP/2 and TCP. In 28 minutes.

One URL standard please

Following up on the problem with our current lack of a universal URL standard that I blogged about in May 2016: My URL isn't your URL. I want a single, unified URL standard that we would all stand behind, support and adhere to.

What triggers me this time, is yet another issue. A friendly curl user sent me this URL:

http://user@example.com:80@daniel.haxx.se

... and pasting this URL into different tools and browsers show that there's not a wide agreement on how this should work. Is the URL legal in the first place and if so, which host should a client contact?

  • curl treats the '@'-character as a separator between userinfo and host name so 'example.com' becomes the host name, the port number is 80 followed by rubbish that curl ignores. (wget2, the next-gen wget that's in development works identically)
  • wget extracts the example.com host name but rejects the port number due to the rubbish after the zero.
  • Edge and Safari say the URL is invalid and don't go anywhere
  • Firefox and Chrome allow '@' as part of the userinfo, take the '80' as a password and the host name then becomes 'daniel.haxx.se'

The only somewhat modern "spec" for URLs is the WHATWG URL specification. The other major, but now somewhat aged, URL spec is RFC 3986, made by the IETF and published in 2005.

In 2015, URL problem statement and directions was published as an Internet-draft by Masinter and Ruby and it brings up most of the current URL spec problems. Some of them are also discussed in Ruby's WHATWG URL vs IETF URI post from 2014.

What I would like to see happen...

Which group? A group!

Friends I know in the WHATWG suggest that I should dig in there and help them improve their spec. That would be a good idea if fixing the WHATWG spec would be the ultimate goal. I don't think it is enough.

The WHATWG is highly browser focused and my interactions with members of that group that I have had in the past, have shown that there is little sympathy there for non-browsers who want to deal with URLs and there is even less sympathy or interest for URL schemes that the popular browsers don't even support or care about. URLs cover much more than HTTP(S).

I have the feeling that WHATWG people would not like this work to be done within the IETF and vice versa. Since I'd like buy-in from both camps, and any other camps that might have an interest in URLs, this would need to be handled somehow.

It would also be great to get other major URL "consumers" on board, like authors of popular URL parsing libraries, tools and components.

Such a URL group would of course have to agree on the goal and how to get there, but I'll still provide some additional things I want to see.

Update: I want to emphasize that I do not consider the WHATWG's job bad, wrong or lost. I think they've done a great job at unifying browsers' treatment of URLs. I don't mean to belittle that. I just know that this group is only a small subset of the people who probably should be involved in a unified URL standard.

A single fixed spec

I can't see any compelling reasons why a URL specification couldn't reach a stable state and get published as *the* URL standard. The "living standard" approach may be fine for certain things (and in particular browsers that update every six weeks), but URLs are supposed to be long-lived and inter-operate far into the future so they really really should not change. Therefore, I think the IETF documentation model could work well for this.

The WHATWG spec documents what browsers do, and browsers do what is documented. At least that's the theory I've been told, and it causes a spinning and never-ending loop that goes against my wish.

Document the format

The WHATWG specification is written in a pseudo code style, describing how a parser would "walk" over the string with a state machine and all. I know some people like that, I find it utterly annoying and really hard to figure out what's allowed or not. I much more prefer the regular RFC style of describing protocol syntax.

IDNA

Can we please just say that host names in URLs should be handled according to IDNA2008 (RFC 5895)? WHATWG URL doesn't state any IDNA spec number at all.

Move out irrelevant sections

"Irrelevant" when it comes to documenting the URL format that is. The WHATWG details several things that are related to URL for browsers but are mostly irrelevant to other URL consumers or producers. Like section "5. application/x-www-form-urlencoded" and "6. API".

They would be better placed in a "URL considerations for browsers" companion document.

Working doesn't imply sensible

So browsers accept URLs written with thousands of forward slashes instead of two. That is not a good reason for the spec to say that a URL may legitimately contain a thousand slashes. I'm totally convinced there's no critical content anywhere using such formatted URLs and no soul will be sad if we'd restricted the number to a single-digit. So we should. And yeah, then browsers should reject URLs using more.

The slashes are only an example. The browsers have used a "liberal in what you accept" policy for a lot of things since forever, but we must resist to use that as a basis when nailing down a standard.

The odds of this happening soon?

I know there are individuals interested in seeing the URL situation getting worked on. We've seen articles and internet-drafts posted on the issue several times the last few years. Any year now I think we will see some movement for real trying to fix this. I hope I will manage to participate and contribute a little from my end.

HTTPS proxy with curl

Starting in version 7.52.0 (due to ship December 21, 2016), curl will support HTTPS proxies when doing network transfers, and by doing this it joins the small exclusive club of HTTP user-agents consisting of Firefox, Chrome and not too many others.

Yes you read this correctly. This is different than the good old HTTP proxy.

HTTPS proxy means that the client establishes a TLS connection to the proxy and then communicates over that, which is different to the normal and traditional HTTP proxy approach where the clients speak plain HTTP to the proxy.

Talking HTTPS to your proxy is a privacy improvement as it prevents people from snooping on your proxy communication. Even when using HTTPS over a standard HTTP proxy, there's typically a setting up phase first that leaks information about where the connection is being made, user credentials and more. Not to mention that an HTTPS proxy makes HTTP traffic "safe" to and from the proxy. HTTPS to the proxy also enables clients to speak HTTP/2 more easily with proxies. (Even though HTTP/2 to the proxy is not yet supported in curl.)

In the case where a client wants to talk HTTPS to a remote server, when using a HTTPS proxy, it sends HTTPS through HTTPS.

Illustrating this concept with images. When using a traditional HTTP proxy, we connect initially to the proxy with HTTP in the clear, and then from then on the HTTPS makes it safe:

HTTP proxyto compare with the HTTPS proxy case where the connection is safe already in the first step:

HTTPS proxyThe access to the proxy is made over network A. That network has traditionally been a corporate network or within a LAN or something but we're seeing more and more use cases where the proxy is somewhere on the Internet and then "Network A" is really huge. That includes use cases where the proxy for example compresses images or otherwise reduces bandwidth requirements.

Actual HTTPS connections from clients to servers are still done end to end encrypted even in the HTTP proxy case. HTTP traffic to and from the user to the web site however, will still be HTTPS protected to the proxy when a HTTPS proxy is used.

A complicated pull request

This awesome work was provided by Dmitry Kurochkin, Vasy Okhin, and Alex Rousskov. It was merged into master on November 24 in this commit.

Doing this sort of major change in the TLS area in curl code is a massive undertaking, much so because of the fact that curl supports getting built with one out of 11 or 12 different TLS libraries. Several of those are also system-specific so hardly any single developer can even build all these backends on his or hers own machines.

In addition to the TLS backend maze, curl and library also offers a huge amount of different options to control the TLS connection and handling. You can switch on and off features, provide certificates, CA bundles and more. Adding another layer of TLS pretty much doubles the amount of options since now you can tweak everything both in the TLS connection to the proxy as well as the one to the remote peer.

This new feature is supported with the OpenSSL, GnuTLS and NSS backends to start with.

Consider it experimental for now

By all means, go ahead and use it and torture the code and file issues for everything bad you see, but I think we make ourselves a service by considering this new feature set to be a bit experimental in this release.

New options

There's a whole forest of new command line and libcurl options to control all the various aspects of the new TLS connection this introduces. Since it is a totally separate connection it gets a whole set of options that are basically identical to the server connection but with a --proxy prefix instead. Here's a list:

  --proxy-cacert 
  --proxy-capath
  --proxy-cert
  --proxy-cert-type
  --proxy-ciphers
  --proxy-crlfile
  --proxy-insecure
  --proxy-key
  --proxy-key-type
  --proxy-pass
  --proxy-ssl-allow-beast
  --proxy-sslv2
  --proxy-sslv3
  --proxy-tlsv1
  --proxy-tlsuser
  --proxy-tlspassword
  --proxy-tlsauthtype

curl and TLS 1.3

Draft 18 of the TLS version 1.3 spec was publiSSL padlockshed at the end of October 2016.

Already now, both Firefox and Chrome have test versions out with TLS 1.3 enabled. Firefox 52 will have it by default, and while Chrome will ship it, I couldn't figure out exactly when we can expect it to be there by default.

Over the last few days we've merged TLS 1.3 support to curl, primarily in this commit by Kamil Dudka. Both the command line tool and libcurl will negotiate TLS 1.3 in the next version (7.52.0 - planned release date at the end of December 2016) if built with a TLS library that supports it and told to do it by the user.

The two current TLS libraries that will speak TLS 1.3 when built with curl right now is NSS and BoringSSL. The plan is to gradually adjust curl over time as the other libraries start to support 1.3 as well. As always we will appreciate your help in making this happen!

Right now, there's also the minor flux in that servers and clients may end up running implementations of different draft versions of the TLS spec which contributes to a layer of extra fun!

Three TLS current 1.3 test servers to play with: https://enabled.tls13.com/ , https://www.allizom.org/ and https://tls13.crypto.mozilla.org/. I doubt any of these will give you any guarantees of functionality.

TLS 1.3 offers a few new features that allow clients such as curl to do subsequent TLS connections much faster, with only 1 or even 0 RTTs, but curl has no code for any of those features yet.

1,000,000 sites run HTTP/2

... out of the top ten million sites that is. So there's at least that many, quite likely a few more.

This is according to w3techs who runs checks daily. Over the last few months, there have been about 50,000 new sites per month switching it on.

ht2-10-percent

It also shows that the HTTP/2 ratio has increased from a little over 1% deployment a year ago to the 10% today.

HTTP/2 gets more used the more  popular site it is. Among the top 1,000 sites on the web, more than 20% of them use HTTP/2. HTTP/2 also just recently (September 9) overcame SPDY among the top-1000 most popular sites.

h2-sep28

On September 7, Amazon announced their CloudFront service having enabled HTTP/2, which could explain an adoption boost over the last few days. New CloudFront users get it enabled by default but existing users actually need to go in and click a checkbox to make it happen.

As the web traffic of the world is severely skewed toward the top ones, we can be sure that a significantly larger share than 10% of the world's HTTPS traffic is using version 2.

Recent usage stats in Firefox shows that HTTP/2 is used in half of all its HTTPS requests!

http2