Tag Archives: IETF

One URL standard please

Following up on the problem with our current lack of a universal URL standard that I blogged about in May 2016: My URL isn’t your URL. I want a single, unified URL standard that we would all stand behind, support and adhere to.

What triggers me this time, is yet another issue. A friendly curl user sent me this URL:

http://user@example.com:80@daniel.haxx.se

… and pasting this URL into different tools and browsers show that there’s not a wide agreement on how this should work. Is the URL legal in the first place and if so, which host should a client contact?

  • curl treats the ‘@’-character as a separator between userinfo and host name so ‘example.com’ becomes the host name, the port number is 80 followed by rubbish that curl ignores. (wget2, the next-gen wget that’s in development works identically)
  • wget extracts the example.com host name but rejects the port number due to the rubbish after the zero.
  • Edge and Safari say the URL is invalid and don’t go anywhere
  • Firefox and Chrome allow ‘@’ as part of the userinfo, take the ’80’ as a password and the host name then becomes ‘daniel.haxx.se’

The only somewhat modern “spec” for URLs is the WHATWG URL specification. The other major, but now somewhat aged, URL spec is RFC 3986, made by the IETF and published in 2005.

In 2015, URL problem statement and directions was published as an Internet-draft by Masinter and Ruby and it brings up most of the current URL spec problems. Some of them are also discussed in Ruby’s WHATWG URL vs IETF URI post from 2014.

What I would like to see happen…

Which group? A group!

Friends I know in the WHATWG suggest that I should dig in there and help them improve their spec. That would be a good idea if fixing the WHATWG spec would be the ultimate goal. I don’t think it is enough.

The WHATWG is highly browser focused and my interactions with members of that group that I have had in the past, have shown that there is little sympathy there for non-browsers who want to deal with URLs and there is even less sympathy or interest for URL schemes that the popular browsers don’t even support or care about. URLs cover much more than HTTP(S).

I have the feeling that WHATWG people would not like this work to be done within the IETF and vice versa. Since I’d like buy-in from both camps, and any other camps that might have an interest in URLs, this would need to be handled somehow.

It would also be great to get other major URL “consumers” on board, like authors of popular URL parsing libraries, tools and components.

Such a URL group would of course have to agree on the goal and how to get there, but I’ll still provide some additional things I want to see.

Update: I want to emphasize that I do not consider the WHATWG’s job bad, wrong or lost. I think they’ve done a great job at unifying browsers’ treatment of URLs. I don’t mean to belittle that. I just know that this group is only a small subset of the people who probably should be involved in a unified URL standard.

A single fixed spec

I can’t see any compelling reasons why a URL specification couldn’t reach a stable state and get published as *the* URL standard. The “living standard” approach may be fine for certain things (and in particular browsers that update every six weeks), but URLs are supposed to be long-lived and inter-operate far into the future so they really really should not change. Therefore, I think the IETF documentation model could work well for this.

The WHATWG spec documents what browsers do, and browsers do what is documented. At least that’s the theory I’ve been told, and it causes a spinning and never-ending loop that goes against my wish.

Document the format

The WHATWG specification is written in a pseudo code style, describing how a parser would “walk” over the string with a state machine and all. I know some people like that, I find it utterly annoying and really hard to figure out what’s allowed or not. I much more prefer the regular RFC style of describing protocol syntax.

IDNA

Can we please just say that host names in URLs should be handled according to IDNA2008 (RFC 5895)? WHATWG URL doesn’t state any IDNA spec number at all.

Move out irrelevant sections

“Irrelevant” when it comes to documenting the URL format that is. The WHATWG details several things that are related to URL for browsers but are mostly irrelevant to other URL consumers or producers. Like section “5. application/x-www-form-urlencoded” and “6. API”.

They would be better placed in a “URL considerations for browsers” companion document.

Working doesn’t imply sensible

So browsers accept URLs written with thousands of forward slashes instead of two. That is not a good reason for the spec to say that a URL may legitimately contain a thousand slashes. I’m totally convinced there’s no critical content anywhere using such formatted URLs and no soul will be sad if we’d restricted the number to a single-digit. So we should. And yeah, then browsers should reject URLs using more.

The slashes are only an example. The browsers have used a “liberal in what you accept” policy for a lot of things since forever, but we must resist to use that as a basis when nailing down a standard.

The odds of this happening soon?

I know there are individuals interested in seeing the URL situation getting worked on. We’ve seen articles and internet-drafts posted on the issue several times the last few years. Any year now I think we will see some movement for real trying to fix this. I hope I will manage to participate and contribute a little from my end.

First QUIC interim – in Tokyo

The IETF working group QUIC has its first interim meeting in Tokyo Japan for three days. Day one is today, January 24th 2017.

As I’m not there physically, I attend the meeting from remote using the webex that’s been setup for this purpose, and I’ll drop in a little screenshot below from one of the discussions (click it for hires) to give you a feel for it. It shows the issue being discussed and the camera view of the room in Tokyo. I run the jabber client on a different computer which allows me to also chat with the other participants. It works really well, both audio and video are quite crisp and understandable.

Japan is eight hours ahead of me time zone wise, so this meeting  runs from 01:30 until 09:30 Central European Time. That’s less comfortable and it may cause me some troubles to attend the entire thing.

On QUIC

We started off at once with a lot of discussions on basic issues. Versioning and what a specific version actually means and entails. Error codes and how error codes should be used within QUIC and its different components. Should the transport level know about priorities or shouldn’t it? How is the security protocol decided?

Everyone who is following the QUIC issues on github knows that there are plenty of people with a lot of ideas and thoughts on these matters and this meeting shows this impression is real.

For a casual bystander, you might’ve been fooled into thinking that because Google already made and deployed QUIC, these issues should be if not already done and decided, at least fairly speedily gone over. But nope. I think there are plenty of indications already that the protocol outputs that will come in the end of this process, the IETF QUIC will differ from the Google QUIC in a fair number of places.

The plan is that the different QUIC drafts (there are at least 4 different planned RFCs as they’re divided right now) should all be “done” during 2018.

(At 4am, the room took lunch and I wrote this up.)

My first 20 years of HTTP

During the autumn 1996 I took my first swim in the ocean known as HTTP. Twenty years ago now.

I had previously worked with writing an IRC bot in C, and IRC is a pretty simple text based protocol over TCP so I could use some experiences from that when I started to look into HTTP. That IRC bot was my first real application distributed to the world that was using TCP/IP. It was portable to most unixes and Amiga and it was open source.

1996 was the year the movie Independence Day premiered and the single hit song that plagued the world more than others that year was called Macarena. AOL, Webcrawler and Netscape were the most popular websites on the Internet. There were less than 300,000 web sites on the Internet (compared to some 900 million today).

I decided I should spice up the bot and make it offer a currency exchange rate service so that people who were chatting could ask the bot what 200 SEK is when converted to USD or what 50 AUD might be in DEM. – Right, there was no Euro currency yet back then!

I simply had to fetch the currency rates at a regular interval and keep them in the same server that ran the bot. I just needed a little tool to download the rates over HTTP. How hard can that be? I googled around (this was before Google existed so that was not the search engine I could use!) and found a tool named ‘httpget’ that made pretty much what I wanted. It truly was tiny – a few hundred nokia-1610lines of code.

I don’t have an exact date saved or recorded for when this happened, only the general time frame. You know, we had no smart phones, no Google calendar and no digital cameras. I sported my first mobile phone back then, the sexy Nokia 1610 – viewed in the picture on the right here.

The HTTP/1.0 RFC had just recently came out – which was the first ever real spec published for HTTP. RFC 1945 was published in May 1996, but I was blissfully unaware of the youth of the standard and I plunged into my little project. This was the first published HTTP spec and it says:

HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification reflects common usage of the protocol referred too as "HTTP/1.0". This specification describes the features that seem to be consistently implemented in most HTTP/1.0 clients and servers.

Many years after that point in time, I have learned that already at this time when I first searched for a HTTP tool to use, wget already existed. I can’t recall that I found that in my searches, and if I had found it maybe history would’ve made a different turn for me. Or maybe I found it and discarded for a reason I can’t remember now.

I wasn’t the original author of httpget; Rafael Sagula was. But I started contributing fixes and changes and soon I was the maintainer of it. Unfortunately I’ve lost my emails and source code history from those earliest years so I cannot easily show my first steps. Even the oldest changelogs show that we very soon got help and contributions from users.

The earliest saved code archive I have from those days, is from after we had added support for Gopher and FTP and renamed the tool ‘urlget’. urlget-3.5.zip was released on January 20 1998 which thus was more than a year later my involvement in httpget started.

The original httpget/urlget/curl code was stored in CVS and it was licensed under the GPL. I did most of the early development on SunOS and Solaris machines as my first experiments with Linux didn’t start until 97/98 something.

sparcstation-ipc

The first web page I know we have saved on archive.org is from December 1998 and by then the project had been renamed to curl already. Roughly two years after the start of the journey.

RFC 2068 was the first HTTP/1.1 spec. It was released already in January 1997, so not that long after the 1.0 spec shipped. In our project however we stuck with doing HTTP 1.0 for a few years longer and it wasn’t until February 2001 we first started doing HTTP/1.1 requests. First shipped in curl 7.7. By then the follow-up spec to HTTP/1.1, RFC 2616, had already been published as well.

The IETF working group called HTTPbis was started in 2007 to once again refresh the HTTP/1.1 spec, but it took me a while until someone pointed out this to me and I realized that I too could join in there and do my part. Up until this point, I had not really considered that little me could actually participate in the protocol doings and bring my views and ideas to the table. At this point, I learned about IETF and how it works.

I posted my first emails on that list in the spring 2008. The 75th IETF meeting in the summer of 2009 was held in Stockholm, so for me still working  on HTTP only as a spare time project it was very fortunate and good timing. I could meet a lot of my HTTP heroes and HTTPbis participants in real life for the first time.

I have participated in the HTTPbis group ever since then, trying to uphold the views and standpoints of a command line tool and HTTP library – which often is not the same as the web browsers representatives’ way of looking at things. Since I was employed by Mozilla in 2014, I am of course now also in the “web browser camp” to some extent, but I remain a protocol puritan as curl remains my first “child”.

curl wants to QUIC

The interesting Google transfer protocol that is known as QUIC is being passed through the IETF grinding machines to hopefully end up with a proper “spec” that has been reviewed and agreed to by many peers and that will end up being a protocol that is thoroughly documented with a lot of protocol people’s consensus. Follow the IETF QUIC mailing list for all the action.

I’d like us to join the fun

Similarly to how we implemented HTTP/2 support early on for curl, I would like us to get “on the bandwagon” early for QUIC to be able to both aid the protocol development and serve as a testing tool for both the protocol and the server implementations but then also of course to get us a solid implementation for users who’d like a proper QUIC capable client for data transfers.

implementations

The current version (made entirely by Google and not the output of the work they’re now doing on it within the IETF) of the QUIC protocol is already being widely used as Chrome speaks it with Google’s services in preference to HTTP/2 and other protocol options. There exist only a few other implementations of QUIC outside of the official ones Google offers as open source. Caddy offers a separate server implementation for example.

the Google code base

For curl’s sake, it can’t use the Google code as a basis for a QUIC implementation since it is C++ and code used within the Chrome browser is really too entangled with the browser and its particular environment to become very good when converted into a library. There’s a libquic project doing exactly this.

for curl and others

The ideal way to implement QUIC for curl would be to create “nghttp2” alternative that does QUIC. An ngquic if you will! A library that handles the low level protocol fiddling, the binary framing etc. Done that way, a QUIC library could be used by more projects who’d like QUIC support and all people who’d like to see this protocol supported in those tools and libraries could join in and make it happen. Such a library would need to be written in plain C and be suitably licensed for it to be really interesting for curl use.

a needed QUIC library

I’m hoping my post here will inspire someone to get such a project going. I will not hesitate to join in and help it get somewhere! I haven’t started such a project myself because I think I already have enough projects on my plate so I fear I wouldn’t be a good leader or maintainer of a project like this. But of course, if nobody else will do it I will do it myself eventually. If I can think of a good name for it.

some wishes for such a library

  • Written in C, to offer the same level of portability as curl itself and to allow it to get used as extensions by other languages etc
  • FOSS-licensed suitably
  • It should preferably not “own” the socket but also work in-memory and to allow applications to do many parallel connections etc.
  • Non-blocking. It shouldn’t wait for things on its own but let the application do that.
  • Should probably offer both client and server functionality for maximum use.
  • What else?

TCP tuning for HTTP

I’m the author of a brand new internet-draft that I submitted just the other day. The title is TCP Tuning for HTTP,  and the intent is to gather a set of current best practices for HTTP implementers; to share and distribute knowledge we’ve gathered over the years. Clients, servers and intermediaries. For HTTP/1.1 as well as HTTP/2.

I’m now awaiting, expecting and looking forward to feedback, criticisms and additional content for this document so that it can become the resource I’d like it to be.

How to contribute to this?

  1.  ideally, send your feedback to the HTTPbis mailing list,
  2. or submit an issue or pull-request on github for the draft.md
  3. or simply email me your comments: daniel <at> haxx.se

I’ve been participating first passively and more and more actively over the years within the IETF, mostly in the HTTPbis working group. I think open protocols and open standards are important and I like being part of making them reality. I have the utmost respect and admiration for those who are involved in putting the RFCs together and thus improve the world we live in, step by step.

For a long while I’ve been wanting  to step up and “pull my weight” too,  to become a better participant in this area, and I’m happy to now finally take this step. Hopefully this is just the first step of many more to come.

(Psssst: While gathering feedback and updating the git version, the current work in progress version of the draft is always visible here.)

RFC 7540 is HTTP/2

HTTP/2 is the new protocol for the web, as I trust everyone reading my blog are fully aware of by now. (If you’re not, read http2 explained.)

Today RFC 7540 was published, the final outcome of the years of work put into this by the tireless heroes in the HTTPbis working group of the IETF. Closely related to the main RFC is the one detailing HPACK, which is the header compression algorithm used by HTTP/2 and that is now known as RFC 7541.

The IETF part of this journey started pretty much with Mike Belshe’s posting of draft-mbelshe-httpbis-spdy-00 in February 2012. Google’s SPDY effort had been going on for a while and when it was taken to the httpbis working group in IETF, where a few different proposals on how to kick off the HTTP/2 work were debated.

HTTP team working in LondonThe first “httpbis’ified” version of that document (draft-ietf-httpbis-http2-00) was then published on November 28 2012 and the standardization work began for real. HTTP/2 was of course discussed a lot on the mailing list since the start, on the IETF meetings but also in interim meetings around the world.

In Zurich, in January 2014 there was one that I only attended remotely. We had the design team meeting in London immediately after IETF89 (March 2014) in the Mozilla offices just next to Piccadilly Circus (where I took the photos that are shown in this posting). We had our final in-person meetup with the HTTP team at Google’s offices in NYC in June 2014 where we ironed out most of the remaining issues.

In between those two last meetings I published my first version of http2 explained. My attempt at a lengthy and very detailed description of HTTP/2, including describing problems with HTTP/1.1 and motivations for HTTP/2. I’ve since published eleven updates.

HTTP team in London, debating protocol detailsThe last draft update of HTTP/2 that contained actual changes of the binary format was draft-14, published in July 2014. After that, the updates were in the language and clarifications on what to do when. There are some functional changes (added in -16 I believe) for like when which sort of frames are accepted that changes what a state machine should do, but it doesn’t change how the protocol looks on the wire.

RFC 7540 was published on May 15th, 2015

I’ve truly enjoyed having had the chance to be a part of this. There are a bunch of good people who made this happen and while I am most certainly forgetting key persons, some of the peeps that have truly stood out are: Mark, Julian, Roberto, Roy, Will, Tatsuhiro, Patrick, Martin, Mike, Nicolas, Mike, Jeff, Hasan, Herve and Willy.

http2 logo

The state and rate of HTTP/2 adoption

http2 logoThe protocol HTTP/2 as defined in the draft-17 was approved by the IESG and is being implemented and deployed widely on the Internet today, even before it has turned up as an actual RFC. Back in February, already upwards 5% or maybe even more of the web traffic was using HTTP/2.

My prediction: We’ll see >10% usage by the end of the year, possibly as much as 20-30% a little depending on how fast some of the major and most popular platforms will switch (Facebook, Instagram, Tumblr, Yahoo and others). In 2016 we might see HTTP/2 serve a majority of all HTTP requests – done by browsers at least.

Counted how? Yeah the second I mention a rate I know you guys will start throwing me hard questions like exactly what do I mean. What is Internet and how would I count this? Let me express it loosely: the share of HTTP requests (by volume of requests, not by bandwidth of data and not just counting browsers). I don’t know how to measure it and we can debate the numbers in December and I guess we can all end up being right depending on what we think is the right way to count!

Who am I to tell? I’m just a person deeply interested in protocols and HTTP/2, so I’ve been involved in the HTTP work group for years and I also work on several HTTP/2 implementations. You can guess as well as I, but this just happens to be my blog!

The HTTP/2 Implementations wiki page currently lists 36 different implementations. Let’s take a closer look at the current situation and prospects in some areas.

Browsers

Firefox and Chome have solid support since a while back. Just use a recent version and you’re good.

Internet Explorer has been shown in a tech preview that spoke HTTP/2 fine. So, run that or wait for it to ship in a public version soon.

There are no news about this from Apple regarding support in Safari. Give up on them and switch over to a browser that keeps up!

Other browsers? Ask them what they do, or replace them with a browser that supports HTTP/2 already.

My estimate: By the end of 2015 the leading browsers with a market share way over 50% combined will support HTTP/2.

Server software

Apache HTTPd is still the most popular web server software on the planet. mod_h2 is a recent module for it that can speak HTTP/2 – still in “alpha” state. Give it time and help out in other ways and it will pay off.

Nginx has told the world they’ll ship HTTP/2 support by the end of 2015.

IIS was showing off HTTP/2 in the Windows 10 tech preview.

H2O is a newcomer on the market with focus on performance and they ship with HTTP/2 support since a while back already.

nghttp2 offers a HTTP/2 => HTTP/1.1 proxy (and lots more) to front your old server with and can then help you deploy HTTP/2 at once.

Apache Traffic Server supports HTTP/2 fine. Will show up in a release soon.

Also, netty, jetty and others are already on board.

HTTPS initiatives like Let’s Encrypt, helps to make it even easier to deploy and run HTTPS on your own sites which will smooth the way for HTTP/2 deployments on smaller sites as well. Getting sites onto the TLS train will remain a hurdle and will be perhaps the single biggest obstacle to get even more adoption.

My estimate: By the end of 2015 the leading HTTP server products with a market share of more than 80% of the server market will support HTTP/2.

Proxies

Squid works on HTTP/2 support.

HAproxy? I haven’t gotten a straight answer from that team, but Willy Tarreau has been actively participating in the HTTP/2 work all the time so I expect them to have work in progress.

While very critical to the protocol, PHK of the Varnish project has said that Varnish will support it if it gets traction.

My estimate: By the end of 2015, the leading proxy software projects will start to have or are already shipping HTTP/2 support.

Services

Google (including Youtube and other sites in the Google family) and Twitter have ran HTTP/2 enabled for months already.

Lots of existing services offer SPDY today and I would imagine most of them are considering and pondering on how to switch to HTTP/2 as Chrome has already announced them going to drop SPDY during 2016 and Firefox will also abandon SPDY at some point.

My estimate: By the end of 2015 lots of the top sites of the world will be serving HTTP/2 or will be working on doing it.

Content Delivery Networks

Akamai plans to ship HTTP/2 by the end of the year. Cloudflare have stated that they “will support HTTP/2 once NGINX with it becomes available“.

Amazon has not given any response publicly that I can find for when they will support HTTP/2 on their services.

Not a totally bright situation but I also believe (or hope) that as soon as one or two of the bigger CDN players start to offer HTTP/2 the others might feel a bigger pressure to follow suit.

Non-browser clients

curl and libcurl support HTTP/2 since months back, and the HTTP/2 implementations page lists available implementations for just about all major languages now. Like node-http2 for javascript, http2-perl, http2 for Go, Hyper for Python, OkHttp for Java, http-2 for Ruby and more. If you do HTTP today, you should be able to switch over to HTTP/2 relatively easy.

More?

I’m sure I’ve forgotten a few obvious points but I might update this as we go as soon as my dear readers point out my faults and mistakes!

How long is HTTP/1.1 going to be around?

My estimate: HTTP 1.1 will be around for many years to come. There is going to be a double-digit percentage share of the existing sites on the Internet (and who knows how many that aren’t even accessible from the Internet) for the foreseeable future. For technical reasons, for philosophical reasons and for good old we’ll-never-touch-it-again reasons.

The survey

Finally, I asked friends on twitter, G+ and Facebook what they think the HTTP/2 share would be by the end of 2015 with the help of a little poll. This does of course not make it into any sound or statistically safe number but is still just a collection of what a set of random people guessed. A quick poll to get a rough feel. This is how the 64 responses I received were distributed:

http2 share at end of 2015

Evidently, if you take a median out of these results you can see that the middle point is between 5-10 and 10-15. I’ll make it easy and say that the poll showed a group estimate on 10%. Ten percent of the total HTTP traffic to be HTTP/2 at the end of 2015.

I didn’t vote here but I would’ve checked the 15-20 choice, thus a fair bit over the median but only slightly into the top quarter..

In plain numbers this was the distribution of the guesses:

0-5% 29.1% (19)
5-10% 21.8% (13)
10-15% 14.5% (10)
15-20% 10.9% (7)
20-25% 9.1% (6)
25-30% 3.6% (2)
30-40% 3.6% (3)
40-50% 3.6% (2)
more than 50% 3.6% (2)

HTTP/2 is at 5%

http2 logoHere follow some numbers extracted from my recent HTTP/2 presentation.

First: HTTP/2 is not finalized yet and it is not yet in RFC status, even though things are progressing nicely within the IETF. With some luck we reach RFC status within Q1 this year.

On January 13th 2015, Firefox 35 was released with HTTP/2 enabled by default. Firefox was already running it enabled before that in beta and development versions.

Chrome has also been sporting HTTP/2 support in development versions since many moths back where it could easily be manually enabled. Chrome 40 was the first main release shipped with HTTP/2 enabled by default, but it has so far only been enabled for a very small fraction of the user-base.

On January 28th 2015, Google reported to me by email that they saw HTTP/2 being used in 5% of their global traffic (que all relevant disclaimers that this is not statistically safe numbers). This, close after a shaky period with Google having had their HTTP/2 services disabled through parts of the Christmas holidays (due to bugs) – and as explained above, there’s been no time for any mainstream browser to use HTTP/2 by default for very long!

Further data points: Mozilla collects telemetry data from Firefox users who opted-in to it, and it collects numbers on “HTTP Protocol Version Used on Response”. On February 10, it reports that Firefox 35 users have got their responses to report HTTP/2 in 9% of all responses (out of more than 340 billion reported responses). The Telemetry for Firefox Nightly 38 even reports HTTP/2 in 14% of all responses (based on a much smaller sample collection), which I guess could very well be because users on such a bleeding edge version are more experimental by nature.

In these Firefox stats we see that recently, the number of HTTP/2 responses outnumber the HTTP/1.0 responses 9 to 1.