Tag Archives: HTTP

I’m leaving Mozilla

It’s been five great years, but now it is time for me to move on and try something else.

During these five years I’ve met and interacted with a large number of awesome people at Mozilla, lots of new friends! I got the chance to work from home and yet work with a global team on a widely used product, all done with open source. I have worked on internet protocols during work-hours (in addition to my regular spare-time working with them) and its been great! Heck, lots of the HTTP/2 development and the publication of that was made while I was employed by Mozilla and I fondly participated in that. I shall forever have this time ingrained in my memory as a very good period of my life.

I had already before I joined the Firefox development understood some of the challenges of making a browser in the modern era, but that understanding has now been properly enriched with lots of hands-on and code-digging in sometimes decades-old messy C++, a spaghetti armada of threads and the wild wild west of users on the Internet.

A very big thank you and a warm bye bye go to everyone of my friends at Mozilla. I won’t be far off and I’m sure I will have reasons to see many of you again.

My last day as officially employed by Mozilla is December 11 2018, but I plan to spend some of my remaining saved up vacation days before then so I’ll hand over most of my responsibilities way before.

The future is bright but unknown!

I don’t yet know what to do next.

I have some ideas and communications with friends and companies, but nothing is firmly decided yet. I will certainly entertain you with a totally separate post on this blog once I have that figured out! Don’t worry.

Will it affect curl or other open source I do?

I had worked on curl for a very long time already before joining Mozilla and I expect to keep doing curl and other open source things even going forward. I don’t think my choice of future employer should have to affect that negatively too much, except of course in periods.

With me leaving Mozilla, we’re also losing Mozilla as a primary sponsor of the curl project, since that was made up of them allowing me to spend some of my work days on curl and that’s now over.

Short-term at least, this move might increase my curl activities since I don’t have any new job yet and I need to fill my days with something…

What about toying with HTTP?

I was involved in the IETF HTTPbis working group for many years before I joined Mozilla (for over ten years now!) and I hope to be involved for many years still. I still have a lot of things I want to do with curl and to keep curl the champion of its class I need to stay on top of the game.

I will continue to follow and work with HTTP and other internet protocols very closely. After all curl remains the world’s most widely used HTTP client.

Can I enter the US now?

No. That’s unfortunately not related, and I’m not leaving Mozilla because of this problem and I unfortunately don’t expect my visa situation to change because of this change. My visa counter is now showing more than 214 days since I applied.

HTTP/3

The protocol that’s been called HTTP-over-QUIC for quite some time has now changed name and will officially become HTTP/3. This was triggered by this original suggestion by Mark Nottingham.

The QUIC Working Group in the IETF works on creating the QUIC transport protocol. QUIC is a TCP replacement done over UDP. Originally, QUIC was started as an effort by Google and then more of a “HTTP/2-encrypted-over-UDP” protocol.

When the work took off in the IETF to standardize the protocol, it was split up in two layers: the transport and the HTTP parts. The idea being that this transport protocol can be used to transfer other data too and its not just done explicitly for HTTP or HTTP-like protocols. But the name was still QUIC.

People in the community has referred to these different versions of the protocol using informal names such as iQUIC and gQUIC to separate the QUIC protocols from IETF and Google (since they differed quite a lot in the details). The protocol that sends HTTP over “iQUIC” was called “hq” (HTTP-over-QUIC) for a long time.

Mike Bishop scared the room at the QUIC working group meeting in IETF 103 when he presented this slide with what could be thought of almost a logo…

On November 7, 2018 Dmitri of Litespeed announced that they and Facebook had successfully done the first interop ever between two HTTP/3 implementations. Mike Bihop’s follow-up presentation in the HTTPbis session on the topic can be seen here. The consensus in the end of that meeting said the new name is HTTP/3!

No more confusion. HTTP/3 is the coming new HTTP version that uses QUIC for transport!

curl up 2019 will happen in Prague

The curl project is happy to invite you to the city of Prague, the Czech Republic, where curl up 2019 will take place.

curl up is our annual curl developers conference where we gather and talk Internet protocols, curl’s past, current situation and how to design its future. A weekend of curl.

Previous years we’ve gathered twenty-something people for an intimate meetup in a very friendly atmosphere. The way we like it!

In a spirit to move the meeting around to give different people easier travel, we have settled on the city of Prague for 2019, and we’ll be there March 29-31.

Sign up now!

Symposium on the Future of HTTP

This year, we’re starting off the Friday afternoon with a Symposium dedicated to “the future of HTTP” which is aimed to be less about curl and more about where HTTP is and where it will go next. Suitable for a slightly wider audience than just curl fans.

That’s Friday the 29th of March, 2019.

Program and talks

We are open for registrations and we would love to hear what you would like to come and present for us – on the topics of HTTP, of curl or related matters. I’m sure I will present something too, but it becomes a much better and more fun event if we distribute the talking as much as possible.

The final program for these days is not likely to get set until much later and rather close in time to the actual event.

The curl up 2019 wiki page is where you’ll find more specific details appear over time. Just go back there and see.

Helping out and planning?

If you want to follow the planning, help out, offer improvements or you have questions on any of this? Then join the curl-meet mailing list, which is dedicated for this!

Free or charge thanks to sponsors

We’re happy to call our event free, or “almost free” of charge and we can do this only due to the greatness and generosity of our awesome sponsors. This year we say thanks to Mullvad, Sticker Mule, Apiary and Charles University.

There’s still a chance for your company to help out too! Just get in touch.

curl up 2019 with logos

Would you like some bold with those headers?

Displaying HTTP headers for a URL on the screen is one of those things people commonly use curl for.

curl -I example.com

To help your eyes separate header names from the corresponding values, I’ve been experimenting with a change that makes the header names get shown using a bold type face and the header values to the right of the colons to use the standard font.

Sending a HEAD request to the curl site could look like this:

This seemingly small change required an unexpectedly large surgery.

Now I want to turn this into a discussion if this is good enough, if we need more customization, how to make the code act on windows and perhaps how an option to explicitly enable/disable this should be named.

If you have ideas for any of that or other things around this feature, do comment in the PR.

The feature window for the next curl release is already closed so this change will not be considered for real until curl 7.61.0 at the earliest. Due for release in July 2018. So lots of time left to really “bike shed” all the details!

Update: the PR was merged into master on May 21st.

curl another host

Sometimes you want to issue a curl command against a server, but you don’t really want curl to resolve the host name in the given URL and use that, you want to tell it to go elsewhere. To the “wrong” host, which in this case of course happens to be the right host. Because you know better.

Don’t worry. curl covers this as well, in several different ways…

Fake the host header

The classic and and easy to understand way to send a request to the wrong HTTP host is to simply send a different Host: header so that the server will provide a response for that given server.

If you run your “example.com” HTTP test site on localhost and want to verify that it works:

curl --header "Host: example.com" http://127.0.0.1/

curl will also make cookies work for example.com in this case, but it will fail miserably if the page redirects to another host and you enable redirect-following (–location) since curl will send the fake Host: header in all further requests too.

The –header option cleverly cancels the built-in provided Host: header when a custom one is provided so only the one passed in from the user gets sent in the request.

Fake the host header better

We’re using HTTPS everywhere these days and just faking the Host: header is not enough then. An HTTPS server also needs to get the server name provided already in the TLS handshake so that it knows which cert etc to use. The name is provided in the SNI field. curl also needs to know the correct host name to verify the server certificate against (server certificates are rarely registered for an IP address). curl extracts the name to use in both those case from the provided URL.

As we can’t just put the IP address in the URL for this to work, we reverse the approach and instead give curl the proper URL but with a custom IP address to use for the host name we set. The –resolve command line option is our friend:

curl --resolve example.com:443:127.0.0.1 https://example.com/

Under the hood this option populates curl’s DNS cache with a custom entry for “example.com” port 443 with the address 127.0.0.1, so when curl wants to connect to this host name, it finds your crafted address and connects to that instead of the IP address a “real” name resolve would otherwise return.

This method also works perfectly when following redirects since any further use of the same host name will still resolve to the same IP address and redirecting to another host name will then resolve properly. You can even use this option multiple times on the command line to add custom addresses for several names. You can also add multiple IP addresses for each name if you want to.

Connect to another host by name

As shown above, –resolve is awesome if you want to point curl to a specific known IP address. But sometimes that’s not exactly what you want either.

Imagine you have a host name that resolves to a number of different host names, possibly a number of front end servers for the same site/service. Not completely unheard of. Now imagine you want to issue your curl command to one specific server out of the front end servers. It’s a server that serves “example.com” but the individual server is called “host-47.example.com”.

You could resolve the host name in a first step before curl is used and use –resolve as shown above.

Or you can use –connect-to, which instead works on a host name basis. Using this, you can make curl replace a specific host name + port number pair with another host name + port number pair before the name is resolved!

curl --connect-to example.com:443:host-47.example.com:443 https://example.com/

Crazy combos

Most options in curl are individually controlled which means that there’s rarely logic that prevents you from using them in the awesome combinations that you can think of.

— resolve, — connect-to and — header can all be used in the same command line!

Connect to a HTTPS host running on localhost, use the correct name for SNI and certificate verification, but then still ask for a separate host in the Host: header? Sure, no problem:

curl --resolve example.com:443:127.0.0.1 https://example.com/ --header "Host: diff.example.com"

All the above with libcurl?

When you’re done playing with the curl options as described above and want to convert your command lines to libcurl code instead, your best friend is called –libcurl.

Just append –libcurl example.c to your command line, and curl will generate the C code template for you in that given file name. Based on that template, making use of  that code correctly is usually straight-forward and you’ll get all the options to read up in a convenient way.

Good luck!

Update: thanks to @Manawyrm, I fixed the ndash issues this post originally had.

Easier HTTP requests with h2c

I spend a large portion of my days answering questions and helping people use curl and libcurl. With more than 200 command line options it certainly isn’t always easy to find the correct ones, in combination with the Internet and protocols being pretty complicated things at times… not to mention the constant problem of bad advice. Like code samples on stackoverflow that repeats non-recommended patterns.

The notorious -X abuse is a classic example, or why not the widespread disease called too much use of the –insecure option (at a recent count, there were more than 118,000 instances of “curl –insecure” uses in code hosted by github alone).

Sending HTTP requests with curl

HTTP (and HTTPS) is by far the most used protocol out of the ones curl supports. curl can be used to issue just about any HTTP request you can think of, even if it isn’t always immediately obvious exactly how to do it.

h2c to the rescue!

h2c is a new command line tool and associated web service, that when passed a complete HTTP request dump, converts that into a corresponding curl command line. When that curl command line is then run, it will generate exactly(*) the HTTP request you gave h2c.

h2c stands for “headers to curl”.

Many times you’ll read documentation somewhere online or find a protocol/API description showing off a full HTTP request. “This is what the request should look like. Now send it.” That is one use case h2c can help out with.

Example use

Here we have an HTTP request that does Basic authentication with the POST method and a small request body. Do you know how to tell curl to send it?

The request:

POST /receiver.cgi HTTP/1.1
Host: example.com
Authorization: Basic aGVsbG86eW91Zm9vbA==
Accept: */*
Content-Length: 5
Content-Type: application/x-www-form-urlencoded

hello

I save the request above in a text file called ‘request.txt’ and ask h2c to give the corresponding curl command line:

$ ./h2c < request.txt
curl --http1.1 --header User-Agent: --user "hello:youfool" --data-binary "hello" https://example.com/receiver.cgi

If we add “–trace-ascii dump” to that command line, run it, and then inspect the dump file after curl has completed, we can see that it did indeed issue the HTTP request we asked for!

Web Site

Maybe you don’t want to install another command line tool written by me in your system. The solution is the online version of h2c, which is hosted on a separate portion of the official curl web site:

https://curl.haxx.se/h2c/

The web site lets you paste a full HTTP request into a text form and the page then shows the corresponding curl command line for that request.

h2c “as a service”

Inception alert: you can also use the web version of h2c by sending over a HTTP request to it using curl. You’ll then get nothing but the correct curl command line output on stdout.

To send off the same file we used above:

curl --data-urlencode http@request.txt https://curl.haxx.se/h2c/

or of course if you rather want to pass your HTTP request to curl on stdin, that’s equally easy:

cat request.txt | curl --data-urlencode http@- https://curl.haxx.se/h2c/

Early days, you can help!

h2c was created just a few days ago. I’m sure there are bugs, issues and quirks to iron out. You can help! Files issues or submit pull-requests!

(*) = barring bugs, there are still some edge cases where the exact HTTP request won’t be possible to repeat, but where we instead will attempt to do “the right thing”.

“OPTIONS *” with curl

(Note: this blog post as been updated as the command line option changed after first publication, based on comments to this very post!)

curl is arguably a “Swiss army knife” of HTTP fiddling. It is one of the available tools in the toolbox with a large set of available switches and options to allow us to tweak and modify our HTTP requests to really test, debug and torture our HTTP servers and services.

That’s the way we like it.

In curl 7.55.0 it will take yet another step into this territory when we finally introduce a way for users to send “OPTION *” and similar requests to servers. It has been requested occasionally by users over the years but now the waiting is over. (brought by this commit)

“OPTIONS *” is special and peculiar just because it is one of the few specified requests you can do to a HTTP server where the path part doesn’t start with a slash. Thus you cannot really end up with this based on a URL and as you know curl is pretty much all about URLs.

The OPTIONS method was introduced in HTTP 1.1 already back in RFC 2068, published in January 1997 (even before curl was born) and with curl you’ve always been able to send an OPTIONS request with the -X option, you just were never able to send that single asterisk instead of a path.

In curl 7.55.0 and later versions, you can remove the initial slash from the path part that ends up in the request by using –request-target. So to send an OPTION * to example.com for http and https URLs, you could do it like:

$ curl --request-target "*" -X OPTIONS http://example.com
$ curl --request-target "*" -X OPTIONS https://example.com/

In classical curl-style this also opens up the opportunity for you to issue completely illegal or otherwise nonsensical paths to your server to see what it does on them, to send totally weird options to OPTIONS and similar games:

$ curl --request-target "*never*" -X OPTIONS http://example.com

$ curl --request-target "allpasswords" http://example.com

Enjoy!

curl: read headers from file

Starting in curl 7.55.0 (since this commit), you can tell curl to read custom headers from a file. A feature that has been asked for numerous times in the past, and the answer has always been to write a shell script to do it. Like this:

#!/bin/sh
while read line; do
  args="$args -H '$line'";
done
curl $args $URL

That’s now a response of the past (or for users stuck on old curl versions). We can now instead tell curl to read headers itself from a file using the curl standard @filename way:

$ curl -H @headers https://example.com

… and this also works if you want to just send custom headers to the proxy you do CONNECT to:

$ curl --proxy-headers @headers --proxy proxy:8080 https://example.com/

(this is a pure curl tool change that doesn’t affect libcurl, the library)

curling over HTTP proxy

Starting in curl 7.55.0 (this commit), curl will no longer try to ask HTTP proxies to perform non-HTTP transfers with GET, except for FTP. For all other protocols, curl now assumes you want to tunnel through the HTTP proxy when you use such a proxy and protocol combination.

Protocols and proxies

curl supports 23 different protocols right now, if we count the S-versions (the TLS based alternatives) as separate protocols.

curl also currently supports seven different proxy types that can be set independently of the protocol.

One type of proxy that curl supports is a so called “HTTP proxy”. The official HTTP standard includes a defined way how to speak to such a proxy and ask it to perform the request on the behalf of the client. curl supports using that over either HTTP/1.1 or HTTP/1.0, where you’d typically only use the latter version if you the first really doesn’t work with your ancient proxy.

HTTP proxy

All that is fine and good. But HTTP proxies were really only defined to handle HTTP, and to some extent HTTPS. When doing plain HTTP transfers over a proxy, the client will send its request to the proxy like this:

GET http://curl.haxx.se/ HTTP/1.1
Host: curl.haxx.se
Accept: */*
User-Agent: curl/7.55.0

… but for HTTPS, which should provide end to end encryption, a client needs to ask the proxy to instead tunnel through the proxy so that it can do TLS all the way, without any middle man, to the server:

CONNECT curl.haxx.se:443 HTTP/1.1
Host: curl.haxx.se:443
User-Agent: curl/7.55.0

When successful, the proxy responds with a “200” which means that the proxy has established a TCP connection to the remote server the client asked it to connect to, and the client can then proceed and do the TLS handshake with that server. When the TLS handshake is completed, a regular GET request is then sent over that established and secure TLS “tunnel” to the server. A GET request that then looks like one that is sent without proxy:

GET / HTTP/1.1
Host: curl.haxx.se
User-Agent: curl/7.55.0
Accept: */*

FTP over HTTP proxy

Things get more complicated when trying to perform transfers over the HTTP proxy using schemes that aren’t HTTP. As already described above, HTTP proxies are basically designed only for doing HTTP over them, but as they have this concept of tunneling through to the remote server it doesn’t have to be limited to just HTTP.

Also, historically, for decades people have deployed HTTP proxies that recognize FTP URLs, and transparently handle them for the client so the client can almost believe it is HTTP while the proxy has to speak FTP to the remote server in the other end and convert it back to HTTP to the client. On such proxies (Squid and Apache both support this mode for example), this sort of request is possible:

GET ftp://ftp.funet.fi/ HTTP/1.1
Host: ftp.funet.fi
User-Agent: curl/7.55.0
Accept: */*

curl knows this and if you ask curl for FTP over an HTTP proxy, it will assume you have one of these proxies. It should be noted that this method of course limits what you can do FTP-wise and for example FTP upload is usually not working and if you ask curl to do FTP upload over and HTTP proxy it will do that with a HTTP PUT.

HTTP proxy tunnel

curl features an option (–proxytunnel) that lets the user forcible tell the client to not assume that the proxy speaks this protocol and instead use the CONNECT method with establishing a tunnel through the proxy to the remote server.

It should of course be noted that very few deployed HTTP proxies in the wild allow clients to CONNECT to whatever port they like. HTTP proxies tend to only allow connecting to port 443 as that is the official HTTPS port, and if you ask for another port it will respond back with a 4xx response code refusing to comply.

Not HTTP not FTP over HTTP proxy

So HTTP, HTTPS and FTP are sent over the HTTP proxy fine. That leaves us with nineteen more protocols. What happens with them when you ask curl to perform them over a HTTP proxy?

Now we have finally reached the change that has just been merged in curl and changes what curl does.

Before 7.55.0

curl would send all protocols as a regular GET to the proxy if asked to use a HTTP proxy without seeing the explicit proxy-tunnel option. This came from how FTP was done and grew from there without many people questioning it. Of course it wouldn’t ever work, but also very few people would actually attempt it because of that.

From 7.55.0

All protocols that aren’t HTTP, HTTPS or FTP will enable the tunnel-through mode automatically when a HTTP proxy is used. No more sending funny GET requests to proxies when they won’t work anyway. Also, it will prevent users from accidentally leak credentials to proxies that were intended for the server, which previously could happen if you omitted the tunnel option with a few authentication setups.

HTTP/2 proxy

Sorry, curl doesn’t support that yet. Patches welcome!

HTTP Workshop s03e02

(Season three, episode two)

Previously, on the HTTP Workshop. Yesterday ended with a much appreciated group dinner and now we’re back energized and eager to continue blabbing about HTTP frames, headers and similar things.

Martin from Mozilla talked on “connection management is hard“. Parts of the discussion was around the HTTP/2 connection coalescing that I’ve blogged about before. The ORIGIN frame is a draft for a suggested way for servers to more clearly announce which origins it can answer for on that connection which should reduce the frequency of 421 needs. The ORIGIN frame overrides DNS and will allow coalescing even for origins that don’t otherwise resolve to the same IP addresses. The Alt-Svc header, a suggested CERTIFICATE frame and how does a HTTP/2 server know for which origins it can do PUSH for?

A lot of positive words were expressed about the ORIGIN frame. Wildcard support?

Willy from HA-proxy talked about his Memory and CPU efficient HPACK decoding algorithm. Personally, I think the award for the best slides of the day goes to Willy’s hand-drawn notes.

Lucas from BBC talked about usage data for iplayer and how much data and number of requests they serve and how their largest share of users are “non-browsers”. Lucas mentioned their work on writing a libcurl adaption to make gstreamer use it instead of libsoup. Lucas talk triggered a lengthy discussion on what needs there are and how (if at all) you can divide clients into browsers and non-browser.

Wenbo from Google spoke about Websockets and showed usage data from Chrome. The median websockets connection time is 20 seconds and 10% something are shorter than 0.5 seconds. At the 97% percentile they live over an hour. The connection success rates for Websockets are depressingly low when done in the clear while the situation is better when done over HTTPS. For some reason the success rate on Mac seems to be extra low, and Firefox telemetry seems to agree. Websockets over HTTP/2 (or not) is an old hot topic that brought us back to reiterate issues we’ve debated a lot before. This time we also got a lovely and long side track into web push and how that works.

Roy talked about Waka, a HTTP replacement protocol idea and concept that Roy’s been carrying around for a long time (he started this in 2001) and to which he is now coming back to do actual work on. A big part of the discussion was focused around the wakli compression ideas, what the idea is and how it could be done and evaluated. Also, Roy is not a fan of content negotiation and wants it done differently so he’s addressing that in Waka.

Vlad talked about his suggestion for how to do cross-stream compression in HTTP/2 to significantly enhance compression ratio when, for example, switching to many small resources over h2 compared to a single huge resource over h1. The security aspect of this feature is what catches most of people’s attention and the following discussion. How can we make sure this doesn’t leak sensitive information? What protocol mechanisms exist or can we invent to help out making this work in a way that is safer (by default)?

Trailers. This is again a favorite topic that we’ve discussed before that is resurfaced. There are people around the table who’d like to see support trailers and we discussed the same topic in the HTTP Workshop in 2016 as well. The corresponding issue on trailers filed in the fetch github repo shows a lot of the concerns.

Julian brought up the subject of “7230bis” – when and how do we start the work. What do we want from such a revision? Fixing the bugs seems like the primary focus. “10 years is too long until update”.

Kazuho talked about “HTTP/2 attack mitigation” and how to handle clients doing many parallel slow POST requests to a CDN and them having an origin server behind that runs a new separate process for each upload.

And with this, the day and the workshop 2017 was over. Thanks to Facebook for hosting us. Thanks to the members of the program committee for driving this event nicely! I had a great time. The topics, the discussions and the people – awesome!