I got this neat t-shirt in the mail yesterday. 3scale runs a sort of marketing campaign right now and they give away this shirt to the ones who participate, and they were kind enough to send one to me!
Category Archives: Open Source
Open Source, Free Software, and similar
The last HTTP Workshop day
This workshop has been really intense days so far and this last and forth Workshop day did not turn out differently. We started out the morning with the presentation: Caching, Intermediation and the Modern Web by Martin Thomson (Mozilla) describing his idea of a “blind cache” and how it could help to offer caching in a HTTPS world. It of course brought a lot of discussions and further brainstorming on the ideas and how various people in the room thought the idea could be improved or changed.
Immediately following that, Martin continued with a second presentation describing for us a suggested new encryption format for HTTP based on the JWE format and how it could possible be used.
The room then debated connection coalescing (with HTTP/2) for a while and some shared their experiences and thoughts on the topic. It is an area where over-sharing based on the wrong assumptions certainly can lead to tears and unhappiness but it seems the few in the room who actually have implemented this seemed to have considered most of the problems people could foresee.
Support of Trailers in HTTP was brought up and we discussed its virtues for a while vs the possible problems with supporting it and what possible caveats could be, and we also explored the idea of using HTTP/2 push instead of trailers to allow servers to send meta-data that way, and that then also doesn’t necessarily have to follow after the transfer but can in fact be sent during transfer!
Resumed uploads is a topic that comes back every now and then and that has some interest. (It is probably one of the most frequently requested protocol features I get asked about.) It was brought up as something we should probably discuss further, and especially when discussing the next generation HTTP.
At some point in the future we will start talking about HTTP/3. We had a long discussion with the whole team here on what HTTP/3 could entail and we also explored general future HTTP and HTTP/2 extensions and more. A massive list of possible future work was created. The list ended up with something like 70 different things to discuss or work on, but of course most of those things will never actually become reality.
With so much possible or potential work ahead, we need to involve more people that want to and can consider writing specs and to show how easy it apparently can be, Martin demoed how to write a first I-D draft using the fancy Internet Draft Template Repository. Go check it out!
Poul-Henning Kamp brought up the topic of “CO2 usage of the Internet” and argued for that current and future protocol work need to consider the environmental impact and how “green” protocols are. Ilya Grigorik (Google) showed off numbers from http archive.org‘s data and demoed how easy it is to use the big query feature to extract useful information and statistical info out of the vast amount of data they’ve gathered there. Brad Fitspatrick (Google) showed off his awesome tool h2i and how we can use it to poke on and test HTTP/2 server implementations in a really convenient and almost telnet-style command line using way.
Finally, Mark Nottingham (Akamai) showed off his redbot.org service that runs HTTP against a site, checks its responses and reports with details exactly what it responds and why and provide a bunch of analysis and informational based on that.
Such an eventful day really had to be rounded off with a bunch of beers and so we did. The HTTP Workshop of the summer 2015 ended. The event was great. The attendees were great. The facilities and the food were perfect. I couldn’t ask for more. Thanks for arranging such a great happening!
I’ll round off showing off my laptop lid after the two new stickers of the week were applied. (The HTTP Workshop one and an Apache one I got from Roy):
… I’ll get up early tomorrow morning and fly back home.
HTTPS and HTTP/2 plans for my sites
I produce a fair amount of open source code. I make that code available online. curl is probably the most popular package.
People ask me how they can trust that they are actually downloading what I put up there. People ask me when my source code can be retrieved over HTTPS. Signatures and hashes don’t add a lot against attacks when they all also are fetched over HTTP…
HTTPS
I really and truly want to offer HTTPS (only) for all my sites. I and my friends run a whole busload of sites on the same physical machine and IP address (www.haxx.se, daniel.haxx.se, curl.haxx.se, c-ares.haxx.se, cool.haxx.se, libssh2.org and many more) so I would like a solution that works for all of them.
I can do this by buying certs, either a lot of individual ones or a few wildcard ones and then all servers would be covered. But the cost and the inconvenience of needing a lot of different things to make everything work has put me off. Especially since I’ve learned that there is a better solution in the works!
Let’s Encrypt will not only solve the problem for us from a cost perspective, but they also promise to solve some of the quirks on the technical side as well. They say they will ship certificates by September 2015 and that has made me wait for that option rather than rolling up my sleeves to solve the problem with my own sweat and money. Of course there’s a risk that they are delayed, but I’m not running against a hard deadline myself here.
HTTP/2
Related, I’ve been much involved in the HTTP/2 development and I host my “http2 explained” document on my still non-HTTPS site. I get a lot of questions (and some mocking) about why my HTTP/2 documentation isn’t itself available over HTTP/2. I would really like to offer it over HTTP/2.
Since all the browsers only do HTTP/2 over HTTPS, a prerequisite here is that I get HTTPS up and running first. See above.
Once HTTPS is in place, I want to get HTTP/2 going as well. I still run good old Apache here so it might be done using mod_h2 or perhaps with a fronting nghttp2 proxy. We’ll see.
HTTP Workshop 2015, day -1
I’ve traveled to a rainy and gray Münster, Germany, today and checked in to my hotel for the coming week and the HTTP Workshop. Tomorrow is the first day and I’m looking forward to it probably a little too much.
There is a whole bunch of attendees coming. Simply put, most of the world’s best brains and the most eager implementers of the HTTP stacks that are in use today and will be in use tomorrow (with a bunch of notable absentees of course but you know you’ll be missed). I’m happy and thrilled to be able to take part during this coming week.
I lead the curl project and this is how it works
I did this 50 minute talk on May 21 2015 for a Swedish company. With tongue in cheek subtitled “from hobby to world domination”. I think it turned out pretty decent and covers what the project is, how we work on it and what I do to make it run. Some of the questions are not easy to hear but in general it works out fine. Enjoy!
daniel weekly
My series of weekly videos, in lack of a better name called daniel weekly, reached episode 35 today. I’m celebrating this fact by also adding an RSS-feed for those of you who prefer to listen to me in an audio-only version.
As an avid podcast listener myself, I can certainly see how this will be a better fit to some. Most of these videos are just me talking anyway so losing the visual shouldn’t be much of a problem.
A typical episode
I talk about what I work on in my open source projects, which means a lot of curl stuff and occasional stuff from my work on Firefox for Mozilla. I also tend to mention events I attend and HTTP/networking developments I find interesting and grab my attention. Lots of HTTP/2 talk for example. I only ever express my own personal opinions.
It is generally an extremely geeky and technical video series.
Every week I mention a (curl) “bug of the week” that allows me to joke or rant about the bug in question or just mention what it is about. In episode 31 I started my “command line options of the week” series in which I explain one or a few curl command line options with some amount of detail. There are over 170 options so the series is bound to continue for a while. I’ve explained ten options so far.
I’ve set a limit for myself and I make an effort to keep the episodes shorter than 20 minutes. I’ve not succeed every time.
Analytics
The 35 episodes have been viewed over 17,000 times in total. Episode two is the most watched individual one with almost 1,500 views.
Right now, my channel has 190 subscribers.
The top-3 countries that watch my videos: USA, Sweden and UK.
Share of viewers that are female: 3.7%
server push to curl
The next step in my efforts to complete curl‘s HTTP/2 implementation, after having made sure downloading and uploading transfers in parallel work, was adding support for HTTP/2 server push.
A quick recap
HTTP/2 Server push is a way for the server to initiate the transfer of a resource. Like when the client asks for resource X, the server can deem that the client most probably also wants to have resource Y and Z and initiate their transfers.
The server then sends a PUSH_PROMISE to the client for the new resource and hands over a set of “request headers” that a GET for that resource could have used, and then it sends the resource in a way that it would have done if it was requested the “regular” way.
The push promise frame gives the client information to make a decision if the resource is wanted or not and it can then immediately deny this transfer if it considers it unwanted. Like in a browser case if it already has that file in its local cache or similar. If not denied, the stream has an initial window size that allows the server to send a certain amount of data before the client has to give the stream more allowance to continue.
It is also suitable to remember that server push is a new protocol feature in HTTP/2 and as such it has not been widely used yet and it remains to be seen exactly how it will become used the best way and what will turn out popular and useful. We have this “immaturity” in mind when designing this support for libcurl.
Enter libcurl
When setting up a transfer over HTTP/2 with libcurl you do it with the multi interface to make it able to work multiplexed. That way you can set up and perform any number of transfers in parallel, and if they happen to use the same host they can be done multiplexed but if they use different hosts they will use separate connections.
To the application, transfers pretty much look the same and it can remain agnostic to whether the transfer is multiplexed or not, it is just another transfer.
With the libcurl API, the application creates an “easy handle” for each transfer and it sets options in that handle for the upcoming transfer. before it adds that to the “multi handle” and then libcurl drives all those individual transfers at the same time.
Server-initiated transfers
Starting in the future version 7.44.0 – planned release date in August, the plan is to introduce the API support for server push. It couldn’t happen sooner because I missed the merge window for 7.43.0 and then 7.44.0 is simply the next opportunity. The wiki link here is however updated and reflects what is currently being implemented.
An application sets a callback to allow server pushed streams. The callback gets called by libcurl when a PUSH_PROMISE is received by the client side, and the callback can then tell libcurl if the new stream should be allowed or not. It could be as simple as this:
static int server_push_callback(CURL *parent, CURL *easy, size_t num_headers, struct curl_pushheaders *headers, void *userp) { char *headp; size_t i; FILE *out; /* here's a new stream, save it in a new file for each new push */ out = fopen("push-stream", "wb"); /* write to this file */ curl_easy_setopt(easy, CURLOPT_WRITEDATA, out); headp = curl_pushheader_byname(headers, ":path"); if(headp) fprintf(stderr, "The PATH is %s\n", headp); return CURL_PUSH_OK; }
The callback would instead return CURL_PUSH_DENY if the stream isn’t desired. If no callback is set, no pushes will be accepted.
An interesting effect of this API is that libcurl now creates and adds easy handles to the multi handle by itself when the callback okeys it, so there will be more easy handles to cleanup at the end of the operations than what the application added. Each pushed transfer needs get cleaned up by the application that “inherits” the ownership of the transfer and the easy handle for it.
PUSH_PROMISE headers
The headers passed along in that frame will contain the mandatory “special” request ones (“:method”, “:path”, “:scheme” and “:authority”) but other than those it really isn’t certain which other headers servers will provide and how this will work. To prepare for this fact, we provide two accessor functions for the push callback to access all PUSH_PROMISE headers libcurl received:
- curl_pushheader_byname() lets the callback get the contents of a specific header. I imagine that “:path” for example is one of those that most typical push callbacks will want to take a closer look at.
- curl_pushheader_bynum() allows the function to iterate over all received headers and do whatever it needs to do, it gets the full header by index.
These two functions are also somewhat special and new in the libcurl world since they are only possible to use from within this particular callback and they are invalid and wrong to use in any and all other contexts.
HTTP/2 headers are compressed on the wire using HPACK compression, but when access from this callback all headers use the familiar HTTP/1.1 style of “name:value”.
Work in progress
As I mentioned above already, this is work in progress and I welcome all and any comments or suggestions on how this API can be improved or tweaked to even better fit your needs. Implementing features such as these usually turn out better when there are users trying them out before they are written in stone.
To try it out, build a libcurl from the http2-push branch:
https://github.com/bagder/curl/commits/http2-push
And while there are docs and an example in that branch already, you may opt to read the wiki version of the docs:
https://github.com/bagder/curl/wiki/HTTP-2-Server-Push
The best way to send your feedback on this is to post to the curl-library mailing list, but if you find obvious bugs or want to provide patches you can also opt to file issues or pull-requests on github.
picturing curl’s future
There will be more stuff over time in the cURL project. Exactly which stuff and how long time it takes for everything, we don’t know. It depends largely on who works on what and how much time said persons can spend on implementing the stuff they work on…
I suspect we might be able to do things slightly faster over time, which is why the red arrow isn’t just a straight line.
I drew this little picture inspired from discussions with friends after a talk I did about curl and how development works in an open source project such as this. We know we will work on things that will improve the products but we don’t see exactly what very far in advance. I tweeted this picture a few days ago, and it turned out very popular.
2015 curl user poll analysis
My full 30 page document with all details and analyses of the curl user poll 2015 is now available. It shows details of all the questions, most of them with a comparison with last year’s survey. The write-ins are also full of good advice, wisdom and some signs of ignorance or unawareness.
I hope all curl hackers and others generally interested in the project can use my “report” to learn something about our users and our user’s view of the project and our products.
Let’s use this to guide us going forward.
status update: http2 multiplexed uploads
I wrote a previous update about my work on multiplexing in curl. This is a follow-up to describe the status as of today.
I’ve successfully used the http2-upload.c code to upload 600 parallel streams to the test server and they were all sent off fine and the responses received were stored fine. MAX_CONCURRENT_STREAMS on the server was set to 100.
This is using curl git master as of right now (thus scheduled for inclusion in the pending curl 7.43.0 release). I’m not celebrating just yet, but it is looking pretty good. I’ll continue testing.
Commit b0143a2a3 was crucial for this, as I realized we didn’t store and use the read callback in the easy handle but in the connection struct which is completely wrong when many easy handles are using the same connection! I don’t recall the exact reason why I put the data in that struct (I went back and read the commit messages etc) but I think this setup is correct conceptually and code-wise, so if this leads to some side-effects I think we need to just fix it.
Next up: more testing, and then taking on the concept of server push to make libcurl able to support it. It will certainly be a subject for future blog posts…