We’re running a short poll asking people about where and how we should organize curl up 2020 – our annual curl developers conference. I’m not making any promises, but getting people’s opinions will help us when preparing for next year.
curl supports some twenty-three protocols (depending on exactly how you count).
In order to properly test and verify curl’s implementations of each of these protocols, we have a test suite. In the test suite we have a set of handcrafted servers that speak the server-side of these protocols. The more used a protocol is, the more important it is to have it thoroughly tested.
We believe in having test servers that are “stupid” and that offer buttons, levers and thresholds for us to control and manipulate how they act and how they respond for testing purposes. The control of what to send should be dictated as much as possible by the test case description file. If we want a server to send back a slightly broken protocol sequence to check how curl supports that, the server must be open for this.
In order to do this with a large degree of freedom and without restrictions, we’ve found that using “real” server software for this purpose is usually not good enough. Testing the broken and bad cases are typically not easily done then. Actual server software tries hard to do the right thing and obey standards and protocols, while we rather don’t want the server to make any decisions by itself at all but just send exactly the bytes we ask it to. Simply put.
Of course we don’t always get what we want and some of these protocols are fairly complicated which offer challenges in sticking to this policy all the way. Then we need to be pragmatic and go with what’s available and what we can make work. Having test cases run against a real server is still better than no test cases at all.
“SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.
Recently we fixed a bug in how curl sends credentials to a SOCKS5 proxy as it turned out the protocol itself only supports user name and password length of 255 bytes each, while curl normally has no such limits and could pass on credentials with virtually infinite lengths. OK, that was silly and we fixed the bug. Now curl will properly return an error if you try such long credentials with your SOCKS5 proxy.
As a general rule, fixing a bug should mean adding at least one new test case, right? Up to this time we had been testing the curl SOCKS support by firing up an ssh client and having that setup a SOCKS proxy that connects to the other test servers.
curl -> ssh with SOCKS proxy -> test server
Since this setup doesn’t support SOCKS5 authentication, it turned out complicated to add a test case to verify that this bug was actually fixed.
This test problem was fixed by the introduction of a newly written SOCKS proxy server dedicated for the curl test suite (which I simply named socksd). It does the basic SOCKS4 and SOCKS5 protocol logic and also supports a range of commands to control how it behaves and what it allows so that we can now write test cases against this server and ask the server to misbehave or otherwise require fun things so that we can make really sure curl supports those cases as well.
It also has the additional bonus that it works without ssh being present so it will be able to run on more systems and thus the SOCKS code in curl will now be tested more widely than before.
curl -> socksd -> test server
Going forward, we should also be able to create even more SOCKS tests with this and make sure to get even better SOCKS test coverage.
In January 2002, we added support for a global DNS cache in libcurl. All transfers set to use it would share and use the same global cache.
We rather quickly realized that having a global cache without locking was error-prone and not really advisable, so already in March 2004 we added comments in the header file suggesting that users should not use this option.
It remained in the code and time passed.
In the autumn of 2018, another fourteen years later, we finally addressed the issue when we announced a plan for this options deprecation. We announced a date for when it would become deprecated and disabled in code (7.62.0), and then six months later if no major incidents or outcries would occur, we said we would delete the code completely.
That time has now arrived. All code supporting a global DNS cache in curl has been removed. Any libcurl-using program that sets this option from now on will simply not get a global cache and instead proceed with the default handle-oriented cache, and the documentation is updated to clearly indicate that this is the case. This change will ship in curl 7.65.0 due to be released in May 2019 (merged in this commit).
If a program still uses this option, the only really noticeable effect should be a slightly worse name resolving performance, assuming the global cache had any point previously.
Programs that want to continue to have a DNS cache shared between multiple handles should use the share interface, which allows shared DNS cache and more – with locking. This API has been offered by libcurl since 2003.
HTTP/1.1 Pipelining is the protocol feature where the client sends off a second HTTP/1.1 request already before the answer to the previous request has arrived (completely) from the server. It is defined in the original HTTP/1.1 spec and is a way to avoid waiting times. To reduce latency.
HTTP/1.1 Pipelining was badly supported by curl for a long time in the sense that we had a series of known bugs and it was a fragile feature without enough tests. Also, pipelining is fairly tricky to debug due to the timing sensitivity so very often enabling debug outputs or similar completely changes the nature of the behavior and things are not reproducing anymore!
HTTP pipelining was never enabled by default by the large desktop browsers due to all the issues with it, like broken server implementations and the likes. Both Firefox and Chrome dropped pipelining support entirely since a long time back now. curl did in fact over time become more and more lonely in supporting pipelining.
The bad state of HTTP pipelining was a primary driving factor behind HTTP/2 and its multiplexing feature. HTTP/2 multiplexing is truly and really “pipelining done right”. It is way more solid, practical and solves the use case in a better way with better performance and fewer downsides and problems. (curl enables multiplexing by default since 7.62.0.)
In 2019, pipelining should be abandoned and HTTP/2 should be used instead.
Starting with this commit, to be shipped in release 7.65.0, curl no longer has any code that supports HTTP/1.1 pipelining. It has been disabled in the code since 7.62.0 already so applications and users that use a recent version already should not notice any difference.
Pipelining was always offered on a best-effort basis and there was never any guarantee that requests would actually be pipelined, so we can remove this feature entirely without breaking API or ABI promises. Applications that ask libcurl to use pipelining can still do that, it just won’t have any effect.
The 2019 HTTP Workshop ended today. In total over the years, we have now done 12 workshop days up to now. This day was not a full day and we spent it on only two major topics that both triggered long discussions involving large parts of the room.
One out of every thousand cookie header values is 10K or larger in size and even at the 50% percentile, the size is 480 bytes. They’re a disaster on so many levels. The additional features that have been added during the last decade are still mostly unused. Mike suggests that maybe the only way forward is to introduce a replacement that avoids the issues, and over longer remove cookies from the web: HTTP state tokens.
A lot of people in the room had opinions and thoughts on this. I don’t think people in general have a strong love for cookies and the way they currently work, but the how-to-replace-them question still triggered lots of concerns about issues from routing performance on the server side to the changed nature of the mechanisms that won’t encourage web developers to move over. Just adding a new mechanism without seeing the old one actually getting removed might not be a win.
We should possibly “worsen” the cookie experience over time to encourage switch over. To cap allowed sizes, limit use to only over HTTPS, reduce lifetimes etc, but even just that will take effort and require that the primary cookie consumers (browsers) have a strong will to hurt some amount of existing users/sites.
(Related: Mike is also one of the authors of the RFC6265bis draft in progress – a future refreshed cookie spec.)
Mike Bishop did an excellent presentation of HTTP/3 for HTTP people that possibly haven’t kept up fully with the developments in the QUIC working group. From a plain HTTP view, HTTP/3 is very similar feature-wise to HTTP/2 but of course sent over a completely different transport layer. (The HTTP/3 draft.)
Most of the questions and discussions that followed were rather related to the transport, to QUIC. Its encryption, it being UDP, DOS prevention, it being “CPU hungry” etc. Deploying HTTP/3 might be a challenge for successful client side implementation, but that’s just nothing compared the totally new thing that will be necessary server-side. Web developers should largely not even have to care…
One tidbit that was mentioned is that in current Firefox telemetry, it shows about 0.84% of all requests negotiates TLS 1.3 early data (with about 12.9% using TLS 1.3)
Thought-worthy quote of the day comes from Willy: “everything is a buffer”
There’s no next workshop planned but there might still very well be another one arranged in the future. The most suitable interval for this series isn’t really determined and there might be reasons to try tweaking the format to maybe change who will attend etc.
The fact that almost half the attendees this time were newcomers was certainly good for the community but that not a single attendee traveled here from Asia was less good.
Thanks to the organizers, the program committee who set this up so nicely and the awesome sponsors!
Yesterday we plowed through a large and varied selection of HTTP topics in the Workshop. Today we continued. At 9:30 we were all in that room again. Day two.
With the audience warmed up like this, Anne van Kasteren took us back to reality with an old favorite topic in the HTTP Workshop: websockets. Not a lot of love for websockets in the room… but this was the first of several discussions during the day where a desire or quest for bidirectional HTTP streams was made obvious.
Woo Xie did a presentation with help from Alan Frindell about Extending h2 for Bidirectional Messaging and how they propose a HTTP/2 extension that adds a new frame to create a bidirectional stream that lets them do messaging over HTTP/2 fine. The following discussion was slightly positive but also contained alternative suggestions and references to some of the many similar drafts for bidirectional and p2p connections over http2 that have been done in the past.
Lucas Pardue and Nick Jones did a presentation about HTTP/2 Priorities, based a lot of research previously done and reported by Pat Meenan. Lucas took us through the history of how the priorities ended up like this, their current state and numbers and also the chaos and something about a possible future, the h3 way of doing prio and mr Meenan’s proposed HTTP/3 prio.
Nick’s second half of the presentation then took us through Cloudflare’s Edge Driven HTTP/2 Prioritisation work/experiments and he showed how they could really improve how prioritization works in nginx by making sure the data is written to the socket as late as possible. This was backed up by audience references to the TAPS guidelines on the topic and a general recollection that reducing the number connections is still a good idea and should be a goal! Server buffering is hard.
Asbjørn Ulsberg presented his case for a new request header: prefer-push. When used, the server can respond to the request with a series of pushed resources and thus save several round-rips. This triggered sympathy in the room but also suggestions of alternative approaches.
Alan Frindell presented Partial POST Replay. It’s a rather elaborate scheme that makes their loadbalancers detect when a POST to one of their servers can’t be fulfilled and they instead replay that POST to another backend server. While Alan promised to deliver a draft for this, the general discussion was brought up again about POST and its “replayability”.
Willy Tarreau followed up with a very similar topic: Retrying failed POSTs. In this this context RFC 2310 – The Safe Response Header Field was mentioned and that perhaps something like this could be considered for requests? The discussion certainly had similarities and overlaps with the SEARCH/POST discussion of yesterday.
Mike West talked about Fetch Metadata Request Headers which is a set of request headers explaining for servers where and what for what purpose requests are made by browsers. He also took us through a brief explained of Origin Policy, meant to become a central “resource” for a manifest that describes properties of the origin.
Mark Nottingham presented Structured Headers (draft). This is a new way of specifying and parsing HTTP headers that will make the lives of most HTTP implementers easier in the future. (Parts of the presentation was also spent debugging/triaging the most weird symptoms seen when his Keynote installation was acting up!) It also triggered a smaller side discussion on what kind of approaches that could be taken for HPACK and QPACK to improve the compression ratio for headers.
Willy Tarreau talked about a race problem he ran into with closing HTTP/2 streams and he explained how he worked around it with a trailing ping frame and suggested that maybe more users might suffer from this problem.
The oxygen level in the room was certainly not on an optimal level at this point but that didn’t stop us. We knew we had a few more topics to get through and we all wanted to get to the boat ride of the evening on time. So…
Hooman Beheshti polled the room to get a feel for what people think about Early hints. Are people still on board? Turns out it is mostly appreciated but not supported by any browser and a discussion and explainer session followed as to why this is and what general problems there are in supporting 1xx headers in browsers. It is striking that most of us HTTP people in the room don’t know how browsers work! Here I could mention that Cory said something about the craziness of this, but I forget his exact words and I blame the fact that they were expressed to me on a boat. Or perhaps that the time is already approaching 1am the night after this fully packed day.
The forth season of my favorite HTTP series is back! The HTTP Workshop skipped over last year but is back now with a three day event organized by the very best: Mark, Martin, Julian and Roy. This time we’re in Amsterdam, the Netherlands.
35 persons from all over the world walked in the room and sat down around the O-shaped table setup. Lots of known faces and representatives from a large variety of HTTP implementations, client-side or server-side – but happily enough also a few new friends that attend their first HTTP Workshop here. The companies with the most employees present in the room include Apple, Facebook, Mozilla, Fastly, Cloudflare and Google – having three or four each in the room.
Patrick Mcmanus started off the morning with his presentation on HTTP conventional wisdoms trying to identify what have turned out as successes or not in HTTP land in recent times. It triggered a few discussions on the specific points and how to judge them. I believe the general consensus ended up mostly agreeing with the slides. The topic of unshipping HTTP/0.9 support came up but is said to not be possible due to its existing use. As a bonus, Anne van Kesteren posted a new bug on Firefox to remove it.
Mark Nottingham continued and did a brief presentation about the recent discussions in HTTPbis sessions during the IETF meetings in Prague last week.
Martin Thomson did a presentation about HTTP authority. Basically how a client decides where and who to ask for a resource identified by a URI. This triggered an intense discussion that involved a lot of UI and UX but also trust, certificates and subjectAltNames, DNS and various secure DNS efforts, connection coalescing, DNSSEC, DANE, ORIGIN frame, alternative certificates and more.
Mike West explained for the room about the concept for Signed Exchanges that Chrome now supports. A way for server A to host contents for server B and yet have the client able to verify that it is fine.
Tommy Pauly then talked to his slides with the title of Website Fingerprinting. He covered different areas of a browser’s activities that are current possible to monitor and use for fingerprinting and what counter-measures that exist to work against furthering that development. By looking at the full activity, including TCP flows and IP addresses even lots of our encrypted connections still allow for pretty accurate and extensive “Page Load Fingerprinting”. We need to be aware and the discussion went on discussing what can or should be done to help out.
Roberto Peon presented his new idea for “Generic overlay networks”, a suggested way for clients to get resources from one out of several alternatives. A neighboring idea to Signed Exchanges, but still different. There was an interested to further and deepen this discussion and Roberto ended up saying he’d at write up a draft for it.
Max Hils talked about Intercepting QUIC and how the ability to do this kind of thing is very useful in many situations. During development, for debugging and for checking what potentially bad stuff applications are actually doing on your own devices. Intercepting QUIC and HTTP/3 can thus also be valuable but at least for now presents some challenges. (Max also happened to mention that the project he works on, mitmproxy, has more stars on github than curl, but I’ll just let it slide…)
Poul-Henning Kamp showed us vtest – a tool and framework for testing HTTP implementations that both Varnish and HAproxy are now using. Massaged the right way, this could develop into a generic HTTP test/conformance tool that could be valuable for and appreciated by even more users going forward.
Asbjørn Ulsberg showed us several current frameworks that are doing GET, POST or SEARCH with request bodies and discussed how this works with caching and proposed that SEARCH should be defined as cacheable. The room mostly acknowledged the problem – that has been discussed before and that probably the time is ripe to finally do something about it. Lots of users are already doing similar things and cached POST contents is in use, just not defined generically. SEARCH is a already registered method but could get polished to work for this. It was also suggested that possibly POST could be modified to also allow for caching in an opt-in way and Mark volunteered to author a first draft elaborating how it could work.
Indonesian and Tibetan food for dinner rounded off a fully packed day.
Thanks Cory Benfield for sharing your notes from the day, helping me get the details straight!
We’re a very homogeneous group of humans. Most of us are old white men, basically all clones and practically indistinguishable from each other. This is not diverse enough!
(I will update this blog post with more links to videos and PDFs to presentations as they get published, so come back later in case your favorite isn’t linked already.)
The third curl developers conference, curl up 2019, is how history. We gathered in the lovely Charles University in central Prague where we sat down in an excellent class room. After the HTTP symposium on the Friday, we spent the weekend to dive in deeper in protocols and curl details.
I started off the Saturday by The state of the curl project (youtube). An overview of how we’re doing right now in terms of stats, graphs and numbers from different aspects and then something about what we’ve done the last year and a quick look at what’s not do good and what we could work on going forward.
James Fuller took the next session and his Newbie guide to contributing to libcurl presentation. Things to consider and general best practices to that could make your first steps into the project more likely to be pleasant!
Long term curl hacker Dan Fandrich (also known as “Daniel two” out of the three Daniels we have among our top committers) followed up with Writing an effective curl test where the detailed what different tests we have in curl, what they’re for and a little about how to write such tests.
After that I was back behind the desk in the classroom that we used for this event and I talked The Deprecation of legacy crap (Youtube). How and why we are removing things, some things we are removing and will soon remove and finally a little explainer on our new concept and handling of “experimental” features.
Our local hero organizer James Fuller then spoiled us completely when we got around to have dinner at a monastery with beer brewing monks and excellent food. Good food, good company and curl related dinner subjects. That’s almost heaven defined!
Daylight saving time morning and you could tell. I’m sure it was not at all related to the beers from the night before…
James Fuller fired off the day by talking to us about Curlpipe (github), a DSL for building http execution pipelines.
Robin Marx then put in the next gear and entertained us another hour with a protocol deep dive titled HTTP/3 (QUIC): the details (slides). For me personally this was a exactly what I needed as Robin clearly has kept up with more details and specifics in the QUIC and HTTP/3 protocols specifications than I’ve managed and his talk help the rest of the room get at least little bit more in sync with current development.
Jakub Nesetril and Lukáš Linhart from Apiary then talked us through what they’re doing and thinking around web based APIs and how they and their customers use curl: Real World curl usage at Apiary.
Jakub Klímek explained to us in very clear terms about current and existing problems in his talk IRIs and IDNs: Problems of non-ASCII countries. Some of the problems involve curl and while most of them have their clear explanations, I think we have to lessons to learn from this: URLs are still as messy and undocumented as ever before and that we might have some issues to fix in this area in curl.
To bring my fellow up to speed on the details of the new API introduced the last year I then made a presentation called The new URL API.
I ended up doing seven presentations during this single weekend. Not all of them stellar or delivered with elegance but I hope they were still valuable to some. I did not steal someone else’s time slot as I would gladly have given up time if we had other speakers wanted to say something. Let’s aim for more non-Daniel talkers next time!
A weekend like this is such a boost for inspiration, for morale and for my ego. All the friendly faces with the encouraging and appreciating comments will keep me going for a long time after this.
Thank you to our awesome and lovely event sponsors – shown in the curl up logo below! Without you, this sort of happening would not happen.
curl up 2020
I will of course want to see another curl up next year. There are no plans yet and we don’t know where to host. I think it is valuable to move it around but I think it is even more valuable that we have a friend on the ground in that particular city to help us out. Once this year’s event has sunken in properly and a month or two has passed, the case for and organization of next year’s conference will commence. Stay tuned, and if you want to help hosting us do let me know!
This year’s version of curl up started a little differently: With an afternoon of HTTP presentations. The event took place the same week the IETF meeting has just ended here in Prague so we got the opportunity to invite people who possibly otherwise wouldn’t have been here… Of course this was only possible thanks to our awesome sponsors, visible in the image above!
Lukáš Linhart from Apiary started out with “Web APIs: The Past, The Present and The Future”. A journey trough XML-RPC, SOAP and more. One final conclusion might be that we’re not quite done yet…
James Fuller from MarkLogic talked about “The Defenestration of Hypermedia in HTTP”. How HTTP web technologies have changed over time while the HTTP paradigms have survived since a very long time.
I talked about DNS-over-HTTPS. A presentation similar to the one I did before at FOSDEM, but in a shorter time so I had to talk a little faster!
Mike Bishop from Akamai (editor of the HTTP/3 spec and a long time participant in the HTTPbis work) talked about “The evolution of HTTP (from HTTP/1 to HTTP/3)” from HTTP/0.9 to HTTP/3 and beyond.
Robin Marx then rounded off the series of presentations with his tongue in cheek “HTTP/3 (QUIC): too big to fail?!” where we provided a long list of challenges for QUIC and HTTP/3 to get deployed and become successful.
We ended this afternoon session with a casual Q&A session with all the presenters discussing various aspects of HTTP, the web, REST, APIs and the benefits and deployment challenges of QUIC.
I think most of us learned things this afternoon and we could leave the very elegant Charles University room enriched and with more food for thoughts about these technologies.
We ended the evening with snacks and drinks kindly provided by Apiary.
(This event was not streamed and not recorded on video, you had to be there in person to enjoy it.)
The 180th public curl release is a patch release: 7.64.1. There’s been 49 days since 7.64.0 shipped. The first release since our 21st birthday last week. (Full changelog.)
the 180th release 2 changes 49 days (total: 7,677) 116 bug fixes (total: 5,029) 184 commits (total: 24,111) 0 new public libcurl functions (total: 80) 2 new curl_easy_setopt() options (total: 267) 1 new curl command line option (total: 221) 49 contributors, 25 new (total: 1,929) 25 authors, 10 new (total: 669) 0 security fixes (total: 87)
This is a patch release but we still managed to introduce some fun news in this version. We ship brand new alt-svc support which we encourage keen and curious users to enable in their builds and test out. We strongly discourage anyone from using that feature in production as we reserve ourselves the right to change it before removing the EXPERIMENTAL label. As mentioned in the blog post linked above, alt-svc is the official way to bootstrap into HTTP/3 so this is a fundamental stepping stone for supporting that protocol version in a future curl.
We also introduced brand new support for the Amiga-specific TLS backend AmiSSL, which is a port of OpenSSL to that platform.
With over a hundred bug-fixes landed in this period there are a lot to choose from, but some of the most most fun and important ones from my point of view include the following.
connection check crash
This was a rather bad regression that occasionally caused crashes when libcurl would scan its connection cache for a live connection to reuse. Most likely to trigger with the Schannel backend.
connection sharing crash
The example source code that uses a shared connection cache among many threads was another crash regression. It turned out a thread could accidentally get hold of a connection already in private use by another thread…
“Expire in…” logs removed
Having the harmless but annoying text there was a mistake to begin with. It was a debug-only line that accidentally was pushed and not discovered in time. It’s history now.
curl -M manual removed
The tutorial-like manual piece that was previously included in the -M (or –manual) built-in command documentation, is no longer included. The output shown is now just the curl.1 man page. The reason for this is that the tutorial has gone a bit stale and there is now better updated and better explained documentation elsewhere. Primarily perhaps in everything curl. The online version of that document will eventually also be removed.
TLS terminology cleanups
We now refer to the Windows TLS backend as “Schannel” and the Apple macOS one as “Secure Transport” in all curl code and documentation. Those are the official names and those are the names people in general know them as. No more use of the former names that sometimes made people confused.
Shaving off bytes and mallocs
We rearranged the layout of a few structs and changed to using bitfields instead of booleans and more. This way, we managed to shrink two of the primary internal structs by 5% and 11% with no functionality change or loss.
Similarly, we removed a few mallocs, even in the common code path, so now the number of allocs for my regular test download of 4GB data over a localhost HTTP server claims fewer allocs than ever before.
We estimate that there will be a 7.65.0 release to ship 56 days from now. Then we will remove some deprecated features, perhaps add something new and quite surely fix a whole bunch of more bugs. Who know what fun we will come up with at curl up this coming weekend?
Keep reporting. Keep posting pull-requests. We love them and you!