Category Archives: Open Source

Open Source, Free Software, and similar

picturing curl’s future

development graph

There will be more stuff over time in the cURL project. Exactly which stuff and how long time it takes for everything, we don’t know. It depends largely on who works on what and how much time said persons can spend on implementing the stuff they work on…

I suspect we might be able to do things slightly faster over time, which is why the red arrow isn’t just a straight line.

I drew this little picture inspired from discussions with friends after a talk I did about curl and how development works in an open source project such as this. We know we will work on things that will improve the products but we don’t see exactly what very far in advance. I tweeted this picture a few days ago, and it turned out very popular.

2015 curl user poll analysis

My full 30 page document with all details and analyses of the curl user poll 2015 is now available. It shows details of all the questions, most of them with a comparison with last year’s survey. The write-ins are also full of good advice, wisdom and some signs of ignorance or unawareness.

I hope all curl hackers and others generally interested in the project can use my “report” to learn something about our users and our user’s view of the project and our products.

Let’s use this to guide us going forward.

keep-calm-and-improve-curl

status update: http2 multiplexed uploads

I wrote a previous update about my work on multiplexing in curl. This is a follow-up to describe the status as of today.

I’ve successfully used the http2-upload.c code to upload 600 parallel streams to the test server and they were all sent off fine and the responses received were stored fine. MAX_CONCURRENT_STREAMS on the server was set to 100.

This is using curl git master as of right now (thus scheduled for inclusion in the pending curl 7.43.0 release).  I’m not celebrating just yet, but it is looking pretty good. I’ll continue testing.

Commit b0143a2a3 was crucial for this, as I realized we didn’t store and use the read callback in the easy handle but in the connection struct which is completely wrong when many easy handles are using the same connection! I don’t recall the exact reason why I put the data in that struct (I went back and read the commit messages etc) but I think this setup is correct conceptually and code-wise, so if this leads to some side-effects I think we need to just fix it.

Next up: more testing, and then taking on the concept of server push to make libcurl able to support it. It will certainly be a subject for future blog posts…

cURL

RFC 7540 is HTTP/2

HTTP/2 is the new protocol for the web, as I trust everyone reading my blog are fully aware of by now. (If you’re not, read http2 explained.)

Today RFC 7540 was published, the final outcome of the years of work put into this by the tireless heroes in the HTTPbis working group of the IETF. Closely related to the main RFC is the one detailing HPACK, which is the header compression algorithm used by HTTP/2 and that is now known as RFC 7541.

The IETF part of this journey started pretty much with Mike Belshe’s posting of draft-mbelshe-httpbis-spdy-00 in February 2012. Google’s SPDY effort had been going on for a while and when it was taken to the httpbis working group in IETF, where a few different proposals on how to kick off the HTTP/2 work were debated.

HTTP team working in LondonThe first “httpbis’ified” version of that document (draft-ietf-httpbis-http2-00) was then published on November 28 2012 and the standardization work began for real. HTTP/2 was of course discussed a lot on the mailing list since the start, on the IETF meetings but also in interim meetings around the world.

In Zurich, in January 2014 there was one that I only attended remotely. We had the design team meeting in London immediately after IETF89 (March 2014) in the Mozilla offices just next to Piccadilly Circus (where I took the photos that are shown in this posting). We had our final in-person meetup with the HTTP team at Google’s offices in NYC in June 2014 where we ironed out most of the remaining issues.

In between those two last meetings I published my first version of http2 explained. My attempt at a lengthy and very detailed description of HTTP/2, including describing problems with HTTP/1.1 and motivations for HTTP/2. I’ve since published eleven updates.

HTTP team in London, debating protocol detailsThe last draft update of HTTP/2 that contained actual changes of the binary format was draft-14, published in July 2014. After that, the updates were in the language and clarifications on what to do when. There are some functional changes (added in -16 I believe) for like when which sort of frames are accepted that changes what a state machine should do, but it doesn’t change how the protocol looks on the wire.

RFC 7540 was published on May 15th, 2015

I’ve truly enjoyed having had the chance to be a part of this. There are a bunch of good people who made this happen and while I am most certainly forgetting key persons, some of the peeps that have truly stood out are: Mark, Julian, Roberto, Roy, Will, Tatsuhiro, Patrick, Martin, Mike, Nicolas, Mike, Jeff, Hasan, Herve and Willy.

http2 logo

curl user poll 2015

Update: the poll is now closed. The responses can be found here.

Now is the time. If you use curl or libcurl from time to time, please consider helping us out with providing your feedback and opinions on a few things:

https://goo.gl/FyToBn

It’ll take you a couple of minutes and it’ll help us a lot when making decisions going forward.

The poll is hosted by Google and that short link above will take you to:

https://docs.google.com/forms/d/1uQNYfTmRwF9RX5-oq_HV4VyeT1j7cxXpuBIp8uy5nqQ/viewform

HTTP/2 in curl, status update

http2 logoI’m right now working on adding proper multiplexing to libcurl’s HTTP/2 code. So far we’ve only done a single stream per connection and while that works fine and is HTTP/2, applications will still want more when switching to HTTP/2 as the multiplexing part is one of the key components and selling features of the new protocol version.

Pipelining means multiplexed

As a starting point, I’m using the “enable HTTP pipelining” switch to tell libcurl it should consider multiplexing. It makes libcurl work as before by default. If you use the multi interface and enable pipelining, libcurl will try to re-use established connections and just add streams over them rather than creating new connections. Yes this means that A) you need to use the multi interface to get the full HTTP/2 stuff and B) the curl tool won’t be able to take advantage of it since it doesn’t use the multi interface! (An old outstanding idea is to move the tool to use the multi interface and this would yet another reason why this could be a good idea.)

We still have some decisions to make about how we want libcurl to act by default – especially when we can expect application to use both HTTP/1.1 and HTTP/2 at the same time. Since we don’t know if the server supports HTTP/2 until after a certain point in the negotiation, we need to decide on how to do when we issue N transfers at once to the same server that might speak HTTP/2… Right now, we get the best HTTP/2 behavior by telling libcurl we only want one connection per host but that is probably not ideal for an application that might use a mix of HTTP/1.1 and HTTP/2 servers.

Downsides with abusing pipelining

There are some drawbacks with using that pipelining switch to allow multiplexing since users may very well want HTTP/2 multiplexing but not HTTP/1.1 pipelining since the latter is just riddled with interop problems.

Also, re-using the same options for limited connections to host names etc for both HTTP/1.1 and HTTP/2 may not at all be what real-world applications want or need.

One easy handle, one stream

libcurl API wise, each HTTP/2 stream is its own easy handle. It makes it simple and keeps the API paradigm very much in the same way it works for all the other protocols. It comes very natural for the libcurl application author. If you setup three easy handles, all identifying a resource on the same server and you tell libcurl to use HTTP/2, it makes perfect sense that all these three transfers are made using a single connection.

As multiplexed data means that when reading from the socket, there is data arriving that belongs to other streams than just a single one. So we need to feed the received data into the different “data buckets” for the involved streams. It gives us a little internal challenge: we get easy handles with no socket activity to trigger a read, but there is data to take care of in the incoming buffer. I’ve solved this so far with a special trigger that says that there is data to take care of, that it should make a read anyway that then will get the data from the buffer.

Server push

HTTP/2 supports server push. That’s a stream that gets initiated from the server side without the client specifically asking for it. A resource the server deems likely that the client wants since it asked for a related resource, or similar. My idea is to support server push with the application setting up a transfer with an easy handle and associated options, but the URL would only identify the server so that it knows on which connection it would accept a push, and we will introduce a new option to libcurl that would tell it that this is an easy handle that should be used for the next server pushed stream on this connection.

Of course there are a few outstanding issues with this idea. Possibly we should allow an easy handle to get created when a new stream shows up so that we can better deal with a dynamic number of  new streams being pushed.

It’d be great to hear from users who have ideas on how to use server push in a real-world application and how you’d imagine it could be used with libcurl.

Work in progress code

My work in progress code for this drive can be found in two places.

First, I do the libcurl multiplexing development in the separate http2-multiplex branch in the regular curl repo:

https://github.com/bagder/curl/tree/http2-multiplex.

Then, I put all my test setup and test client work in a separate repository just in case you want to keep up and reproduce my testing and experiments:

https://github.com/bagder/curl-http2-dev

Feedback?

All comments, questions, praise or complaints you may have on this are best sent to the curl-library mailing list. If you are planning on doing a HTTP/2 capable applications or otherwise have thoughts or ideas about the API for this, please join in and tell me what you think. It is much better to get the discussions going early and work on different design ideas now before anything is set in stone rather than waiting for us to ship something semi-stable as the closer to an actual release we get, the harder it’ll be to change the API.

Not quite working yet

As I write this, I’m repeatedly doing 99 parallel HTTP/2 streams with no data corruption… But there’s a lot more to be done before I’ll call it a victory.

talking curl on the changelog

The changelog is the name of a weekly podcast on which the hosts discuss open source and stuff.

Last Friday I was invited to participate and I joined hosts Adam and Jerod for an hour long episode about curl. It all started as a response to my post on curl 17 years, so we really got into how things started out and how curl has developed through the years, how much time I’ve spent on it and if I could mention a really great moment in time that stood out over the years?

They day before, they released the little separate teaser we made about about the little known –remote-name-all command line option that basically makes curl default to do -O on all given URLs.

The full length episode can be experienced in all its glory here: https://changelog.com/153/

Summing up the birthday festivities

I blogged about curl’s 17th birthday on March 20th 2015. I’ve done similar posts in the past and they normally pass by mostly undetected and hardly discussed. This time, something else happened.

Primarily, the blog post quickly became the single most viewed blog entry I’ve ever written – and I’ve been doing it for many many years. Already in the first day it was up, I counted more than 65,000 views.

The blog post got more comments than on any other blog post I’ve ever done. Right now they have probably stopped but there are 60 of them now, almost everyone one of them saying congratulations and/or thanks.

The posting also got discussed on both hacker news and reddit, totaling in more than 260 comments. Most of those in positive spirit.

The initial tweet I made about my blog post is the most retweeted and stared tweet I’ve ever posted. At least 87 retweets and 49 favorites (it might even grow a bit more over time). Others subsequently also tweeted the link hundreds of times. I got numerous replies and friendly call-outs on twitter saying “congrats” and “thanks” in many variations.

Spontaneously (ie not initiated or requested by me but most probably because of a comment on hacker news), I also suddenly started to get donations from the curl web site’s donation web page (to paypal). Within 24 hours from my post, I had received 35 donations from friendly fans who donated a total sum of  445 USD. A quick count revealed that the total number of donations ever through the history of curl’s lifetime was 43 before this day. In one day we had basically gotten as many as we had gotten the first 17 years.

Interesting data from this donation “race”: I got donations varying from 1 USD (yes one dollar) to 50 USD and the average donation was then 12.7 USD.

Let me end this summary by thanking everyone who in various ways made the curl birthday extra fun by being nice and friendly and some even donating some of their hard earned money. I am honestly touched by the attention and all the warmth and positiveness. Thank you for proving internet comments can be this good!

curl, 17 years old today

Today we celebrate the fact that it is exactly 17 years since the first public release of curl. I have always been the lead developer and maintainer of the project.

Birthdaycake

When I released that first version in the spring of 1998, we had only a handful of users and a handful of contributors. curl was just a little tool and we were still a few years out before libcurl would become a thing of its own.

The tool we had been working on for a while was still called urlget in the beginning of 1998 but as we just recently added FTP upload capabilities that name turned wrong and I decided cURL would be more suitable. I picked ‘cURL’ because the word contains URL and already then the tool worked primarily with URLs, and I thought that it was fun to partly make it a real English word “curl” but also that you could pronounce it “see URL” as the tool would display the contents of a URL.

Much later, someone (I forget who) came up with the “backronym” Curl URL Request Library which of course is totally awesome.

17 years are 6209 days. During this time we’ve done more than 150 public releases containing more than 2600 bug fixes!

We started out GPL licensed, switched to MPL and then landed in MIT. We started out using RCS for version control, switched to CVS and then git. But it has stayed written in good old C the entire time.

The term “Open Source” was coined 1998 when the Open Source Initiative was started just the month before curl was born, which was superseded with just a few days by the announcement from Netscape that they would free their browser code and make an open browser.

We’ve hosted parts of our project on servers run by the various companies I’ve worked for and we’ve been on and off various free services. Things come and go. Virtually nothing stays the same so we better just move with the rest of the world. These days we’re on github a lot. Who knows how long that will last…

We have grown to support a ridiculous amount of protocols and curl can be built to run on virtually every modern operating system and CPU architecture.

The list of helpful souls who have contributed to make curl into what it is now have grown at a steady pace all through the years and it now holds more than 1200 names.

Employments

In 1998, I was employed by a company named Frontec Tekniksystem. I would later leave that company and today there’s nothing left in Sweden using that name as it was sold and most employees later fled away to other places. After Frontec I joined Contactor for many years until I started working for my own company, Haxx (which we started on the side many years before that), during 2009. Today, I am employed by my forth company during curl’s life time: Mozilla. All through this project’s lifetime, I’ve kept my work situation separate and I believe I haven’t allowed it to disturb our project too much. Mozilla is however the first one that actually allows me to spend a part of my time on curl and still get paid for it!

The Netscape announcement which was made 2 months before curl was born later became Mozilla and the Firefox browser. Where I work now…

Future

I’m not one of those who spend time glazing toward the horizon dreaming of future grandness and making up plans on how to go there. I work on stuff right now to work tomorrow. I have no idea what we’ll do and work on a year from now. I know a bunch of things I want to work on next, but I’m not sure I’ll ever get to them or whether they will actually ship or if they perhaps will be replaced by other things in that list before I get to them.

The world, the Internet and transfers are all constantly changing and we’re adapting. No long-term dreams other than sticking to the very simple and single plan: we do file-oriented internet transfers using application layer protocols.

Rough estimates say we may have a billion users already. Chances are, if things don’t change too drastically without us being able to keep up, that we will have even more in the future.

1000 million users

It has to feel good, right?

I will of course point out that I did not take curl to this point on my own, but that aside the ego-boost this level of success brings is beyond imagination. Thinking about that my code has ended up in so many places, and is driving so many little pieces of modern network technology is truly mind-boggling. When I specifically sit down or get a reason to think about it at least.

Most of the days however, I tear my hair when fixing bugs, or I try to rephrase my emails to no sound old and bitter (even though I can very well be that) when I once again try to explain things to users who can be extremely unfriendly and whining. I spend late evenings on curl when my wife and kids are asleep. I escape my family and rob them of my company to improve curl even on weekends and vacations. Alone in the dark (mostly) with my text editor and debugger.

There’s no glory and there’s no eternal bright light shining down on me. I have not climbed up onto a level where I have a special status. I’m still the same old me, hacking away on code for the project I like and that I want to be as good as possible. Obviously I love working on curl so much I’ve been doing it for over seventeen years already and I don’t plan on stopping.

Celebrations!

Yeps. I’ll get myself an extra drink tonight and I hope you’ll join me. But only one, we’ll get back to work again afterward. There are bugs to fix, tests to write and features to add. Join in the fun! My backlog is only growing…