It isn’t actually going away. It’s just been thrown over the fence to the Apache project and Subversion itself to host and maintain going forward.
When the Subversion project started in the early year 2000, I was there. I joined the project and participated in the early days of its development as I really believed in creating an “improved CVS” and I thought I could contribute to it.
While I was involved with the project, I noticed the lack of a decent mailing list archive for the discussions and set one up under the name svn.haxx.se as a service for myself and for the entire community. I had the server and the means to do it, so why not?
After some years I drifted away from the project. It was doing excellently and I was never any significant contributor. Then git and some of the other distributed version control systems came along and in my mind they truly showed the world how version control should be done…
The mailing list archive however I left, and I had even added more subversion related lists to it over time. It kept chugging along without me having to do much. Mails flew in, got archived and were made available for the world to search for and link to. Today it has over 390,000 emails archived from over twenty years of rather active open source development on multiple mailing lists. It is fascinating that no less than 46 persons have written more than a thousand emails each on those lists during these two decades.
The physical machine that runs the website is going to be shut down and taken out of service soon, and instead of just shutting down this service I’ve worked with the good people in the Subversion project and the hosting of that site and archive has now been taken over by the Apache project instead. It is no longer running on my machine. If you discover any issues with it, you need to talk to them.
Today, January 20 2021, I updated the DNS to instead have the host name svn.haxx.se point to Apache’s web server. I believe the plan is to keep the site as an archive of past emails and not add any new emails to it as of now.
I hereby sign off my twenty years of service as an svn email archive janitor. It was a pleasure to serve you.
I founded the curl project early 1998 but had already then been working on the code since November 1996. The source code was always open, free and available to the world. The term “open source” actually wasn’t even coined until early 1998, just weeks before curl was born.
In the beginning of course, the first few years or so, this project wasn’t seen or discovered by many and just grew slowly and silently in a dusty corner of the Internet.
Already when I shipped the first versions I wanted the code to be open and freely available. For years I had seen the cool free software put out the in the world by others and I wanted my work to help build this communal treasure trove.
When I started this journey I didn’t really know what I wanted with curl’s license and exactly what rights and freedoms I wanted to give away and it took a few years and attempts before it landed.
The early versions were GPL licensed, but as I learned about resistance from proprietary companies and thought about it further, I changed the license to be more commercially friendly and to match my conviction better. I ended up with MIT after a brief experimental time using MPL. (It was easy to change the license back then because I owned all the copyrights at that point.)
To be exact: we actually have a slightly modified MIT license with some very subtle differences. The reason for the changes have been forgotten and we didn’t get those commits logged in the “big transition” to Sourceforge that we did in late 1999… The end result is that this is now often recognized as “the curl license”, even though it is in effect the MIT license.
The license says everyone can use the code for whatever purpose and nobody is required to ship any source code to anyone, but they cannot claim they wrote it themselves and the license/use of the code should be mentioned in documentation or another relevant location.
As licenses go, this has to be one of the most frictionless ones there is.
Open source relies on a solid copyright law and the copyright owners of the code are the only ones who can license it away. For a long time I was the sole copyright owner in the project. But as I had decided to stick to the license, I saw no particular downsides with allowing code and contributors (of significant contributions) to retain their copyrights on the parts they brought. To not use that as a fence to make contributions harder.
Today, in early 2021, I count 1441 copyright strings in the curl source code git repository. 94.9% of them have my name.
I never liked how some projects require copyright assignments or license agreements etc to be able to submit code or patches. Partly because of the huge administrative burden it adds to the project, but also for the significant friction and barrier to entry they create for new contributors and the unbalance it creates; some get more rights than others. I’ve always worked on making it easy and smooth for newcomers to start contributing to curl. It doesn’t happen by accident.
In many ways, running a spare time open source project is easy. You just need a steady income from a “real” job and sufficient spare time, and maybe a server to host stuff on for the online presence.
The challenge is of course to keep developing it, adding things people want, to help users with problems and to address issues timely. Especially if you happen to be lucky and the user amount increases and the project grows in popularity.
I ran curl as a spare time project for decades. Over the years it became more and more common that users who submitted bug reports or asked for help about things were actually doing that during their paid work hours because they used curl in a commercial surrounding – which sometimes made the situation almost absurd. The ones who actually got paid to work with curl were asking the unpaid developers to help them out.
I changed employers several times. I started my own company and worked as my own boss for a while. I worked for Mozilla on network stuff in Firefox for five years. But curl remained a spare time project because I couldn’t figure out how to turn it into a job without risking the project or my economy.
Earning a living
For many years it was a pipe dream for me to be able to work on curl as a real job. But how do I actually take the step from a spare time project to doing it full time? I give away all the code for free, and it is a solid and reliable product.
The initial seeds were planted when I met and got to know Larry (wolfSSL CEO) and some of the other good people at wolfSSL back in the early 2010s. This, because wolfSSL is a company that write open source libraries and offer commercial support for them – proving that it can work as a business model. Larry always told me he thought there was a possibility waiting here for me with curl.
Apart from the business angle, if I would be able to work more on curl it could really benefit the curl project, and then of course indirectly everyone who uses it.
It was still a step to take. When I gave up on Mozilla in 2018, it just took a little thinking before I decided to try it. I joined wolfSSL to work on curl full time. A dream came true and finally curl was not just something I did “on the side”. It only took 21 years from first curl release to reach that point…
I’m living the open source dream, working on the project I created myself.
Food for free code
We sell commercial support for curl and libcurl. Companies and users that need a helping hand or swift assistance with their problems can get it from us – and with me here I dare to claim that there’s no company anywhere else with the same ability. We can offload engineering teams with their curl issues. Up to 24/7 level!
We also offer custom curl development, debugging help, porting to new platforms and basically any other curl related activity you need. See more on the curl product page on the wolfSSL site.
curl (mostly in the shape of libcurl) runs in ten billion installations: some five, six billion mobile phones and tablets – used by several of the most downloaded apps in existence, in virtually every website and Internet server. In a billion computer games, a billion Windows machines, half a billion TVs, half a billion game consoles and in a few hundred million cars… curl has been made to run on 82 operating systems on 22 CPU architectures. Very few software components can claim a wider use.
“Isn’t it easier to list companies that are not using curl?”
Wide use and being recognized does not bring food on the table. curl is also totally free to download, build and use. It is very solid and stable. It performs well, is documented, well tested and “battle hardened”. It “just works” for most users.
Pay for support!
How to convince companies that they should get a curl support contract with me?
Paying customers get to influence what I work on next. Not only distant road-mapping but also how to prioritize short term bug-fixes etc. We have a guaranteed response-time.
You get your issues first in line to get fixed. Customers also won’t risk getting their issues added the known bugs document and put in the attic to be forgotten. We can help customers make sure their application use libcurl correctly and in the best possible way.
I try to emphasize that by getting support from us, customers can take away some of those tasks from their own engineers and because we are faster and better on curl related issues, that is a pure net gain economically. For all of us.
This is not an easy sell.
Sure, curl is used by thousands of companies everywhere, but most of them do it because it’s free (in all meanings of the word), functional and available. There’s a real challenge in identifying those that actually use it enough and value the functionality enough that they realize they want to improve their curl foo.
Most of our curl customers purchased support first when they faced a complicated issue or problem they couldn’t fix themselves – this fact gives me this weird (to the wider curl community) incentive to not fix some problems too fast, because it then makes it work against my ability to gain new customers!
We need paying customers for this to be sustainable. When wolfSSL has a sustainable curl business, I get paid and the work I do in curl benefits all the curl users; paying as well as non-paying.
There’s clearly business in releasing open source under a strong copyleft license such as GPL, and as long as you keep the copyrights, offer customers to purchase that same code under another more proprietary- friendly license. The code is still open source and anyone doing totally open things can still use it freely and at no cost.
We’ve shipped tiny-curl to the world licensed under GPLv3. Tiny-curl is a curl branch with a strong focus on thetiny part: the idea is to provide a libcurl more suitable for smaller systems, the ones that can’t even run a full Linux but rather use an RTOS.
Consider it a sort of experiment. Are users interested in getting a smaller curl onto their products and are they interested in paying for licensing. So far, tiny-curl supports two separate RTOSes for which we haven’t ported the “normal” curl to.
Keeping things separate
Maybe you don’t realize this, but I work hard to keep separate things compartmentalized. I am not curl, curl is not wolfSSL and wolfSSL is not me. But we all overlap greatly!
I work for wolfSSL. I work on curl. wolfSSL offers commercial curl support.
One idea that we haven’t explored much yet is the ability to make and offer “reserved features” to paying customers only. This of course as another motivation for companies to become curl support customers.
Such reserved features would still have to be sensible for the curl project and most likely we would provide them as specials for paying customers for a period of time and then merge them into the “real” open source curl project. It is very important to note that this will not in any way make the “regular curl” worse or a lesser citizen in any way. It would rather be a like a separate product, a curl+ with extra stuff on top of vanilla curl.
Since we haven’t ventured into this area yet, we haven’t worked out all the details. Chances are we will wander into this territory soon.
I do occasional speaking gigs on curl and HTTP related topics but even if I charge for them this activity never brings much more than some extra pocket money. I do it because it’s fun and educational.
It has been suggested that I should create a web shop to sell curl branded merchandise in, like t-shirts, mugs, etc but I think that grossly over-estimates the user interest and how much margin I could put on mundane things just because they’d have a curl logo glued on them. Also, I would have a difficult time mentally to sell curl things and claim the profit personally. I rather keep giving away curl stash (mostly stickers) for free as a means to market the project and long term encourage users into buying support.
We receive money to the curl project through donations, most of them via our opencollective account. It is important to note that even if I’m a key figure in the project, this is not my money and it’s not my project. Donated money is spent on project related expenses, which so far primarily is our bug bounty program. We’ve avoided to spend donated money on direct curl development, and especially such that I could provide or benefit from myself, as that would totally blur the boundaries. I’m not ruling out taking that route in a future though. As long as and only if it is to the project’s benefit.
Donations via GitHub to me personally sponsors me personally and ends up in my pockets. That’s not curl money but I spend it mostly on curl development, equipment etc and it makes me able to not have to think twice when sending curl stickers to fans and friends all over the world. It contributes to food on my table and I like to think that an occasional beer I drink is sponsored by friends out there!
The future I dream of
We get a steady number of companies paying for support at a level that allows us to also pay for a few more curl engineers than myself.
I started learning how to program in my teens, well over thirty years ago and I’ve worked as a software engineer and developer since the early 1990s. My first employment as a developer was in 1993. I’ve since worked for and with lots of companies and I’ve worked on a huge amount of (proprietary) software products and devices over many years. Meaning: I certainly didn’t start my life open source. I had to earn it.
When I was 20 years old I did my (then mandatory) military service in Sweden. After having endured that, I applied to the university while at the same time I was offered a job at IBM. I hesitated, but took the job. I figured I could always go to university later – but life took other turns and I never did. I didn’t do a single day of university. I haven’t regretted it.
I learned to code in the mid 80s on a Commodore 64 and software development has been one of my primary hobbies ever since. One thing it taught me well, that I still carry with me, is to spend a few hours per day in front of my home computer.
And then I shipped curl
In the spring of 1998 I renamed my little pet project of the time again and I released the first ever curl release. I have told this story many times, but since then I have spent two hours or so of my spare time on that project – every day for over twenty years. While still working as a software engineer by day.
Over time, curl gradually grew popular and attracted more users. There was no sudden moment in time where I struck gold and everything took off. It was just slowly gaining ground while me and my fellow project members kept improving and polishing curl. At some point in time I happened to notice that curl and libcurl would appear in more and more acknowledgements and in open source license collections in products and devices.
It was still just a spare time project.
Proprietary Software for years
I’d like to emphasize that I worked as a contract and consultant developer for many years (over 20!), primarily on proprietary software and custom solutions, before I managed to land myself a position where I could primarily write open source as part of my job.
In 2014 I joined Mozilla and got the opportunity to work on the open source project Firefox for a living – and doing it entirely from my home. This was the first time in my career I actually spent most of my days on code that was made public and available to the world. They even allowed me to spend a part of my work hours on curl, even if that didn’t really help them and curl was not a fundamental part of any Mozilla work or products. It was still great.
I landed that job for Mozilla a lot thanks to my many years and long experience with portable network coding and running a successful open source project at this level.
My work setup with Mozilla made it possible for me to spend even more time on curl, apart from the (still going) two daily spare time hours. Nobody at Mozilla cared much about (my work with) curl and no one there even asked me about it. I worked on Firefox for a living.
For anyone wanting to do open source as part of their work, getting a job at a company that already does a lot of open source is probably the best path forward. Even if that might not be easy either, and it might also mean that you would have to accept working on some open source projects that you might not yourself be completely sold on.
In late 2018 I quit Mozilla, in part because I wanted to try to work with curl “for real” (and part other reasons that I’ll leave out here). curl was then already over twenty years old and was used more than ever before.
I now work for wolfSSL. We sell curl support and related services to companies. Companies pay wolfSSL, wolfSSL pays me a salary and I get food on the table. This works as long as we can convince enough companies that this is a good idea.
The vast majority of curl users out there of course don’t pay anything and will never pay anything. We just need a small number of companies to do it – and it seems to be working. We help customers use curl better, we make curl better for them and we make them ship better products this way. It’s a win win. And I can work on open source all day long thanks to this.
My open source life-style
A normal day in the work week, I get up before 7 in the morning and I have breakfast with my family members: my wife and my two kids. Then I grab my first cup of coffee for the day and take the thirteen steps up the stairs to my “office”.
I sit down in front of my main development (Linux) machine with two 27″ screens and get to work.
What work and in what order?
I lead the curl project. It means many questions and decisions fall down to me to have an opinion about or say on, and it’s a priority for me to make sure that I unblock such situations as soon as possible so that developers wanting to do things with curl can continue doing that.
Thus, I read and respond to email about curl all hours I’m awake and have network access. Of course incoming messages actually rarely require immediate responses and then I can queue them up and instead do them later. I also try to read and assess all new incoming curl issues as soon as possible to see if there’s something urgent I should deal with immediately, or otherwise figure out how to deal with them going forward.
I usually have a set of bugs or features to work on so when there’s no alarming email or GitHub issue left, I context-switch over to the curl source code tree and the particular branch in which I work on right now. I typically have 20-30 different branches of development of various stages and maturity going on. If I get stuck on something, or if I create a pull-request for one of them that needs time to get all the CI jobs done, I switch over to one of the others.
Customers and their needs of course have priority when I decide what to work on. The exception would perhaps be security vulnerabilities or other really serious bugs being reported, but thankfully they are rare. But after that, I go by ear and work on what I think is fun and what I think users might appreciate.
If I want to go forward with something, for my own sake or for a customer’s, and that entails touching or improving other software in other projects, then I don’t shy away from submitting pull requests for them – or at least filing an issue.
Spare time open source
Yes, I still spend my spare time hours on open source, mostly curl. This means I often end up spending 50-55 hours per week on curl and curl related activities. But I don’t count or measure work hours and I rarely have to report any to anyone. This is a work of love.
Lots of people will say that they don’t have time because of life, family, kids etc. I have of course been very fortunate over the years to have had the opportunity and ability to spend all this time on what I want to do, but let’s not forget that people in general spend lots of time on their hobbies; on watching TV, on playing computer games and on socializing with friends and why not: to sleep. If you cut down on all of those things (yes, including the sleeping) there could very well be opportunities. It’s often a question of priorities. I’ve made spare time development a priority in my life.
Any company that uses curl or libcurl – and they are plenty – could benefit from buying support from us instead of wasting their own time and resources. We at wolfSSL are probably much better at curl already and we can find and fix the issues much faster, which ends up cheaper and better long-term.
The top photo is taken by Anja Stenberg, my wife. It’s me in a local forest, summer 2020.
About four years ago I announced that curl was 100% compliant with the CII Best Practices criteria. curl was one of the first projects on that train to reach a 100% – primarily of course because we were early joiners and participants of the Best Practices project.
The point of that was just to highlight and underscore that we do everything we can in the curl project to act as a responsible open source project and citizen of the larger ecosystem. You should be able to trust curl, in every aspect.
Going above and beyond basic
Subsequently, the best practices project added higher levels of compliance. Basically adding a bunch of requirements so if you want to grade yourself at silver or even gold level there are a whole series of additional requirements to meet. At the time those were added, I felt they were asking for quite a lot of specifics that we didn’t provide in the curl project and with a bit of a sigh I felt I had to accept the fact that we would remain on “just” 100% compliance and only reaching a part of the way toward Silver and Gold. A little disheartened of course because I always want curl to be in the top.
So maybe Silver?
I had left the awareness of that entry listing in a dusty corner of my brain and hadn’t considered it much lately, when I noticed the other day that it was announced that the Linux kernel project reached gold level best practice.
That’s a project with around 50 times more developers and commits than curl for an average release (and even a greater multiplier for amount of code) so I’m not suggesting the two projects are comparable in any sense. But it made me remember our entry on CII Best Practices web site.
I came back, updated a few fields that seemed to not be entirely correct anymore and all of a sudden curl quite unexpectedly had a 100% compliance at Silver level!
If Silver was achievable, what’s actually left for gold?
Sure enough, soon there were only a few remaining criteria left “unmet” and after some focused efforts in the project, we had created the final set of documents with information that were previously missing. When we now finally could fill in links to those docs in the final few entries, project curl found itself also scoring a 100% at gold level.
Best Practices: Gold Level
What does it mean for us? What does it mean for you, our users?
For us, it is a double-check and verification that we’re doing the right things and that we are providing the right information in the project and we haven’t forgotten anything major. We already knew that we were doing everything open source in a pretty good way, but getting a bunch of criteria that insisted on a number of things also made us go the extra way and really provide information for everything in written form. Some of what previously really only was implied, discussed in IRC or read between lines in various pull requests.
I’m proud to lead the curl project and I’m proud of all our maintainers and contributors.
For users, having curl reach gold level makes it visible that we’re that kind of open source project. We’re part of this top clique of projects. We care about every little open source detail and this should instill trust and confidence in our users. You can trust curl. We’re a golden open source project. We’re with you all the way.
The final criteria we checked off
Which was the last criteria of them all for curl to fulfill to reach gold?
The project MUST document its code review requirements, including how code review is conducted, what must be checked, and what is required to be acceptable (link)
This criteria is now fulfilled by the brand new document CODE_REVIEW.md.
We’re working on the next release. We always do. Stop the slacking now and get back to work!
I’m honored to – once again – be a recipient of this award Google hands out to open source contributors, annually. I was previously awarded this in 2011.
I don’t get a lot of awards. Getting this token of appreciation feels awesome and I’m humbled and grateful I was not only nominated but also actually selected as recipient. Thank you, Google!
Nine years ago I got 350 USD credits in the Google store and I got my family a set of jackets using them – my kids have grown significantly since then, so to them those black beauties are now just a distant memory, but I still actually wear mine from time to time!
This time, the reward comes with a 250 USD “payout” (that’s the gift mentioned in the mail above), as a real money transfer that can be spent on other things than just Google merchandise!
I’ve decided to accept the reward and the money and I intend to spend it on beer and curl stickers for my friends and fans. As I prefer to view it:
Du har genom ett omfattande arbete vaskats fram som en värdig mottagare av årets Polhemspris. Det har skett genom en nomineringskommitté och slutligen ett råd med bred sammansättning. Priset delas ut av Kungen den 19 oktober på Tekniska muséet.
My attempt of an English translation:
You have been selected as a worthy recipient of this year's Polhem prize through extensive work.It has been through a nomination committee and finally a council of broad composition.The prize is awarded by the King on October 19th at the Technical Museum.
A gold medal
At the award ceremony in October 2017 I received the gold medal at the most fancy ceremony I could ever wish for, where I was given the most prestigious award I couldn’t have imagined myself even being qualified for, handed over by no other than the Swedish King.
An entire evening with me in focus, where I was the final grand finale act and where my life’s work was the primary reason for all those people being dressed up in fancy clothes!
Things have settled down since. The gold medal has started to get a little dust on it where it lies here next to me on my work desk. I still glance at it every once in a while. It still feels surreal. It’s a fricking medal in pure gold with my name on it!
I almost forget the money part of the prize. I got a lot of money as well, but in retrospect it is really the honors, that evening and the gold medal that stick best in my memory. Money is just… well, money.
So did the award and prize make my life any different? Yes sure, a little, and I’ll tell you how.
What’s all that time spent on?
My closest surrounding of friends and family got a better understanding of what I’ve actually been doing all these long hours, all these years and more than one phrase in the style of “oh, so you actually did something useful?!” have been uttered.
Certainly I’ve tried to explain to them before, but nothing works as good as a gold medal from an award committee to say that what I do is actually appreciated “out there” and it has made a serious impact on the world.
I think I’m considered a little less weird now when I keep spending night hours in front of my computer when the house is otherwise dark and silent. Well, maybe still weird, but at least my weirdness has proven to result in something useful for mankind and that’s more than many other sorts of weird do… We all have hobbies.
What is curl?
Family and friends have gotten a rudimentary level of understanding of what curl is and what it does. I’m not suggesting they fully grasp it or know what an “internet protocol” is now, but at least a lot of people understand that it works with “internet transfers”. It’s not like people were totally uninterested before, but when I was given this prize – by a jury of engineers no less – that says this is a significant invention and accomplishment with a value that “can not be overestimated“, it made them more interested. The little video that was produced helped:
Some mysteries remain
People in general still have a hard time to grasp the reach of the project, how much time I’ve spent so far on it, how I can find motivation to keep up the work and not the least how this is all given away for free for everyone.
The simple fact that these are all questions that I’ve been asked I think is a small reward in itself. I think the fact that I was awarded this prize for my work on Open Source is awesome and I feel honored to be a person who introduces this way of thinking to some of the people who previously would think that you have to sell proprietary things or earn a lot of money for your products in order to impact and change society as a whole.
Not widely known
The Polhem prize is not widely known in Sweden among the general populace and thus neither is the fact that I won it. Only a very special subset of people know about this. Of course it is even less known outside of Sweden and in fact the information about the prize given in English is very sparse.
Next year’s winner
The other day I received my invitation to participate in this year’s award ceremony on November 14. Of course I’ll happily accept that and I will be there and celebrate the winner this year!
The curl project
How did the prize affect the project itself, the project that I was awarded for having cared for this long?
It hasn’t affected it much at all (as far as I can tell). The project has moved along like before and we’ve worked on fixing bugs and added features and cool things over time after my award just as we did before it. That’s how it has felt like. Business as usual.
If anything, I think I might have gotten some renewed energy and interest in the project and the commit author statistics actually show that my commit frequency has gone up since around the time I got the award. Our gitstats show that I’ve done more than half of the commits every single month the last year, most of this time even more than 70% of the commits.
I may have served twenty years here, but I’m not done yet!
This award tradition that was started in 2007 was put on a hiatus after 2010 (I believe) and there has not been any awards handed out since, and we have not properly shown our appreciation for the free software heroes of the Nordic region ever since.
— “Have you ever detected anyone trying to add a backdoor to curl?”
— “Have you ever been pressured by an organization or a person to add suspicious code to curl that you wouldn’t otherwise accept?”
— “If a crime syndicate would kidnap your family to force you to comply, what backdoor would you be be able to insert into curl that is the least likely to get detected?” (The less grim version of this question would instead offer huge amounts of money.)
I’ve been asked these questions and variations of them when I’ve stood up in front of audiences around the world and talked about curl and how it is one of the most widely used software components in the world, counting way over three billion instances.
Back door (noun)
— a feature or defect of a computer system that allows surreptitious unauthorized access to data.
So how is it?
No. I’ve never seen a deliberate attempt to add a flaw, a vulnerability or a backdoor into curl. I’ve seen bad patches and I’ve seen patches that brought bugs that years later were reported as security problems, but I did not spot any deliberate attempt to do bad in any of them. But if done with skills, certainly I wouldn’t have noticed them being deliberate?
If I had cooperated in adding a backdoor or been threatened to, then I wouldn’t tell you anyway and I’d thus say no to questions about it.
How to be sure
There is only one way to be sure: review the code you download and intend to use. Or get it from a trusted source that did the review for you.
If you have a version you trust, you really only have to review the changes done since then.
Possibly there’s some degree of safety in numbers, and as thousands of applications and systems use curl and libcurl and at least some of them do reviews and extensive testing, one of those could discover mischievous activities if there are any and report them publicly.
Infected machines or owned users
The servers that host the curl releases could be targeted by attackers and the tarballs for download could be replaced by something that carries evil code. There’s no such thing as a fail-safe machine, especially not if someone really wants to and tries to target us. The safeguard there is the GPG signature with which I sign all official releases. No malicious user can (re-)produce them. They have to be made by me (since I package the curl releases). That comes back to trusting me again. There’s of course no safe-guard against me being forced to signed evil code with a knife to my throat…
If one of the curl project members with git push rights would get her account hacked and her SSH key password brute-forced, a very skilled hacker could possibly sneak in something, short-term. Although my hopes are that as we review and comment each others’ code to a very high degree, that would be really hard. And the hacked person herself would most likely react.
Downloading from somewhere
I think the highest risk scenario is when users download pre-built curl or libcurl binaries from various places on the internet that isn’t the official curl web site. How can you know for sure what you’re getting then, as you couldn’t review the code or changes done. You just put your trust in a remote person or organization to do what’s right for you.
Trusting other organizations can be totally fine, as when you download using Linux distro package management systems etc as then you can expect a certain level of checks and vouching have happened and there will be digital signatures and more involved to minimize the risk of external malicious interference.
Pledging there’s no backdoor
Some people argue that projects could or should pledge for every release that there’s no deliberate backdoor planted so that if the day comes in the future when a three-letter secret organization forces us to insert a backdoor, the lack of such a pledge for the subsequent release would function as an alarm signal to people that something is wrong.
That takes us back to trusting a single person again. A truly evil adversary can of course force such a pledge to be uttered no matter what, even if that then probably is more mafia level evilness and not mere three-letter organization shadiness anymore.
I would be a bit stressed out to have to do that pledge every single release as if I ever forgot or messed it up, it should lead to a lot of people getting up in arms and how would such a mistake be fixed? It’s little too irrevocable for me. And we do quite frequent releases so the risk for mistakes is not insignificant.
Also, if I would pledge that, is that then a promise regarding all my code only, or is that meant to be a pledge for the entire code base as done by all committers? It doesn’t scale very well…
Additionally, I’m a Swede living in Sweden. The American organizations cannot legally force me to backdoor anything, and the Swedish versions of those secret organizations don’t have the legal rights to do so either (caveat: I’m not a lawyer). So, the real threat is not by legal means.
What backdoor would be likely?
It would be very hard to add code, unnoticed, that sends off data to somewhere else. Too much code that would be too obvious.
A backdoor similarly couldn’t really be made to split off data from the transfer pipe and store it locally for other systems to read, as that too is probably too much code that is too different than the current code and would be detected instantly.
No, I’m convinced the most likely backdoor code in curl is a deliberate but hard-to-detect security vulnerability that let’s the attacker exploit the program using libcurl/curl by some sort of specific usage pattern. So when triggered it can trick the program to send off memory contents or perhaps overwrite the local stack or the heap. Quite possibly only one step out of several steps necessary for a successful attack, much like how a single-byte-overwrite can lead to root access.
Any past security problems on purpose?
We’ve had almost 70 security vulnerabilities reported through the project’s almost twenty years of existence. Since most of them were triggered by mistakes in code I wrote myself, I can be certain that none of those problems were introduced on purpose. I can’t completely rule out that someone else’s patch modified curl along the way and then by extension maybe made a vulnerability worse or easier to trigger, could have been made on purpose. None of the security problems that were introduced by others have shown any sign of “deliberateness”. (Or were written cleverly enough to not make me see that!)
Maybe backdoors have been planted that we just haven’t discovered yet?
At the time of each commit, check how many unique authors that had a change committed within the previous 120, 90, 60, 30 and 7 days. Run the script on the curl git repository and then plot a graph of the data, ranging from 2010 until today. This is just under 10,000 commits.
(click for the full resolution version)
git-authors-active.pl is the little stand-alone script I wrote and used for this – should work fine for any git repository. I then made the graph from that using libreoffice.
bus factor: the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel.
Projects should strive to survive
If a project is worth using and deploying today and if it is a project worth sending patches to right now, it is also a project that should position itself to survive a loss of key individuals. However unlikely or unfortunate such an event would be.
This number is really impossible to figure out without tools and tools really cannot take “general knowledge” into account, or “this person answers a lot of email on the list”, or this person has 48k in reputation on stack overflow already for responding to questions about the project.
The bus factor as evaluated by a tool pretty much has to be about amount of code, size of code or number of code changes, which may or may not be a good indicator of who knows what about the code. Those who author and commit changes probably have a good idea but a real problem is that you can’t reverse that view and say that just because you didn’t commit or change something, you don’t know. Do you know more about the code if you did many commits? Do you know more about the code if you changed more lines of code?
We can’t prove or assume lack of knowledge or interest by an absence of commits, edits or changes. And yet we can’t calculate bus factor if there’s no tool or way to calculate it.
A look at curl
curl is soon 20 years old and boasts 22k something commits. I’m the author of about 57% of them, and the second-most committer (who’s not involved anymore) has about 12%. That makes two committers having done 15.3k commits out of the 22k. If we for simplicity calculate bus factor based on commit numbers, we’d need 8580 commits from others and I would stop completely, to reach bus factor >2 (when the 2 top committers have less than 50% of the commits), which at the current commit rate equals in about 5 years. And it would take about 3 years to just push the factor above 1. So even when new people joins the project, they have a really hard time to significantly change the bus factor…
The image above shows the relative share of commits done in the curl project’s git source code repository (as a share of the total amount) by the top 4 commiters from January 1 2010 to July 5 2017 (click for higher resolution). The top dotted line shows the combined share of all four (at 82% right now) and the dark blue line is my share. You can see how my commit share has shrunk from 72% down to 57% over these last 7.5 years. If this trend holds, I’ll have less than 50% of the total commits done in curl in 3-4 years.
At the same time, the thicker light blue line that climbs up into the right is the total number of authors in the git repository, which recently surpassed 500 as you can see. (The line uses the right Y-axes)
We’re approaching 1600 individually named contributors thanked in the project and every release we do (we ship one every 8 weeks) has around 40 contributors, out of which typically around half are newcomers. The long tail is very long and the amount of drive-by just-once contributors is high. Also note how the number 1600 is way higher than the 500 something that has authored commits. Lots of people contribute in other ways.
When we ask our users “why don’t you contribute (more) to the project?” (which we do annually) what do they answer? They say its because 1) everything works, 2) I don’t have time 3) things get fixed fast enough 4) I don’t know the programming language 5) I don’t have the energy.
First as the 6th answer (at 5% 2017) comes “other” where some people actually say they wouldn’t know where to start and so on.
All of this taken together: there are no visible signs of us suffering from having a low bus factor. Lots of signs that people can do things when they want to if I don’t do it. Lots of signs that the code and concepts are understood.
Lots of signs that a low bus factor is not a big problem here. Or perhaps rather that the bus factor isn’t really as low as any tool would calculate it.
What if I…
Do I know who would pick up the project and move on if I die today? No. We’re a 100% volunteer-driven project. We create one of the world’s most widely used software components (easily more than three billion instances and counting) but we don’t know who’ll be around tomorrow to work on it. I can’t know because that’s not how the project works.
Given the extremely wide use of our stuff, given the huge contributor base, given the vast amounts of documentation and tests I think it’ll work out.
Just because you have a large bus factor doesn’t necessarily make the project a better place to ask questions. We’ve seen projects in the past where N persons involved are all from the same company and when that company removes its support for that project those people all go away. High bus factor, no people to ask.
Finally, let me just add that I would of course love to have many more committers and contributors in the curl project, and I think we would be an even better project if we did. But that’s a separate issue.