Tag Archives: Open Source

bye bye hosting c-ares web

At some point during 2003, my friend Bjørn Reese (from Dancer) and I were discussing back and forth and planning to maybe create our own asynchronous DNS/name resolver library. We felt that the synchronous APIs provided by gethostname() and getaddrinfo() were too limiting in for example curl. We could really use something that would not block the caller.

While thinking about this and researching what was already out there, I found the ares library written by Greg Hudson. It was an effort that was almost exactly what we had been looking for. I decided I would not make a new library but rather join the ares project and help polish that further to perfect it – for curl and for whoever else who wants such functionality.

It was soon made clear to me that the original author of this library did not want the patches I deemed were necessary, including changes to make it more portable to Windows and beyond. I felt I had no choice but to fork the project and instead I created c-ares. It would show its roots but not be the same. The c could be for curl, but it also made it into an English word like “cares” which was enough for me.

The first c-ares release I did was called version 1.0.0, published in February 2004.

The ares project did not have a website, but I am of the opinion that a proper open source project needs one, to provide downloads and not the least its documentation etc. A home. I created a basic c-ares website and since then I have hosted it on my server on the behalf of the c-ares project.

The was available as c-ares.haxx.se for many years but was recently moved over to c-ares.org. It has never been a traffic magnet so quite easy to manage.

In the backseat

In recent years, I have not participated much in the c-ares development. I have had my hands full with curl while the c-ares project has been in a pretty good shape and has been cared for in an excellent manner by Brad House and others.

I have mostly just done the occasional website admin stuff and releases.

Transition

Starting now, the c-ares website is no longer hosted by me. A twenty years streak is over and the website is now instead hosted on GitHub. I own the domain name and I run the DNS for it, but that is all.

The plan is that Brad is also going to take over the release duty. Brad is awesome.

My BDFL guiding principles

The thing about me being a BDFL for curl is that it has the D in there. I have the means and ability to push for or veto just about anything I like or don’t like in the project, should I decide to. In my public presentations about curl I emphasize that I truly try to be a benevolent dictator, but then I also presume quite a few dictators would say and believe so whether that is true or not to the outside world.

I think we can say with some certainty that dictatorships are not the ideal way of running a country, and it might also go for Open Source projects.

In curl we remain using this model because it works and changing it to something else is a large and complicated process which we have not wanted to get to because we have not had any strong reason. There is anecdotal evidence that this way of running the project works somewhat.

A significant difference between being a dictator for an Open Source project compared to a country is however the ease with which every citizen could just leave one day and start a new clone country, with all the same citizens and the same layout, just without the dictator. I’m easily replaceable and made into past tense if I would abuse my role.

So there is this inherent force to push me to do good for the project even if I am a “dictator”.

As a BDFL of curl…

This is what I think the curl project should focus on. What I want the curl project to be. These are the ten commandments I think should remain our guiding principles. I think this is what makes curl. My benevolent guidelines.

My ten guiding principles for curl

  1. Be open and friendly
  2. Ship rock-solid products
  3. Be a leader in Open Source
  4. Maintain a security first focus
  5. Provide top-notch documentation
  6. Remain independent
  7. Respond timely
  8. Keep up with the world
  9. Stay on the bleeding edge
  10. Respect feedback

This list of ten areas are perhaps things every open source projects want to focus and excel in, but I think that is irrelevant here. These are then ten key and core focus points for me when I work on curl. We should be best-in-class in each and every one of them.

Let me elaborate

1. Be open and friendly to all contributors, new and old

I think open source projects have a lot to gain by making efforts in being friendly and approachable. We were all newcomers into a project once. We gain more contributors, better, by remaining a friendly and open project.

This does not mean that we should tolerate abuse or trolling.

I try to lead this by example. I do not always succeed.

2. Ship rock-solid products for the universe to depend upon

Reviews, tests, analyzers, fuzzers, audits, bug bounty programs etc are means to make sure the code runs smoothly everywhere. Studying protocol specs and inter-operating with servers and other clients on the Internet ensure that the products we make work as expect by billions of end users. On this planet and beyond. If our products cannot carry the world on their shoulders, we fail.

Rock-solid also means we are reliable over time – we do not break users’ scripts and applications. We maintain ABI/API compatibility. The command line options the curl tool introduces are supported until the end of time.

3. Be a leader in Open Source, follow every best practice

As true believers in the powers of Open Source we lead by example. We are here to show that you not only can do all development and everything in the open and using open source practices, but that it also makes the project thrive and deliver state of the art outcomes.

4. Always keep users secure, maintain a security first focus

Provide features and functionality with user and protocol security in focus. Address security concerns and reports without delays. Document every past mistake in thorough detail. Help users do secure and safe Internet transfers – by default.

5. Provide industry-leading quality documentation

A key to successful usage of our products, to give users the means and ability to use our project fully, we need to document how it works and how to use it. Everything needs to be documented with clarity and detail enough so that users understand and are empowered.

No comparable software project, open or proprietary, can compete with the quality and amount of documentation we provide.

6. Roam free, independent from all companies and organizations

curl shall forever remain independent. curl is not part of any umbrella organization, it is not owned or controlled by any company. It makes us entirely independent and free to do what we think is best for the community, for our users and for Internet transfers in general. Its license shall remain set and it ensures that curl remains free. Copyright holders are individuals, there is no assignment or licensing of copyrights involved.

7. Respond timely on issues and questions

We shall strive to respond to issues, reports and questions sent to the project within a reasonable time. To show that we care and to help users solve their problems. We want users to solve their problems sooner rather than later.

It does not mean that we always can fix the problems or give a good answer immediately. Sometimes we just have to say that we can’t fix it for now.

8. Remain the internet transfer choice, keep up with the world

In the curl project we should keep up with protocol development, updates and changes. The way we do Internet transfers changes over time and curl needs to keep up to remain relevant.

We also need to write our protocol implementations sensibly, knowing that we are being watched and our way of doing things are often copied, referred to and relied upon by other Internet clients.

This also implies that we are never done and that we can always improve. In every aspect of the project.

9. Offer bleeding edge protocol support to aid early adopters

When new protocols or ways to do protocols are introduced to the world, curl can play a great role in providing and offering early support of such protocols. This has through the years helped countless of other implementers or even protocol authors of these protocols and with this, we help improving the world around us. It also helps us get early feedback on our implementation and thus ship better code earlier.

10. Listen to and respect community feedback

I might be a dictator, but this dictatorship would not work if I and the rest of the curl maintainers did not listen to what the users and the greater curl community have to say. We need to stay agile and have a sense of what people want our products to do and to not do. Now and in the future.

An open source project can always get forked the second it makes a bad turn and somehow gives up on, sells out or betrays its users. Me being a dictator does not protect us from that. We need to stay responsive, listening and caring. We are here for our users.

Flag?

If curl was an evil empire, I figure we would sport this flag:

3,000 contributors

Thank you everyone who has helped out in making curl into what it is today.

We make an effort to note the names of and say thanks to every single individual who ever reported bugs, fixed problems, ran tests, wrote code, polished the website, spell-fixed documentation, assisted debug sessions, helped interpret protocol standards, reported security problems or co-authored code etc.

In 2005 I decided to go back through the project history and make sure all names that had been involved up to that point in time would also be mentioned in the THANKS file. This is the reason for the visible bump in the graph.

Since then, we add the names of all the helpers. We say thanks and give credits in commit messages and we have scripts to help us collect them and mention them as contributors. We probably miss occasional ones but I hope and believe that most of all the awesome people that ever helped us are recorded accordingly and given credit.

We are nothing without out dear contributors.

Today, this list of people we are thankful for, reached 3,000 entries when Alex Klyubin’s pull request was merged. (The list of names on the website is synced every once in a while so it actually typically shows slightly fewer people than we have logged in git.)

The team behind curl. 3,000 persons over almost 27 years.

We reached 2,000 contributors less than four years ago, in October 2019, so we have added about 250 new names per year to this list the last four years.

We reached 1,000 contributors in March 2013, meaning from then it took about 6.5 years to get the next 1,000. Roughly 150 new names per year.

The first 1,000 took over 16 years to reach.

You too can see it quite clearly in the graph above as well: the rate of which we get new contributors to help out in the project is increasing.

curl has existed for 9338 days. That equals one new contributor every 3.1 days for over 25 years, on average.

Non-code or code alike

It is oftentimes said that Open Source projects in general have a hard time to properly recognize and appreciate non-code contributions. In the curl project, we try hard to not differentiate between help and help.

If a person helps out to take the curl project further, even if just by a little, we say thank you very much and add their name to the list. We need and are grateful for code and non-code contributions alike. One of the most important parts of the curl project is the documentation, and that is clearly not code.

About the contributors

We cannot say much about the contributors because in an effort to lower the bars and reduce friction, we also do not ask them about details. We only know the name they provide the help under, which could be a pseudonym and in some cases clearly are nicknames. We do not know where in the world they originate from or which company they work for, we can’t tell their gender, skin color, religion or other human “properties”. And we don’t care much about those specifics. Our job is to bring curl forward.

This said: there is a risk that we have added the same contributors twice or more, if they have helped out using several different names. That is just not something we can detect or avoid. Unless the contributor themself informs us.

There is also a risk that some of the persons that contributed to curl are not nice people or that they work for reprehensible organizations. We focus on the quality of their submissions and if they hide who they are, we will never know if they actually are animal-hurting nazis hiding behind pseudonyms.

(Lack of) Diversity

As far as I can tell, we have a lousy contributor diversity. I am pretty sure the majority of all help come from old white middle-class western men. Like myself.

I cannot fully know this for sure because I only actually know a small fraction of all the contributors, but out of the ones I have met this is true and I believe I have met or at least communicated with the ones who have done the vast majority of all the changes.

I would much rather see us have many more contributors from other parts of the world, female, and with non-christian backgrounds, but I cannot control who comes to us. I can only do my best to take care of all and appreciate every contribution without discrimination.

Commit authors

This day when we reach 3,000 contributors, we also count 1,201 commit authors. Persons with their names as authors of at least one commit in the curl source repository. 40% of the contributors are committers. Almost 65% of the committers only ever committed once.

3,000 visualized

The top image of this blog post is a photo from FOSDEM a few years back when I did a presentation in front of a packed room with some 1,400 attendees. The largest room at FOSDEM is said to fit 1415 persons. So not even two such giant rooms would be enough to hold all the curl contributors if it would have been possible to get them all in one place…

You too?

You too can be a curl contributor. We are friendly. It is not hard. There is lots to do. Your contributions can end up getting used by literally billions of humans.

Google Open Source Peer Bonus award 2023

I am honored to yet again receive a peer bonus award from Google. This is a Google program for which persons like me can be nominated by Googlers and as a result receive grants.

I previously received such an award in 2020.

Update

A few people noticed and have commented on the fact that this letter is signed by Chris DiBona and dated April 19th 2023, while sources say he was let go from Google back in January. Which means one or two of those things are wrong.

trurl manipulates URLs

trurl is a tool in a similar spirit of tr but for URLs. Here, tr stands for translate or transpose.

trurl is a small command line tool that parses and manipulates URLs, designed to help shell script authors everywhere.

URLs are tricky to parse and there are numerous security problems in software because of this. trurl wants to help soften this problem by taking away the need for script and command line authors everywhere to re-invent the wheel over and over.

trurl uses libcurl’s URL parser and will thus parse and understand URLs exactly the same as curl the command line tool does – making it the perfect companion tool.

I created trurl on March 31, 2023.

Some command line examples

Given just a URL (even without scheme), it will parse it and output a normalized version:

$ trurl ex%61mple.com/
http://example.com/

The above command will guess on a http:// scheme when none was provided. The guess has basic heuristics, like for example FTP server host names often starts with ftp:

$ trurl ftp.ex%61mple.com/
ftp://ftp.example.com/

A user can output selected components of a provided URL. Like if you only want to extract the path or the query components from it.:

$ trurl https://curl.se/?search=foobar --get '{path}'
/

Or both (with extra text intermixed):

$ trurl https://curl.se/?search=foobar --get 'p: {path} q: {query}'
p: / q: search=foobar

A user can create a URL by providing the different components one by one and trurl outputs the URL:

$ trurl --set scheme=https --set host=fool.wrong
https://fool.wrong/

Reset a specific previously populated component by setting it to nothing. Like if you want to clear the user component:

$ trurl https://daniel@curl.se/--set user=
https://curl.se/

trurl tells you the full new URL when the first URL is redirected to a second relative URL:

$ trurl https://curl.se/we/are/here.html --redirect "../next.html"
https://curl.se/we/next.html

trurl provides easy-to-use options for adding new segments to a URL’s path and query components. Not always easily done in shell scripts:

$ trurl https://curl.se/we/are --append path=index.html
https://curl.se/we/are/index.html
$ trurl https://curl.se?info=yes --append query=user=loggedin
https://curl.se/?info=yes&user=loggedin

trurl can work on a single URL or any amount of URLs passed on to it. The modifications and extractions are then performed on them all, one by one.

$ trurl https://curl.se localhost example.com 
https://curl.se/
http://localhost/
http://example.com/

trurl can read URLs to work on off a file or from stdin, and works on them in a streaming fashion suitable for filters etc.

$ cat many-urls.yxy | trurl --url-file -
...

More or different

trurl was born just a few days ago, this is what we have made it do so far. There is a high probability that it will change further going forward before it settles on exactly how things ideally should work.

It also means that we are extra open for and welcoming to feedback, ideas and pull-requests. With some luck, this could become a new everyday tool for all of us.

Tell us on GitHub!

Fossified pilot episode

Henrik, Johan, Magnus and I are all Swedish “FOSS people” and friends since many years back.

We like open source. We work with Open Source. We have contributed to Open Source since a long time back. We also have slightly different backgrounds and areas of expertise so we don’t all just totally overlap.

We decided we wanted to try putting together a podcast and talk about all things FOSS: from lightweight news down to more deep dives and interviews and discussions with peeps who know more. With our takes and personal views applied of course.

We named it Fossified. We have recorded a first pilot where we test the concept a little, but more importantly this is just the beginning and we have created a GitHub repository where we collect program ideas and proposals.

We certainly need and appreciate your help. With ideas for topics and guests. Perhaps even with a logo or why not an intro song?

The Pilot

We have recorded our first episode. You can find it on our very fancy website fossified.com.

Uncurled – the presentation

Uncurled – everything I know and learned about running and maintaining Open Source projects for three decades.

This is me, doing a live English-speaking presentation/webinar on these topics that I cover in my book: Uncurled.

Recording

Date: Tuesday August 23, 2022

Time: 10: 00 UTC (12:00 CEST)

Where: over zoom [Sign up]

The plan is to record this session and make it available after the fact on YouTube. This post will be updated with a link to that once it exists.

Agenda

Here’s the outlook on what I hope to be able to cover in a 40 minutes talk.

This will be followed by a Q&A-session with me answering any questions you might have. Feel most welcome and encouraged to submit your questions ahead of time if you already have some! (comment here, email me, comment or DM on Twitter, send a carrier pigeon, anything!)

I have not done this presentation before. I know the subject very intimately so I have no worries about that. The timing of the thing is what is going to be my bigger challenge I think. I aim for no more than 40 minutes of me blabbing.

I don’t know who uses my code

When I (in spite of knowing better) talk to ordinary people about what I do for a living and the project I work on, one of the details about it that people have the hardest time to comprehend, is the fact that I really and truly don’t know a lot about who uses my code. (Or where. Or what particular features they use.)

I work on curl full-time and we ship releases frequently. Users download the curl source code from us, build curl and put it to use. Most of “my” users never tell me or anyone else in the curl project that they use curl or libcurl. This is of course perfectly fine and I probably could not even handle the flood if every user would tell me.

This not-knowing is a most common situation for Open Source authors and projects. It is not unique for me.

The not knowing your users is otherwise unusual in a world of products and software, and quite frankly, sometimes it is an obstacle for us as well since we lack a good way to communicate with users about plans, changes or ideas. It also makes it really hard to estimate our own success and the always-recurring question: how many users do you have?

To be fair: I do know quite a lot of users as well. But I don’t know how representative they are, nor how big fraction of the totals they consist of etc.

How to find out that someone uses your code

I wrote a section in Uncurled about How to find out that someone uses your code, and these are three main ways I learn about users of my code:

  1. A user of my project runs into an issue and asks for help in a project forum – without hiding where they come from or what product they make.
  2. While vanity-searching I find my project’s Open Source license mentioned for or bundled with a product or tool or device.
  3. A user emails me asking funny questions about a product I never heard of, and then I realize they found my email because that product uses my code and the license has my email in it.

curl is REUSE compliant

The REUSE project is an effort to make Open Source projects provide copyright and license information (for all files) in a machine readable way.

When a project is fully REUSE compliant, you can easily figure out the copyright and license situation for every single file it holds.

The easiest way to accomplish this is to make sure that all files have the correct header with the appropriate copyright info and SPDX-License-Identifier specified, but it also has ways to provide that meta data in adjacent files – for files where prepending that info isn’t sensible.

What we needed to do

We were already in a fairly good place before this push. We have a script that verifies the presence of copyright header in files (including checking the end year vs the latest git commit), with a list of files that were deliberately skipped.

The biggest things we needed to do were

  1. Add the SPDX identifier all over
  2. Make sure that the skipped files also have copyright and licensing info provided
  3. Add a CI job that verifies that we remain compliant

I also ended up adjusting our own copyright scan script to use the REUSE metadata files instead of its own ignore filters which also made it even easier for us to make sure we are and remain compatible — that every single files in the curl git repository has a known and documented license and copyright situation.

As a bonus, the cleanup work helped us detect an example file that stood out which we got relicensed and we removed two older files that had their own unique licenses (without any good reason).

There are 3518 files in the curl git repository this exact moment.

Compliant!

Starting mid-June 2022, curl is 100% REUSE compliant. curl 7.84.0 will be the first release done in this status.

Motivation

I think it is a good idea to have perfect control over the copyright and license situation for every single file, and to make sure that the situation is documented enough and to a level that allows anyone and everyone to check it out and learn how things lie. No surprises.

Companies have obviously figured out this info before to a degree that they have been satisfied with since curl is widely used even commercial since a long time. But I believe that by providing the information in an even easier and more descriptive way makes things even better. For existing and future users.

I also think that the low threshold for us to reach this compliance was a factor. We were almost there already. We just need to polish up some small details and I think it made it worth it.

This cleanup also makes sure we have perfect control and knowledge of the license situation, now and going forward. I think this can be expected from a project aiming for gold standard.

The curl SPDX license identifier

Keen readers will notice that curl has its own license identifier. It is called the curl license. Not MIT, X or a BSD variation. curl.

The reason for this is good old stupidity. In January 2001 we adopted the MIT license for use in the project because we believed it better matches what we want compared to the previous license situation. We started out with a dual license situation together with the MPL license we used previously, but the MPL part was removed completely in October 2002.

For reasons that have since been forgotten, we thought it was a good idea to edit the license text. To trim it a little. Since August 2002, the license text that started out as an MIT/X license is no longer a perfect copy. It is a derivative . Very similar and almost identical. But it’s not the same.

When the SPDX project created their set of identifiers for well-used licenses out in the FOSS world they decided that the curl license is different enough from the MIT/X license to treat it separately and give it its own identifier. I know of no other project than curl that uses this particular edited version of the MIT license.

In hindsight, I believe the editing of the license text back in 2002 was dumb. I regret it, but I will not change it again. I think we can live with this situation pretty good.

Credits

Most of the heavy lifting necessary to make curl compliant was done by Max Mehl.

Meeting the Cyber Safety Review Board

Three Open Source hackers were invited to this meeting with the CSRB and I was one of them.

The board with this name is part of CISA, a US government effort that received a presidential order to work on “Improving the Nation’s Cybersecurity“. Where “the Nation” here is the US.

I’m not in the US and I’m not a US citizen but I felt I should help out when asked and I was able to.

On April 21 2022, I joined the video meeting together with an OpenSSL and a Tomcat contributor and several members of the board. (I am not naming any names of participants in this post because I have not asked for permission nor do I think the names are important here.)

For about an hour we talked to the board how we develop Open Source, how we take on security problems and how we work on making sure we do things as securely as we can. It was striking how similarly the three of us looked at the issues and how we work in our project, despite our projects all being different and having our own specifics.

As projects, we believe we have pretty well-established and working procedures for getting problems reported and we think we fix the issues fairly swiftly. We ship fixes, advisories and updates not long after the issues get known. The CVE system where we register and publish security vulnerabilities in a global registry is working adequately. (I’m not saying things are perfect.)

The main problem

It was pretty clear to me that we agreed that the biggest problem in the Open Source supply chain today is the slow uptake in patching vulnerable software.

Lots of vendors and products have not been made or have any plans for how to handle upgrades when vulnerabilities are found. Many of those that do act, do that with such glacier like speeds that users of such products remain exposed for attackers for a long period after the flaws are already fixed and have become known.

My own analysis of this is that such vendors of course do this because its the cheapest way. Plain capitalistic reasons.

Addressing this is hard

If we had any easy fixes for this, we would already have them in progress. We were also asked by the board what kind of systems that we would not like to see.

Will Software Bill Of Materials (SBOM) fix this? Maybe it can help, by exposing to the world what software and versions are used in products, but it will certainly depend on how it is used and enforced. If done too heavy-handed, it risks causing overhead and added complications but in the other end it might end up too wishy-washy.

Ended there

This was just an hour of conversation with a few follow-up clarifying emails. I hope that we were able to provide insights into how Open Source is made but I have no illusions of us changing anything in drastic ways.

I felt honored to represent “my kind” and help sharing knowledge of Open Source to areas of the world that might not always get informed about it.