Category Archives: Open Source

Open Source, Free Software, and similar

libcurl is 24 years old

On Monday August 7, 2000 at 14:49 UTC, we announced the release of the first libcurl version ever. Exactly twenty-four years ago today. We called it version 7.1. The simple reason we did a point one release as the first one was that we had shipped a whole range of 7.0 beta versions before that day and I wanted to make it clear to everyone that this was a bump up from those. To avoid confusions.

The last release we did before libcurl was born was called 6.5.2, which we did 139 days before 7.1 was uploaded (in March 2000). The longest release interval in the entire history of curl. Never before or after, have we spent that long time to prepare a single new curl release.

Howdy

I just put the curl 7.1 tarball on the curl site. It’ll take a while more for it to appear on the mirrors and for our friends to send in binary archives or report where there are updated archives for various platforms.

The original announcement. Not many frills there.

Contributors

Most of the work that went into the creation of libcurl was done by me alone. While we already had contributors and lots of people helping out with bug reports and features, this kind of heavy lifting and huge refactor was not really a team effort. I spent the summer splitting the monolithic command line tool into two components: a library and a command line tool that uses the library.

Since the last days of 1999 the curl source code was hosted on Sourceforge using CVS so at least all this was done in a public source code repository with version control.

Why libcurl

I had already years before this time understood the powers of using shared libraries and how they are good ways to offer features and functionality to applications. I got an article published in a technical Amiga magazine in the early 1990s about how to do shared libraries on AmigaOS.

As the command line tool curl started to get somewhere and was already two years old at that point (the curl tool was born on March 20 1998), I imagined that there could be an application or two that could benefit from getting easy access to doing Internet transfers. It was just a gut feeling and a guess. I did not know. I did not research this or anything, I just took the plunge and figured we will eventually realize if I was right or wrong.

With me having written some portable libraries in the past already, curl was already written with the mindset that it could at time point get converted to a library + command line tool. This eased the conversion.

Already on day one libcurl built and ran on numerous operating systems offering the same API.

Name

When I decided to make a library from the core of the curl tool I opened the discussion to see if someone had any ideas for what this library should be called. Naming is hard and I had no particularly bright ideas myself so we quite soon landed on the simplest take possible. Let’s just call it libcurl.

API

How do you design an internet transfer library API?

I had this idea to offer multiple different API “sets” but I figured I needed to start with something basic. It ended up getting called the “easy” interface because it was intended to be straight-forward to use and I decided I could always add something more advanced later. I wanted to provide a fairly low level API and if someone wanted something more high-level, such an API can get built on top of this.

I knew that I wanted the API to be somewhat extensible without having to change the API all the time, and I found inspiration in how the ioctl() and fcntl() and similar functions work. I made curl_easy_setopt() follow that pattern. For good and bad.

I wanted the API to be decently protocol agnostic, since curl could already handle several different ones and I figured it could be appreciated by users who might not always be Internet protocol experts.

Mostly due to luck we managed to ship an API that even though it has its quirks has survived remarkably well.

Using C

There was never any consideration to use another language than C for the library and there was really no decent options to select from either. In theory I suppose C++ could have been used, but I have never been a friend of C++ so I rather not. Not then, not now.

There was never even a discussion about what language to use.

Resilience

The API that gradually was created in that summer of 2000 turned out to be fairly good and has survived and done surprisingly good.

There are 17 API functions still in use today that were introduced in 7.1. In fact, much code written for the API from back then can still be compiled and run using the latest libcurl!

At the time of the first libcurl release there was about 17,000 lines of product code. Today, twenty-four years later, we recently surpassed 171,000 lines.

The API has over the years even proved itself to endure “revolutionary” protocol changes such as HTTP/2 introducing multiplexing multiple transfers over the same connection and HTTP/3, which switches from TCP to UDP. This, mostly because it offers a high enough abstraction level.

Users

The same month we shipped the first libcurl ever, the PHP project built the first libcurl binding. When PHP 4.0.2 was released on August 29 2000, curl was a bundled, blessed and official extension. PHP’s early and unconditional adoption of libcurl helped us get users, get testing, get bugreports and helped us mature.

Soon other users, projects, tools and devices adopted libcurl and built things using it. Gradually, it grew in popularity.

multi interface

In the spring of 2002, we expanded the libcurl API with the “multi” API. Using this API family, libcurl offers an unlimited number of concurrent parallel Internet transfers in the same thread.

In 2006, we later expanded the multi interface to offer the multi_action way, which expands the API to allow use of event-based mechanisms for the event loop, like epoll etc, thus allowing scaling up parallel transfers to the thousands or more without sacrificing performance.

SONAME

Put simply, SONAME is the version number of the binary interface. We bumped it several times in the early years as we adjusted API details that we did not get right from the beginning.

Years later (in 2006), we did the last major SONAME bump and by then libcurl had enough usage to make a lot of people squeal and complain about the agony that bump caused. That event and its reactions was a strong contributing factor to me making the decision: To as far as we possibly can never again break the ABI. So far we have managed to stick to this promise and goal – barring bugs of course.

Every libcurl program since 2006 can be compiled against the latest libcurl. Of course, the underlying protocols (especially TLS) and URLs etc have changed and moved significantly since then which makes most URLs from that time to not work anymore.

Higher level API

I think in many ways my original hopes of others doing higher level APIs to build on top of what libcurl provides has been realized primarily through the means of language bindings.

We have records of at least 65 different language bindings created for libcurl, for almost as may different languages and environments. This expands the reach of libcurl powers and makes it virtually independent of programming language of choice. It also lowers the bar, as many of those languages and bindings might be easier and perhaps less intimidating to use than the C API, and yet they still offer the unparalleled super powers of the libcurl engine under the hood.

Success

I think no matter how you count or view it, libcurl can be considered a success.

This success was possible only because of the people involved. The thousands of contributors who reported bugs, debugged problems, tested patches, ran crazy experiments in production, voiced their opinions, provided ideas and added features. I may be the captain of the ship, but I was never alone and it would never have been even close to possible if I had been.

libcurl has been made to run on at least 103 operating systems and 28 CPU architectures. Everyone can always just opt to use libcurl in whatever thing they need Internet transfers in.

Timing

I attribute a lot of our success on the lucky timing. We offered something that was working and decent at the right time, and we could following along and grow with the general Internet use growth. We had no idea where the Internet evolution was going, its explosion growth and we had no idea that just about every consumer device was going to get Internet connected twenty years layer. Devices that all could run libcurl.

Competition

Internet development and evolution never stop. When libcurl first shipped, people were using libwww and for several years I considered that to be our primary competitor. Today, few people even know it existed.

The time is now and if we do not keep up with development, fix bugs and offer the features that allow the world to do Internet transfers the way every wants, then libcurl could soon also be added to the pile of forgotten software projects.

These days, there are plenty of libcurl alternatives. Perhaps the most significant ones being native (HTTP) libraries included in or for programming languages like Python, Java, Rust or .Net.

I like to say that libcurl’s biggest selling point is that we have at least 18 years of proven API, ABI and general stability. Nothing you may consider to use instead of libcurl comes even close.

Future

Even more devices will get Internet access going forward. Where Internet goes, libcurl can be of use. I do not think we have reached peak libcurl yet.

The curl project needs to keep the product rock solid and never break the API. We need to continue support doing Internet transfers they way “the world” wants them done: the protocols, the versions and the methods deemed the right ones. Keep listening to our users. Keep making it easy for contributors to improve it.

I have never been good at forecasting or even guessing what the future holds. Will there ever be a moment or event in the future that suddenly makes libcurl irrelevant? Maybe. It’s not totally unlikely. More likely however, libcurl will continue to transfer Internet bytes for the world for many years to come. At some point it will be superseded by something else, but I cannot say when or by what.

curl 8.9.1

Some annoying regressions triggered this.

Numbers

the 259th release
0 changes
7 days (total: 9,630)

28 bugfixes (total: 10,559)
43 commits (total: 32,748)
0 new public libcurl function (total: 94)
0 new curl_easy_setopt() option (total: 306)

0 new curl command line option (total: 263)
19 contributors, 5 new (total: 3,211)
10 authors, 1 new (total: 1,288)
1 security fixes (total: 158)

Download the new curl release from curl.se as always.

Release presentation

Security

We decided to do a patch release. Then yesterday we got a security vulnerability reported and so now we have that fixed in here as well.

CVE-2024-7264: ASN.1 date parser overread (severity low) libcurl’s ASN1 parser code has the GTime2str() function, used for parsing an ASN.1 Generalized Time field. If given an syntactically incorrect field, the parser might end up using -1 for the length of the time fraction, leading to a strlen() getting performed on a pointer to a heap buffer area that is not (purposely) null terminated.

Bugfixes

This release is done only because we shipped a few regressions in 8.9.0 we rather let users avoid. Here are some noteworthy fixes from the past week:

  • connection shutdown fix for event based processing – this would cause applications to keep monitoring sockets “too much”, easily leading to busy-loops or worse
  • cmake builds detect libssh and nettle better
  • several libcurl functions now survive NULL pointer inputs better
  • fixed an Apple SDK bug workaround for non-macOS targets
  • the curl tool builds with the manual enabled on OS400
  • works around an IBM (OS400) ASCII run-time library bug
  • speed limiting for 32bit systems had the wrong math
  • allow wolfSSL’s implementation of kyber to be used
  • wolfssl CA store caching fix
  • more defensive and portable socket code for the curl tool’s --ip-tos logic

changelog changes

On the curl website we of course list exactly what changes that go into each and every single release we do. In recent years I have even gone back and made sure we provide this information for every single release ever done. At the moment that means 258 releases, listing over 10,000 bugfixes and almost 1,000 changes. From 1996 until today.

This is literally a wall of changes.

Since we keep doing somewhere between 150 and 250 bugfixes per release and we do a new release very eight weeks, the page with all changes these keeps growing quite fast.

Right now, the HTML of this page is at 1.1 megabytes.

Use case

Most typically I think the use case for users visiting the changelog is to view what changes that were done in one specific curl release. Possibly checking out a few different ones. Very few users actually want tens of thousands of lines of text to scroll through. I believe.

Enter single release changelogs

To make sure that people can read the changes for a single release only, and to reduce the amount of data a user needs to download in order to view those single release changes, I worked on a setup that generates separate individual changelog pages for every release. Easy to bookmark, load fast, contain only information about the specific releases and they make it easy to skip back and forth between past and future releases.

I deployed these changes today and if you go to https://curl.se/ch/ now, you will see the changelog for the most recent release only.

The all changes changelog remains

The changelog showing everything will remain and is still an option to browse. I personally use it at times when I want to control-f and look for a change done in a previous curl version that I cannot remember exactly which. This all-changes page remains only a click away if you rather view that one instead of the single-version thing.

Design

I am not a web developer and I am not web designer. I know just enough HTML and CSS to be able to publish these things, but I do not do fancy and I am fully aware that I am not good at making “nice” or “attractive” designs. I focus on usable and practical.

As per curl website standards these pages are all static content using no JavaScript and only a few small images. Excellent for rendering fast and for caching well in the CDN.

Known vulnerabilities

I did not especially mention this before, but only a few days ago I added direct links from each version header to the page for known vulnerabilities for that specific version and that link of course is now also present in the single version changelog page. Next to the link that goes directly to the release presentation video.

Feedback?

If you find problems or have ideas on how to further improve the curl website, let us know!

curl 8.9.0

Numbers

the 258th release
11 changes
63 days (total: 9,623)

260 bugfixes (total: 10,531)
423 commits (total: 32,704)
0 new public libcurl function (total: 94)
1 new curl_easy_setopt() option (total: 306)

4 new curl command line option (total: 263)
80 contributors, 38 new (total: 3,209)
47 authors, 16 new (total: 1,288)
2 security fixes (total: 157)

Download the new curl release from curl.se as always.

Release presentation

Security

Today we fix two security vulnerabilities and publish all details about them.

  • CVE-2024-6197: freeing stack buffer in utf8asn1str. (severity medium) libcurl’s ASN1 parser has this utf8asn1str() function used for parsing an ASN.1 UTF-8 string. It can detect an invalid field and return error. Unfortunately, when doing so it also invokes free() on a 4 byte local stack buffer.
  • CVE-2024-6874: macidn punycode buffer overread. (severity low) libcurl’s URL API function curl_url_get() offers punycode conversions, to and from IDN. Asking to convert a name that is exactly 256 bytes, libcurl ends up reading outside of a stack based buffer when built to use the macidn IDN backend. The conversion function then fills up the provided buffer exactly – but does not null terminate the string.

Changes

  • –ip-tos (IP Type of Service / Traffic Class). Lets users set this IP header field to a number.
  • –mptcp. Asks curl to enable the Multipath TCP option for this connection, which if the server also allows it may make the TCP connection to go over multiple network paths.
  • –vlan-priority. Makes curl set the VLAN priority field for its IP traffic. This is typically a field used in the network layer below IP (think Ethernet), so it is not likely to survive through IP routers. A local network thing.
  • –keepalive-cnt (and CURLOPT_TCP_KEEPCNT). Specify how many keeplive probes curl should send before it considers the connection to be dead.
  • –write-out ‘%{num_retries}’ is a new variable for the info output that outputs the number of retries that were done for the previous transfer (when –retry was used).
  • gnutls now supports CA caching. For libcurl using applications, this can really speed up doing serial TLS connections.
  • mbedtls supports CURLOPT_CERTINFO. Returns certificate information to the application.
  • noproxy patterns need to be comma separated. Space separation is no longer enough.
  • Support binding a connection to both interface and IP, not just one of them.
  • The URL API added CURLU_NO_GUESS_SCHEME, to allow an application to figure out if the scheme for a previously parsed URL was set or guessed.
  • wolfssl now supports CA caching

Bugfixes

In no other release ever before in curl’s long history have there been this many bugfixes: 260. Some of my favorites are:

  • cmake: 26 separate bugfixes
  • configure: 10 separate bugfixes
  • –help category cleanup and list categories in –help
  • allow etag and content-disposition for 3xx reply
  • docs: countless fixes, polish and corections
  • show name and keywords for failed tests in summary
  • avoid using GetAddrInfoExW with impersonation
  • URL encode the canonical path for aws-sigv4
  • fix DoH cleanup
  • fix memory leak and zero-length HTTPS RR crash in DoH
  • allow DoH transfers to override max connection limit
  • fix ß with AppleIDN
  • fix compilation with OpenSSL 1.x with md4 disabled
  • do a final progress update on connect failure
  • multi: fix pollset during RESOLVING phase
  • enable UDP GRO for QUIC
  • require at least OpenSSL 3.3 for QUIC
  • add shutdown support for HTTP/3 (QUIC)
  • fix CRLF conversion of input
  • fixed starttls for SMTP
  • change TCP keepalive from ms to seconds on DragonFly BSD
  • support TCP keepalive parameters on Solaris <11.4
  • shutdown TLS and TCP better
  • gnutls: pass in SNI name, not hostname when checking cert
  • gnutls: rectify the TLS version checks for QUIC
  • mbedtls v3.6.0 workarounds
  • several x509 asn.1 parser fixes

Next

Because the 8.9.0 release spent an extra week for its release cycle, the next one is going to be one week shorter. We do this by shortening the feature window to just two weeks this time, which might impact how many new features and changes we manage to merge.

We have a large amount of pull requests for changes already pending merge, waiting for the release window to open.

If all goes well, the next release is named 8.10.0 and eventually ships on September 11, 2024.

curl for QNX

Starting now, there are official curl releases for QNX hosted on the curl.se website. See https://curl.se/qnx.

QNX is a commercial real-time operating system and these curl release packages are produced as a result of a business arrangement.

The plan is to from now on ship curl tarballs for three different QNX versions, and each archive contains curl and libcurl built for several different targets. The curl for QNX releases should be possible to release in sync with the regular releases, but they can also be updated out of sync if need be.

Every curl release from here on out will be packaged for QNX and made available.

curl and libcurl have been functional on QNX since decades – the first mention of curl and QNX together that I could find is from October 2000. curl releases for QNX were previously packaged and provided to end users by the QNX team themselves.

This move will allow QNX users to get the latest curl faster and make them able to keep up better with curl development. For features, bugfixes and perhaps most of all security.

We will also make sure that curl keeps building fine for QNX straight from the tarball.

The complete set of build and setup scripts for curl on QNX are maintained in the curl-for-qnx git repository. Of course we will appreciate submitted issues and pull requests in that repository as well.

This commercial agreement is between Blackberry and wolfSSL. I am employed by wolfSSL. If you want your operating system to have equally fancy and always up-to-date releases, you know who to contact.

wcurl is here

Users tell us that remembering what curl options to use when they just want to download the contents of a URL is hard. This is one often repeated reason why some users reach for wget instead of curl on the command line. It downloads the data from the URL without you needing to provide any extra arguments. Without you needing to remember which option(s) to use.

In the curl user survey of 2024, it was again mentioned several times.

Enter wcurl

Samuel Henrique decided to do something about it. Today he announced that he not only created wcurl as a curl wrapper aimed at meeting this exact need, he also created a Debian package out of it and made sure wcurl now ships as part of the curl package. Starting in 8.8.0-2. I already have it on my Debian unstable installations.

wcurl is implemented a shell script that uses curl. It also ships with its own manpage.

Take it for a spin. Tell the team what you think!

Discussion

Hacker news

long term curl versions

In the curl project we ship new releases based on the master branch of our git repository, in a clean and linear commit history. We have never maintained an old branch for long term or stability etc. Instead we promise to not break user behavior nor the ABI or API. All users should be able to always upgrade to the latest.

A never-ending stream of releases you can always upgrade to; a new one every 8th week.

We build infrastructure you can lean on.

But

Sometimes reality does not match our intentions and we ship regressions.

Sometimes users are too scared that there might be a regression so they refrain from upgrading. risk averse is probably how they view themselves.

Sometimes users, organizations and Linux distros have policies that say they do not upgrade versions. Usually based on how software in general works and there needs to be a single fixed policy for managing software versions and then curl gets treated the same way.

For those situations and other related scenarios, repeating the top paragraphs does not help.

long term branches by others

In practice, just about every major Linux distributor maintains one or more stable curl branches. They backport security fixes to those versions to keep them secure for their users. Some vendors also merge selected bugfixes into their branches.

Every Linux distributor picks the particular curl version they stick to by themselves without coordination with other distros. They all do it at different times and they all have their own specific criteria and work processes for doing this. This, in combination with curl’s frequent releases, tend to make them all pick different versions for their different branches. And keep them alive for different lengths.

Some vendors maintain their stable branches for extended periods of time. Upwards and beyond ten years happen.

These long-lived branches may eventually end up having literally hundreds of patches applied to them. The curl builds done from these branches still report as version x.y.z but in reality they are mutated versions that can be significantly different compared to the original x.y.z version that the curl project shipped in a tarball back in the day.

That’s the comfort you get for picking (and paying?) a Linux distribution. (Yes, you also easily get stuck with an ancient version because of this.)

Some users also simply get stuck on older versions for other reasons and do not security-patch them over time (by ignorance or incompetence), making them more and more insecure over time.

Reality

At the time I write this post, the curl release with the largest number of known security vulnerabilities has 85 published CVEs.

By asking users and by looking in logs in various servers, we know that just about every curl versions we have shipped the last dozen of years or so remain in use somewhere. We can only hope that most of them are security patched.

In reality, every release we do becomes a long term release for someone.

Long term support?

Every once in a while a discussion pops out in or close to the curl project whether we should consider starting to maintain one or more LTS branches.

We have never completely dismissed those ideas. We are however acutely aware of the extra effort and energy such an endeavor requires, so we have so far shrugged it off. But should there come users and sponsors willing to help make it happen, we would not be shy of implementing something.

After all, the ones most interested in LTS branches are usually people and companies with an economic gain to be had; with businesses using and relying on a rock solid curl. They should then also be able to help pay for this.

If you or the company you work for would be interested in something like this, please reach out and we can get the conversation going. Maybe we can do something to improve the lives of people out there?

Until then, we stick to a single release branch.

Credits

Image by Julius Silver from Pixabay

Inside 22,734 Steam games

About a year ago I blogged about games that use curl. In that post I listed a bunch of well-known titles I knew use curl and there was a list of 136 additional games giving credit to curl.

Kind of amazing that over one hundred games decided to use curl!

At the time, lots of people told me that number was probably way low and while I kind of had that feeling as well it was just a feeling and nothing else. We cannot be absolutely certain unless there is data or evidence to actually back it up.

The speculation could stop this week when someone provided me with a link to a database of Steam titles (Steam, as in the video game service). SteamDB is a third-party site that among other things extracts data and figures out which “SDKs” are used by Steam games: Their list of game titles on Steam using curl.

Since that list is capped at 10,000 titles, I had to filter it and add up the number of titles based on release year. Out of the 91,559 titles they currently list in their database, 22,734 are identified to be using curl: 24.8%.

Not too shabby for a hobby.

Discussion

Hacker news

curl user survey 2024 analysis

As tradition dictates, I have spent many hours walking through the responses to the curl user survey of the year. I have sorted tables, rendered updated graphs and tried to wrap my head around what all these numbers might mean and what conclusions and lessons we should draw.

I present the results, the collected answers, to the survey mostly raw without a lot of analysis or decisions. This, to allow everyone who takes the time to reads through to form their own opinion and thoughts. It also gives me more time to glance over the numbers many more times before I make up my mind about possible outcomes.

The 2024 user survey analysis document

If you find any mistakes or omissions in this document, let me know and I might fix and update corrected versions.

63 pages and 14,800 words. Enjoy!

Why curl closes PRs on GitHub

Contributors to the curl project on GitHub tend to notice the above sequence quite quickly: pull requests submitted do not generally appear as “merged” with its accompanying purple blob, instead they are said to be “closed”. This has been happening since 2015 and is probably not going to change anytime soon.

Let me explain why this happens.

I blame GitHub

GitHub’s UI does not allow us to review or comment on commit messages for pull requests. Therefore, it is hard to insist on contributors to provide the correct message, using the proper language in the correct format.

If you make a pull request based on a single commit, the initial PR message is based on the commit message but when follow-up fixes are done and perhaps force-pushed, the PR message is not updated accordingly with the commit message’s updates.

Commit messages with style

I believe having good commit messages following a fixed style and syntax helps the project. It makes the git history better and easier to browse. It allows us to write tools and scripts around git and the git history. Like how we for example generate release notes and project stat graphs based on git log basically.

We also like and use a strictly linear history in curl, meaning that all commits are rebased on the master branch. Lots of the scripting mentioned above depends on this fact.

Manual merges

In order to make sure the commit message is correct, and in fact that the entire commit looks correct, we merge pull requests manually. That means that we pull down the pull request into a local git repository, clean up the commit message to adhere to project standards.

And then we push the commit to git. One or more of the commit messages in such a push then typically contains lines like:

Fixes #[number] and Closes #[number]. Those are instructions to GitHub and we use them like this:

Fixes means that this commit fixed an issue that was reported in the GitHub issue with that id. When we push a commit with that instruction, GitHub closes that issue.

Closes means that we merged a pull request with this id. (GitHub has no way for us to tell it that we merged the pull request.) This instruction makes GitHub closes the corresponding pull request: “[committer] closed this in [commit hash]”.

We do not let GitHub dictate how we do git. We use git and expect GitHub to reflect our git activity.

We COULD but we won’t

We could in theory fix and cleanup the commits locally and manually exactly the way we do now and then force-push them to the remote branch and then use the merge button on the GitHub site and then they would appear as “merged”.

That is however a clunky, annoying and time-consuming extra-step that not only requires that we (always) push code to other people’s branches, it also triggers a whole new round of CI jobs. This, only to get a purple blob instead of a red one. Not worth it.

If GitHub would allow it, I would disable the merge button in the GitHub PR UI for curl since it basically cannot be used correctly in the project.

Squashing all the commits in the PR is also not something we want since in many cases the changes should be kept as more than one commit and they need their own dedicated and correct commit message.

What GitHub could do

GitHub could offer a Merged keyword in the exact same style as Fixed and Closes, that just tells the service that we took care of this PR and merged it as this commit. It’s on me. My responsibility. I merged it. It would help users and contributors to better understand that their closed PR was in fact merged as that commit.

It would also have saved me from having to write this blog post.

Discussion

Hacker news

Addendum

In some post-publish discussions I have seen people ask about credits. This method to merge commits does not break or change how the authors are credited for their work. The commit authors remain the commit authors, and the one doing the commits (which is I when I do them) is stored separately. Like git always do. Doing the pushes manually this way does in no way change this model. GitHub will even count the commits correctly for the committer – assuming they use an email address their GitHub account does (I think).