Tag Archives: cURL and libcurl

curl + MQTT = true

This is the 25th transfer protocol added to curl. The first new addition since we added SMB and SMBS back in November 2014.

Background

Back in early 2019, my brother Björn Stenberg brought a pull request to the curl project that added support for MQTT. I tweeted about it and it seemed people were interested in seeing this happen.

Time passed and Björn unfortunately didn’t manage to push his work forward and instead it grew stale and the PR eventually was closed due to that inactivity later the same year.

Roadmap 2020

In my work trying to go over and figure out what I want to see in curl the coming year and what we (wolfSSL) as a company would like to see being done, MQTT qualified as a contender for the list. See my curl roadmap 2020 video.

It’s happening again

I grabbed Björn’s old pull-request and rebased it onto git master, fixed a few minor conflicts and small cleanups necessary and then brought it further. I documented two of my early sessions on this, live-streamed on twitch. See MQTT in curl and MQTT part two below:

Polish

Björn’s code was an excellent start but didn’t take us all the way.

I wrote an MQTT test server, created a set of test cases, made sure the code worked for those test cases, made it more solid and more. It is still early days and the MQTT support is basic and comes with several caveats, but it’s slowly getting there.

MQTT – really?

When I say that MQTT almost fits the curl concepts and paradigms, I mean that you can consider what an MQTT client does to be “sending” and “receiving” and you can specify that with a URL.

Fetching an MQTT URL with curl means doing SUSCRIBE on a topic and waiting for that to arrive and get the payload sent to the output.

Doing the equivalent of a HTTP POST with curl, like with the command line’s -d option makes an MQTT PUBLISH and sends a payload to a topic.

Rough corners and wrong assumptions

I’m an MQTT rookie. I’m sure there will be mistakes and I will have misunderstood things. The MQTT will be considered experimental for a time forward so that people will get a chance to verify the functionality and we have a chance to change and correct the worst decisions and fatal mistakes. Remember that for experimental features in curl, we reserve ourselves the right to change behavior, API and ABI so nobody should ship such features enabled anywhere without first thinking it through very carefully!

If you’re a person who think MQTT in curl would be useful, good or just fun and you have use cases or ideas where you’d want to use this. Please join in and try and let us know how it works and what you think we should polish or fix to make it truly stellar!

The code is landed in the master branch since PR 5173 was merged. The code will be present in the coming 7.70.0 release, due to ship on April 29 2020.

TODO

As I write this, the MQTT support is still very basic. I want a first version out to users as early as possible as I want to get feedback and comments to help verify that we’re in the right direction and then work on making the support of the protocol more complete. TLS, authentication, QoS and more will come as we proceed. Of course, if you let me know what we must support for MQTT to make it interesting for you, I’ll listen! Preferably, you do the discussions on the curl-library mailing list.

We’ve only just started.

Credits

The initial MQTT patch that kicked us off was written by Björn Stenberg. I brought it forward from there, bug-fixed it, extended it, added a test server and test cases and landed the lot in the master branch.

The queuing top image by DaKub from Pixabay

curl ootw: –append

Previously mentioned command line options of the week.

--append is the long form option, -a is the short. The option has existed since at least May 1998 (present in curl 4.8). I think it is safe to say that if we would’ve created this option just a few years later, we would not have “wasted” a short option letter on it. It is not a very frequently used one.

Append remotely

The append in the option name is a require to the receiver to append to — rather than replace — a destination file. This option only has any effect when uploading using either FTP(S) or SFTP. It is a flag option and you use it together with the --upload option.

When you upload to a remote site with these protocols, the default behavior is to overwrite any file that happens to exist on the server using the name we’re uploading to. If you append this option to the command line, curl will instead instruct the server to append the newly uploaded data to the end of the remote file.

The reason this option is limited to just subset of protocols is of course that they are the only ones for which we can give that instruction to the server.

Example

Append the local file “trailer” to the remote file called “begin”:

curl --append --upload trailer ftp://example.com/path/begin

A QQGameHall storm

Mar 31 2020, 11:13:38: I get a message from Frank in the #curl IRC channel over on Freenode. I’m always “hanging out” on IRC and Frank is a long time friend and fellow frequent IRCer in that channel. This time, Frank informs me that the curl web site is acting up:

“I’m getting 403s for some mailing list archive pages. They go away when I reload”

That’s weird and unexpected. An important detail here is that the curl web site is “CDNed” by Fastly. This means that every visitor of the web site is actually going to one of Fastly’s servers and in most cases they get cached content from those servers, and only infrequently do these servers come back to my “origin” server and ask for an updated file to send out to a web site visitor.

A 403 error for a valid page is not a good thing. I started checking out some of my logs – which then only are for the origin as I don’t do any logging at all at CDN level (more about that later) – and I could verify the 403 errors. So they’re in my log meaning it isn’t caused by (a misconfiguration of) the CDN. Why would a perfectly legitimate URL suddenly return 403 to have it go away again after a reload?

Why does he get a 403?

I took a look at Fastly’s management web interface and I spotted that the curl web site was sending out data at an unusual high speed at the moment. An average speed of around 50mbps, while we typically average at below 20. Hm… something is going on.

While I continued to look for the answers to these things I noted that my logs were growing really rapidly. There were POSTs being sent to the same single URL at a high frequency (10-20 reqs/second) and each of those would get some 225Kbytes of data returned. And they all used the same User-agent: QQGameHall. It seems this started within the last 24 hours or so. They’re POSTs so Fastly basically always pass them through to my server.

Before I could figure out Franks’s 403s, I decided to slow down this madness by temporarily forbidding this user-agent access so that the bot or program or whatever would notice it starts to fail, and it would of course then stop bombarding the site.

Deny

Ok, a quick deny of the user-agent made my server start responding with 403s to all those requests and instead of a 225K response it now sent back 465 bytes per request. The average bandwidth on the site immediately dropped down to below 20Mbps again. Back to looking for Frank’s 403-problem

First the 403s seen due to the ratelimiting, then I removed the ratelmiting and finally I added a block of the user-agent. Screenshotted error rates from Fastly’s admin interface. This is errors per minute.

The answer was pretty simple and I didn’t have to search a lot. The clues existed in the error logs and it turned out we had “mod_evasive” enabled since another heavy bot load “attack” a while back. It is a module for “rate limiting” incoming requests and since a lot of requests to our server now comes from Fastly’s limited set of IP addresses and we had this crazy QQ thing hitting us, my server would return a 403 every now and then when it considered the rate too high.

I whitelisted Fastly’s requests and Frank’s 403 problems were solved.

Deny a level up

The bot traffic showed no sign of slowing down. Easily 20 requests per second, to the same URL and they all get an error back and obviously they don’t care. I decided to up my game a little so with help, I moved my blocking of this service to Fastly. I now block their user-agent already there so the traffic doesn’t ever reach my server. Phew, my server was finally back to its regular calm state. They way it should be.

It doesn’t stop there. Here’s a follow-up graph I just grabbed, a little over a week since I started the blocking. 16.5 million blocked requests (and counting). This graph here shows number of requests/hour on the Y axis, peeking at almost 190k; around 50 requests/second. The load is of course not actually a problem, just a nuisance now. QQGameHall keeps on going.

Errors per hour over the period of several days.

QQGameHall

What we know about this.

Friends on Twitter and googling for this name informs us that this is a “game launcher” done by Tencent. I’ve tried to contact them via Twitter (as I have no means of contacting them otherwise that seems even remotely likely to work).

I have not checked what these user-agent POSTs, because I didn’t log that. I suspect it was just a zero byte POST.

The URL they post to is the CA cert bundle file with provide on the curl CA extract web page. The one we convert from the Mozilla version into a PEM for users of the world to enjoy. (Someone seems to enjoy this maybe just a little too much.)

The user-agents seemed to come (mostly) from China which seems to add up. Also, the look of the graph when it goes up and down could indicate an eastern time zone.

This program uses libcurl. Harry in the #curl channel found files in Virus Total and had a look. It is, I think, therefore highly likely that this “storm” is caused by an application using curl!

My theory: this is some sort of service that was deployed, or an upgrade shipped, that wants to get an updated CA store and they get that from our site with this request. Either they get it far too often or maybe there are just a very large amount them or similar. I cannot understand why they issue a POST though. If they would just have done a GET I would never have noticed and they would’ve fetched perfectly fine cached versions from the CDN…

Feel free to speculate further!

Logging, privacy, analytics

I don’t have any logging of the CDN traffic to the curl site. Primarily because I haven’t had to, but also because I appreciate the privacy gain for our users and finally because handling logs at this volume pretty much requires a separate service and they all seem to be fairly pricey – for something I really don’t want. So therefore I don’t see the source IP addresses these things. (But yes, I can ask Fastly to check and tell me if I really really wanted to know.)

Also: I don’t run any analytics (Google or otherwise) on the site, primarily for privacy reasons. So that won’t give me that data or other clues either.

Update: it has been proposed I could see the IP address in the X-Forwarded-For: headers and it seems accurate. Of course I didn’t log that header during this period but I will consider starting doing it for better control and info in the future.

Update 2: As of May 18 2020, this flood has not diminished. Logs show that we still block about 5 million requests/day from this service, peaking at over 100 requests/minute.

Credits

Top image by Elias Sch. from Pixabay

curl ootw: -v is for verbose

(Previous entries in the curl option of the week series.)

This is one of the original 24 command line options that existed already in the first ever curl release in the spring of 1998. The -v option’s long version is --verbose.

Note that this uses the lowercase ‘v’. The uppercase -V option shows detailed version information.

So use it!

In a blog post series of curl command line options you’d think that this option would basically be unnecessary to include since it seems to basic, so obvious and of course people know of it and use it immediately to understand why curl invokes don’t behave as expected!

Time and time again the first response to users with problems is to please add –verbose to the command line. Many of those times, the problem is then figured out, understood and sorted out without any need of further help.

--verbose should be the first action to try for everyone who runs a curl command that fails unexplainably.

What verbose shows

First: there’s only one verbosity level in curl. There’s normal and there’s verbose. Pure binary; on or off. Adding more -v flags on the same line won’t bring you more details. In fact, adding more won’t change anything at all other than making your command line longer. (And yes, you have my permission to gently taunt anyone you see online who uses more than one -v with curl.)

Verbose mode shows outgoing and incoming headers (or protocol commands/responses) as well as “extra” details that we’ve deemed sensible in the code.

For example you will get to see which IP addresses curl attempts to connect to (that the host name resolved to), it will show details from the server’s TLS certificate and it will tell you what TLS cipher that was negotiated etc.

-v shows details from the protocol engine. Of course you will also see different outputs depending on what protocol that’s being used.

What verbose doesn’t show

This option is meant to help you understand the protocol parts but it doesn’t show you everything that’s going on – for example it doesn’t show you the outgoing protocol data (like the HTTP request body). If -v isn’t enough for you, then the --trace and --trace-ascii options are there for you.

If you are in the rare situation where the trace options aren’t detailed enough, you can go all-in with full SSLKEYLOGFILE mode and inspect curl’s network traffic with Wireshark.

Less verbose?

It’s not exactly “verbose level” but curl does by default for example show a progress meter and some other things. You can silence curl completely by using -s (--silent) or use the recently introduced option --no-progress-meter.

Example

curl -v https://example.com/

Google Open Source Peer Bonus award 2020

I’m honored to – once again – be a recipient of this award Google hands out to open source contributors, annually. I was previously awarded this in 2011.

I don’t get a lot of awards. Getting this token of appreciation feels awesome and I’m humbled and grateful I was not only nominated but also actually selected as recipient. Thank you, Google!

Nine years ago I got 350 USD credits in the Google store and I got my family a set of jackets using them – my kids have grown significantly since then, so to them those black beauties are now just a distant memory, but I still actually wear mine from time to time!

curl beers and curl stickers!

This time, the reward comes with a 250 USD “payout” (that’s the gift mentioned in the mail above), as a real money transfer that can be spent on other things than just Google merchandise!

I’ve decided to accept the reward and the money and I intend to spend it on beer and curl stickers for my friends and fans. As I prefer to view it:

The Google Open Source Beer Bonus.

Thank you Google and thank you Gaspar!

Update: the Google Open Source blog post about it.

curl ootw: –proxy-basic

Previous command line options of the week.

--proxy-basic has no short option. This option is closely related to the option --proxy-user, which has as separate blog post.

This option has been provided and supported since curl 7.12.0, released in June 2004.

Proxy

In curl terms, a proxy is an explicit middle man that is used to go through when doing a transfer to or from a server:

curl <=> proxy <=> server

curl supports several different kinds of proxies. This option is for HTTP(S) proxies.

HTTP proxy authentication

Authentication: the process or action of proving or showing something to be true, genuine, or valid.

When it comes to proxies and curl, you typically provide name and password to be allowed to use the service. If the client provides the wrong user or password, the proxy will simply deny the client access with a 407 HTTP response code.

curl supports several different HTTP proxy authentication methods, and the proxy can itself reply and inform the client which methods it supports. With the option of this week, --proxy-basic, you ask curl to do the authentication using the Basic method. “Basic” is indeed very basic but is the actual name of the method. Defined in RFC 7616.

Security

The Basic method sends the user and password in the clear in the HTTP headers – they’re just base64 encoded. This is notoriously insecure.

If the proxy is a HTTP proxy (as compared to a HTTPS proxy), users on your network or on the path between you and your HTTP proxy can see your credentials fly by!

If the proxy is a HTTPS proxy however, the connection to it is protected by TLS and everything is encrypted over the wire and then the credentials that is sent in HTTP are protected from snoopers.

Also note that if you pass in credentials to curl on the command line, they might be readable in the script where you do this from. Or if you do it interactively in a shell prompt, they might be viewable in process listings on the machine – even if curl tries to hide them it isn’t supported everywhere.

Example

Use a proxy with your name and password and ask for the Basic method specifically. Basic is also the default unless anything else is asked for.

curl --proxy-user daniel:password123 --proxy-basic --proxy http://myproxy.example https://example.com

Related

With --proxy you specify the proxy to use, and with --proxy-user you provide the credentials.

Also note that you can of course set and use entirely different credentials and HTTP authentication methods with the remote server even while using Basic with the HTTP(S) proxy.

There are also other authentication methods to selected, with --proxy-anyauth being a very practical one to know about.

curl ootw: –retry-max-time

Previous command line options of the week.

--retry-max-time has no short option alternative and it takes a numerical argument stating the time in seconds. See below for a proper explanation for what that time is.

Retrying

curl supports retrying of operations that failed due to “transient errors”, meaning that if the error code curl gets signals that the error is likely to be temporary and not the fault of curl or the user using curl, it can try again. You enable retrying with --retry [tries] where you tell curl how many times it should retry. If it reaches the maximum number of retries with a successful transfer, it will return error.

A transient error can mean that the server is temporary overloaded or similar, so when curl retries it will by default wait a short while before doing the next “round”. By default, it waits one second on the first retry and then it doubles the time for every new attempt until the waiting time reaches 10 minutes which then is the max waiting time. A user can set a custom delay between retries with the --retry-delay option.

Transient errors

Transient errors mean either: a timeout, an FTP 4xx response code or an HTTP 408 or 5xx response code. All other errors are non-transient and will not be retried with this option.

Retry no longer than this

Retrying can thus go on for an extended period of time, and you may want to limit for how long it will retry if the server really doesn’t work. Enter --retry-max-time.

It sets the maximum number of seconds that are allowed to have elapsed for another retry attempt to be started. If you set the maximum time to 20 seconds, curl will only start new retry attempts within a twenty second period that started before the first transfer attempt.

If curl gets a transient error code back after 18 seconds, it will be allowed to do another retry and if that operation then takes 4 seconds, there will be no more attempts but if it takes 1 second, there will be time for another retry.

Of course the primary --retry option sets the number of times to retry, which may reach the end before the maximum time is met. Or not.

Retry-After:

Since curl 7.66.0 (September 2019), the server’s Retry-After: HTTP response header will be used to figure out when the subsequent retry should be issued – if present. It is a generic means to allow the server to control how fast clients will come back, so that the retries themselves don’t become a problem that causes more transient errors…

–retry-connrefused

In curl 7.52.0 curl got this additional retry switch that adds “connection refused” as a valid reason for doing a retry. If not used, a connection refused is not considered a transient error and will cause a regular error exit code.

Related options

--max-time limits the entire time allowed for an operation, including all retry attempts.

let’s talk curl 2020 roadmap

tldr: join in and watch/discuss the curl 2020 roadmap live on Thursday March 26, 2020. Sign up here.

The roadmap is basically a list of things that we at wolfSSL want to work on for curl to see happen this year – and some that we want to mention as possibilities.(Yes, the word “webinar” is used, don’t let it scare you!)

If you can’t join live, you will be able to enjoy a recorded version after the fact.

I shown the image below in curl presentation many times to illustrate the curl roadmap ahead:

The point being that we as a project don’t really have a set future but we know that more things will be added and fixed over time.

Daniel, wolfSSL and curl

This is a balancing act where there I have several different “hats”.

I’m the individual who works for wolfSSL. In this case I’m looking at things we at wolfSSL want to work on for curl – it may not be what other members of the team will work on. (But still things we agree are good and fit for the project.)

We in wolfSSL cannot control or decide what the other curl project members will work on as they are volunteers or employees working for other companies with other short and long term goals for their participation in the curl project.

We also want to try to communicate a few of the bigger picture things for curl that we want to see done, so that others can join in and contribute their ideas and opinions about these features, perhaps even add your preferred subjects to the list – or step up and buy commercial curl support from us and get a direct-channel to us and the ability to directly affect what I will work on next.

As a lead developer of curl, I will of course never merge anything into curl that I don’t think benefits or advances the project. Commercial interests don’t change that.

Webinar

Sign up here. The scheduled time has been picked to allow for participants from both North America and Europe. Unfortunately, this makes it hard for all friends not present on these continents. If you really want to join but can’t due to time zone issues, please contact me and let us see what we can do!

Credits

Top image by Free-Photos from Pixabay

curl: 22 years in 22 pictures and 2222 words

curl turns twenty-two years old today. Let’s celebrate this by looking at its development, growth and change over time from a range of different viewpoints with the help of graphs and visualizations.

This is the more-curl-graphs-than-you-need post of the year. Here are 22 pictures showing off curl in more detail than anyone needs.

I founded the project back in the day and I remain the lead developer – but I’m far from alone in this. Let me take you on a journey and give you a glimpse into the curl factory. All the graphs below are provided in hires versions if you just click on them.

Below, you will learn that we’re constantly going further, adding more and aiming higher. There’s no end in sight and curl is never done. That’s why you know that leaning on curl for Internet transfers means going with a reliable solution.

Number of lines of code

Counting only code in the tool and the library (and public headers) it still has grown 80 times since the initial release, but then again it also can do so much more.

At times people ask how a “simple HTTP tool” can be over 160,000 lines of code. That’s basically three wrong assumptions put next to each other:

  1. curl is not simple. It features many protocols and fairly advanced APIs and super powers and it offers numerous build combinations and runs on just all imaginable operating systems
  2. curl supports 24 transfer protocols and counting, not just HTTP(S)
  3. curl is much more than “just” the tool. The underlying libcurl is an Internet transfer jet engine.

How much more is curl going to grow and can it really continue growing like this even for the next 22 years? I don’t know. I wouldn’t have expected it ten years ago and guessing the future is terribly hard. I think it will at least continue growing, but maybe the growth will slow down at some point?

Number of contributors

Lots of people help out in the project. Everyone who reports bugs, brings code patches, improves the web site or corrects typos is a contributor. We want to thank everyone and give all helpers the credit they deserve. They’re all contributors. Here’s how fast our list of contributors is growing. We’re at over 2,130 names now.

When I wrote a blog post five years ago, we had 1,200 names in the list and the graph shows a small increase in growth over time…

Daniel’s share of total commits

I started the project. I’m still very much involved and I spend a ridiculous amount of time and effort in driving this. We’re now over 770 commits authors and this graph shows how the share of commits I do to the project has developed over time. I’ve done about 57% of all commits in the source code repository right now.

The graph is the accumulated amount. Some individual years I actually did far less than 50% of the commits, which the following graph shows

Daniel’s share of commits per year

In the early days I was the only one who committed code. Over time a few others were “promoted” to the maintainer role and in 2010 we switched to git and the tracking of authors since then is much more accurate.

In 2014 I joined Mozilla and we can see an uptake in my personal participation level again after having been sub 50% by then for several years straight.

There’s always this argument to be had if it is a good or a bad sign for the project that my individual share is this big. Is this just because I don’t let other people in or because curl is so hard to work on and only I know my ways around the secret passages? I think the ever-growing number of commit authors at least show that it isn’t the latter.

What happens the day I grow bored or get run over by a bus? I don’t think there’s anything to worry about. Everything is free, open, provided and well documented.

Number of command line options

The command line tool is really like a very elaborate Swiss army knife for Internet transfers and it provides many individual knobs and levers to control the powers. curl has a lot of command line options and they’ve grown in number like this.

Is curl growing too hard to use? Should we redo the “UI” ? Having this huge set of features like curl does, providing them all with a coherent and understandable interface is indeed a challenge…

Number of lines in docs/

Documentation is crucial. It’s the foundation on which users can learn about the tool, the library and the entire project. Having plenty and good documentation is a project ambition. Unfortunately, we can’t easily measure the quality.

All the documentation in curl sits in the docs/ directory or sub directories in there. This shows how the amount of docs for curl and libcurl has grown through the years, in number of lines of text. The majority of the docs is in the form of man pages.

Number of supported protocols

This refers to protocols as in primary transfer protocols as in what you basically specify as a scheme in URLs (ie it doesn’t count “helper protocols” like TCP, IP, DNS, TLS etc). Did I tell you curl is much more than an HTTP client?

More protocols coming? Maybe. There are always discussions and ideas… But we want protocols to have a URL syntax and be transfer oriented to map with the curl mindset correctly.

Number of HTTP versions

The support for different HTTP versions has also grown over the years. In the curl project we’re determined to support every HTTP version that is used, even if HTTP/0.9 support recently turned disabled by default and you need to use an option to ask for it.

Number of TLS backends

The initial curl release didn’t even support HTTPS but since 2005 we’ve support customizable TLS backends and we’ve been adding support for many more ones since then. As we removed support for two libraries recently we’re now counting thirteen different supported TLS libraries.

Number of HTTP/3 backends

Okay, this graph is mostly in jest but we recently added support for HTTP/3 and we instantly made that into a multi backend offering as well.

An added challenge that this graph doesn’t really show is how the choice of HTTP/3 backend is going to affect the choice of TLS backend and vice versa.

Number of SSH backends

For a long time we only supported a single SSH solution, but that was then and now we have three…

Number of disclosed vulnerabilities

We take security seriously and over time people have given us more attention and have spent more time digging deeper. These days we offer good monetary compensation for anyone who can find security flaws.

Number of known vulnerabilities

An attempt to visualize how many known vulnerabilities previous curl versions contain. Note that most of these problems are still fairly minor and some for very specific use cases or surroundings. As a reference, this graph also includes the number of lines of code in the corresponding versions.

More recent releases have less problems partly because we have better testing in general but also of course because they’ve been around for a shorter time and thus have had less time for people to find problems in them.

Number of function calls in the API

libcurl is an Internet transfer library and the number of provided function calls in the API has grown over time as we’ve learned what users want and need.

Anything that has been built with libcurl 7.16.0 or later you can always upgrade to a later libcurl and there should be no functionality change and the API and ABI are compatible. We put great efforts into make sure this remains true.

The largest API additions over the last few year are marked in the graph: when we added the curl_mime_* and the curl_url_* families. We now offer 82 function calls. We’ve added 27 calls over the last 14 years while maintaining the same soname (ABI version).

Number of CI jobs per commit and PR

We’ve had automatic testing in the curl project since the year 2000. But for many years that testing was done by volunteers who ran tests in cronjobs in their local machines a few times per day and sent the logs back to the curl web site that displayed their status.

The automatic tests are still running and they still provide value, but I think we all agree that getting the feedback up front in pull-requests is a more direct way that also better prevent bad code from ever landing.

The first CI builds were added in 2013 but it took a few more years until we really adopted the CI lifestyle and today we have 72, spread over 5 different CI services (travis CI, Appveyor, Cirrus CI, Azure Pipelines and Github actions). These builds run for every commit and all submitted pull requests on Github. (We actually have a few more that aren’t easily counted since they aren’t mentioned in files in the git repo but controlled directly from github settings.)

Number of test cases

A single test case can test a simple little thing or it can be a really big elaborate setup that tests a large number of functions and combinations. Counting test cases is in itself not really saying much, but taken together and looking at the change over time we can at least see that we continue to put efforts into expanding and increasing our tests. It should also be considered that this can be combined with the previous graph showing the CI builds, as most CI jobs also run all tests (that they can).

Number of commits per month

A commit can be tiny and it can be big. Counting a commit might not say a lot more than it is a sign of some sort of activity and change in the project. I find it almost strange how the number of commits per months over time hasn’t changed more than this!

Number of authors per month

This shows number of unique authors per month (in red) together with the number of first-time authors (in blue) and how the amounts have changed over time. In the last few years we see that we are rarely below fifteen authors per month and we almost always have more than five first-time commit authors per month.

I think I’m especially happy with the retained high rate of newcomers as it is at least some indication that entering the project isn’t overly hard or complicated and that we manage to absorb these contributions. Of course, what we can’t see in here is the amount of users or efforts people have put in that never result in a merged commit. How often do we miss out on changes because of project inabilities to receive or accept them?

72 operating systems

Operating systems on which you can build and run curl for right now, or that we know people have ran curl on before. Most mortals cannot even list this many OSes off the top of their heads. If you know of any additional OS that curl has run on, please let me know!

20 CPU architectures

CPU architectures on which we know people have run curl. It basically runs on any CPU that is 32 bit or larger. If you know of any additional CPU architecture that curl has run on, please let me know!

32 third party dependencies

Did I mention you can build curl in millions of combinations? That’s partly because of the multitude of different third party dependencies you can tell it to use. curl support no less than 32 different third party dependencies right now. The picture below is an attempt to some sort of block diagram and all the green boxes are third party libraries curl can potentially be built to use. Many of them can be used simultaneously, but a bunch are also mutually exclusive so no single build can actually use all 32.

60 libcurl bindings

If you’re looking for more explanations how libcurl ends up being used in so many places, here are 60 more. Languages and environments that sport a “binding” that lets users of these languages use libcurl for Internet transfers.

Missing pictures

“number of downloads” could’ve been fun, but we don’t collect the data and most users don’t download curl from our site anyway so it wouldn’t really say a lot.

“number of users” is impossible to tell and while I’ve come up with estimates every now and then, making that as a graph would be doing too much out of my blind guesses.

“number of graphs in anniversary blog posts” was a contender, but in the end I decided against it, partly since I have too little data.

(Scripts for most graphs)

Future

Every anniversary is an opportunity to reflect on what’s next.

In the curl project we don’t have any grand scheme or roadmap for the coming years. We work much more short-term. We stick to the scope: Internet transfers specified as URLs. The products should be rock solid and secure. The should be high performant. We should offer the features, knobs and levers our users need to keep doing internet transfers now and in the future.

curl is never done. The development pace doesn’t slow down and the list of things to work on doesn’t shrink.

curl write-out JSON

This is not a command line option of the week post, but I feel a need to tell you a little about our brand new addition!

--write-out [format]

This option takes a format string in which there are a number of different “variables” available that let’s a user output information from the previous transfer. For example, you can get the HTTP response code from a transfer like this:

curl -w 'code: %{response_code}' https://example.org -o saved

There are currently 34 different such variables listed and described in the man page. The most recently added one is for JSON output and it works like this:

%{json}

It is a single variable that outputs a full json object. You would for example invoke it like this when you get data from example.com:

curl --write-out '%{json}' https://example.com -o saved

That command line will spew some 800 bytes to the terminal and it won’t be very human readable. You will rather take care of that output with some kind of script/program, or if you want an eye pleasing version you can pipe it into jq and then it can look like this:

{
  "url_effective": "https://example.com/",
  "http_code": 200,
  "response_code": 200,
  "http_connect": 0,
  "time_total": 0.44054,
  "time_namelookup": 0.001067,
  "time_connect": 0.11162,
  "time_appconnect": 0.336415,
  "time_pretransfer": 0.336568,
  "time_starttransfer": 0.440361,
  "size_header": 347,
  "size_request": 77,
  "size_download": 1256,
  "size_upload": 0,
  "speed_download": 0.002854,
  "speed_upload": 0,
  "content_type": "text/html; charset=UTF-8",
  "num_connects": 1,
  "time_redirect": 0,
  "num_redirects": 0,
  "ssl_verify_result": 0,
  "proxy_ssl_verify_result": 0,
  "filename_effective": "saved",
  "remote_ip": "93.184.216.34",
  "remote_port": 443,
  "local_ip": "192.168.0.1",
  "local_port": 44832,
  "http_version": "2",
  "scheme": "HTTPS",
  "curl_version": "libcurl/7.69.2 GnuTLS/3.6.12 zlib/1.2.11 brotli/1.0.7 c-ares/1.15.0 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.3.0) nghttp2/1.40.0 librtmp/2.3"
}

The JSON object

It always outputs the entire object and the object may of course differ over time, as I expect that we might add more fields into it in the future.

The names are the same as the write-out variables, so you can read the --write-out section in the man page to learn more.

Ships?

The feature landed in this commit. This new functionality will debut in the next pending release, likely to be called 7.70.0, scheduled to happen on April 29, 2020.

Credits

This is the result of fine coding work by Mathias Gumz.

Top image by StartupStockPhotos from Pixabay