What if GitHub is the devil?

Some critics think the curl project shouldn’t use GitHub. The reasons for being against GitHub hosting tend to be one or more of:

  1. it is an evil proprietary platform
  2. it is run by Microsoft and they are evil
  3. GitHub is American thus evil

Some have insisted on craziness like “we let GitHub hold our source code hostage”.

Why GitHub?

The curl project switched to GitHub (from Sourceforge) almost eleven years ago and it’s been a smooth ride ever since.

We’re on GitHub not only because it provides a myriad of practical features and is a stable and snappy service for hosting and managing source code. GitHub is also a developer hub for millions of developers who already have accounts and are familiar with the GitHub style of developing, the terms and the tools. By being on GitHub, we reduce friction from the contribution process and we maximize the ability for others to join in and help. We lower the bar. This is good for us.

I like GitHub.

Self-hosting is not better

Providing even close to the same uptime and snappy response times with a self-hosted service is a challenge, and it would take someone time and energy to volunteer that work – time and energy we now instead can spend of developing the project instead. As a small independent open source project, we don’t have any “infrastructure department” that would do it for us. And trust me: we already have enough infrastructure management to deal with without having to add to that pile.

… and by running our own hosted version, we would lose the “network effect” and convenience for people that already are on and know the platform. We would also lose the easy integration with cool services like the many different CI and code analyzer jobs we run.

Proprietary is not the issue

While git is open source, GitHub is a proprietary system. But the thing is that even if we would go with a competitor and get our code hosting done elsewhere, our code would still be stored on a machine somewhere in a remote server park we cannot physically access – ever. It doesn’t matter if that hosting company uses open source or proprietary code. If they decide to switch off the servers one day, or even just selectively block our project, there’s nothing we can do to get our stuff back out from there.

We have to work so that we minimize the risk for it and the effects from it if it still happens.

A proprietary software platform holds our code just as much hostage as any free or open source software platform would, simply by the fact that we let someone else host it. They run the servers our code is stored on.

If GitHub takes the ball and goes home

No matter which service we use, there’s always a risk that they will turn off the light one day and not come back – or just change the rules or licensing terms that would prevent us from staying there. We cannot avoid that risk. But we can make sure that we’re smart about it, have a contingency plan or at least an idea of what to do when that day comes.

If GitHub shuts down immediately and we get zero warning to rescue anything at all from them, what would be the result for the curl project?

Code. We would still have the entire git repository with all code, all source history and all existing branches up until that point. We’re hundreds of developers who pull that repository frequently, and many automatically, so there’s a very distributed backup all over the world.

CI. Most of our CI setup is done with yaml config files in the source repo. If we transition to another hosting platform, we could reuse them.

Issues. Bug reports and pull requests are stored on GitHub and a sudden exit would definitely make us lose some of them. We do daily “extractions” of all issues and pull-requests so a lot of meta-data could still be saved and preserved. I don’t think this would be a terribly hard blow either: we move long-standing bugs and ideas over to documents in the repository, so the currently open ones are likely possible to get resubmitted again within the nearest future.

There’s no doubt that it would be a significant speed bump for the project, but it would not be worse than that. We could bounce back on a new platform and development would go on within days.

Low risk

It’s a rare thing, that a service just suddenly with no warning and no heads up would just go black and leave projects completely stranded. In most cases, we get alerts, notifications and get a chance to transition cleanly and orderly.

There are alternatives

Sure there are alternatives. Both pure GitHub alternatives that look similar and provide similar services, and projects that would allow us to run similar things ourselves and host locally. There are many options.

I’m not looking for alternatives. I’m not planning to switch hosting anytime soon! As mentioned above, I think GitHub is a net positive for the curl project.

Nothing lasts forever

We’ve switched services several times before and I’m expecting that we will change again in the future, for all sorts of hosting and related project offerings that we provide to the work and to the developers and participators within the project. Nothing lasts forever.

When a service we use goes down or just turns sour, we will figure out the best possible replacement and take the jump. Then we patch up all the cracks the jump may have caused and continue the race into the future. Onward and upward. The way we know and the way we’ve done for over twenty years already.

Credits

Image by Elias Sch. from Pixabay

Updates

After this blog post went live, some users remarked than I’m “disingenuous” in the list of reasons at the top, that people have presented to me. This, because I don’t mention the moral issues staying on GitHub present – like for example previously reported workplace conflicts and their association with hideous American immigration authorities.

This is rather the opposite of disingenuous. This is the truth. Not a single person have ever asked me to leave GitHub for those reasons. Not me personally, and nobody has asked it out to the wider project either.

These are good reasons to discuss and consider if a service should be used. Have there been violations of “decency” significant enough that should make us leave? Have we crossed that line in the sand? I’m leaning to “no” now, but I’m always listening to what curl users and developers say. Where do you think the line is drawn?

curl your own error message

The --write-out (or -w for short) curl command line option is a gem for shell script authors looking for more information from a curl transfer. Experienced users know that this option lets you extract things such as detailed timings, the response code, transfer speeds and sizes of various kinds. A while ago we even made it possible to output JSON.

Maybe the best resource to learn more about it, is the dedication section in Everything curl. You’ll like it!

Now even more versatile

In curl 7.75.0 (released on February 3, 2021) we introduce five new variables for this option, and I’ll elaborate on some of the fun new things you can do with these!

These new variables were invented when we received a bug report that pointed out that when a user transfers many URLs in parallel and one or some of them fail – the error message isn’t identifying exactly which of the URLs that failed. We should improve the error messages to fix this!

Or wait a minute. What if we provide enough details for --write-out to let the user customize the error message completely by themselves and thus get exactly the info they want?

onerror

Using this, you can specify a message only to get written if the transfer ends in error. That is a non-zero exit code. An example could look like this:

curl -w '%{onerror}failed\n' $URL -o saved -s

…. if the transfer is OK, it says nothing. If it fails, the text on the right side of the “onerror” variable gets output. And that text can of course contain other variables!

This command line uses -s for “silent” to make it inhibit the regular built-in error message.

url

To help craft a good error message, maybe you want the URL included that was used in the transfer?

curl -w '%{onerror}%{url} failed\n' $URL

urlnum

If you get more than one URL in the command line, it might be helpful to get the index number of the used URL. This is of course especially useful if you for example work with the same URL multiple times in the same command line and just one of them fails!

curl -w '%{onerror}URL %{urlnum} failed\n' $URL $URL

exitcode

The regular built-in curl error message shows the exit code, as it helps diagnose exactly what the problem was. Include that in the error message like:

curl -w '%{onerror}%{url} got %{exitcode}\n' $URL

errormsg

This is the human readable explanation for the problem. The error message. Mimic the default curl error message like this:

curl -w '%{onerror}curl: %{exitcode} %{errormsg}\n' $URL

stderr

We already provide this “variable” from before, which allows you to make sure the output message is sent to stderr instead of stdout, which then makes it even more like a real error message:

url -w '%{onerror}%{stderr}curl: %{exitcode} %{errormsg}\n' $URL

More

These new variables work fine after %{onerror}, but they also of course work just as fine to output even when there was no error, and they work perfectly fine whether you use -Z for parallel transfers or doing them serially, one after the other.

More on less curl memory

tldr: curl uses 30K of dynamic memory for downloading a large HTTP file, plus the size of the download buffer.

Back in September 2020 I wrote about my work to trim curl allocations done for FTP transfers. Now I’m back again on the memory use in curl topic, from a different angle.

This time, I learned about the awesome tool pahole, which can (among other things) show structs and their sizes from a built library – and when embracing this fun toy, I ran some scripts on a range of historic curl releases to get a sense of how we’re doing over time – memory size and memory allocations wise.

The task I set out to myself was: figure out how the sizes of key structs in curl have changed over time, and correlate that with the number and size of allocations done at run-time. To make sure that trimming down the size of a specific struct doesn’t just get allocated by another one instead, thus nullifying the gain. I want to make sure we’re not slowly degrading – and if we do, we should at least know about it!

Also: we keep developing curl at a fairly good pace and we’re adding features in almost every release. Some growth is to be expected and should be tolerated I think. We also keep the build process very configurable so users with particular needs and requirements can switch off features and thus also gain memory.

Memory sizes in modern computing

Of course systems are growing every year and machines ship with more and more ram, which also goes for the smallest machines. But there are still a vast amount of systems out there with limited memory capabilities that want good Internet transfers as well. Also, by keeping sizes down, it allows applications and systems to scale better: a 10% decrease in size can imply a 10% increase in number of possible parallel transfers. curl, and especially libcurl, is still today in 2021 frequently used on machines with limited amounts of available memory. Sometimes in the few megabytes of ram range.

Fixed configuration

In my tests I did for this I used the exact same configuration and build config for all versions tested. The sizes and behavior will vary greatly depending on config, but I tried to use a fairly complete and typical build to see how code and memory use is for “most” users. I ran everything on my x86_64 Debian Linux dev machine. My focus is on curl versions from the last 3-4 years. I figured going back to ancient times won’t help here.

Key structs

struct Curl_easy – this is the “easy handle”, what is allocated by curl_easy_init() and is the anchor for every transfer done with libcurl, no matter which API you’re using. An application creates one of these for each concurrent transfer it wants to do or keep around. Some applications allocate hundreds or even thousands of these.

struct Curl_multi – this is the “multi handle”, allocated with curl_multi_init(). This handle is created by applications as a holder of many concurrent transfers so applications typically do not have a very large amount of these.

struct connectdata – this is an internal struct that isn’t visible externally to applications. It is the holder of connection related data for a connection to a specific server. The connection pool curl uses to handle persistent connections will hold a number of these structs in memory after the transfer has completed, to allow subsequent reuse. The size of the connection pool is customizable. A busy application doing lots of transfers might end up with a sizeable number of connections in the pool, so the size of this struct adds up.

Dynamic allocations

In early curl history, the download and upload buffers for transfers were part of the Curl_easy struct, which made it fairly large.

In curl 7.53.0 (February 2017) the download buffer was turned dynamically sized and is since then allocated separately. Before that transition, curl 7.52.0 had a Curl_easy struct that was 36584 bytes, which included both the download and the upload buffers. In 7.58.0 the size was down to 21264 bytes since the download buffer was then allocated separately and was then also allowed to be done much larger than the previously set 16KB fixed size.

The 16KB upload buffer was moved out of the Curl_easy handle in the 7.62.0 release (October 2018) to be done on demand – which of course especially benefits everyone who doesn’t do uploads at all… The size of this struct was then down to 6208 bytes.

In curl 7.71.0 we also made the download buffer allocated on demand, and immediately freed after the transfer completes. This makes applications that keep handles around for reuse use significantly less memory. Applications are generally encouraged to keep the handles around to better facilitate connection reuse.

Struct size development

Curl_easy

The size in bytes of struct Curl_easy the last few years:

 7.52.0 36584
 7.58.0 21264
 7.61.0 21344
 7.62.0 6208
 7.63.0 6216
 7.64.0 6312
 7.65.0 5976
 7.66.0 6024
 7.67.0 6040
 7.68.0 6040
 7.69.0 6040
 7.70.0 6080
 7.71.0 6448
 7.72.0 6472
 7.73.0 6464
 7.74.0 6512

Current git: 5272 bytes (-19% from last release). With this, the struct is smaller than it has ever been before.

How we made this extra reduction? Primarily I noticed how we had a DoH related struct in the handle by default, which was turned into on-demand allocation. DoH is still rare and that data only needs to be allocated during the name resolving phase.

Curl_multi

The size in bytes of struct Curl_multi the last few years has remained very stable and it keeps being very small. Notable is that when we removed pipelining support in 7.65.0 it took away 96 bytes from this struct.

 7.50.0 384
 7.52.0 384
 7.58.0 480
 7.59.0 488
 7.60.0 488
 7.61.0 512
 7.62.0 512
 7.63.0 512
 7.64.0 512
 7.65.0 416
 7.66.0 416
 7.67.0 424
 7.68.0 432
 7.69.0 416
 7.70.0 416
 7.71.0 416
 7.72.0 416
 7.73.0 416
 7.74.0 416

Current git: 416 bytes.

With this, we’re smaller than we were in the beginning of 2018.

connectdata

The size in bytes of struct connectdata. It’s been both up and down.

 7.50.0 1904 
 7.52.0 2104 
 7.58.0 2112 
 7.59.0 2112 
 7.60.0 2112 
 7.61.0 2128 
 7.62.0 2152 
 7.63.0 2160 
 7.64.0 2160 
 7.65.0 1944 
 7.66.0 1960 
 7.67.0 1976 
 7.68.0 1976 
 7.69.0 2600 
 7.70.0 2608 
 7.71.0 2608 
 7.72.0 2624 
 7.73.0 2640 
 7.74.0 2656

Current git: 1472 bytes (-44% from last release)

The size bump in 7.69.0 was the insertion of a new struct for state data when doing SOCKS connections non-blocking, and the corresponding decrease again for the pending release is the removal of the buffer from that struct. With this, we’re down to a size we had a very long time ago.

Run-time memory use

To make sure that we don’t just move memory to other on-demand buffers that we need to allocate anyway, I ran a script with a lot of curl versions and counted the number of allocations needed and the peak amount of memory allocated. For a plain 512MB download over HTTP from localhost. The counted allocations were only the ones done by curl code (malloc, calloc, realloc, strdup etc).

Number of allocations

There are many reasons to allocate memory and while we want to keep the number down, lots of factors of course needs to be taken into account.

In the list below you’ll see that clearly we had some mistake in 7.52.0 and perhaps some more versions, as it did over 32,000 allocations. The situation was fixed in or before 7.58.0 and I haven’t bothered to go back to check exactly what it was.

 7.52.0 32883 
 7.58.0 82 
 7.59.0 82 
 7.60.0 82 
 7.61.0 82 
 7.62.0 86 
 7.63.0 87 
 7.64.0 87 
 7.65.0 82
 7.66.0 101 
 7.67.0 107 
 7.68.0 111 
 7.69.0 113 
 7.70.0 113 
 7.71.0 99 
 7.72.0 99 
 7.73.0 96 
 7.74.0 96

Current git: 96 allocations.

We do more allocations than some years back, but I think it is still within a reasonable growth.

Peak memory allocations

Here’s some developments to look closer at!

If we start out looking at the oldest versions in my test, we can see that they’re sub 100KB allocated – but we need to take into account the fact that back then we used a fixed 16KB download buffer. In curl 7.54.1 we bumped the default buffer size the curl tool uses to 100K which in the table below is visible in the 7.58.0 allocation.

 7.50.0 84473 
 7.52.0 85329 
 7.58.0 174243 
 7.59.0 174315 
 7.60.0 174339 
 7.61.0 174531 
 7.62.0 143886 
 7.63.0 143928 
 7.64.0 144128 
 7.65.0 143152 
 7.66.0 168188 
 7.67.0 173365 
 7.68.0 168575
 7.69.0 169167
 7.70.0 169303 
 7.71.0 136573
 7.72.0 136765 
 7.73.0 136875 
 7.74.0 137043

Current git: 131680 bytes.

The gain in 7.62.0 was mostly the removal of the default allocation of the upload buffer, which isn’t used in this test…

The current size tells me several things. We’re at a memory consumption level that is probably at its lowest point in the last decade – while at the same time having more features and being better than ever before. If we deduct the download buffer we have 29280 additional bytes allocated. Compare this to 7.50.0 which allocated 68089 bytes on top of the download buffer!

If I change my curl to use the smallest download buffer size allowed by libcurl (1KB) instead of the default 100KB, it ends up peaking at: 30304 bytes. That’s 44% of the memory needed by 7.50.0.

In my opinion, this is very good.

It might also be worth to reiterate that this is with a full featured libcurl build. We can shrink even further if we switch off undesired features or just go tiny-curl.

I hope this goes without saying, but of course all of this work has been done with the API and ABI still intact.

Graphs?

You know I like graphs, but for now I decided this blog post and analysis was enough. I’m going to think about how we can perhaps get this info somehow floated on a more regular and automated way in the future. Not sure it is worth spending a lot of effort on though.

Reproduce

  1. build curl with the --enable-debug option to configure. Don’t use the threaded resolver – I use the c-ares one, because it otherwise breaks the memdebug system.
  2. Run your command line with tracing enabled and then run memanalyze on the log:
#!/bin/sh
export CURL_MEMDEBUG=/tmp/curlmem.log
./src/curl -v localhost/512M -o /dev/null
./tests/memanalyze.pl -v /tmp/curlmem.log

To get the struct sizes, just run pahole on the static libcurl lib after the build:

pahole -s lib/.libs/libcurl.a > sizes.txt

Credits

The photo was taken by me, in Siem Reap, Cambodia. “A smaller transport”

Updates

After the initial posting of this article I optimized the structs even further so the numbers have been updated since then to reflect the state of what’s in git a week later.

everything.curl.dev

The online version of the curl book “everything curl” has been moved to the address shown in the title:

everything.curl.dev

This, after I did a very unscientific and highly self-selective poll on twitter on January 18 2020

The old name (which you can see what the least selected in the poll) will now redirect to the new host name and so will everything.curl.se .

curl.dev

I am the owner of this domain since a little while back but we haven’t yet figured out what to do with the domain – this is the first use of curl.dev for real content.

If you have ideas of how we can improve curl’s web presence with this domain, please let me know! I do not want to move the official curl web site again from its new home at curl.se, that’s not what I would call a productive idea.

bye bye svn.haxx.se

It isn’t actually going away. It’s just been thrown over the fence to the Apache project and Subversion itself to host and maintain going forward.

Mail archive

When the Subversion project started in the early year 2000, I was there. I joined the project and participated in the early days of its development as I really believed in creating an “improved CVS” and I thought I could contribute to it.

While I was involved with the project, I noticed the lack of a decent mailing list archive for the discussions and set one up under the name svn.haxx.se as a service for myself and for the entire community. I had the server and the means to do it, so why not?

After some years I drifted away from the project. It was doing excellently and I was never any significant contributor. Then git and some of the other distributed version control systems came along and in my mind they truly showed the world how version control should be done…

The mailing list archive however I left, and I had even added more subversion related lists to it over time. It kept chugging along without me having to do much. Mails flew in, got archived and were made available for the world to search for and link to. Today it has over 390,000 emails archived from over twenty years of rather active open source development on multiple mailing lists. It is fascinating that no less than 46 persons have written more than a thousand emails each on those lists during these two decades.

Transition complete

The physical machine that runs the website is going to be shut down and taken out of service soon, and instead of just shutting down this service I’ve worked with the good people in the Subversion project and the hosting of that site and archive has now been taken over by the Apache project instead. It is no longer running on my machine. If you discover any issues with it, you need to talk to them.

Today, January 20 2021, I updated the DNS to instead have the host name svn.haxx.se point to Apache’s web server. I believe the plan is to keep the site as an archive of past emails and not add any new emails to it as of now.

I’m out

I hereby sign off my twenty years of service as an svn email archive janitor. It was a pleasure to serve you.

Food on the table while giving away code

I founded the curl project early 1998 but had already then been working on the code since November 1996. The source code was always open, free and available to the world. The term “open source” actually wasn’t even coined until early 1998, just weeks before curl was born.

In the beginning of course, the first few years or so, this project wasn’t seen or discovered by many and just grew slowly and silently in a dusty corner of the Internet.

Already when I shipped the first versions I wanted the code to be open and freely available. For years I had seen the cool free software put out the in the world by others and I wanted my work to help build this communal treasure trove.

License

When I started this journey I didn’t really know what I wanted with curl’s license and exactly what rights and freedoms I wanted to give away and it took a few years and attempts before it landed.

The early versions were GPL licensed, but as I learned about resistance from proprietary companies and thought about it further, I changed the license to be more commercially friendly and to match my conviction better. I ended up with MIT after a brief experimental time using MPL. (It was easy to change the license back then because I owned all the copyrights at that point.)

To be exact: we actually have a slightly modified MIT license with some very subtle differences. The reason for the changes have been forgotten and we didn’t get those commits logged in the “big transition” to Sourceforge that we did in late 1999… The end result is that this is now often recognized as “the curl license”, even though it is in effect the MIT license.

The license says everyone can use the code for whatever purpose and nobody is required to ship any source code to anyone, but they cannot claim they wrote it themselves and the license/use of the code should be mentioned in documentation or another relevant location.

As licenses go, this has to be one of the most frictionless ones there is.

Copyright

Open source relies on a solid copyright law and the copyright owners of the code are the only ones who can license it away. For a long time I was the sole copyright owner in the project. But as I had decided to stick to the license, I saw no particular downsides with allowing code and contributors (of significant contributions) to retain their copyrights on the parts they brought. To not use that as a fence to make contributions harder.

Today, in early 2021, I count 1441 copyright strings in the curl source code git repository. 94.9% of them have my name.

I never liked how some projects require copyright assignments or license agreements etc to be able to submit code or patches. Partly because of the huge administrative burden it adds to the project, but also for the significant friction and barrier to entry they create for new contributors and the unbalance it creates; some get more rights than others. I’ve always worked on making it easy and smooth for newcomers to start contributing to curl. It doesn’t happen by accident.

Spare time

In many ways, running a spare time open source project is easy. You just need a steady income from a “real” job and sufficient spare time, and maybe a server to host stuff on for the online presence.

The challenge is of course to keep developing it, adding things people want, to help users with problems and to address issues timely. Especially if you happen to be lucky and the user amount increases and the project grows in popularity.

I ran curl as a spare time project for decades. Over the years it became more and more common that users who submitted bug reports or asked for help about things were actually doing that during their paid work hours because they used curl in a commercial surrounding – which sometimes made the situation almost absurd. The ones who actually got paid to work with curl were asking the unpaid developers to help them out.

I changed employers several times. I started my own company and worked as my own boss for a while. I worked for Mozilla on network stuff in Firefox for five years. But curl remained a spare time project because I couldn’t figure out how to turn it into a job without risking the project or my economy.

Earning a living

For many years it was a pipe dream for me to be able to work on curl as a real job. But how do I actually take the step from a spare time project to doing it full time? I give away all the code for free, and it is a solid and reliable product.

The initial seeds were planted when I met and got to know Larry (wolfSSL CEO) and some of the other good people at wolfSSL back in the early 2010s. This, because wolfSSL is a company that write open source libraries and offer commercial support for them – proving that it can work as a business model. Larry always told me he thought there was a possibility waiting here for me with curl.

Apart from the business angle, if I would be able to work more on curl it could really benefit the curl project, and then of course indirectly everyone who uses it.

It was still a step to take. When I gave up on Mozilla in 2018, it just took a little thinking before I decided to try it. I joined wolfSSL to work on curl full time. A dream came true and finally curl was not just something I did “on the side”. It only took 21 years from first curl release to reach that point…

I’m living the open source dream, working on the project I created myself.

Food for free code

We sell commercial support for curl and libcurl. Companies and users that need a helping hand or swift assistance with their problems can get it from us – and with me here I dare to claim that there’s no company anywhere else with the same ability. We can offload engineering teams with their curl issues. Up to 24/7 level!

We also offer custom curl development, debugging help, porting to new platforms and basically any other curl related activity you need. See more on the curl product page on the wolfSSL site.

curl (mostly in the shape of libcurl) runs in ten billion installations: some five, six billion mobile phones and tablets – used by several of the most downloaded apps in existence, in virtually every website and Internet server. In a billion computer games, a billion Windows machines, half a billion TVs, half a billion game consoles and in a few hundred million cars… curl has been made to run on 82 operating systems on 22 CPU architectures. Very few software components can claim a wider use.

“Isn’t it easier to list companies that are not using curl?”

Wide use and being recognized does not bring food on the table. curl is also totally free to download, build and use. It is very solid and stable. It performs well, is documented, well tested and “battle hardened”. It “just works” for most users.

Pay for support!

How to convince companies that they should get a curl support contract with me?

Paying customers get to influence what I work on next. Not only distant road-mapping but also how to prioritize short term bug-fixes etc. We have a guaranteed response-time.

You get your issues first in line to get fixed. Customers also won’t risk getting their issues added the known bugs document and put in the attic to be forgotten. We can help customers make sure their application use libcurl correctly and in the best possible way.

I try to emphasize that by getting support from us, customers can take away some of those tasks from their own engineers and because we are faster and better on curl related issues, that is a pure net gain economically. For all of us.

This is not an easy sell.

Sure, curl is used by thousands of companies everywhere, but most of them do it because it’s free (in all meanings of the word), functional and available. There’s a real challenge in identifying those that actually use it enough and value the functionality enough that they realize they want to improve their curl foo.

Most of our curl customers purchased support first when they faced a complicated issue or problem they couldn’t fix themselves – this fact gives me this weird (to the wider curl community) incentive to not fix some problems too fast, because it then makes it work against my ability to gain new customers!

We need paying customers for this to be sustainable. When wolfSSL has a sustainable curl business, I get paid and the work I do in curl benefits all the curl users; paying as well as non-paying.

Dual license

There’s clearly business in releasing open source under a strong copyleft license such as GPL, and as long as you keep the copyrights, offer customers to purchase that same code under another more proprietary- friendly license. The code is still open source and anyone doing totally open things can still use it freely and at no cost.

We’ve shipped tiny-curl to the world licensed under GPLv3. Tiny-curl is a curl branch with a strong focus on the tiny part: the idea is to provide a libcurl more suitable for smaller systems, the ones that can’t even run a full Linux but rather use an RTOS.

Consider it a sort of experiment. Are users interested in getting a smaller curl onto their products and are they interested in paying for licensing. So far, tiny-curl supports two separate RTOSes for which we haven’t ported the “normal” curl to.

Keeping things separate

Maybe you don’t realize this, but I work hard to keep separate things compartmentalized. I am not curl, curl is not wolfSSL and wolfSSL is not me. But we all overlap greatly!

The Daniel + curl + wolfSSL trinity

I work for wolfSSL. I work on curl. wolfSSL offers commercial curl support.

Reserved features

One idea that we haven’t explored much yet is the ability to make and offer “reserved features” to paying customers only. This of course as another motivation for companies to become curl support customers.

Such reserved features would still have to be sensible for the curl project and most likely we would provide them as specials for paying customers for a period of time and then merge them into the “real” open source curl project. It is very important to note that this will not in any way make the “regular curl” worse or a lesser citizen in any way. It would rather be a like a separate product, a curl+ with extra stuff on top of vanilla curl.

Since we haven’t ventured into this area yet, we haven’t worked out all the details. Chances are we will wander into this territory soon.

Other food-generators

I do occasional speaking gigs on curl and HTTP related topics but even if I charge for them this activity never brings much more than some extra pocket money. I do it because it’s fun and educational.

It has been suggested that I should create a web shop to sell curl branded merchandise in, like t-shirts, mugs, etc but I think that grossly over-estimates the user interest and how much margin I could put on mundane things just because they’d have a curl logo glued on them. Also, I would have a difficult time mentally to sell curl things and claim the profit personally. I rather keep giving away curl stash (mostly stickers) for free as a means to market the project and long term encourage users into buying support.

Donations

We receive money to the curl project through donations, most of them via our opencollective account. It is important to note that even if I’m a key figure in the project, this is not my money and it’s not my project. Donated money is spent on project related expenses, which so far primarily is our bug bounty program. We’ve avoided to spend donated money on direct curl development, and especially such that I could provide or benefit from myself, as that would totally blur the boundaries. I’m not ruling out taking that route in a future though. As long as and only if it is to the project’s benefit.

Donations via GitHub to me personally sponsors me personally and ends up in my pockets. That’s not curl money but I spend it mostly on curl development, equipment etc and it makes me able to not have to think twice when sending curl stickers to fans and friends all over the world. It contributes to food on my table and I like to think that an occasional beer I drink is sponsored by friends out there!

The future I dream of

We get a steady number of companies paying for support at a level that allows us to also pay for a few more curl engineers than myself.

Credits

Image by Khusen Rustamov from Pixabay

Age is just a number or two

Kjell, a friend of mine, mailed me a zip file this morning saying he’d found an earlier version of “urlget” lying around. Meaning: an older version than what we provide on the curl download page. urlget was the name we used for the command line tool before we changed the name to curl in March 1998.

I’ve been reckless with some of the source code and keeping track of early history so this made me curios and when I glanced through the source code for urlget 2.4, shipped in October 1997. Kjell had found a project of his own where he’d imported the urlget sources as that was from before the days curl was also a library.

In this source code I also found the original URL to the home page for urlget and its predecessor, httpget: http://www.inf.ufrgs.br/~sagula/urlget.html

I don’t know if I have this info stored somewhere else too, but the important thing here is that it then struck me that I hadn’t checked the Internet Archive for what it has archived for this URL!

The earliest archived version of the urlget page is from February 16 1998. I checked, but there’s no archived version from slightly earlier when the tool was named just httpget. I did however find source code from httpget that was older than I had saved from before: httpget 1.3! 320 lines of hopelessly naive code. From April 14, 1997.

That date is also the only one that has content as the next archived one is just a redirect to the first curl web site over at http://www.fts.frontec.se/~dast/curl/

Two birthdays?

I used this newly found gem to update the curl history page with exact dates for some of the earliest releases and events, as it was previously not very specific there as I hadn’t kept notes.

I could now also once and for all note that the first release of HttpGet (version 0.1) was done on November 11, 1996. My personal participation in the project began at some days/weeks after that, as it is recorded that I provided improvements in the HttpGet 0.2 release that was done on December 17 the same year.

I’ve always counted the age of curl from March 20, 1998 which is when I first released something under the name “curl”, but since we released it as curl 4.0 that is certainly a sign that the time up to that point could possibly also be counted into its age.

It’s not terribly important of when to start the count.

What’s more fun with the particular HttpGet 0.1 release date, is that it is the exact same date Wget was released the first time under that name! It had previously been developed and released under a different name (“geturl”) and exactly on November 11, 1996 Hrvoje Nikši? released Wget 1.4.0 to the world.

Why not go with Wget?

People sometimes ask me why I didn’t use wget to get currencies that winter day back in 1996 when I found HttpGet and started to work on that HTTP client, but the fact is then that not only was the search engines and software hosting alternatives clearly inferior back in those days so finding software could be difficult, wget was also very new to the world. I didn’t learn about the existence of wget until many months later – although I can’t recall exactly when or how.

I also think, looking back at myself in that time, that if I would’ve found wget then, I would probably have thought it to be overkill for my use case and opted to use something else anyway. I mean, I was “just getting a HTTP page” and the wget package was 171KB compressed, while HttpGet 1.3 was still less than 8K in a single source code file… I’m not saying that way of thinking was right!

The curl year 2020

As we’re approaching the end of the year, I just want to sum up the curl year with a few words.

2020 has been another glorious year in the curl project. We’ve seen a series of accomplishments and introductions of new things during this the year of the plague.

Accomplishments

I personally have done more commits in the git repository since any year after 2004 (890 so far).

The total number of commits done in git is the largest since 2014 (1445 plus some).

The number of published curl related CVEs is the lowest since 2013 (6). For the ones we announced, we could reward record amounts in our bug bounty program!

139 authors wrote commits that were merged (so far).

We did nine curl releases, out of which two unfortunately were quicker “panic releases” that patched up problems in the previous release.

Seven changes to remember

We’ve logged no less than 905 bug-fixes and 30 changes in the releases of this year, but the seven perhaps most memorable things we’ve introduced in 2020 are…

Videos

This year I’ve introduced the concept of doing a “release presentation” for every release. Those are videos where I go through and discuss the changes, the security releases and some interesting bug-fixes. Each release links to those from the changelog page on the website.

New home

This is the year when we finally got ourselves a curl domain. curl.se is our new home.

What didn’t happen

We cancelled curl up 2020 due to Covid-19. It was planned to happen in Berlin. We did it purely online instead. We’re not planning any new physical curl up for 2021 either. Let’s just wait and see what happens with the pandemic next year and hope that we might be able to go back and have a physical meetup in 2022…

It is a curl world

curl supports NASA

Not everyone understands how open source is made. I received the following email from NASA a while ago.

Subject: Curl Country of Origin and NDAA Compliance

Hello, my name is [deleted] and I am a Supply Chain Risk Management Analyst at NASA. As such, I ensure that all NASA acquisitions of Covered Articles comply with Section 208 of the Further Consolidated Appropriations Act, 2020, Public Law 116-94, enacted December 20, 2019. To do so, the Country of Origin (CoO) information must be obtained from the company that develops, produces, manufactures, or assembles the product(s). To do so, please provide an email response or a formal document (a PDF on company letterhead is preferred, but a simple statement is sufficient) specifically identifying the country, or countries, in which Curl is developed and maintained

If the country of origin is outside the United States, please provide any information you may have stating that testing is performed in the United States prior to supplying products to customers. Additionally, if available, please identify all authorized resellers of the product in question.

Lastly, please confirm that Curl is not developed by, contain components developed by, or receive substantial influence from entities prohibited by Section 889 of the 2019 NDAA. These entities include the following companies and any of their subsidiaries or affiliates:

Hytera Communications Corporation
Huawei Technologies Company
ZTE Corporation
Dahua Technology Company
Hangzhou Hikvision Digital Technology Company

Finally, we have a time frame of 5 days for a response.
Thank you,

My answer

Okay, I first considered going with strong sarcasm in my reply due to the complete lack of understanding, and the implied threat in that last line. What would happen if I wouldn’t respond in time?

Then it struck me that this could be my chance to once and for all get a confirmation if curl is already actually used in space or not. So I went with informative and a friendly tone.

Hi [name],

I will answer to these questions below to the best of my ability, and maybe you can answer something for me?

curl (https://curl.se) is an open source project that creates two products, curl the command line tool and libcurl the library. I am the founder, lead developer and core maintainer of the project. To this date, I have done about 57% of the 26,000 changes in the source code repository. The remaining 43% have been done by 841 different volunteers and contributors from all over the world. Their names can be extracted from our git repository: https://github.com/curl/curl

You can also see that I own most, but not all, copyrights in the project.

I am a citizen of Sweden and I’ve been a citizen of Sweden during the entire time I’ve done all and any work on curl. The remaining 841 co-authors are from all over the world, but primarily from western European countries and the US. You could probably say that we live primarily “on the Internet” and not in any particular country.

We don’t have resellers. I work for an American company (wolfSSL) where we do curl support for customers world-wide.

Our testing is done universally and is not bound to any specific country or region. We test our code substantially before release.

Me knowingly, we do not have any components or code authored by people at any of the mentioned companies.

So finally my question: can you tell me anything about where or for what you use curl? Is it used in anything in space?

Regards,
Daniel

Used in space?

Of course my attempt was completely in vain and the answer back was very brief and it just said…

“We are using curl to support NASA’s mission and vision.”

Credits

Space ship image by Elias Sch. from Pixabay

How my Twitter hijacks happened

You might recall that my Twitter account was hijacked and then again just two weeks later.

The first: brute-force

The first take-over was most likely a case of brute-forcing my weak password while not having 2FA enabled. I have no excuse for either of those lapses. I had convinced myself I had 2fa enabled which made me take a (too) lax attitude to my short 8-character password that was possible to remember. Clearly, 2fa was not enabled and then the only remaining wall against the evil world was that weak password.

The second time

After that first hijack, I immediately changed password to a strong many-character one and I made really sure I enabled 2fa with an authenticator app and I felt safe again. Yet it would only take seventeen days until I again was locked out from my account. This second time, I could see how someone had managed to change the email address associated with my account (displayed when I wanted to reset my password). With the password not working and the account not having the correct email address anymore, I could not reset the password, and my 2fa status had no effect. I was locked out. Again.

It felt related to the first case because I’ve had my Twitter account since May 2008. I had never lost it before and then suddenly after 12+ years, within a period of three weeks, it happens twice?

Why and how

How this happened was a complete mystery to me. The account was restored fairly swiftly but I learned nothing from that.

Then someone at Twitter contacted me. After they investigated what had happened and how, I had a chat with a responsible person there and he explained for me exactly how this went down.

Had Twitter been hacked? Is there a way to circumvent 2FA? Were my local computer or phone compromised? No, no and no.

Apparently, an agent at Twitter who were going through the backlog of issues, where my previous hijack issue was still present, accidentally changed the email on my account by mistake, probably confusing it with another account in another browser tab.

There was no outside intruder, it was just a user error.

Okay, the cynics will say, this is what he told me and there is no evidence to back it up. That’s right, I’m taking his words as truth here but I also think the description matches my observations. There’s just no way for me or any outsider to verify or fact-check this.

A brighter future

They seem to already have identified things to improve to reduce the risk of this happening again and Michael also mentioned a few other items on their agenda that should make hijacks harder to do and help them detect suspicious behavior earlier and faster going forward. I was also happy to provide my feedback on how I think they could’ve made my lost-account experience a little better.

I’m relieved that the second time at least wasn’t my fault and neither of my systems are breached or hacked (as far as I know).

I’ve also now properly and thoroughly gone over all my accounts on practically all online services I use and made really sure that I have 2fa enabled on them. On some of them I’ve also changed my registered email address to one with 30 random letters to make it truly impossible for any outsider to guess what I use.

(I’m also positively surprised by this extra level of customer care Twitter showed for me and my case.)

Am I a target?

I don’t think I am. I think maybe my Twitter account could be interesting to scammers since I have almost 25K followers and I have a verified account. Me personally, I work primarily with open source and most of my works is already made public. I don’t deal in business secrets. I don’t think my personal stuff attracts attackers more than anyone else does.

What about the risk or the temptation for bad guys in trying to backdoor curl? It is after all installed in some 10 billion systems world-wide. I’ve elaborated on that before. Summary: I think it is terribly hard for someone to actually manage to do it. Not because of the security of my personal systems perhaps, but because of the entire setup and all processes, signings, reviews, testing and scanning that are involved.

So no. I don’t think my personal systems are a valued singled out target to attackers.

Now, back to work!

Credits

Image by Gerd Altmann from Pixabay

curl, open source and networking