Tag Archives: cURL and libcurl

curl dash-dash-json

The curl “cockpit” is yet again extended with a new command line option: --json. The 245th command line option.

curl is a generic transfer tool for sending and receiving data across networks and it is completely agnostic as to what it transfers or even why.

To allow users to craft all sorts of transfers, or requests if you will, it offers a wide range of command line options. This flexibility has made it possible for a large number of users to keep using curl even as network ecosystems and its (HTTP) use have changed over time.

Craft content to send

curl offers a few convenience options for creating contents to send in HTTP “uploads” . Options such as -F for building multi-part formposts, and --data-urlencode for URL-encoding POST data.

When curl was born, JSON didn’t exist. Over the decades since, JSON has grown to become a very commonly used format for structured data, often used in “REST API” calls and more. curl command lines sending and receiving JSON are very common now.

When asking users today what features they would like to see added in curl, and when asking them what good features they see in curl alternatives, a huge chunk of them mention JSON support in one way or another.

Partly because dealing with JSON with curl sometimes make the involved command lines rather long and quirky.

JSON in curl

The discussion has been ignited in the curl community about what, if anything, we should do in curl to make it a smoother and better tool when working with JSON. The offered opinions range from nothing (“curl is content agnostic”) to full-fledged JSON generator, parser and pretty-printer (or a combination in between). I don’t think we are ready to put our collective foot down on where we should go with this.

Introducing --json

While the discussion is ongoing, we still bring a first JSON oriented feature to the tool as a first step. The --json option. This is a new option that basically works as an alias, or shortcut, to sending JSON to an endpoint. You use it like this:

curl --json '{"tool": "curl"}' https://example.com/

Send JSON from a file:

curl --json @json.txt https://example.com/

Or send JSON passed to curl on stdin:

echo '{"a":"b"}' | curl --json @- https://example.com/

This change does not affect the libcurl library at all, it is only done the tool side of things.

Details

The --json option is the equivalent of setting

--data [arg]
--header "Content-Type: application/json"
--header "Accept: application/json"

If you want another Content-Type or Accept header, then you can override them as usual with -H. If you use multiple --json options on the same command line, the contents from them will be concatenated before sent off.

Ships!

This new json option was merged in this commit and will be available and present in curl 7.82.0, to be released in early March 2022. Of course you can build it from git or a daily snapshot already now to test it out!

Future JSON stuff

I want to at least experiment a little with some level of JSON creation for curl. To help users who want to create basic JSON on the command line to send off. Lots of users run into problems with this since JSON uses double quotes a lot and then the combination of quoting and unquoting and the shell often make users confused or even frustrated.

Exactly how or to what level is still to be determined. Your opinions and feedback are very valuable and helpful.

jo+curl+jq

This trinity of commands helps you send correctly formatted data and show the returned JSON easily. Like in a command line similar to:

jo name=daniel tool=curl | curl --json @- https://httpbin.org/post | jq

jo, jq

curl with rust

I did an online presentation with this name for the Rust Linz meetup, on January 27 2022. This is the recording:

The individual slides are also available.

Content quickly

In this presentation I talk about how libcurl’s most important aspect is the stable ABI and API. If we just maintain those, we can change the internals however we like.

libcurl has a system with build-time selected components, called backends. They are usually powered by third party libraries. With the recently added HTTP backend there are now seven “flavors” of backends.

A backend can be provided by a library written in rust, as long as that rust component provides a C interface that libcurl can use. When it does, it being rust or not is completely transparent and a non-issue for libcurl.

curl currently supports components written in rust for three different backends:

None of these backends are yet “feature complete”, but we are moving slowly towards that. Your help is appreciated!

1,000 commit authors

In the curl project we switched to using git for source code control in March 2010. Since then we have exact tracking of every commit author to the project. In the times before git, we used CVS which doesn’t properly separate committers from authors.

The number of commit authors in curl has gradually been increasing over time.

1998-03-20: 1 author recorded in the premiere curl release. It actually took until May 30th 2001 for the second recorded committer (partly of course because we used CVS). 1167 days to bump up the number one notch.

2014-05-06: 250 authors took 5891 days to reach.

2017-08-18: 500 authors required additional 1200 days

2019-12-20: 750 authors was reached after only 854 more days

2022-01:30: reaching 1000 authors took an additional 772 days.

Jan-Piet Mens became the thousandth author by providing pull-request #8354.

These 1,000 authors have done in total 28,163 commits. This equals 28.2 commits on average while the median commit author only did a single one – more than 60% of the authors are one-time committers after all! Only fourteen persons in the project have authored more than one hundred commits.

It took us 8,717 days to reach 1,000 authors, making it one new committer in the project per 8.7 days on average, over the project’s life time so far. The latest 250 authors have “arrived” at a rate of one new author per 3.1 days.

In 2021 we had more commit authors than ever before and also more first-timers than ever. It will certainly be a challenge to reach that same level or maybe even break the record again in 2022.

It took almost 24 years to reach 1,000 authors. How long until we double this?

Visualized on video

To celebrate this occasion, I did a fresh gource video of curl development so far; for the duration of the first 1,000 committers. I spiced it up by adding 249 annotations of source related facts laid out in time.

Credits: the music is from bensound.com.

GitHub can’t count

If you look at the curl/curl repository on GitHub, it appears to be short of a few hundred committers. It says 792 right now.

This is because GitHub can’t count commit authors. The theory is that they somehow ignore committers that do not have GitHub accounts. My author count is instead based on plain git access and standard git commands run against a plain git clone:

$ git shortlog -s | wc -l
1000

Don’t mix URL parsers

I have had my share of adventures with URL parsers and their differences in the past. The current state of my research on the topic of (failed) URL interoperability remains available in this GitHub document.

Use one and only one

There is still no common or standard URL syntax format in sight. A string that you think looks like a URL passed to one URL parser might be considered fine, but passed to a second parser it might be rejected or get interpreted differently. I believe the state of URLs in the wild has never before been this poor.

The problem

If you parse a URL with parser A and make conclusions about the URL based on that, and then pass the exact same URL to parser B and it draws different conclusions and properties from that, it opens up not only for strange behaviors but in some cases for downright security vulnerabilities.

This is easily done when you for example use two different libraries, frameworks or libraries that need to work on that URL, but the repercussions are not always easy to see at once.

A well-known presentation on this topic from 2017 is Orange Tsai’s A New Era Of SSRF – Exploiting Url Parsers.

URL Parsing Confusion

The report EXPLOITING URL PARSERS: THE GOOD, BAD, AND INCONSISTENT (by Noam Moshe, Sharon Brizinov, Raul Onitza-Klugman and Kirill Efimov) was published today and I have had the privilege to have read and worked with the authors a little on this prior to its release.

As you see in the report, it shows that problems very similar to those mr Tsai reported and exploited back in 2017 are still present today, although perhaps in slightly different ways.

As the report shows, the problem is not only that there are different URL standards and that every implementation provides a parser that is somewhere in between both specs, but on top of that, several implementations often do not even follow the existing conflicting specifications!

The report authors also found and reported a bug in curl’s URL parser (involving percent encoded octets in host names) which I’ve subsequently fixed so if you use the latest curl that one isn’t present anymore.

curl’s URL API

In the curl project we attempt to help applications and authors to reduce the number of needed URL parsers in any given situation – to a large part as a reaction to the Tsai presentation from 2017 – with the URL API we introduced for libcurl in 2018.

Thanks to this URL parser API, if you are already using libcurl for transfers, it is easy to also parse and treat URLs elsewhere exactly the same way libcurl does. By sticking to the same parser, there is a significantly smaller risk that repeated parsing bring surprises.

Other work-arounds

If your application uses different languages or frameworks, another work-around to lower the risk that URL parsing differences will hurt you, is to use a single parser to extract the URL components you need in one place and then work on the individual components from that point on. Instead of passing around the full URL to get parsed multiple times, you can pass around the already separated URL parts.

Future

I am not aware of any present ongoing work on consolidating the URL specifications. I am not even aware of anyone particularly interested in working on it. It is an infected area, and I will get my share of blow-back again now by writing my own view of the state.

The WHATWG probably say they would like to be the steward of this and they are generally keen on working with URLs from a browser standpoint. It limits them to a small number of protocol schemes and from my experience, getting them to interested in changing something for the the sake of aligning with RFC 3986 parsers is hard. This is however the team that more than any other have moved furthest away from the standard we once had established. There are also strong anti-IETF sentiments oozing there. The WHATWG spec is a “living specification” which means it continues to change and drift away over time.

The IETF published RFC 3986 back in 2005, they saw the RFC 3987 pretty much fail and then more or less gave up on URLs. I know there are people and working groups there who would like to see URLs get brought back to the agenda (as I’ve talked to a few of them over the years) and many IETFers think that the IETF is the only group that can do it proper, but due to the unavoidable politics and the almost certain collision course against (and cooperation problems with) WHATWG, it is considered a very hot potato that barely anyone wants to hold. There are also strong anti-WHATWG feelings in some areas of the IETF. There is just a too small of a chance of a successful outcome from something that mostly likely will take a lot of effort, will, thick skin and backing from several very big companies.

We are stuck here. I foresee yet another report to be written a few years down the line that shows more and new URL problems.

My URL isn’t your URL.

curl 7.81.0 – more percent

There has been eight weeks since 7.80.0.

Release presentation

Numbers

the 205th release
1 change
56 days (total: 8,636)

121 bug-fixes (total: 7,518)
189 commits (total: 28,055)
0 new public libcurl function (total: 86)
1 new curl_easy_setopt() option (total: 295)

1 new curl command line option (total: 244)
53 contributors, 25 new (total: 2,558)
32 authors, 14 new (total: 990)
0 security fixes (total: 111)
0 USD paid in Bug Bounties (total: 16,900 USD)

Security

Today we celebrate our fourth consecutive release without any new vulnerability to fix and reveal.

Change

This release comes with just one change to note, but one that brings both a new libcurl setopt (CURLOPT_MIME_OPTIONS) and a new command line option (--form-escape). Starting now, libcurl defaults to percent encoding certain fields when doing multi-part HTTP formposts.

Bug-fixes

As usual, here’s a set of selected favorite bug-fixes of mine from this cycle:

require “see also” for every documented option in curl.1

When the curl command man page is generated at build time, the script now makes sure that there is a “see also” for each option. This will help users find related info. More mandatory information for each option makes us do better documentation that ultimately helps users.

lazy-alloc the table in Curl_hash_add()

The internal hash functions moved the allocation of the actual hash table from the init() function to when the first add() is called to add something to the table. This delay simplified code (when the init function became infallible ) and does even avoid a few allocs in many cases.

enable haproxy support for hyper backend

Plus a range of code and test cases adjusted to make curl built with hyper run better. There are now less than 30 test cases still disabled for hyper. We are closing in!

mbedTLS: add support for CURLOPT_CAINFO_BLOB

Users of this backend can now also use this feature that allows applications to provide a CA cert store in-memory instead of using an external file.

multi: handle errors returned from socket/timer callbacks

It was found out that the two multi interface callbacks didn’t at all treat errors being returned the way they were documented to do. They are now, and the documentation was also expanded to clarify.

nss:set_cipher don’t clobber the cipher list

Applications that uses libcurl built to use NSS found out that if they would select cipher, they would also effectively prevent connections from being reused due to this bug.

openldap: implement STARTTLS

curl can now switch LDAP transfers into LDAPS using the STARTTLS command much like how it already works for the email protocols. This ability is so far limited to LDAP powered by OpenLDAP.

openssl: define HAVE_OPENSSL_VERSION for OpenSSL 1.1.0+

This little mistake made libcurl use the wrong method to extract and show the OpenSSL version at run-time, which most notably would make libcurl say the wrong version for OpenSSL 3.0.1, which would rather show up as the non-existing version 3.0.0a.

sha256/md5: return errors when init fails

A few internal functions would simply ignore errors from these hashing functions instead of properly passing them back to the caller, making them to rather generate the wrong hash instead of properly and correctly returning an error etc.

curl: updated search for a file in the homedir

The curl tool now searches for personal config files in a slightly improved manner, to among other things make it find the same .known_hosts file on Windows as the Microsoft provided ssh client does.

url: check ssl_config when re-use proxy connection

A bug in the logic for checking connections in the connection pool suitable for reuse caused flaws when doing subsequent HTTPS transfers to servers over the same HTTPS proxy.

ngtcp2: verify server certificate

When doing HTTP/3 transfers, libcurl is now doing proper server certificate verification for the QUIC connection – when the ngtcp2 backend is used. The quiche backend is still not doing this, but really should.

urlapi: accept port number zero

Years ago I wrote a blog post about using port zero in URLs to do transfers. Then it turned out port zero did not work like that with curl anymore so work was done and now order is restored again and port number zero is once again fine to use for curl.

urlapi: provide more detailed return codes

There are a whole range of new error codes introduced that help better identify and pinpoint what the problem is when a URL or a part of a URL cannot be parsed or will not be accepted. Instead of the generic “failed to parse URL”, this can now better tell the user what part of the URL that was found out to be bad.

socks5h: use appropriate ATYP for numerical IP address

curl supports using SOCKS5 proxies and asking the proxy to resolve the host name, what we call socks5h. When using this protocol and using a numerical IP address in the URL, curl would use the SOCKS protocol slightly wrong and pass on the wrong “ATYP” parameter which a strict proxy might reject. Fixed now.

Coming up?

The curl factory never stops. There are many pull-requests already filed and in the pipeline of possibly getting merged. There will also, without any doubts, be more ones coming up that none of us have yet thought about or considered. Existing pending topics might include:

  • the ManageSieve protocol
  • --no-clobber
  • CURLMOPT_STREAM_WINDOW_SIZE
  • Remove Mesalink support
  • HAproxy protocol v2
  • WebSockets
  • Export/import SSL session-IDs
  • HTTP/3 fixes
  • more hyper improvements
  • CURLFOLLOW_NO_CUSTOMMETHOD

Next release

March 2, 2022 is the scheduled date for what will most probably become curl 7.82.0.

The curl year 2021

Every year is a curl year.

I’m saving my bigger summary for curl’s 24th birthday in March, but when reaching the end of a calendar year it feels natural and even fun to look back and highlight some of the things we accomplished and what happened during this particular lap around the sun. I decided to pick five areas to highlight.

This has been another great curl year and it has been my pleasure to serve this project working full time with it, and I intend to keep doing it next year as well.

Activities, contribution and usage have all grown. I don’t think there has ever before been a more curl year than 2021.

Contributions

In 2021, the curl project beats all previous project records in terms of contribution. More than 180 individuals authored commits to the source code repository, out of more than 130 persons were first-time committers. Both numbers larger than ever before.

The number of authors per month was also higher than ever before and we end the year with a monthly average of 25 authors.

The number of committers who authored ten or more commits within a single year lands on 15 this year. A new record, up from the previous 13 in 2014 and 2017.

We end this year with the amazing number of more than 2,550 persons listed as contributors. We are also very close to reaching 1,000 committers. We are just a dozen authors away. Learn how to help us!

I personally have done about 60% of all commits to curl in 2021 and I was awarded a GitHub star earlier this year. I was a guest on eight podcast episodes this year, talking curl at least partly in all of them.

New backends

This year we introduced support for two new backends in curl: hyper and rustls. I suppose it is a sign of the times that both of them are written in Rust and could be a stepping stone into a future with more curl components written in memory safe languages.

We actually got an increase in number of CVEs reported in 2021, 13 separate ones, after previously having had a decreasing trend the last few years. To remind us that security is still crucial!

Technically we merged the first hyper code already in late 2020 but we’ve worked on it through 2021 and this has made it work almost on par with the native code now.

None of these two new backends are yet used or exercised widely yet in curl, but we are moving in that direction. Slowly but surely.

Also backend related, during 2021 we removed the default TLS library choice when building curl and instead push that decision to get made by the person building curl. It refuses to build unless a choice is made.

The backends in curl provides build-time “pluggable” functionality.

Everything curl

In September 2015 I started to write Everything curl. The book to cover all there is to know and learn about curl. The project, the command line tool and the library.

When I started out, I wrote a lot of titles and sub-titles that I figured should be covered and detailed. For those that I didn’t yet have any text written I just wrote TBD. Over time I thought of more titles so I added more TBDs all over – and I created myself a script that would list which files that had the most number of TBDs outstanding. I added more and more text and explanations over time, but the more content I added I often thought of even more things that were still missing.

It took until December 15, 2021 to erase the final TBD occurrence! Six years and three months.

Presently, everything curl consists of more than 81,000 words in 12,000 lines of text. Done using more than 1,000 commits.

There are and probably always will be details missing and text that can be improved and clarified, but all the sections I once thought out should be there now at least are present and covered! I trust that users will tell us what we miss, and as we continue to grow and develop curl there will of course pop up new things to add to the book.

Death threat

In February 2021 I received a death threat by email. It is curl related because I was targeted entirely because my name is in the curl copyright statement and license and that is (likely) how the person found and contacted me. Months later, the person who sent me the threat apologized for his behavior.

It was something of a brutal awakening for me that reminded me with far too much clarity than I needed, that everything isn’t always just fun and games when people find my email address in their systems.

I filed a police report. I had a long talk with my wife. It shook my world there for a moment and it hinted of the abyss of darkness that lurk out there. I cannot say that it particularly changed my life or how I go about with curl development since then, but I think maybe it took away some of the rosy innocence out of the weird emails I get.

Mars

Not only did we finally get confirmation this year that curl is used in space – we learned that curl was used in the Mars 2020 Helicopter Mission! Quite possibly one of the coolest feats an open source project can pride itself with.

GitHub worked with NASA and have given all contributors to participating projects with a GitHub account a little badge on their profile. Shown here on the right. I think this fact alone might have helped attract more contributors this year. Getting your code into curl gets your contributions to places few other projects go.

There’s no info anywhere as to what function and purpose curl had exactly in the project and we probably will never know, but I think we can live with that. Now we are aiming for more planets.

Keeping curl safe

I’ve talked on this topic before but I realized I never did a proper blog post on the topic. So here it is: how we develop curl to keep it safe. The topic of supply chain security is one that is discussed frequently these days and every so often there’s a very well used (open source) component that gets a terrible weakness revealed.

Don’t get me wrong. Proprietary packages have their share of issues as well, and probably even more so, but for obvious reasons we never get the same transparency, details and insight into those problems and solutions.

curl

curl, in the shape of libcurl primarily, is one of the world’s most commonly used software components. It is installed in somewhere around ten billion installations world wide. It might even be forty billion. Nobody knows.

If we would find a critical vulnerability in curl, it could potentially exist in every internet-connected device on the globe. We don’t want that.

A critical security flaw in our products would be bad, but we also similarly need to make sure that we provide APIs and help users of our products to be safe and to use curl safely. To make sure users of libcurl don’t accidentally end up getting security problems, to the best of our ability.

In the curl project, we work hard to never have our own version of a “heartbleed moment“. How do we do this?

Always improving

Our method is not strange, weird or innovative. We simply apply all best practices, tools and methods that are available to us. In all areas. As we go along, we tighten the screws and improve our procedures, learning from past mistakes.

There are no short cuts or silver bullets. Just hard work and running tools.

Not a coincidence

Getting safe and secure code into your product is not something that happens by chance. We need to work on it and we need to make a concerned effort. We must care about it.

We all know this and we all know how to do it, we just need to make sure that we also actually do it.

The steps

  1. Write code following the rules
  2. Review written code and make sure it is clear and easy to read.
  3. Test the code. Before and after merge
  4. Verify the products and APIs to find cracks
  5. Bug-bounty to reward outside helpers
  6. Act on mistakes – because they will happen

Writing

For users of libcurl we provide an API with safe and secure defaults as we understand the power of the default. We also document everything with details and take great pride in having world-class documentation. To reduce the risk of applications becoming unsafe just because our API was unclear.

We also document internal APIs and functions to help contributors write better code when improving and changing curl.

We don’t allow compiler warnings to remain – on any platform. This is sometimes quite onerous since we build on such a ridiculous amount of systems.

We encourage use of source code comments and assert()s to make assumptions obvious. (curl is primarily written in C.)

Number of lines of (product) code in the curl project over time.

Review

All code should be reviewed. Maintainers are however allowed to review and merge their own pull-requests for practical reasons.

Code should be easy to read and understand. Our code style must be followed and encourages that: for example, no assignments in conditions, one statement per line, no lines longer than 80 columns and more.

Strict compliance with the code style also means that the code gets a flow and a consistent look, which makes it easier to read and manage. We have a tool that verifies most aspects of the code style, which takes away most of that duty away from humans. I find that PR authors generally take code style remarks better when pointed out by a tool than when humans do it.

A source code change is accompanied with a git commit message that need to follow the template. A consistent commit message style makes it easier to later come back and understand it proper when viewing source code history.

Test

We want everything tested.

  • Unit tests. We strive at writing more and more unit tests of internal functions to make sure they truly do what expected.
  • System tests. Do actual network transfers against test servers, and make sure different situations are handled.
  • Integration tests. Test libcurl and its APIs and verify that they handle what they are expected to.
  • Documentation tests. Check formats, check references and cross-reference with source code, check lists that they include all items, verify that all man pages have all sections, in the same order and that they all have examples.
  • “Fix a bug? Add a test!” is a mantra that we don’t always live up to, but we try.

curl runs on 80+ operating systems and 20+ CPU architectures, but we only run tests on a few platforms. This usually works out fine because most of the code is written to run on multiple platforms so if tested on one, it will also run fine on all the other.

curl has a flexible build system that offers many million different build combinations with over 30 different possible third-party libraries in countless version combinations. We cannot test all build combos, but we try to test all the popular ones and at least one for each config option enabled and disabled.

We have many tests, but there are unfortunately still gaps and details not tested by the test suite. For those things we simply have to rely on the code review and then that users report problems in the shipped products.

Number of test cases, test files really, over time.

Verify

We run all the tests using valgrind to make sure nothing leaks memory or do bad memory accesses.

We build and run with address, undefined behavior and integer overflow sanitizers.

We are part of the OSS-Fuzz project which fuzzes curl code non-stop, and we run CIFuzz in CI builds, which runs “a little” fuzzing on the curl code in the normal pull-request process.

We do “torture testing“: run a test case once and count the number of “fallible” function calls it makes. Those are calls to memory allocation, file operations, socket read/write etc. Then re-run the test that many times, and for each new iteration we make another one of the fallible functions fail and return error. Verify that no memory leaks or crashes occur. Do this on all tests.

We use several different static code analyzers to scan the code checking for flaws and we always fix or otherwise handle every reported defect. Many of them for each pull-request and commit, some are run regularly outside of that process:

  • scan-build
  • clang tidy
  • lgtm
  • CodeQL
  • Lift
  • Coverity

The exact set has varied and will continue to vary over time as services come and go.

Bug-bounty

No matter how hard we try, we still ship bugs and mistakes. Most of them of course benign and harmless but some are not. We run a bug-bounty program to reward security searchers real money for reported security vulnerabilities found in curl. Until today, we have paid almost 17,000 USD in total and we keep upping the amounts for new findings.

Accumulated bug-bounty payouts over time

When we report security problems, we produce detailed and elaborate advisories to help users understand every subtle detail about the problem and we provide overview information that shows exactly what versions are vulnerable to which problems. The curl project aims to also be a world-leader in security advisories and related info.

Act on mistakes

We are not immune, no matter how hard we try. Bad things will happen. When they do, we:

  • Act immediately.
  • Own the problem, responsibly
  • Fix it and announce it – as soon as possible
  • Learn from it
  • Make it harder to do the same or similar mistakes again

Does it work? Do we actually learn from our history of mistakes? Maybe. Having our product in ten billion installations is not a proof of this. There are some signs that might show we are doing things right:

  • We were reporting fewer CVEs/year the last few years but in 2021 we went back up. It could also be the result of more people looking, thanks to the higher monetary rewards offered. At the same time the number of lines of code have kept growing at a rate of around 6,000 lines per year.
  • We get almost no issues reported by OSS-Fuzz anymore. The first few years it ran it found many problems.
  • We are able to increase our bug-bounty payouts significantly and now pay more than one thousand USD almost every time. We know people are looking hard for security bugs.
Security vulnerabilities. Fixed vs Introduced over the years.

Continuous Integration

For every pull-request and commit done in the project, we run about 100 different builds + test rounds.

Total number of CI builds per pull-request and commit, over time
  • Test code style
  • Run thousands of tests per build
  • Build and test on tens of platforms
  • Over twenty hours of CPU time per commit
  • Done using several different CI services for maximum performance, widest possible coverage and shortest time to completion.

We currently use the following CI services: Cirrus CI, AppVeyor, Azure Pipelines, GitHub Actions, Circle CI and Zuul CI.

We also have a separate autobuild system with systems run by volunteers that checkout the latest code, build, run all the tests and report back in a continuous manner a few times or maybe once per day.

New habits past mistakes have taught us

We have done several changes to curl internals as direct reactions to past security vulnerabilities and their root causes. Lessons learned.

Unified dynamic buffer functions

These days we have a family of functions for working with dynamically sized buffers. Be using the same set for this functionality we have it well tested and we reduce the risk that new code messes up. Again, nothing revolutionary or strange, but as curl had grown organically over the decades, we found ourselves in need of cleaning this up one day. So we did.

Maximum string sizes

Several past mistakes came from possible integer overflows due to libcurl accepting input string sizes of unrestricted lengths and after doing operations on such string sizes, they would sometimes lead to overflows.

Since a few years back now, no string passed to curl is allowed to be larger than eight megabytes. This limit is somewhat arbitrarily set but is meant to be way larger than the largest user names and passwords ever used etc. We could also update the limit in a future, should we want. It’s not a limit that is exposed in the API or even mentioned. It is there to trap mistakes and malicious use.

Avoid reallocs

Thanks to the previous points we now avoid realloc as far as possible outside of those functions. History shows that realloc in combination with integer overflows have been troublesome for us. Now, both reallocs and integer overflows should be much harder to mess up.

Code coverage

A few years ago we ran code coverage reports for one build combo on one platform. This generated a number that really didn’t mean a lot to anyone but instead rather mislead users to drawing funny conclusions based on the report. We stopped that. Getting a “complete” and representative number for code coverage for curl is difficult and nobody has yet gone back to attempt this.

The impact of security problems

Every once in a while someone discovers a security problem in curl. To date, those security vulnerabilities have been limited to certain protocols and features that are not used by everyone and in many cases even disabled at build-time among many users. The issues also often rely on either a malicious user to be involved, either locally or remotely and for a lot of curl users, the environments it runs in limit that risk.

To date, I’m not aware of any curl user, ever, having been seriously impacted by a curl security problem.

This is not a guarantee that it will not ever happen. I’m only stating facts about the history so far. Security is super hard and I can only promise that we will keep working hard on shipping secure products.

Is it scary?

Changes done to curl code today will end up in billions of devices within a few years. That’s an intimidating fact that could truly make you paralyzed by fear of the risk that the world will “burn” due to a mistake of mine.

Rather than instilling fear by this outlook, I think the proper way to think it about it, is respecting the challenge and “shouldering the responsibility”. Make the changes we deem necessary, but make them according to the guidelines, follow the rules and trust that the system we have setup is likely to detect almost every imaginable mistake before it ever reaches a release tarball. Of course we plug holes in the test suite that we spot or suspect along the way.

The back-door threat

I blogged about that recently. I think a mistake is much more likely to slip-in and get shipped to the world than a deliberate back-door is.

Memory safe components might help

By rewriting parts of curl to use memory safe components, such as hyper for HTTP, we might be able to further reduce the risk of future vulnerabilities. That’s a long game to make reality. It will also be hard in the future to actually measure and tell for sure if it truly made an impact.

How can you help out?

  1. Pay for a curl support contract. This is what enables me to work full time on curl.
  2. Help out with reviews and adding new tests to curl
  3. Help out with fixing issues and improving the code
  4. Sponsor curl
  5. Report all bugs you find
  6. Upgrade your systems to run modern curl versions

Credits

Image by Dorian Krauss from Pixabay

No easter eggs in curl

Easter egg; noun:

An unexpected or undocumented feature in a piece of computer software, included as a joke or a bonus.

There are no Easter eggs in curl. For the good.

I’ve been asked about this many times. Among the enthusiast community, people seem to generally like the concept of Easter eggs and hidden treasures, features and jokes in software and devices. Having such an embedded surprise is considered fun and curl being a cool and interesting project should be fun too!

With the risk of completely ruining my chances of ever being considered a fun person, I’ll take you through my thought process on why curl does not feature any such Easter eggs and why it will not have any in the future either.

Trust

The primary and main reason is the question of trust.

We deliver products with known and documented functionality. Everything is known and documented. There’s nothing secret or hidden. The users see it all, can learn it all and it all is documented. We are always 100% transparent.

curl is installed in some ten billion installations to date and we are doing everything we can to be responsible and professional to make sure curl can and will be installed in many more places going forward.

Having an Easter egg in curl would violate several of the “commandments” we live by. If we could hide an Easter egg, what else is there that we haven’t shown or talked about?

Security

Everything in curl needs to be scrutinized, poked at, “tortured” and reviewed for security. An Easter egg would as well, as otherwise it would be an insecure component and therefor a security risk. This makes it impossible to maintain an Easter egg even almost secret.

Adding code to perform an Easter egg would mean adding code that potentially could cause problems to users by the plain unexpected nature of an Easter egg. Unexpected behavior is not a good foundation for security and secure procedures.

Boring is good

curl is not meant to be “fun” (on that fun scale). curl is here to perform its job, exactly as documented and expected and it is not meant to be fun. Boring is good and completely predictable. Boring is to deliver nothing else than the expected.

Even more security

If we would add an Easter egg, which by definition would be a secret or surprise to many, it would need to be hidden or sneaked in somehow and remain undocumented. We cannot allow code or features to get “snuck in” or remain undocumented.

If we would allow some features to get added like that, where would we draw the line? What other functionality and code do we merge into curl without properly disclosing and documenting it?

Useless work

If we would allow an Easter egg to get merged, we would soon start getting improvements to the egg code and people would like to add more eggs and to change the existing one. We would spend time and effort on the silly parts and we would need to spend testing and energy on these jokes instead of the real thing. We already have enough work without adding irrelevant work to the pile.

“Unintended Easter eggs”

We frequently ship bugs and features that go wrong. Due to fluke or random accidents, some of those mistakes can perhaps at times almost appear as Easter eggs, if you try hard. Still, when they are not done on purpose they are just bugs – not Easter eggs – and we will fix them as soon as we get them reported and have the chance.

A cover-up?

Yes, some readers will take this denial as a sign that there actually exists an Easter egg in curl and I am just doing my best to hide it. My advice to you, if you are one of those thinking this, is to read the code. We all benefit if more people read and carefully investigate the code so we will just be happy if you do and then ask us about whatever you think is unclear or “suspicious”.

I am not judging

This is not a judgement on projects that ship Easter eggs. I respect and acknowledge that different projects and people resonate differently on these topics.

Credits

Image by anncapictures from Pixabay

Why curl is used everywhere, even on Mars

When I speak of curl in this post, I lump curl the command line tool and libcurl the library into one, and I just call them curl. Related: my webinar titled “Why everyone is using curl and you should too“.

I believe just about every curl user has their own story and explanation about why they use curl in their product or device. I think there are several good reasons why users, including many of the world’s largest and most successful tech giants, decide they can lean on curl for Internet transfers.

curl is used in mobile phone and tablet apps, it is used in TVs, cars, motorcycles, fridges, settop boxes, printers, smart watches, medical devices and computer games, both on desktop and in game consoles and of course in just about every web or Internet server out there. It was also used to land on mars. Put simpler: in almost every internet-connected device.

Buy why use curl?

I want to highlight these four main properties:

In each of these areas, curl is one of the top alternatives if you compare against competitors.

It maintains this position because of its standing on a very firm foundation; a backbone built on Open Source, Leadership, Testing and Security.

Reliable

Solid code and product – most people and users never experience any bugs or problems with it. They keep using it over decades without any glitch.

Secure, means we deliver products that are safe and secure by default and we keep a strong focus and care about security related flaws in the project.

Stable API and ABI. Our users know they can always upgrade to a later version without worry. Things that worked before will continue to work tomorrow and forever.

Constantly refined. curl keeps up with the Internet, our users, new protocols and evolving standards. We never stop.

Available

It runs everywhere. Whatever platform of choice you go with, you can trust that curl can run there too. With the same API.

curl is here and has been around since decades. The same trustworthy product and API you knew in the past is still here and we do not plan to go anywhere. We have a long and proven track record that holds for all and any scrutinizing you do.

Often pre-installed on and with operating systems, making it easy to access and use.

Thoroughly documented. A library and its API is only as good as its documentation. Users need to be able to figure out available options and how the API is to be used to make effective and secure applications. curl has world-class documentation, including all the tiny details you might need. Online and offline.

Capable

Powerful. The provided API is versatile enough to power the Internet transfers for virtually all kinds of applications and use cases.

Fast. Speed is important.

curl supports “all the protocols” a modern application needs and the implementation is interoperable, proven and battle-tested in the wild and over the Internet for a long time.

The Internet is a crazy place and there are countless ways to do transfers. curl offers a myriad of features to please the most demanding users.

The flexible build allows users to streamline and control exactly what their curl build supports and provides.

Affordable

curl is free and open source under a liberal license. You don’t have to pay anyone to use or run it. We also provide free help and support on the mailing lists.

We offer commercial support to help users use curl and solve any related issues you might run into. Using curl does not waste your engineers’ time.

curl is also easy to contribute to, for the cases where you want to fix a bug or add a feature. We are a no-friction, no-bureaucracy project with a positive attitude and low bar for newcomers.

The Foundation

All of this is possible because of a solid and firm project foundation. We are Open Source, with full transparency and ability for everyone to inspect and follow along. To verify every claim.

I think we have a good Leadership, in which I of course bang my own drum a little bit, but we have managed to steer this boat for a long time in a direction that has made curl able power a world of Internet transfers. Rules, enforcement, knowledge, communication, guidelines, concepts and atmosphere are important factors. We lead by example.

There’s a strong emphasis on testing the products in a non-stop way as much as possible with numerous tools, from the first pull-requests, to the merging of commits into the main branch and onward, in order to find and fix as many nits as possible before the code reach users.

Almost no project can match the level of detail and information we provide with each and every security vulnerability we have published. We run a generous Bug Bounty program and pay a growing amount of monetary rewards to those who can identify new security problems.

The cycle of curl

When discussing the different qualities and properties of curl and the curl project, none of them were of course built-up or created separately from the others. They all tie in together and we have iteratively built and created curl gradually and little by little over a very long time.

One thing lead to another that leads to the next in a positive spiral that never ends. They are all interconnected and improving curl in one of these areas can definitely have a direct positive effect in one the other areas.

The cycle of curl is about improving curl all over for the benefit of everyone and the entire project, which leads to it just getting better and more used. Which leads to more developers, more features and more users etc. On and on and on without an end.

If you’re up for it: tell me why you use curl!

Free Apple support

Imagine running a trillion dollar company that bundles various open source components into your products, making billions of dollars of profit annually. When one of your users reach out and ask for help, with the product you ship to your customers, you instead refer the user to the open source project. The project which is run by volunteers which you never sponsored with a cent.

Who would do something like that?