Yes C is unsafe, but…

I posted curl is C a few days ago and it raced on hacker news, reddit and elsewhere and got well over a thousand comments in those forums alone. The blog post has been read more than 130,000 times so far.

Addendum a few days later

Many commenters of my curl is C post struck down on my claim that most of our security flaws aren’t due to curl being written in C. It turned out into some sort of CVE counting game in some of the threads.

I think that’s missing the point I was trying to make. Even if 75% of them happened due to us using C, that fact alone would still not be a strong enough reason for me to reconsider our language of choice (at this point in time). We use C for a whole range of reasons as I tried to lay out there in spite of the security challenges the language brings. We know C has tricky corners and we know we are likely to do more mistakes going forward.

curl is currently one of the most distributed and most widely used software components in the universe, be it open or proprietary and there are easily way over three billion instances of it running in appliances, servers, computers and devices across the globe. Right now. In your phone. In your car. In your TV. In your computer. Etc.

If we then have had 40, 50 or even 60 security problems because of us using C, through-out our 19 years of history, it really isn’t a whole lot given the scale and time we’re talking about here.

Using another language would’ve caused at least some problems due to that language, plus I feel a need to underscore the fact that none of the memory safe languages anyone would suggest we should switch to have been around for 19 years. A portion of our security bugs were even created in our project before those alternatives you would suggest were available! Let alone as stable and functional alternatives.

This is of course no guarantee that there isn’t still more ugly things to discover or that we won’t mess up royally in the future, but who will throw the first stone when it comes to that? We will continue to work hard on minimizing risks, detecting problems early by ourselves and work closely together with everyone who reports suspected problems to us.

Number of problems as a measurement

The fact that we have 62 CVEs to date (and more will follow surely) is rather a proof that we work hard on fixing bugs, that we have an open process that deals with the problems in the most transparent way we can think of and that people are on their toes looking for these problems. You should not rate a project in any way purely based on the number of CVEs – you really need to investigate what lies behind the numbers if you want to understand and judge the situation.

Future

Let me clarify this too: I can very well imagine a future where we transition to another language or attempt various others things to enhance the project further – security wise and more. I’m not really ruling anything out as I usually only have very vague ideas of what the future might look like. I just don’t expect it to be happening within the next few years.

These “you should switch language” remarks are strangely enough from the backseat drivers of the Internet. Those who can tell us with confidence how to run our project but who don’t actually show us any code.

Languages

What perhaps made me most sad in the aftermath of said previous post, is everyone who failed to hold more than one thought at a time in their heads. In my post I wrote 800 words on some of the reasoning behind us sticking to the language C in the curl project. I specifically did not say that I dislike certain other languages or that any of those alternative languages are bad or should be avoided. Please friends, I wrote about why curl uses C. There are many fine languages out there and you should all use them as much as you possibly can, and I will too – but not in the curl project (at the moment). So no, I don’t hate language XXXX. I didn’t say so, and I didn’t imply it either. Don’t put that label on me, thanks.

25 thoughts on “Yes C is unsafe, but…”

  1. Much great software is written in C, not despite C but thanks to C.

    I have a feeling this anti-C movement has something intellectually in common with post modernism. We’ll see.

    1. In my experience it has more to do with marketing. There is someone out there who is looking to make money of the latest and “greatest” language yet. Whether it be open source or not every new language/paradigm has attached to it a new market for tools, training and consulting…

      1. I disagree – it’s not marketing, as such. It’s the fans… the people who treat every new technology as religious epiphany, and think it should be used for everything. And some of them do have some actual knowledge, but regrettably most are just hyping something new that’s caught their attention…

    2. No, it has nothing to do with post modernism. C wasn’t either designed for safety nor security. People make errors, and C makes it easy. And when handling safety-critical applications, it’s reasonable to criticize the choice of see. That’s it. One of the biggest problems problems we have in my opinion is, that there is no widespread alternative, which has a stable tool chain. The next two best alternatives in my view would be Ada or Rust. Ada should be stable, but isn’t that widely used. Rust should still be considered unstable and there was no time that a reasonable tool chain could develop.

      One problem of the C-fan(atics)-movement is the idea, that it’s cool to write C and yourself belong to group of people who don’t make mistakes.

  2. I read your previous post, and thought that your reasoning was sound: curl is a very long-lived, very used project, and it is super invested in C. No language can provide enough incentives to make such an established project switch gears. A re-write would produces endless bugs and instability for users, until the authors master the new language as well as they master C now.

    I am not a fan of C, either. Probably curl could have reached the stability it has now in much less time if there had been credible alternatives when the project started.

    But the point of your post was not about “which language is best”, it was about how mature projects have to stick to their implementation because they are a basis on which many other things are built. It is a bit sad that you have to write this follow-up to explain again your point.

  3. I would like to object to: “Using another language would’ve caused at least some problems due to that language, plus I feel a need to underscore the fact that none of the memory safe languages anyone would suggest we should switch to have been around for 19 years.”

    (Have you considered Common Lisp? It was available 19 years ago (granted, fewer implementations existed, but the big commercial implementations already existed, as well as some of the main free software ones).

    But the main point here is that if people used non-C languages 19 years ago, then compiler providers for those other programming languages would have invested time and effort to bring those other language implementations to par with C implementations. (and honestly, I don’t remember that other languages had more problems then than the GCC of that time). It was the case for Ada and Fortran for example.

    I should also mention that the C standards don’t prevent C compilers to perform run-time checks (or to have fat pointers). People should realize that speed is not everything, and foremost is nothing for 99% of the code: when you’re waiting 1/10 second for the next packet, you can can check your bounds and overflows, no excuse, even for a C compiler!

  4. Let’s be clear here. There are a lot of users of curl, and it needs to be maintained for quite a while to come. Rewriting curl in a different language doesn’t solve that maintenance problem. And that’s about the only good reason to keep using C.

    I’m an IT security professional, and have been for almost 20 years now. And a lot of the problems we currently see are due to C being used as parts of critical infrastructure. “What’s 60 security issues over the lifetime of the project” is not the kind of attitude what will make our life on the Internet more secure, if those vulnerabilities could have been prevented in the first place. It’s dangerous. It’s reckless. We need to stop it.

    1. We need to stop it, as a business, yes. But I’m not running the software or internet business. I’m the maintainer of the curl project and we take decisions on what we do now. That’s what I’m writing about. There are many great minds working on “the bigger picture”, creating “safe languages” and more. That’s a separate discussion we should continue having.

      I worry on how to make sure our 60 issues are not 120 in a few years. That’s my back yard. That’s my worry. You can worry over the business.

      1. Actually much appreciated that you do. Unmaintained code is an even worse security problem. And rebuilding critical infrastructure does not happen overnight.

    2. True, but we also need “safe” languages that are actually *engineered* to implement security-sensitive code. This pretty much requires being able to write time-invariant, power-invariant code. This means being able to write code without exception handlers, garbage collection, and any other complex, surprisingly full of sidechannels functionality behind-the-wheels. This means being able to write code that is not going to be utterly broken by surprising optimizations. This means being able to *compile* that code to a very wide range of platforms, or that language is USELESS (hint: make it a frontend for either LLVM or gcc or you have already lost). This means the whole *library ecosystem is secured and designed for configuration management* and not an absurdly insane (and pathetic) mess like one sees in the Web ones (js, Go outside of Google).

      So far, not even Rust makes the cut. Which pretty much means there is no reason for a security-sensitive project to switch away from C. Using instrumented, type-safe C (with the help of static checkers that enforce a strict type safety on top of C and source code marking to drive these) makes a lot more sense *at this time*.

      1. > This means the whole *library ecosystem is secured and designed for configuration management*

        C doesn’t have this, not even close.

        > Using instrumented, type-safe C (with the help of static checkers that enforce a strict type safety on top of C and source code marking to drive these)

        There are no “static checkers that enforce strict type safety on top of C”. For example no C tools or analyses are capable of eliminating use-after-free bugs. What tools do you have in mind?

  5. I totally agree with you Daniel, I am not going to start a flame war, I am just sorry for the people who don’t understand things.

  6. Heh. HN. Sometimes I forget that most redditors are kids who don’t even have jobs and I’ll get caught up in one of their storms. Then I go to blogs of real programmers who write real code for serious projects to get my head straight.

    No one should ever, EVER use reddit or HN as a reference for how things should work.

    1. Any alternative suggestions? I recently started to use lobste.rs . . . but that’s a poor replacement for how I used to use reddit ( back in the day in 2008-10 )

  7. First of all, a great thanks and debt is owed for the arduous work on Curl.

    Second, I work with an OS2200 system [far older than all of us] that has been well served by the FireWall ToolKit: http://fwtk.org

    Since the author admits that the fwtk has been compromised, this is what I now do in an effort to keep that code safe, and this is what I would do with curl should my trust in it ever falter:

    # cat /etc/systemd/system/ftp-gw.socket
    [Unit]
    Description=ftp proxy
    [Socket]
    ListenStream=21
    Accept=yes
    [Install]
    WantedBy=sockets.target

    hx0014 # cat /etc/systemd/system/ftp-gw@.service
    [Unit]
    Description=ftp proxy
    [Service]
    RootDirectory=/home/fwjail
    ExecStart=-/usr/local/etc/ftp-gw
    StandardInput=socket
    User=nobody
    Group=nobody

    The RootDirectory and User/Group directives above establish privilege separation and isolation. They are great security tools for legacy software.

    May the day never come where I need this with Curl.

  8. I saw some comments on HN (where all the wannabes hang out) and it just shows that they no do little except muh css, muh Javascript, muh apple, muh google.

    I remember when MS band came out, someone asked if anybody knew what is the underlying “programming language” and the top comment : js + erlang
    Talk of blind man and his elephant, so detached from reality.

  9. I have been using curl for 19 years and have always appreciated the adherence to standards and simplicity. I have used curl in many of my scripts and for debugging a myriad of internet issues.

    I for one am happy that it is written in C. There are plenty of C developers that can contribute; or in a worst case, even take over should you decide to take a sabbatical. 🙂

  10. “C is a problem but rewriting curl from C into another language is not worthwhile” is a very reasonable position.

    “C is just fine if you’re a good careful programmer and run these tools” is … less reasonable.

    This post sounds like the first statement. Your previous post sounded more like the second statement.

    A lot of us want to move the consensus from the second statement to the first statement because the second does serious harm.

    1. The language itself is neither “a problem” nor “just fine”. It is a perfectly-defined language (like other languages). People who use the language matter the most – I think this should be the consensus.

    1. RustJW’s
      is more accurate. And yes, they are the internet safety strike force. There safe space must be free of leaks.

  11. Could you point me at one of these “you should switch language” comments? What I see are comments like the one at the top of the Hacker News discussion (https://news.ycombinator.com/item?id=13966967) which are challenging that particular point, while not necessarily saying your overall conclusion is wrong. In fact, the comment I linked to starts out with “I have no problem with Curl being written in C”.

    I haven’t chimed in on any of these threads so far, but I think that comment is correct. I absolutely agree with your other points:

    * Curl couldn’t have been written in (for example) Rust 19 years ago,
    * Curl C library is still needed for many projects today,
    * a rewrite would introduce bugs (to some extent—this could be mitigated in large part by good automated tests),
    * etc.

    But I think that looking at bugs one-by-one to see how many of them (or how many of the most serious ones, or however you want to slice it) is the best way of evaluating the specific statement “most of our security flaws aren’t due to curl being written in C”, and evaluating that statement might be an interesting thing to do, even if it doesn’t change your overall conclusion that Curl should continue being written in C.

    For example, some newly-written project might be evaluating programming languages. They might care less about portability and of course have no existing code to port. For them, “most of our security flaws aren’t due to curl being written in C” is the only relevant part of your post, and it might be reasonable for them to use curl’s bugs as a proxy for bugs in their to-be-written project.

  12. Thank you for clarifying on some of this. I agree with all your points, emphatically on the stability of the C language.

Comments are closed.