Category Archives: Open Source

Open Source, Free Software, and similar

cURL and libcurl, Development, Open Source, Security

Yes C is unsafe, but…

March 30, 2017 Daniel Stenberg 25 Comments

I posted curl is C a few days ago and it raced on hacker news, reddit and elsewhere and got well over a thousand comments in those forums alone. The blog post has been read more than 130,000 times so far.

Addendum a few days later

Many commenters of my curl is C post struck down on my claim that most of our security flaws aren’t due to curl being written in C. It turned out into some sort of CVE counting game in some of the threads.

I think that’s missing the point I was trying to make. Even if 75% of them happened due to us using C, that fact alone would still not be a strong enough reason for me to reconsider our language of choice (at this point in time). We use C for a whole range of reasons as I tried to lay out there in spite of the security challenges the language brings. We know C has tricky corners and we know we are likely to do more mistakes going forward.

curl is currently one of the most distributed and most widely used software components in the universe, be it open or proprietary and there are easily way over three billion instances of it running in appliances, servers, computers and devices across the globe. Right now. In your phone. In your car. In your TV. In your computer. Etc.

If we then have had 40, 50 or even 60 security problems because of us using C, through-out our 19 years of history, it really isn’t a whole lot given the scale and time we’re talking about here.

Using another language would’ve caused at least some problems due to that language, plus I feel a need to underscore the fact that none of the memory safe languages anyone would suggest we should switch to have been around for 19 years. A portion of our security bugs were even created in our project before those alternatives you would suggest were available! Let alone as stable and functional alternatives.

This is of course no guarantee that there isn’t still more ugly things to discover or that we won’t mess up royally in the future, but who will throw the first stone when it comes to that? We will continue to work hard on minimizing risks, detecting problems early by ourselves and work closely together with everyone who reports suspected problems to us.

Number of problems as a measurement

The fact that we have 62 CVEs to date (and more will follow surely) is rather a proof that we work hard on fixing bugs, that we have an open process that deals with the problems in the most transparent way we can think of and that people are on their toes looking for these problems. You should not rate a project in any way purely based on the number of CVEs – you really need to investigate what lies behind the numbers if you want to understand and judge the situation.

Future

Let me clarify this too: I can very well imagine a future where we transition to another language or attempt various others things to enhance the project further – security wise and more. I’m not really ruling anything out as I usually only have very vague ideas of what the future might look like. I just don’t expect it to be happening within the next few years.

These “you should switch language” remarks are strangely enough from the backseat drivers of the Internet. Those who can tell us with confidence how to run our project but who don’t actually show us any code.

Languages

What perhaps made me most sad in the aftermath of said previous post, is everyone who failed to hold more than one thought at a time in their heads. In my post I wrote 800 words on some of the reasoning behind us sticking to the language C in the curl project. I specifically did not say that I dislike certain other languages or that any of those alternative languages are bad or should be avoided. Please friends, I wrote about why curl uses C. There are many fine languages out there and you should all use them as much as you possibly can, and I will too – but not in the curl project (at the moment). So no, I don’t hate language XXXX. I didn’t say so, and I didn’t imply it either. Don’t put that label on me, thanks.

cURL and libcurl, Development, Open Source

curl is C

March 27, 2017 Daniel Stenberg 29 Comments

For some reason, this post got picked up again and is debated today in 2021, almost 4 years since I wrote it. Some things have changed in the mean time and I might’ve phrased a few things differently if I had written this today. But still, what’s here below is what I wrote back then. Enjoy!

Every once in a while someone suggests to me that curl and libcurl would do better if rewritten in a “safe language”. Rust is one such alternative language commonly suggested. This happens especially often when we publish new security vulnerabilities. (Update: I think Rust is a fine language! This post and my stance here has nothing to do with what I think about Rust or other languages, safe or not.)

curl is written in C

The curl code guidelines mandate that we stick to using C89 for any code to be accepted into the repository. C89 (sometimes also called C90) – the oldest possible ANSI C standard. Ancient and conservative.

C is everywhere

This fact has made it possible for projects, companies and people to adopt curl into things using basically any known operating system and whatever CPU architecture you can think of (at least if it was 32bit or larger). No other programming language is as widespread and easily available for everything. This has made curl one of the most portable projects out there and is part of the explanation for curl’s success.

The curl project was also started in the 90s, even long before most of these alternative languages you’d suggest, existed. Heck, for a truly stable project it wouldn’t be responsible to go with a language that isn’t even old enough to start school yet.

Everyone knows C

Perhaps not necessarily true anymore, but at least the knowledge of C is very widespread, where as the current existing alternative languages for sure have more narrow audiences or amount of people that master them.

C is not a safe language

Does writing safe code in C require more carefulness and more “tricks” than writing the same code in a more modern language better designed to be “safe” ? Yes it does. But we’ve done most of that job already and maintaining that level isn’t as hard or troublesome.

We keep scanning the curl code regularly with static code analyzers (we maintain a zero Coverity problems policy) and we run the test suite with valgrind and address sanitizers.

C is not the primary reason for our past vulnerabilities

There. The simple fact is that most of our past vulnerabilities happened because of logical mistakes in the code. Logical mistakes that aren’t really language bound and they would not be fixed simply by changing language.

Of course that leaves a share of problems that could’ve been avoided if we used another language. Buffer overflows, double frees and out of boundary reads etc, but the bulk of our security problems has not happened due to curl being written in C.

C is not a new dependency

It is easy for projects to add a dependency on a library that is written in C since that’s what operating systems and system libraries are written in, still today in 2017. That’s the default. Everyone can build and install such libraries and they’re used and people know how they work.

A library in another language will add that language (and compiler, and debugger and whatever dependencies a libcurl written in that language would need) as a new dependency to a large amount of projects that are themselves written in C or C++ today. Those projects would in many cases downright ignore and reject projects written in “an alternative language”.

curl sits in the boat

In the curl project we’re deliberately conservative and we stick to old standards, to remain a viable and reliable library for everyone. Right now and for the foreseeable future. Things that worked in curl 15 years ago still work like that today. The same way. Users can rely on curl. We stick around. We don’t knee-jerk react to modern trends. We sit still in the boat. We don’t rock it.

Rewriting means adding heaps of bugs

The plain fact, that also isn’t really about languages but is about plain old software engineering: translating or rewriting curl into a new language will introduce a lot of bugs. Bugs that we don’t have today.

Not to mention how rewriting would take a huge effort and a lot of time. That energy can instead today be spent on improving curl further.

What if

If I would start the project today, would I’ve picked another language? Maybe. Maybe not. If memory safety and related issues was the primary concern I had, then sure. But as I’ve mentioned above there are several others concerns too so it would really depend on my priorities.

Finally

At the end of the day the question that remains is: would we gain more than we would pay, and over which time frame? Who would gain and who would lose?

I’m sure that there will be or it may even already exist, curl and libcurl competitors and potent alternatives written in most of these new alternative languages. Some of them are absolutely really good and will get used and reach fame and glory. Some of them will be crap. Just like software always work. Let a thousand curl competitors bloom!

Will curl be rewritten at some point in the future? I won’t rule it out, but I find it unlikely. I find it even more unlikely that it will happen in the short term or within the next few years.

Discuss this post on Hacker news or Reddit!

Followup-post: Yes, C is unsafe, but…

cURL and libcurl

curlup 2017: curl now

March 20, 2017 Daniel Stenberg

At curlup 2017 in Nuremberg, I did a keynote and talked a little about the road to what we are and where we are right now in the curl project. There will hopefully be a recording of this presentation made available soon, but I wanted to entertain you all by also presenting some of the graphs from that presentation in a blog format for easy access and to share the information.

Some stats and numbers from the curl project early 2017. Unless otherwise mentioned, this is based on the availability of data that we have. The git repository has data from December 1999 and we have detailed release information since version 6.0 (September 13, 1999).

Web traffic

First out, web site traffic to curl.haxx.se over the seven last full years that I have stats for. The switch to a HTTPS-only site happened in February 2016. The main explanation to the decrease in spent bandwidth in 2016 is us removing the HTML and PDF versions of all documentation from the release tarballs (October 2016).

My log analyze software also tries to identify “human” traffic so this graph should not include the very large amount of bots and automation that hits our site. In total we serve almost twice the amount of data to “bots” than to human. A large share of those download the cacert.pem file we host.

Since our switch to HTTPS we have a 301 redirect from the HTTP site, and we still suffer from a large number of user-agents hitting us over and over without seemingly following said redirect…

Number of lines in git

Since we also have documentation and related things this isn’t only lines of code. Plain and simply: lines added to files that we have in git, and how the number have increased over time.

There’s one notable dip and one climb and I think they both are related to how we have rearranged documentation and documentation formatting.

Top-4 author’s share

This could also talk about how seriously we suffer from “the bus factor” in this project. Look at how large share of all commits that the top-4 commiters have authored. Not committed; authored. Of course we didn’t have proper separation between authors and committers before git (March 2010).

Interesting to note here is also that the author listed second here is Yang Tse, who hasn’t authored anything since August 2013. Me personally seem to have plateaued at around 57% of all commits during the recent year or two and the top-4 share is slowly decreasing but is still over 80% of the commits.

I hope we can get the top-4 share well below 80% if I rerun this script next year!

Number of authors over time

In comparison to the above graph, I did one that simply counted the total number of unique authors that have contributed a change to git and look at how that number changes over time.

The time before git is, again, somewhat of a lie since we didn’t keep track of authors vs committers properly then so we shouldn’t put too much value into that significant knee we can see on the graph.

To me, the main take away is that in spite of the top-4 graph above, this authors-over-time line is interestingly linear and shows that the vast majority of people who contribute patches only send in one or maybe a couple of changes and then never appear again in the project.

My hope is that this line will continue to climb over the coming years.

Commits per release

We started doing proper git tags for release for curl 6.5. So how many commits have we done between releases ever since? It seems to have gone up and down over time and I added an average number line in this graph which is at about 150 commits per release (and remember that we attempt to do them every 8 weeks since a few years back).

Towards the right we can see the last 20 releases or so showing a pattern of high bar, low bar, and I’ll get to that more in a coming graph.

Of course, counting commits is a rough measurement as they can be big or small, easy or hard, good or bad and this only counts them.

Commits per day

As the release frequency has varied a bit over time I figured I should just check and see how many commits we do in the project per day and see how that has changed (or not) over time. Remember, we are increasing the number of unique authors fairly fast but the top-4 share of “authorship” is fairly stable.

Turns our the number of commits per day has gone up and down a little bit through the git history but I can’t spot any obvious trend here. In recent years we seem to keep up more than 2 commits per day and during intense periods up to 8.

Days per release

Our general plan is since bunch of years back to do releases every 8 weeks like a clock work. 8 weeks is 56 days.

When we run into serious problems, like bugs that are really annoying or tedious to users or if we get a really serious security problem reported, we sometimes decide to go outside of the regular release schedule and ship something before the end of the 8-week cycle.

This graph clearly shows that over the last, say 20, releases we clearly have felt ourselves “forced” to do follow-up releases outside of the regular schedule. The right end of the graph shows a very clear saw-tooth look that proves this.

We’ve also discussed this slightly on the mailing list recently, and I’m certainly willing to go back and listen to people as to what we can do to improve this situation.

Bugfixes per release

We keep close track of all bugfixes done in git and mark them up and mention them in the RELEASE-NOTES document that we ship in every new release.

This makes it possible for us to go back and see how many bug fixes we’ve recorded for each release since curl 6.5. This shows a clear growth over time. It’s interesting since we don’t see this when we count commits, so it may just be attributed to having gotten better at recording the bugs in the files. Or that we now spend fewer commits per bug fix. Hard to tell exactly, but I do enjoy that we fix a lot of bugs…

Days spent per bugfix

Another way to see the data above is to count the number of bug fixes we do over time and just see how many days we need on average to fix bugs.

The last few years we do more bug fixes than there are days so if we keep up the trend this shows for 2017 we might be able to reach down to 0.5 days per bug fix on average. That’d be cool!

Coverity scans

We run coverity scans on the curl cover regularly and this service keeps a little graph for us showing the number of found defects over time. These days we have a policy of never allowing a defect detected by Coverity to linger around. We fix them all and we should have zero detected defects at all times.

The second graph here shows a comparison line with “other projects of comparable size”, indicating that we’re at least not doing badly here.

Vulnerability reports

So in spite of our grand intentions and track record shown above, people keep finding security problems in curl in a higher frequency than every before.

Out of the 24 vulnerabilities reported to the curl project in 2016, 7 was the result of the special security audit that we explicitly asked for, but even if we hadn’t asked for that and they would’ve remained unknown, 17 would still have stood out in this graph.

I do however think that finding – and reporting – security problem is generally more good than bad. The problems these reports have found have generally been around for many years already so this is not a sign of us getting more sloppy in recent years, I take it as a sign that people look for these problems better and report them more often, than before. The industry as a whole looks on security problems and the importance of them differently now than it did years ago.

cURL and libcurl, Open Source

curl up 2017, the venue

March 20, 2017 Daniel Stenberg

The fist ever physical curl meeting took place this last weekend before curl’s 19th birthday. Today curl turns nineteen years old.

After much work behind the scenes to set this up and arrange everything (and thanks to our awesome sponsors to contributed to this), over twenty eager curl hackers and friends from a handful of countries gathered in a somewhat rough-looking building at curl://up 2017 in Nuremberg, March 18-19 2017.

The venue was in this old factory-like facility but we put up some fancy signs so that people would find it:

Yes, continue around the corner and you’ll find the entrance door for us:

I know, who’d guessed that we would’ve splashed out on this fancy conference center, right? This is the entrance door. Enter and look for the next sign.

Yes, move in here through this door to the right.

And now, up these stairs…

When you’ve come that far, this is basically the view you could experience (before anyone entered the room):

And when Igor Chubin presents about wttr,in and using curl to do console based applications, it looked like this:

It may sound a bit lame to you, but I doubt this would’ve happened at all and it certainly would’ve been less good without our great sponsors who helped us by chipping in what we didn’t want to charge our visitors.

Thank you very much Kippdata, Ergon, Sevenval and Haxx for backing us!

cURL and libcurl

19 years ago

March 20, 2017 Daniel Stenberg 20 Comments

19 years ago on this day I released the first ever version of a software project I decided to name curl. Just a little hobby you know. Nothing fancy.

19 years ago that was a few hundred lines of code. Today we’re at around 150.000 lines.

19 years ago that was mostly my thing and I sent it out hoping that *someone* would like it and find good use. Today virtually every modern internet-connected device in the world run my code. Every car, every TV, every mobile phone.

19 years ago was a different age not only to me as I had no kids nor house back then, but the entire Internet and world has changed significantly since.

19 years ago we’d had a handful of persons sending back bug reports and a few patches. Today we have over 1500 persons having helped out and we’re adding people to that list at a rapid pace.

19 years ago I would not have imagined that someone can actually stick around in a project like this for this long time and still find it so amazingly fun and interesting still.

19 years ago I hadn’t exactly established my “daily routine” of spare time development already but I was close and for the larger part of this period I have spent a few hours every day. All days really. Working on curl and related stuff. 19 years of a few hours every day equals a whole lot of time

I took us 19 years minus two days to have our first ever physical curl meeting, or conference if you will.

cURL and libcurl

Some curl numbers

February 22, 2017 Daniel Stenberg

We released the 163rd curl release ever today. curl 7.53.0 – approaching 19 years since the first curl release (6914 days to be exact).

It took 61 days since the previous release, during which 47 individuals helped us fix 95 separate bugs. 25 of these contributors were newcomers. In total, we now count more than 1500 individuals credited for their help in the project.

One of those bug-fixes, one was a security vulnerability, upping our total number of vulnerabilities through the years to 62.

Since the previous release, 7.52.1, 155 commits were made to the source repository.

The next curl release, our 164th, is planned to ship in exactly 8 weeks.

Mozilla, Network, Open Source, Web

Post FOSDEM 2017

February 6, 2017 Daniel Stenberg

I attended FOSDEM again in 2017 and it was as intense, chaotic and wonderful as ever. I met old friends, got new friends and I got to test a whole range of Belgian beers. Oh, and there was also a set of great open source related talks to enjoy!

On Saturday at 2pm I delivered my talk on curl in the main track in the almost frighteningly large room Janson. I estimate that it was almost half full, which would mean upwards 700 people in the audience. The talk itself went well. I got audible responses from the audience several times and I kept well within my given time with time over for questions. The trickiest problem was the audio from the people who asked questions because it wasn’t at all very easy to hear, while the audio is great for the audience and in the video recording. Slightly annoying because as everyone else heard, it made me appear half deaf. Oh well. I got great questions both then and from people approaching me after the talk. The questions and the feedback I get from a talk is really one of the things that makes me appreciate talking the most.

The video of the talk is available, and the slides can also be viewed.

So after I had spent some time discussing curl things and handing out many stickers after my talk, I managed to land in the cafeteria for a while until it was time for me to once again go and perform.

We’re usually a team of friends that hang out during FOSDEM and we all went over to the Mozilla room to be there perhaps 20 minutes before my talk was scheduled and wow, there was a huge crowd outside of that room already waiting by the time we arrived. When the doors then finally opened (about 10 minutes before my talk started), I had to zigzag my way through to get in, and there was a large amount of people who didn’t get in. None of my friends from the cafeteria made it in!

The Mozilla devroom had 363 seats, not a single one was unoccupied and there was people standing along the sides and the back wall. So, an estimated nearly 400 persons in that room saw me speak about HTTP/2 deployments numbers right now, how HTTP/2 doesn’t really work well under 2% packet loss situations and then a bit about how QUIC can solve some of that and what QUIC is and when we might see the first experiments coming with IETF-QUIC – which really isn’t the same as Google-QUIC was.

To be honest, it is hard to deliver a talk in twenty minutes and I was only 30 seconds over my time. I got questions and after the talk I spent a long time talking with people about HTTP, HTTP/2, QUIC, curl and the future of Internet protocols and transports. Very interesting.

The video of my talk can be seen, and the slides are online too.

I’m not sure if I was just unusually unlucky in my choices, or if there really was more people this year, but I experienced that “FULL” sign more than usual this year.

I fully intend to return again next year. Who knows, maybe I’ll figure out something to talk about then too. See you there?

cURL and libcurl, Firefox, Network, Web

One URL standard please

January 30, 2017 Daniel Stenberg 19 Comments

Following up on the problem with our current lack of a universal URL standard that I blogged about in May 2016: My URL isn’t your URL. I want a single, unified URL standard that we would all stand behind, support and adhere to.

What triggers me this time, is yet another issue. A friendly curl user sent me this URL:

http://user@example.com:80@daniel.haxx.se

… and pasting this URL into different tools and browsers show that there’s not a wide agreement on how this should work. Is the URL legal in the first place and if so, which host should a client contact?

curl treats the ‘@’-character as a separator between userinfo and host name so ‘example.com’ becomes the host name, the port number is 80 followed by rubbish that curl ignores. (wget2, the next-gen wget that’s in development works identically)
wget extracts the example.com host name but rejects the port number due to the rubbish after the zero.
Edge and Safari say the URL is invalid and don’t go anywhere
Firefox and Chrome allow ‘@’ as part of the userinfo, take the ’80’ as a password and the host name then becomes ‘daniel.haxx.se’

The only somewhat modern “spec” for URLs is the WHATWG URL specification. The other major, but now somewhat aged, URL spec is RFC 3986, made by the IETF and published in 2005.

In 2015, URL problem statement and directions was published as an Internet-draft by Masinter and Ruby and it brings up most of the current URL spec problems. Some of them are also discussed in Ruby’s WHATWG URL vs IETF URI post from 2014.

What I would like to see happen…

Which group? A group!

Friends I know in the WHATWG suggest that I should dig in there and help them improve their spec. That would be a good idea if fixing the WHATWG spec would be the ultimate goal. I don’t think it is enough.

The WHATWG is highly browser focused and my interactions with members of that group that I have had in the past, have shown that there is little sympathy there for non-browsers who want to deal with URLs and there is even less sympathy or interest for URL schemes that the popular browsers don’t even support or care about. URLs cover much more than HTTP(S).

I have the feeling that WHATWG people would not like this work to be done within the IETF and vice versa. Since I’d like buy-in from both camps, and any other camps that might have an interest in URLs, this would need to be handled somehow.

It would also be great to get other major URL “consumers” on board, like authors of popular URL parsing libraries, tools and components.

Such a URL group would of course have to agree on the goal and how to get there, but I’ll still provide some additional things I want to see.

Update: I want to emphasize that I do not consider the WHATWG’s job bad, wrong or lost. I think they’ve done a great job at unifying browsers’ treatment of URLs. I don’t mean to belittle that. I just know that this group is only a small subset of the people who probably should be involved in a unified URL standard.

A single fixed spec

I can’t see any compelling reasons why a URL specification couldn’t reach a stable state and get published as *the* URL standard. The “living standard” approach may be fine for certain things (and in particular browsers that update every six weeks), but URLs are supposed to be long-lived and inter-operate far into the future so they really really should not change. Therefore, I think the IETF documentation model could work well for this.

The WHATWG spec documents what browsers do, and browsers do what is documented. At least that’s the theory I’ve been told, and it causes a spinning and never-ending loop that goes against my wish.

Document the format

The WHATWG specification is written in a pseudo code style, describing how a parser would “walk” over the string with a state machine and all. I know some people like that, I find it utterly annoying and really hard to figure out what’s allowed or not. I much more prefer the regular RFC style of describing protocol syntax.

IDNA

Can we please just say that host names in URLs should be handled according to IDNA2008 (RFC 5895)? WHATWG URL doesn’t state any IDNA spec number at all.

Move out irrelevant sections

“Irrelevant” when it comes to documenting the URL format that is. The WHATWG details several things that are related to URL for browsers but are mostly irrelevant to other URL consumers or producers. Like section “5. application/x-www-form-urlencoded” and “6. API”.

They would be better placed in a “URL considerations for browsers” companion document.

Working doesn’t imply sensible

So browsers accept URLs written with thousands of forward slashes instead of two. That is not a good reason for the spec to say that a URL may legitimately contain a thousand slashes. I’m totally convinced there’s no critical content anywhere using such formatted URLs and no soul will be sad if we’d restricted the number to a single-digit. So we should. And yeah, then browsers should reject URLs using more.

The slashes are only an example. The browsers have used a “liberal in what you accept” policy for a lot of things since forever, but we must resist to use that as a basis when nailing down a standard.

The odds of this happening soon?

I know there are individuals interested in seeing the URL situation getting worked on. We’ve seen articles and internet-drafts posted on the issue several times the last few years. Any year now I think we will see some movement for real trying to fix this. I hope I will manage to participate and contribute a little from my end.

cURL and libcurl, Mozilla, Open Source

curl <3 Mozilla

January 18, 2017 Daniel Stenberg 3 Comments

Today Mozilla revealed the new logo and “refreshed identity”. The new Mozilla logo uses the colon-slash-slash sequence already featured in curl’s logo as I’ve discussed previously here from back before the final decision (or layout) was complete.

The sentiment remains though. We don’t feel that we can in any way claim exclusivity to this symbol of internet protocols – on the contrary we believe this character sequence tells something and we’re nothing but happy that this belief is shared by such a great organization such as Mozilla.

Will the “://” in either logo make some users think of the other logo? Not totally unlikely, but I don’t consider that bad. curl and Mozilla share a lot of values and practices; open source, pro users, primarily client-side. I am leading the curl project and I am employed by Mozilla. Also, when talking and comparing brands and their recognition and importance in a global sense, curl is of course nothing next to Mozilla.

I’ve talked to the core team involved in the Mozilla logo revamping before this was all made public and I assured them that we’re nothing but thrilled.

Network, Open Source, Security

Lesser HTTPS for non-browsers

January 10, 2017 Daniel Stenberg 5 Comments

An HTTPS client needs to do a whole lot of checks to make sure that the remote host is fine to communicate with to maintain the proper high security levels.

In this blog post, I will explain why and how the entire HTTPS ecosystem relies on the browsers to be good and strict and thanks to that, the rest of the HTTPS clients can get away with being much more lenient. And in fact that is good, because the browsers don’t help the rest of the ecosystem very much to do good verification at that same level.

Let me me illustrate with some examples.

CA certs

The server’s certificate must have been signed by a trusted CA (Certificate Authority). A client then needs the certificates from all the CAs that are trusted. Who’s a trusted CA and how would a client get their certs to use for verification?

You can say that you trust the same set of CAs that your operating system vendor trusts (which I’ve always thought is a bit of a stretch but hey, I can very well understand the convenience in this). If you want to do this as an HTTPS client you need to use native APIs in Windows or macOS, or you need to figure out where the cert bundle is stored if you’re using Linux.

If you’re not using the native libraries on windows and macOS or if you can’t find the bundle in your Linux distribution, or you’re in one of a large amount of other setups where you can’t use someone else’s bundle, then you need to gather this list by yourself.

How on earth would you gather a list of hundreds of CA certs that are used for the popular web sites on the net of today? Stand on someone else’s shoulders and use what they’ve done? Yeps, and conveniently enough Mozilla has such a bundle that is licensed to allow others to use it…

Mozilla doesn’t offer the set of CA certs in a format that anyone else can use really, which is the primary reason why we offer Mozilla’s cert bundle converted to PEM format on the curl web site. The other parties that collect CA certs at scale (Microsoft for Windows, Apple for macOS, etc) do even less.

Before you ask, Google doesn’t maintain their own list for Chrome. They piggyback the CA store provided on the operating system it runs on. (Update: Google maintains its own list for Android/Chrome OS.)

Further constraints

But the browsers, including Firefox, Chrome, Edge and Safari all add additional constraints beyond that CA cert store, on what server certificates they consider to be fine and okay. They blacklist specific fingerprints, they set a last allowed date for certain CA providers to offer certificates for servers and more.

These additional constraints, or additional rules if you want, are never exported nor exposed to the world in ways that are easy for anyone to mimic (in other ways than that everyone of course can implement the same code logic in their ends). They’re done in code and they’re really hard for anyone not a browser to implement and keep up with.

This makes every non-browser HTTPS client susceptible to okaying certificates that have already been deemed not OK by security experts at the browser vendors. And in comparison, not many HTTPS clients can compare or stack up the amount of client-side TLS and security expertise that the browser developers can.

HSTS preload

HTTP Strict Transfer Security is a way for sites to tell clients that they are to be accessed over HTTPS only for a specified time into the future, and plain HTTP should then not be used for the duration of this rule. This setup removes the Man-In-The-Middle (MITM) risk on subsequent accesses for sites that may still get linked to via HTTP:// URLs or by users entering the web site names directly into the address bars and so on.

The browsers have a “HSTS preload list” which is a list of sites that people have submitted and they are HSTS sites that basically never time out and always will be accessed over HTTPS only. Forever. No risk for MITM even in the first access to these sites.

There are no such HSTS preload lists being provided for non-browser HTTPS clients so there’s no easy way for non-browsers to avoid the first access MITM even for these class of forever-on-HTTPS sites.

Update: The Chromium HSTS preload list is available in a JSON format.

SHA-1

I’m sure you’ve heard about the deprecation of SHA-1 as a certificate hashing algorithm and how the browsers won’t accept server certificates using this starting at some cut off date.

I’m not aware of any non-browser HTTPS client that makes this check. For services, API providers and others don’t serve “normal browsers” they can all continue to play SHA-1 certificates well into 2017 without tears or pain. Another ecosystem detail we rely on the browsers to fix for us since most of these providers want to work with browsers as well…

This isn’t really something that is magic or would be terribly hard for non-browsers to do, its just that it will make some users suddenly get errors for their otherwise working setups and that takes a firm attitude from the software provider that is hard to maintain. And you’d have to introduce your own cut-off date that you’d have to fight with your users about! 😉

TLS is hard to get right

TLS and HTTPS are full of tricky areas and dusty corners that are hard to get right. The more we can share tricks and rules the better it is for everyone.

I think the browser vendors could do much better to help the rest of the ecosystem. By making their meta data available to us in sensible formats mostly. For the good of the Internet.

Disclaimer

Yes I work for Mozilla which makes Firefox. A vendor and a browser that I write about above. I’ve been communicating internally about some of these issues already, but I’m otherwise not involved in those parts of Firefox.

Addendum a few days later

Number of problems as a measurement

Future

Languages

curl is written in C

C is everywhere

Everyone knows C

C is not a safe language

C is not the primary reason for our past vulnerabilities

C is not a new dependency

curl sits in the boat

Rewriting means adding heaps of bugs

What if

Finally

Web traffic

Number of lines in git

Top-4 author’s share

Number of authors over time

Commits per release

Commits per day

Days per release

Bugfixes per release

Days spent per bugfix

Coverity scans

Vulnerability reports

Which group? A group!

A single fixed spec

Document the format

IDNA

Move out irrelevant sections

Working doesn’t imply sensible

The odds of this happening soon?

CA certs

Further constraints

HSTS preload

SHA-1

TLS is hard to get right

Disclaimer

curl, open source and networking