curl’s new CA store cache

When setting up a TLS or QUIC connection, a client like curl needs a CA store in order to verify the certificate(s) the server provides in the TLS handshake.

CA store

A CA store is a fancy name for a number of certificates. Certificates for the Certificate Authorities (CAs) that a TLS client trusts. On the curl website, we offer a PEM version of the CA store that Mozilla maintains, for download. This set currently contains 142 certificates and while the exact amount vary a little over time, it has been more than a hundred for many years. A fair amount. And there is nothing in the pipe that will bring down the number significantly anytime soon, to my knowledge. These 142 certificates make up a file that is exactly 225,403 bytes. 1587 bytes per certificate on average.

Load and parse

When setting up a TLS connection, the 142 certificates need to be loaded from the external file into memory and parsed so that the server’s certificate can be verified. So that curl knows that the server it has connected to is indeed the correct server and not a man in the middle, an impostor.

This procedure is a rather costly one, in terms of CPU cycles needed.

Another cache

A classic approach to avoid heavy work is to cache the results from a previous use to be able to reuse them again. Starting in curl 7.87.0 curl introduces a CA store cache.

Now, curl can keep the loaded and parsed CA store in memory associated with the handle and then subsequent requests can avoid re-loading and re-parsing the CA data when new connections are created – if they use the same CA store of course. The performance gain in doing this shortcut can be enormous. After all, most transfers are done using the same single CA store.

To quote the numbers Michael Drake presented in the pull request for this new feature. He measured number of instructions to load and render a particular web page from BBC with the NetSurf browser (which obviously is using libcurl for its HTTPS transfers). With and without this cache.

CA store cacheTotal instruction fetch cost
None5,168,090,712
Enabled1,020,984,411

I think a reduction to one fifth of the original cost is significant.

Converted into a little graph they compare like this (smaller is better):

But even in simpler applications and curl command lines this caching should have a measurable impact as soon as multiple TLS connections are done using the same handle. An extremely common usage pattern.

Life-time

Keeping the data around after use potentially changes the behavior a little, but the huge performance gain made us decide to still do this by default. We compensate this a little by setting the default life-time to 24 hours, so applications that keep handles alive for a very long time will still get the cache flushed and read from file again every day.

The CA store is typically not updated more frequently than once every few months or weeks.

CURLOPT_CA_CACHE_TIMEOUT

This is a new option for libcurl that allows applications to tweak the life-time and CA cache behavior for when the default as described above is not enough.

Details

This CA cache system is so far only supported when curl is built to use OpenSSL or one of its forks. I hope others will get inspired and bring this support for other TLS backends as well as we go forward.

CA cache support for curl was authored by Michael Drake. Thanks!