In an office building close to the Waterloo station in London, around 40 persons again sat down at this giant table forming a big square that made it possible for us all to see each other. One by one there were brief presentations done with follow-up discussions. The discussions often reiterated old truths, brought up related topics and sometimes went deep down into the weeds about teeny weeny details of the involved protocol specs. The way we love it.
The people around the table represent Ericsson, Google, Microsoft, Apple, Meta, Akamai, Cloudflare, Fastly, Mozilla, Varnish. Caddy, Nginx, Haproxy, Tomcat, Adobe and curl and probably a few more I forget now. One could say with some level of certainty that a large portion of every day HTTP traffic in the world is managed by things managed by people present here.
This morning we all actually understood that the south entrance is actually the east one (yeah, that’s a so called internal joke) and most of us were sitting down, eager and prepared when the day started at 9:30 am.
Capsule. The capsule protocol (RFC 9297) is a way to, put simply, send UDP packets/datagrams over old style HTTP/1 or HTTP/2 proxies.
Cookies. With the 6265bis effort well on its way to ship as an updated RFC, there is an effort and intent to take yet another stab at improving and refreshing the cookie spec situation. In particular to better split off browser management and API related stuff from the more network-oriented over-the-wire details. You know yours truly never ceases an opportunity to voice his opinion on cookies… I approve of this attempt as well, as I think increasing clarity and improving the specification situation can’t but to help improve things.
Declarative web push. There’s an ongoing effort to improve web push – not to be confused with server push, so that it can be done easier and without needing JavaScript to manage it in the client side.
Reverse HTTP. There are origins who want to contact their CDNs without having to listening on any ports/sockets and still be able to provide content. That’s one of the use cases for Reverse HTTP and we got to learn details from internet drafts done on the topics for the last fifteen years and why it might still be a worthwhile effort and why the use cases still exist. But is there an enough demand to put it into HTTP?
Server Stack Detection. A discussion around how someone can detect the origin of the server stack of any given HTTP server implementation. Should there be a better way? What is the downside of introducing what would basically be the server version of the user-agent header field? Lots of productive discussions on how to avoid recreating problems of the past but in a reversed way.
MoQ: What is it and why is it not just HTTP/3? Was an educational session about the ongoing work done in this working group that is wrongly named and would appreciate more input from the general protocol community.
New HTTP stack. A description of the journey of a full HTTP stack rewrite: how components can be chained together and in which order and a follow-up discussion about if this should be included in documentation and if so in which way etc. Lessons included that the spec is one thing, the Internet is another. Maybe not an entirely new revelation.
Multiplexing in the year 2024. There are details in HTTP/2 multiplexing that does not really work, there are assumptions that are now hard to change. To introduce new protocols and features in the modern HTTP stack, things need to be done for both HTTP/2 and HTTP/3 that are similar but still different and it forces additional work and pain.
What if we create a way to do multiplexing over TCP, so called “over streams”, so that the HTTP/3 fallback over TCP could still be done using HTTP/3 framing. This would allow future new protocols to remain HTTP/3-only and just make the transport be either QUIC+UDP or Streams-over TCP+TLS. This triggered a lot of discussions, mostly positive and forward-looking but also a lot of concerns raised about additional work and yet another protocols component to write and implement that then needs to be supported until the end of times because things never truly go away completely.
I think this sounds like a fun challenge! Count me in.
End of day two. I need a beer or two to digest this.
For the sixth time, this informal group of HTTP implementers and related “interested parties” unite in a room over a couple of days doing a HTTP Workshop. Nine years since that first event in Münster, Germany.
If you are someone like me, obsessed with networking and HTTP in particular this is certainly the place to be. Talking and discussing past lessons, coming changes and protocol dreams in several days with like-minded people is a blast. The people on these events are friends that I don’t always get to hang out with too often. Many of the attendees here have been involved in this community for a long time and have attended all or most of the past workshops as well. I have fortunately been able to attend all of them so far.
This time we are in London.
Let me tell you a little about the topics of day one – without spilling the beans about exactly who said what or what the company they came from. (I will add links to the presentations later once I know where to link to.)
Make Cleartext HTTP harder. The first discussion point of the day. While we have made HTTPS easier over recent years it can only take us so far. What if we considered means to make HTTP harder to use as the next level efforts to further reduce its use over the internet. The HSTS preload list is only growing. Should we instead convert it into a HSTS exclude list that can shrink over time? This triggered a looong a discussion in the group which brought back a lot of old arguments and reasoning from days I thought we had left behind long ago.
HTTP 2/3 abuses. A prominent implementer of a HTTP proxy/load balancer walked us through a whole series of different HTTP/2 and HTTP/3 protocols details that attackers can have been exploiting in recent years, with details about what can be done to mitigate such attacks. It made several other implementers mention how they take similar precautions and some other general discussions around the topics of what can be considered normal use of the protocol and what is not.
Idle connections & mobile: beneficial or harmful. 0-RTT instead of idle connections. There is a non-negligible cost associate with keeping connections alive for clients running on mobile phones. Would it be possible to instead move forward into a world when they are not kept alive but instead closed and 0-RTT opened again next time they are needed? Again the room woke up to a long discussion about the benefits and problems with doing this – which if it would be possible probably would save a lot of battery time on the average mobile phones.
QUIC pacing. We learned that it is very important for servers to implement decent QUIC pacing as it can increase performance up to 20 times compared to no pacing at all. What about using the flow control properly? What about changing the default buffer sizes for UDP sockets in the Linux kernel to something similar to TCP sockets in order to help the default case to perform better?
HTTP prioritization for product performance. The HTTP/2 way of doing prioritization was deemed a failure and too complicated a long time ago but in this presentation we were taught that there are definitely use cases and scenarios where the regular HTTP/3 priority setup is helpful and improves performs. Examples and descriptions for a popular and well used client were shown.
Allowing HTTP clients to use stale DNS data. What if HTTP clients would use stale data instead of having to wait for the DNS resolver response as a means to avoid having to wait a whole RTT to get the date that in a fair amount of the cases is the same as the stale data. Again a long discussion around TTLs for DNS queries and the fact that some clients are already doing this, in more or less explicit ways.
QUIC cache DSR. As the last talk of the the afternoon we got in the details of Direct Server Response for QUIC and how this can improve performance and problems and challenges involved with this. It then indirectly took us into a long sub-thread talking about HTTP caching, Vary headers and what could and should be done to improve things going forward. There seems to be an understanding that it would be good to improve the current situation but it is not entirely clear to this author exactly what that would entail.
When we then took a walk through the streets of London, only to have an awesome dinner during which we could all conclude that HTTP still is not ready. There is still work to be done. There are challenges left to overcome.
tldr: the curl bug-bounty has been an astounding success so far.
We started the current curl bug-bounty setup in April 2019. We have thus run it for five and a half years give or take.
In the beginning we awarded researchers just a few hundred USD per issue because we did not know where it would go and as we used money from the curl fund (donated money) we wanted to make sure we could afford it.
Since a few years back, the money part of the bug-bounty is sponsored by the Internet Bug Bounty, meaning that the curl project actually earns money for every flaw as we get 20% of the IBB money for each bounty paid.
While the exact award amounts per report vary over time, they are roughly 500 USD for a low severity issue, 2,500 USD for a Medium and almost 5,000 USD for a High severity one.
To this day, we have paid out 84,260 USD to security researchers as rewards for their findings, distributed over 69 separate CVEs. 1,220 USD on average.
Counters
In this period we have received 477 reports, which is about 6 per month on average.
73 of the reports (15.4%) were confirmed and treated as valid security vulnerabilities that ended up CVEs. This also means that we get roughly one valid security report per month on average. Only 3 of these security problems were rated severity High, the rest were Low or Medium. None of them reached the worst level: Critical.
92 of of the reports (19.4%) were confirmed legitimate bugs but not security problems.
311 of the reports (65.3%) were Not Applicable. They were not bugs and not security problems. See below for more on this category.
1 of the reports is still being assessed as I write this.
Tightening the screws
Security is top priority for us but we also continue to develop curl at a high pace. We merge code into the repository at a frequency of more than four bugfixes per day on average over the last couple of years. When we tighten the screws in this project in order to avoid future problems and to mitigate the risks that we add new ones, we need to do it using policies and concepts that still allow us to move fast and be agile.
First response
We have an ambition to always have a first response posted within 24 hours. Over these first 477 reports, we have had a medium response time on under one hour and we have never missed our 24 hour goal. I am personally a little amazed by this feat.
Time to triage
The medium time from filed report until the curl security team has determined and concluded with some confidence that the problem is a security problem is 36 hours.
Assessing
Assessing a (good) report is hard and usually involves a lot of work: reading up on protocols details, reading code, trying different reproducer builds/scripts and bouncing back and forth with the reporters and the security team.
Acknowledging that it is a security problem is only one step. The adjacent one that is at least equally difficult is to then figuring out the severity. How serious is this flaw? A normal pattern is of course that the researcher considers the problem to be several degrees worse than the curl security team does so it can take a great deal of reasoning to reach an agreement. Sometimes we even decree a certain severity against the will of the researcher.
The team
There is a curl security team that works on and with security reports. The awesome people in this group are:
Max Dymond, Dan Fandrich, Daniel Gustafsson, James Fuller, Viktor Szakats, Stefan Eissing and myself.
They are all long-time curl maintainers. Knowledgeable, skilled, trusted.
Report quality
65.3% of the incoming reports are deemed not even a bug.
These reports can be all sorts of different things of course. When promising people money for their reports, there is no surprise that we get a fair share of luck-seekers trying to earn a few bucks the easy route.
Some reporters run scanners against the code, the mail server or the curl website and insist some findings are bounty worthy. The curl bug-bounty does not cover infrastructure, only the products, so they are not covered no matter what.
A surprisingly large amount of the bad reports are on various kinds of “information exposure” on the website – which is often ironic since the entire website already is available in a public git repository and the information exposed is hardly secret.
Reporting scanner results on code without applying your own thinking and confirming that the findings are indeed correct – and actual security problems – is rarely a good idea. That also goes for when asking AIs for finding problems.
Dismissing
Typically, the worse the report is, the quicker it is to dismiss. That is also why having this large share of rubbish is usually not a problem: we normally get rid of them with just a few minutes work spent.
The better crap we get, the worse the problem gets. An AI or a person that writes a long and good-looking report arguing for their sake can take a long time to analyze, asses and eventually debunk.
Since security problems are top priority in the project, getting too much good crap can to some degree cause a denial of service in the project as we need to halt other activities while we take care of the incoming reports.
We run our bug-bounty program on Hackerone, which has a reputation system for reporters. When we close reports as N/A, they get a reputation cut. This works as a mild deterrence for submitting low quality reports. Of course it also sometimes gives the reporter a reason to argue with us and insist we should rather close it as informative which does not come with a reputation penalty.
The good findings
I would claim that it is pretty hard to find a security problem in curl these days, but since we still average in maybe twelve per year recently they certainly still exist.
The valid reports today tend to happen because either a user accidentally did something that made them look, research and unveil something troublesome, or in the more common case: they have put in some real effort into research.
In the latter cases, we see researchers run their own custom fuzzers on parts of the code that our own fuzzers have not exercised as well, we see them check for code patterns that have led to problems before or in other projects and we also see researchers get inspiration by previous reports and fixes to see if perhaps there were gaps left.
The best curl security problem finders today understand the underlying involved protocols, the curl architecture, the source code and they look for inconsistencies between them all, as such might cause security problems.
Bounty hunters
The 69 bug bounty payouts so far have been done to 27 separate individuals. Five reporters have been rewarded for more than two issues each. The true curl security researcher heroes:
Reports
Name
Rewarded
25
Harry Sintonen
29,620 USD
8
Hiroki Kurosawa
9,800 USD
4
Axel Chong
7,680 USD
4
Patrick Monnerat
7,300 USD
3
z2_
4,080 USD
Top-5 curl bounty hunters
We are extremely fortunate to have this skilled set of people tracking down and highlighting our worst mistakes.
Harry of course sticks out in the top with his 25 rewarded curl security reports. More than three times the amount the number two has.
(Before you think the math is wrong: a few reports have been filed that ended up as valid CVEs but for which the reporters have declined getting a monetary reward.)
My advice
I think the curl bug-bounty is an absolute and undisputed success. I believe it is a key part in our mission to keep our users safe and secure.
If you consider kicking off a bug-bounty for your project here’s my little checklist:
Do your software engineering proper. Run all the tools, tests, checks, analyzers, scanners, fuzzers you can and make sure they are at zero reported defects. To avoid a raging herd of reports when you open the gates.
Start out with conservative bounty amounts to get a lay of the land, then raise them as you go.
Own all security problems for your project. Whoever reports them and however they appear, you assess, evaluate, research and fix them. You write and publish the complete and original security advisory.
Make sure you have a team. Even the best maintainers need sleep and occasional vacation days. Security is hard and having good people around to bounce problems with is priceless.
Close/reject crap reports as quickly as possible to prevent them from wasting team time and energy.
Always fix security problems with haste. Never let them linger around.
Transparency. Make as much as possible open and public once the CVEs are out, so that your processes, communications, methods are visible. This builds trust and allows for feedback and iterative improvements of the process.
Future
I think we will continue to receive valid security reports going forward, simply because we keep developing at a high pace and we change and add a lot of source code every year.
The trend in recent years have been more security reports, but the ratio of low/medium vs high/critical has sky-rocketed. The issues reported these days tend to be less sever than they were in the past.
My explanation for this is primarily that we have more people looking harder for problems now than in the past. Due to mitigations and past reports we introduce really bad security problems at a lower frequency than before.
On Monday this week, I did a talk at the Nordic Software Security Summit conference in Stockholm Sweden. I titled it CVEMITRECVSSNVDCNAOSS WTF with the subtitle “Keeping the world from Burning”.
The talk was well received and I think it added something to the conversation. Almost every other talk during the rest of the conference that I saw referred back to it.
Since the talk was not recorded (no talks were at this event), I intend to do the presentation again – from home. This time live-streamed and recorded.
This happens on:
Monday September 30, 2024 14:00 UTC (16:00 CEST)
The stream happens on Twitch where I as always am curlhacker. Join the chatroom, ask questions, have a good time. There will of course be room for a Q&A.
No registration. No fee. Just show up.
At the conference, I did the presentation in under thirty minutes. This version might go on a few more minutes.
Abstract
The abstract I provided for this talk to the conference says:
Bogus CVEs, know-better organizations, conflicting databases, AI hallucinations, inflated severity scoring, security scanners, Jia Tan. As the lead developer in the curl project, Daniel describes some of the challenges involved and what you need to do to stay on top of security when working in a high profile Open Source project running in some twenty billion instances. The talk will be involving many examples from real life.
Differences
Since this is a second run of a talk I already did and I have no script, it will not be identical. I will also try to polish some minor details that I felt could need some brush-ups.
I have this tradition of mentioning occasional network related quirks on Windows on my blog so here we go again.
This round started with a bug report that said
curl is slow to connect to localhost on Windows
It is also demonstrably true. The person runs a web service on a local IPv4 port (and nothing on the IPv6 port), and it takes over 200 milliseconds to connect to it. You would expect it to take no less than a number of microseconds, as it does on just about all other systems out there.
The command
curl http://localhost:4567
Connecting
Buckle up, this is getting worse. But first, let’s take a look at this exact use case and what happens.
The hostname localhost first resolves to ::1 and 127.0.0.1 by curl. curl actually resolves this name “hardcoded”, so it does this extremely fast. Hardcoded meaning that it does not actually use DNS or any resolver functionality. It provides this result “fixed” for this hostname.
When curl has both IPv6 and IPv4 addresses to connect to and the host supports both IP families, it will first start the IPv6 attempt(s) and only if it did not succeed to connect over IPv6 after two hundred millisecond does it start the IPv4 attempts. This way of connecting is called Happy Eyeballs and is the de-facto and recommended way to connect to servers in a dual-stack since years back.
On all systems except Windows, where the IPv6 TCP connect attempt sends a SYN to a TCP port where nothing is listening, it gets a RST back immediately and returns failure. ECONNREFUSED is the most likely outcome of that.
On all systems except Windows, curl then immediately switches over to the IPv4 connect attempt instead that in modern systems succeeds within a small fraction of a millisecond. The failed IPv6 attempt is not noticeable to a user.
A TCP reminder
This is how a working TCP connect can be visualized like:
A successful TCP 3-way handshake
But when the TCP port in the server has no listener it actually performs like this
A refused TCP connect
Connect failures on Windows
On Windows however, the story is different.
When the TCP SYN is sent to the port where nothing is listening and an RST is sent back to tell the client so, the client TCP stack does not return an error immediately.
Instead it decides that maybe the problem is transient and it will magically fix itself in the near future. It then waits a little and then keeps sending new SYN packets to see if the problem perhaps has fixed itself – until a maximum retry value is reached (set in the registry, this value defaults to 3 extra times).
Done on localhost, this retry-SYN process can easily take a few seconds and when done over a network, it can take even more seconds.
Since this behavior makes the leading IPv6 attempt not possible to fail within 200 milliseconds even when nothing is listening on that port, connecting to any service that is IPv4-only but has an IPv6 address by default delays curl’s connect success by two hundred milliseconds. On Windows.
Of course this does not only hurt curl. This is likely to delay connect attempts for countless applications and services for Windows users.
A refused TCP connect on Windows
Non-responding is different
I want to emphasize that there is a big difference when trying to connect to a host where the SYN packet is simply not answered. When the server is not responding. Because then it could be a case of the packet gotten lost on the way so then the TCP stack has to resend the SYN again a couple of times (after a delay) to see if it maybe works this time.
IPv4 and IPv6 alike
I want to stress that this is not an issue tied to IPv6 or IPv4. The TCP stack seems to treat connect attempts done over either exactly the same. The reason I even mention the IP versions is because how this behavior works counter to Happy Eyeballs in the case where nothing listens to the IPv6 port.
Is resending SYN after RST “right” ?
According to this exhaustive resource I found on explaining this Windows TCP behavior, this is not in violation of RFC 793. One of the early TCP specifications from 1981.
It is surprising to users because no one else does it like this. I have not found any other systems or TCP stacks that behave this way.
Fixing?
There is no way for curl to figure out that this happens under the hood so we cannot just adjust the code to error out early on Windows as it does everywhere else.
Workarounds
There is a registry key in Windows called TcpMaxConnectRetransmissions that apparently can be tweaked to change this behavior. Of course it changes this for all applications so it is probably not wise to do this without a lot of extra verification that nothing breaks.
The two hundred millisecond Happy Eyeballs delay that curl exhibits can be mitigated by forcibly setting –happy-eyballs-timeout-ms to zero.
If you know the service is not using IPv6, you can tell curl to connect using IPv4 only, which then avoids trying and wasting time on the IPv6 sinkhole: –ipv4.
Without changing the registry and trying to connect to any random server out there where nothing is listening to the requested port, there is no decent workaround. You just have to accept that where other systems can return failure within a few milliseconds, Windows can waste multiple seconds.
Windows version
This behavior has been verified quite recently on modern Windows versions.
Time for another checkup. Where are we right now with HTTP/3 support in curl for users?
I think curl’s situation is symptomatic for a lot of other HTTP tools and libraries. HTTP/3 has been and continues to be a much tougher deployment journey than HTTP/2 was.
curl supports four alternative HTTP/3 solutions
You can enable HTTP/3 for curl using one of these four different approaches. We provide multiple different ones to let “the market” decide and to allow different solutions to “compete” with each other so that users eventually can get the best one. The one they prefer. That saves us from the hard problem of trying to pick a winner early in the race.
More details about the four different approaches follow below.
Why is curl not using HTTP/3 already?
It already does if you build it yourself with the right set of third party libraries. Also, the curl for windows binaries provided by the curl project supports HTTP/3.
For Linux and other distributions and operating system packagers, a big challenge remains that the most widely used TLS library (OpenSSL) does not offer the widely accepted QUIC API that most other TLS libraries provide. (Remember that HTTP/3 uses QUIC which uses TLS 1.3 internally.) This lack of API prevents existing QUIC libraries to work with OpenSSL as their TLS solution forcing everyone who want to use a QUIC library to use another TLS library – because curl does not easily allows itself to get built using multiple TLS libraries . Having a separate TLS library for QUIC than for other TLS based protocols is not supported.
Debian tries an experiment to enable HTTP/3 in their shipped version of curl by switching to GnuTLS (and building with ngtcp2 + nghttp3).
HTTP/3 backends
To get curl to speak HTTP/3 there are three different components that need to be provided, apart from the adjustments in the curl code itself:
TLS 1.3 support for QUIC
A QUIC protocol library
An HTTP/3 protocol library
Illustrated
Below, you can see the four different HTTP/3 solutions supported by curl in different columns. All except the right-most solution are considered experimental.
HTTP/3 backend support in curl, June 2024
From left to right:
the quiche library does both QUIC and HTTP/3 and it works with BoringSSL for TLS
msh3 is an HTTP/3 library that uses mquic for QUIC and either a fork family or Schannel for TLS
nghttp3 is an HTTP/3 library that in this setup uses OpenSSL‘s QUIC stack, which does both QUIC and TLS
nghttp3 for HTTP/3 using ngtcp2 for QUIC can use a range of different TLS libraries: fork family, GnuTLS and wolfSSL. (picotls is supported too, but curl itself does not support picotls for other TLS use)
ngtcp2 is ahead
ngtcp2 + nghttp3 was the first QUIC and HTTP/3 combination that shipped non-beta versions that work solidly with curl, and that is the primary reason it is the solution we recommend.
The flexibility in TLS solutions in that vertical is also attractive as this allows users a wide range of different libraries to select from. Unfortunately, OpenSSL has decided to not participate in that game so this setup needs another TLS library.
OpenSSL QUIC
OpenSSL 3.2 introduced a QUIC stack implementation that is not “beta”. As the second solution curl can use. In OpenSSL 3.3 they improved it further. Since early 2024 curl can get built and use this library for HTTP/3 as explained above.
However, the API OpenSSL provide for doing transfers is lacking. It lacks vital functionality that makes it inefficient and basically forces curl to sometimes busy-loop to figure out what to do next. This fact, and perhaps additional problems, make the OpenSSL QUIC implementation significantly slower than the competition. Another reason to advise users to maybe use another solution.
We keep communicating with the OpenSSL team about what we think needs to happen and what they need to provide in their API so that we can do QUIC efficiently. We hope they will improve their API going forward.
Stefan Eissing produced nice comparisons graph that I have borrowed from his Performance presentation (from curl up 2024. Stefan also blogged about h3 performance in curl earlier.). It compares three HTTP/3 curl backends against each other. (It does not include msh3 because it does not work good enough in curl.)
As you can see below, in several test setups OpenSSL is only achieving roughly half the performance of the other backends in both requests per second and raw transfer speed. This is on a localhost, so basically CPU bound transfers.
HTTP/3 backend performance in curl compared, requests/second
HTTP/3 backend performance in curl compared, megabytes/second
I believe OpenSSL needs to work on their QUIC performance in addition to providing an improved API.
quiche and msh3
quiche is still labeled beta and is only using BoringSSL which makes it harder to use in a lot of situations.
msh3 does not work at all right now in curl after a refactor a while ago.
HTTP/3 is a CPU hog
This is not news to anyone following protocol development. I have been repeating this over and over in every HTTP/3 presentation I have done – and I have done a few by now, but I think it is worth repeating and I also think Stefan’s graphs for this show the situation in a crystal clear way.
HTTP/3 is slow in terms of transfer performance when you are CPU bound. In most cases of course, users are not CPU bound because typically networks are the bottlenecks and instead the limited bandwidth to the remote site is what limits the speed on a particular transfer.
HTTP/3 is typically faster to completing a handshake, thanks to QUIC, so a HTTP/3 transfer can often get the first byte transmitted sooner than any other HTTP version (over TLS) can.
To show how this looks with more of Stefan’s pictures, let’s first show the faster handshakes from his machine somewhere in Germany. These tests were using a curl 8.8.0-DEV build, from a while before curl 8.8.0 was released.
HTTP/3 vs HTTP/1.1 handshake performance in curl compared
Nope, we cannot explain why google.com actually turned out worse with HTTP/3. It can be added that curl.se is hosted by Fastly’s CDN, so this is really comparing curl against three different CDN vendors’ implementations.
HTTP/3 vs HTTP/2 vs HTTP/1.1 throughput performance in curl compared
Again: these are CPU bound transfers so what this image really shows is the enormous amounts of extra CPU work that is required to push these transfers through. As long as you are not CPU bound, your transfers should of course run at the same speeds as they do with the older HTTP versions.
These comparisons show curl’s treatment of these protocols as they are not generic protocol comparisons (if such are even possible). We cannot rule out that curl might have some issues or weird solutions in the code that could explain part of this. I personally suspect that while we certainly always have areas for improvement remaining, I don’t think we have any significant performance blockers lurking. We cannot be sure though.
OpenSSL-QUIC stands out here as well, in the not so attractive end.
HTTP/3 deployments
w3techs, Mozilla and Cloudflare data all agree that somewhere around 28-30% of the web traffic is HTTP/3 right now. This is a higher rate than HTTP/1.1 for browser traffic.
An interesting detail about this 30% traffic share is that all the big players and CDNs (Google, Facebook, Cloudflare, Akamai, Fastly, Amazon etc) run HTTP/3, and I would guess that they combined normally have a much higher share of all the web traffic than 30%. Meaning that there is a significant amount of browser web traffic that could use HTTP/3 but still does not. Unfortunately I don’t have the means to figure out explanations for this.
HTTPS stack overview
In case you need a reminder, here is how an HTTPS stack works.
You know Tor, but do you know SOCKS5? It is an old and simple protocol for setting up a connection and when using it, the client can decide to either pass on the full hostname it wants to connect to, or it can pass on the exact IP address.
(SOCKS5 is by the way a minor improvement of the SOCKS4 protocol, which did not support IPv6.)
When you use curl, you decide if you want curl or the proxy to resolve the target hostname. If you connect to a site on the public Internet it might not even matter who is resolving it as either party would in theory get the same set of IP addresses.
The .onion TLD
There is a concept of “hidden” sites within the Tor network. They are not accessible on the public Internet. They have names in the .onion top-level domain. For example. the search engine DuckDuckGo is available at https://duckduckgogg42xjoc72x3sjasowoarfbgcmvfimaftt6twagswzczad.onion/.
.onion names are used to provide access to end to end encrypted, secure, anonymized services; that is, the identity and location of the server is obscured from the client. The location of the client is obscured from the server.
To access a .onion host, you must let Tor resolve it because a normal DNS server aware of the public Internet knows nothing about it.
This is why we recommend you ask the SOCKS5 proxy to resolve the hostname when accessing Tor with curl.
The proxy connection
The SOCKS5 protocol is clear text so you must make sure you do not access the proxy over a network as then it will leak the hostname to eavesdroppers. That is why you see the examples above use localhost for the proxy.
You can also step it up and connect to the SOCKS5 proxy over unix domain sockets with recent curl versions like this:
Sites using the .onion TLD are not on the public Internet and it is pointless to ask your regular DNS server to resolve them. Even worse: if you in fact ask your normal resolver you practically advertise your intention of connection to a .onion site and you give the full name of that site to the outsider. A potentially significant privacy leak.
To combat the leakage problem, RFC 7686The “.onion” Special-Use Domain Name was published in October 2015. With the involvement and consent from people involved in the Tor project.
It only took a few months after 7686 was published until there was an accurate issue filed against curl for leaking .onion names. Back then, in the spring of 2016, no one took upon themselves to fix this and it was instead simply added to the queue of known bugs.
This RFC details (among other things) how libraries should refuse to resolve .onion host names using the regular means in order to avoid the privacy leak.
After having stewed in the known bugs lists for almost five years, it was again picked up in 2023, a pull-request was authored, and when curl 8.1.0 shipped on May 17 2023 curl refused to resolve .onion hostnames.
Tor still works remember?
Since users are expected to connect using SOCKS5 and handing over the hostname to the proxy, the above mention refusal to resolve a .onion address did not break the normal Tor use cases with curl.
Turns out there is a group of people who runs transparent proxies who automatically “catches” all local traffic and redirects it over Tor. They have a local DNS server who can resolve .onion host names and they intercept outgoing traffic to instead tunnel it through Tor.
With this setup now curl no longer works because it will not send .onion addresses to the local resolver because RFC 7686 tells us we should not,
curl of course does not know when it runs in a presumed safe and deliberate transparent proxy network or when it does not. When a leak is not a leak or when it actually is a leak.
torsocks
A separate way to access tor is to use the torsocks tool. Torsocks allows you to use most applications in a safe way with Tor. It ensures that DNS requests are handled safely and explicitly rejects any traffic other than TCP from the application you’re using.
You run it like
torsocks curl https://example.com
Because of curl’s new .onion filtering, the above command line works fine for “normal” hostnames but no longer for .onion hostnames.
Arguably, this is less of a problem because when you use curl you typically don’t need to use torsocks since curl has full SOCKS support natively.
Option to disable the filter?
In the heated discussion thread we are told repeatedly how silly we are who block .onion name resolves – exactly in the way the RFC says, the RFC that had the backing and support from the Tor project itself. There are repeated cries for us to add ways to disable the filter.
I am of course sympathetic with the users whose use cases now broke.
A few different ways to address this have been proposed, but the problem is difficult: how would curl or a user know that it is fine to leak a name or not? Adding a command line option to say it is okay to leak would just mean that some scripts would use that option and users would run it in the wrong conditions and your evil malicious neighbors who “help out” will just add that option when they convince their victims to run an innocent looking curl command line.
The fact that several of the louder voices show abusive tendencies in the discussion of course makes these waters even more challenging to maneuver.
Future
I do not yet know how or where this lands. The filter has now been in effect in curl for a year. Nothing is forever, we keep improving. We listen to feedback and we are of course eager to make sure curl remains and awesome tool and library also for content over Tor.
I have held back on writing anything about AI or how we (not) use AI for development in the curl factory. Now I can’t hold back anymore. Let me show you the most significant effect of AI on curl as of today – with examples.
Bug Bounty
Having a bug bounty means that we offer real money in rewards to hackers who report security problems. The chance of money attracts a certain amount of “luck seekers”. People who basically just grep for patterns in the source code or maybe at best run some basic security scanners, and then report their findings without any further analysis in the hope that they can get a few bucks in reward money.
We have run the bounty for a few years by now, and the rate of rubbish reports has never been a big problem. Also, the rubbish reports have typically also been very easy and quick to detect and discard. They have rarely caused any real problems or wasted our time much. A little like the most stupid spam emails.
Our bug bounty has resulted in over 70,000 USD paid in rewards so far. We have received 415 vulnerability reports. Out of those, 64 were ultimately confirmed security problems. 77 of the report were informative, meaning they typically were bugs or similar. Making 66% of the reports neither a security issue nor a normal bug.
Better crap is worse
When reports are made to look better and to appear to have a point, it takes a longer time for us to research and eventually discard it. Every security report has to have a human spend time to look at it and assess what it means.
The better the crap, the longer time and the more energy we have to spend on the report until we close it. A crap report does not help the project at all. It instead takes away developer time and energy from something productive. Partly because security work is consider one of the most important areas so it tends to trump almost everything else.
A security report can take away a developer from fixing a really annoying bug. because a security issue is always more important than other bugs. If the report turned out to be crap, we did not improve security and we missed out time on fixing bugs or developing a new feature. Not to mention how it drains you on energy having to deal with rubbish.
AI generated security reports
I realize AI can do a lot of good things. As any general purpose tool it can also be used for the wrong things. I am also sure AIs can be trained and ultimately get used even for finding and reporting security problems in productive ways, but so far we have yet to find good examples of this.
Right now, users seem keen at using the current set of LLMs, throwing some curl code at them and then passing on the output as a security vulnerability report. What makes it a little harder to detect is of course that users copy and paste and include their own language as well. The entire thing is not exactly what the AI said, but the report is nonetheless crap.
Detecting AI crap
Reporters are often not totally fluent in English and sometimes their exact intentions are hard to understand at once and it might take a few back and fourths until things reveal themselves correctly – and that is of course totally fine and acceptable. Language and cultural barriers are real things.
Sometimes reporters use AIs or other tools to help them phrase themselves or translate what they want to say. As an aid to communicate better in a foreign language. I can’t find anything wrong with that. Even reporters who don’t master English can find and report security problems.
So: just the mere existence of a few give-away signs that parts of the text were generated by an AI or a similar tool is not an immediate red flag. It can still contain truths and be a valid issue. This is part of the reason why a well-formed crap report is harder and takes longer to discard.
Exhibit A: code changes are disclosed
In the fall of 2023, I alerted the community about a pending disclosure of CVE-2023-38545. A vulnerability we graded severity high.
The day before that issue was about to be published, a user submitted this report on Hackerone: Curl CVE-2023-38545 vulnerability code changes are disclosed on the internet
That sounds pretty bad and would have been a problem if it actually was true.
The report however reeks of typical AI style hallucinations: it mixes and matches facts and details from old security issues, creating and making up something new that has no connection with reality. The changes had not been disclosed on the Internet. The changes that actually had been disclosed were for previous, older, issues. Like intended.
In this particular report, the user helpfully told us that they used Bard to find this issue. Bard being a Google generative AI thing. It made it easier for us to realize the craziness, close the report and move on. As can be seen in the report log, we did have to not spend much time on researching this.
Exhibit B: Buffer Overflow Vulnerability
A more complicated issue, less obvious, done better but still suffering from hallucinations. Showing how the problem grows worse when the tool is better used and better integrated into the communication.
On the morning of December 28 2023, a user filed this report on Hackerone: Buffer Overflow Vulnerability in WebSocket Handling. It was morning in my time zone anyway.
Again this sounds pretty bad just based on the title. Since our WebSocket code is still experimental, and thus not covered by our bug bounty it helped me to still have a relaxed attitude when I started looking at this report. It was filed by a user I never saw before, but their “reputation” on Hackerone was decent – this was not their first security report.
The report was pretty neatly filed. It included details and was written in proper English. It also contained a proposed fix. It did not stand out as wrong or bad to me. It appeared as if this user had detected something bad and as if the user understood the issue enough to also come up with a solution. As far as security reports go, this looked better than the average first post.
In the report you can see my first template response informing the user their report had been received and that we will investigate the case. When that was posted, I did not yet know how complicated or easy the issue would be.
Nineteen minutes later I had looked at the code, not found any issue, read the code again and then again a third time. Where on earth is the buffer overflow the reporter says exists here? Then I posted the first question asking for clarification on where and how exactly this overflow would happen.
After repeated questions and numerous hallucinations I realized this was not a genuine problem and on the afternoon that same day I closed the issue as not applicable. There was no buffer overflow.
I don’t know for sure that this set of replies from the user was generated by an LLM but it has several signs of it.
Ban these reporters
On Hackerone there is no explicit “ban the reporter from further communication with our project” functionality. I would have used it if it existed. Researchers get their “reputation” lowered then we close an issue as not applicable, but that is a very small nudge when only done once in a single project.
I have requested better support for this from Hackerone. Update: this function exists, I just did not look at the right place for it…
Future
As these kinds of reports will become more common over time, I suspect we might learn how to trigger on generated-by-AI signals better and dismiss reports based on those. That will of course be unfortunate when the AI is used for appropriate tasks, such as translation or just language formulation help.
I am convinced there will pop up tools using AI for this purpose that actually work (better) in the future, at least part of the time, so I cannot and will not say that AI for finding security problems is necessarily always a bad idea.
I do however suspect that if you just add an ever so tiny (intelligent) human check to the mix, the use and outcome of any such tools will become so much better. I suspect that will be true for a long time into the future as well.
I have no doubts that people will keep trying to find shortcuts even in the future. I am sure they will keep trying to earn that quick reward money. Like for the email spammers, the cost of this ends up in the receiving end. The ease of use and wide access to powerful LLMs is just too tempting. I strongly suspect we will get more LLM generated rubbish in our Hackerone inboxes going forward.
You know I spend all my days working on curl and related matters. I also spend a lot of time thinking on the project; like how we do things and how we should do things.
The security angle of this project is one of the most crucial ones and an area where I spend a lot of time and effort. Dealing with and assessing security reports, handling the verified actual security vulnerabilities and waiving off the imaginary ones.
150 vulnerabilities
The curl project recently announced its 150th published security vulnerability and its associated CVE. 150 security problems through a period of over 25 years in a library that runs in some twenty billion installations? Is that a lot? I don’t know. Of course, the rate of incoming security reports is much higher in modern days than it was decades ago.
150 curl vulnerabilities, when they were introduced and when they were fixed.
Out of the 150 published vulnerabilities, 60 were reported and awarded money through our bug-bounty program. In total, the curl bug-bounty has of today paid 71,400 USD to good hackers and security researchers. The monetary promise is an obvious attraction to researchers. I suppose the fact that curl also over time has grown to run in even more places, on more architectures and in even more systems also increases people’s interest in looking into and scrutinize our code. curl is without doubt one of the world’s most widely installed software components. It requires scrutiny and control. Do we hold up our promises?
The amount of money paid by curl in bug bounties since 2018
curl is a C program running in virtually every internet connect device you can think of.
Trends
Another noticeable trend among the reports the last decade is that we are getting way more vulnerabilities reported with severity level low or medium these days, while historically we got more ones rated high or even critical. I think this is partly because of the promise of money but also because of a generally increased and sharpened mindset about security. Things that in the past would get overlooked and considered “just a bug” are nowadays more likely to get classified as security problems. Because we think about the problems wearing our security hats much more now.
curl vulnerabilities, distribution of reports low/medium vs high/critical
Memory-safety
Every time we publish a new CVE people will ask about when we will rewrite curl in a memory-safe language. Maybe that is good, it means people are aware and educated on these topics.
I will not rewrite curl. That covers all languages. I will however continue to develop it, also in terms of memory-safety. This is what happens:
We add support for more third party libraries written in memory-safe languages. Like the quiche library for QUIC and HTTP/3 and rustls for TLS.
We are open to optionally supporting a separate library instead of native code, where that separate library could be written in a memory-safe language. Like how we work with hyper.
We keep improving the code base with helper functions and style guides to reduce risks in the C code going forward. The C code is likely to remain with us for a long time forward, no matter how much the above mention areas advance. Because it is the mature choice and for many platforms still the only choice. Rust is cool, but the language, its ecosystem and its users are rookies and newbies for system library level use.
Step 1 and 2 above means that over time, the total amount of executable code in curl gradually can become more and more memory-safe. This development is happening already, just not very fast. Which is also why number 3 is important and is going to play a role for many years to come. We move forward in all of these areas at the same time, but with different speeds.
Why no rewrite
Because I’m not an expert on rust. Someone else would be a much more suitable person to lead such a rewrite. In fact, we could suspect that the entire curl maintainer team would need to be replaced since we are all old C developers maybe not the most suitable to lead and take care of a twin project written in rust. Dedicated long-term maintainer internet transfer library teams do not grow on trees.
Because rewriting is an enormous project that will introduce numerous new problems. It would take years until the new thing would be back at a similar level of rock solid functionality as curl is now.
During the initial years of the port’s “beta period”, the existing C project would continue on and we would have two separate branches to maintain and develop, more than doubling the necessary work. Users would stay on the first version until the second is considered stable, which will take a long time since it cannot become stable until it gets a huge amount of users to use it.
There is quite frankly very little (if any) actual demand for such a rewrite among curl users. The rewrite-it-in-rust mantra is mostly repeated by rust fans and people who think this is an easy answer to fixing the share of security problems that are due to C mistakes. Typically, the kind who has no desire or plans to participate in said venture.
C is unsafe and always will be
The C programming language is not memory-safe. Among the 150 reported curl CVEs, we have determined that 61 of them are “C mistakes”. Problems that most likely would not have happened had we used a memory-safe language. 40.6% of the vulnerabilities in curl reported so far could have been avoided by using another language.
Rust is virtually the only memory-safe language that is starting to become viable. C++ is not memory-safe and most other safe languages are not suitable for system/library level use. Often because how they fail to interface well with existing C/C++ code.
By June 2017 we had already made 51 C mistakes that ended up as vulnerabilities and at that time Rust was not a viable alternative yet. Meaning that for a huge portion of our problems, Rust was too late anyway.
The curl vulnerabilities that are C mistakes and those that are not, per date of introduction
40 is not 70
In lots of online sources people repeat that when writing code with C or C++, the share of security problems due to lack of memory-safety is in the range 60-70% of the flaws. In curl, looking per the date of when we introduced the flaws (not reported date), we were never above 50% C mistakes. Looking at the flaw introduction dates, it shows that this was true already back when the project was young so it’s not because of any recent changes.
If we instead count the share per report-date, the share has fluctuated significantly over time, as then it has depended on when people has found which problems. In 2010, the reported problems caused by C mistakes were at over 60%.
The curl vulnerabilities that are C mistakes and those that are not, per date of reporting
Of course, curl is a single project and not a statistical proof of any sort. It’s just a 25 year-old project written in C with more knowledge of and introspection into these details than most other projects.
Additionally, the share of C mistakes is slightly higher among the issues rated with higher severity levels: 51% (22 of 43) of the issues rated high or critical was due to C mistakes.
Help curl authors do better
We need to make it harder to write bad C code and easier to write correct C code. I do not only speak of helping others, I certainly speak of myself to a high degree. Almost every security problem we ever got reported in curl, I wrote. Including most of the issues caused by C mistakes. This means that I too need help to do right.
I have tried to learn from past mistakes and look for patterns. I believe I may have identified a few areas that are more likely than others to cause problems:
strings without length restrictions, because the length might end up very long in edge cases which risks causing integer overflows which leads to issues
reallocs, in particular without length restrictions and 32 bit integer overflows
memory and string copies, following a previous memory allocation, maybe most troublesome when the boundary checks are not immediately next to the actual copy in the source code
perhaps this is just subset of (3), but strncpy() is by itself complicated because of the padding and its not-always-null-terminating functionality
We try to avoid the above mentioned “problem areas” like this:
We have general maximum length restrictions for strings passed to libcurl’s API, and we have set limits on all internally created dynamic buffers and strings.
We avoid reallocs as far as possible and instead provide helper functions for doing dynamic buffers. In fact, avoiding all sorts of direct memory allocations help.
Many memory copies cannot be avoided, but if we can use a pointer and length instead that is much better. If we can snprintf() a target buffer that is better. If not, try do the copy close to the boundary check.
Avoid strncpy(). In most cases, it is better to just return error on too long input anyway, and then instead do plain strcpy or memcpy with the exact amount. Ideally of course, just using a pointer and the length is sufficient.
Over time, we use fewer “difficult functions” per line of libcurl code even when the code size grows.
These helper functions and reduction of “difficult functions” in the code are not silver bullets. They will not magically make us avoid future vulnerabilities, they should just ideally make it harder to do security mistakes. We still need a lot of reviews, tools and testing to verify the code.
Clean code
Already before we created these helpers we have gradually and slowly over time made the code style and the requirements to follow it, stricter. When the source code looks and feels coherent, consistent, as if written by a single human, it becomes easier to read. Easier to read becomes easier to debug and easier to extend. Harder to make mistakes in.
To help us maintain a consistent code style, we have tool and CI job that runs it, so that obvious style mistakes or conformance problems end up as distinct red lines in the pull request.
Source verification
Together with the strict style requirement, we also of course run many compilers with as many picky compiler flags enabled as possible in CI jobs, we run fuzzers, valgrind, address/memory/undefined behavior sanitizers and we throw static code analyzers on the code – in a never-ending fashion. As soon as one of the tools gives a warning or indicates that something could perhaps be wrong, we fix it.
Of course also to verify the correct functionality of the code.
Data for this post
All data and numbers I speak of in this post are publicly available in the curl git repositories: curl and curl-www. The graphs come from the curl web site dashboard. All graph code is available.
I’ve shown you email examples many times before. Today I received this. I don’t know this person. Clearly a troubled individual. I suspect she found my name and address somewhere and then managed to put me somewhere in the middle of the conspiracy against her.
The entire mail is written in a single paragraph and the typos are saved as they were written. It is a little hard to penetrate, but here it is:
From: Lindsay
Thank you for making it so easy for me to see that you have hacked into 3 of my very own devices throughout the year. I’m going to be holding onto all of my finds that have your name all over it and not by me because I have absolutely no reason to hack my own belongings. I will be adding this to stuff I have already for my attorney. You won’t find anything on my brand new tablet that you all have so kindly broken into and have violated my rights but have put much stress on myself as well. Maybe if you would have came and talked to me instead of hacking everything I own and fallow me to the point of a panic attack because I suffer from PTSD I might have helped you. I cannot help what my boyfriend does and doesn’t do but one ting I was told by the bank is that they would not let me talk for him so I can’t get involved. He has had his car up for pickup for months but I’m guessing that the reason they won’t pick the car up in the street right where it has sat for months waiting is because I’ve probably see every single driver that has or had fallowed me. My stress is so terrible that when I tell him to call the bank over and over again he does and doesn’t get anywhere and because of my stress over this he gets mad and beats me or choaks me. I have no where to go at the moment and I’m not going to sleep on the streets either. So if you can kindly tell the repo truck to pick up the black suv at his dad’s house in the street the bank can give them the number it would be great so I no longer have to deal with people thinking that they know the whole story. But really I am suffering horrificly. I’m not a mean person but imagine not knowing anything about what’s going on with your spouse and then finding out they didn’t pay the car payment and so being embarrassed about it try to pay for it yourself and they say no I have it only to find out that he did it for a second time and his dad actually was supposed to pay the entire thing off but instead he went down hill really fast and seeing the same exact people every day everywhere you go and you tell your spouse and they don’t believe you and start calling you wicked names like mine has and then from there every time my ptsd got worse from it happening over and over again and he says you’re a liar and he’s indenial about it and because I don’t agree with him so I get punched I get choked and now an broken with absolutely no one but God on my side.how would you feel if it was being done to you and people following you and your so angry that alls you do is yell at people anymore and come off as a mean person when I am not? I don’t own his car that he surrendered I don’t pay his bills he told me to drive it and that’s it.i trusted a liar and an abuser. I need someone to help other than my mom my attorney and eventually the news if everyone wants to be cruel to me I’m going to the news for people taking my pictures stalking me naibors across the street watching and on each side of the house and the school behind. It isn’t at all what you all think it is I want someone to help get the suv picked up not stalked. How would you feel if 5 cities were watching every single move? I am the victim all the way around and not one nabor has ever really taken the time to get to know me. I’m not at all a mean person but this is not my weight to carry. I have everyone on camera and I will have street footage pulled and from each store or gestation I go to. I don’t go anywhere anymore from this and I’m the one asking for help. Their was one guy who was trying to help me get in touch with the tow truck guy and I haven’t seen him since and his name is Antonio. He was going to help me. I have been trying to to the right thing from the start and yet you all took pleasure in doing rotten mean things to me and laughing about it. I want one person to come help me since I can’t talk with the bank to get his suv picked up and I won’t press charges on the person that helps nor onthe tow truck guy either.