CVSS is short for Common Vulnerability Scoring System and is according to Wikipedia a technical standard for assessing the severity of vulnerabilities in computing systems.
Typically you use an online CVSS calculator, click a few checkboxes and radio buttons and then you magically get a number from 0 to 10. There are also different versions of CVSS.
Every CVE filed to MITRE is supposed to have a CVSS score set. CVEs that are registered that lack this information will get “amended” by an ADP (Authorized Data Publishers) that think of it as their job. In the past NVD did this. Nowadays CISA does it. More on this below.
Let’s say you write a tool and library that make internet transfers. They are used literally everywhere, in countless environments and with an almost impossible number of different build combinations, target operating systems and CPU architectures. Let’s call it curl.
When you find a theoretical security problem in this product (theoretical because most problems are never actually spotted exploited), how severe is it? The CVSS calculation has a limited set of input factors that tend to result in a fairly high number for a network product. What if we can guess that the problem is only used by a few or only affects an unusual platform? Not included.
The CVSS scoring is really designed for when you know exactly when and how the product is used and how an exploit of the flaw affects it. Then it might at least work. For a generic code base shipped in a tarball that runs in more than twenty billion installations it does less so.
If you look around you can easily find numerous other (and longer) writings about the problems and challenges with CVSS. We are not alone thinking this.
CVSS is used
At the same time, it seems the popularity of security scanners have increased significantly over the last few years. The kind of products that scan your systems checking for vulnerable products and show you big alerts and warnings when they do.
The kind of programs that looks for a product, figures out a version number and then shouts if it finds a registered CVE for that product and version with a CVSS score above a certain threshold.
This kind of product that indirectly tricks users to deleting operating system components to silence these alerts. We even hear of people who have contractual agreements that say they must address these alerts within N number of business days or face consequences.
Just days ago I was contacted by users on macOS who were concerned about a curl CVE that their scanner found in the libcurl version shipped by Apple. Was their tool right or wrong? Do you think anyone involved in that process actually can tell? Do you think Apple cares?
curl skips CVSS
In the curl project we have given up trying to use CVSS to get a severity score and associated severity level.
In the curl security team we instead work hard to put all our knowledge together and give a rough indication about the severity by dividing it into one out of four levels: low, medium, high, critical.
We believe that because we are not tied to any (flawed and limited) calculator and because we are intimately familiar with the code base and how it is used, we can assess and set a better security severity this way. It serves our users better.
Part of our reason to still use these four levels is that our bug-bounty‘s reward levels are based on the level.
As a comparison, The Linux kernel does not even provide that course-grained indication, based on similar reasoning to why we don’t provide the numeric scores.
This is not treated well
The curl project is a CNA, which means that we reserve and publish our own CVE Ids to the CVE database. There is no middle man interfering and in fact no one else can file curl CVE entries anymore without our knowledge and us having a saying about it. That’s good.
However, the CVE system itself it built on the idea that every flaw has a CVSS score. When someone like us creates CVE entries without scores, that leaves something that apparently is considered a gaping sore in the system that someone needs to “fix”.
Who would “fix” this?
Authorized Data Publishers
A while ago this new role was added to the CVE ecosystem called ADPs. This job was previously done a little on the side but roughly the same way by NVD who would get all the CVEs, edit them and then publish them all themselves to the world with their additions. And the world really liked that and used the NVD database.
However NVD kind of drowned themselves by this overwhelming work and it has instead been replaced by CISA who is an “ADP” and is thus allowed to enrich CVE entries in the database that they think need “improvement”.
The main thing they seem to detect and help “fix” is the lack of CVSS in published CVE entries. Like every single curl CVE because we don’t participate in the CVSS dance.
No clues but it must get a score
Exactly in the same way this system was broken before when NVD did it, this new system is broken when CISA does it.
I don’t have the numbers for exactly how many CVE entries they do this “enrichment” for (there were over 40,000 CVEs last year but a certain amount of them had CVSS filed in by their CNAs). I think it is safe to assume that the volume is high and since they are filed for products in all sorts of categories it is certainly impossible for CISA to have experts in the many products and technologies each CVE describes and affects.
So: given limited time and having no real clue what the issues are about, the individuals in this team click some buttons in a CVSS calculator, get a score, a severity and then (presumably) quickly move on the next issue. And the next. And the next. In a never-ending stream of incoming security issues.
How on earth does anyone expect them to get this right? I mean sure, in some or perhaps even many cases they might get close because of luck, skill or something but the system is certainly built in a way that just screams: this will end up crazy wrong ever so often.
A recent example
In the end of 2024 I was informed by friends that several infosec related websites posted about a new curl-related critical security problem. Since we have not announced any critical security problems since 2013, that of course piqued my interest so I had a look.
It turned out that CISA had decided that CVE-2024-11053 should be earned a CVSS 9.1 score: CRITICAL, and now scanners and news outlets had figured that out. Or would very soon.
The curl security team had set the severity to LOW because of the low risk and special set of circumstances that are a precondition for the problem. Go read it yourself – the fine thing with CVEs for Open Source products is that the source, the fix and everything is there to read and inspect as much as we like.
The team of actual experts who knows this code and perfectly understands the security problem says LOW. The team at CISA overrides that and insists that are all wrong and that this problem risks breaking the Internet. Because we apparently need a CVSS at all costs.
A git repository
One positive change that the switch to CISA from NVD brought is that now they host their additional data in GitHub repository. Once I was made aware of this insane 9.1 score, I took time of my Sunday afternoon with my family and made a pull-request there urging them to at least lower the score to 5.3. That was a score I could get the calculator to tell me.
I wanted to have this issue sorted and stomped down as quickly as possible to if possible reduce the risk that security scanners everywhere would soon start alerting on this and we would get overloaded with queries from concerned and worried users.
It’s not like CISA gets overloaded by worried users when they do this. Their incompetence here puts a load on no one else but the curl project. But sure, they got their CVSS added.
After my pull request it took less than ninety minutes for them to update the curl records. Without explanation, with no reference to my PR, they now apparently consider the issue to be CVSS 3.4.
I’m of course glad it is no longer marked critical. I think you all understand exactly how arbitrary and random this scoring approach is.
A problem with the initial bad score getting published is of course that a certain number of websites and systems are really slow or otherwise bad at updating that information after they initially learned about the critical score. There will linger websites out there speaking about this “critical” curl bug for a long time now. Thanks CISA!
Can we avoid this?
In the curl security team we have discussed setting “fixed” (fake) scores on our CVE entries just in order to prevent CISA or anyone else to ruin them, but we have decided not to since that would be close to lying about them and we actually work fiercely to make sure we have everything correct and meticulously described.
So no, since we do not do the CVSS dance, we unfortunately will continue having CISA do this to us.
Stop mandatory CVSS?
I am of course advocating strongly within the CNA ecosystem that we should be able to stop CISA from doing this, but I am just a small cog in a very large machine. A large machine that seems to love CVSS. I do not expect to have much success in this area anytime soon.
And no, I don’t think switching to CVSS 4.0 or updates to this system is ultimately going to help us. The problem is grounded in the fact that a single one-dimensional score is just too limited. Every user or distributor of the project should set scores for their different use cases. Maybe even different ones for different cases. Then it could perhaps work.
But I’m not in this game for any quick wins. I’m on the barricades for better (Open Source) security information, and to stop security misinformation. Ideally for the wider ecosystem, because I think we are far from alone in this situation.
The love of CVSS is strong and there is a lot of money involved based on and relying on this.
Minor update
After posting this, I got confirmation that the Go Security team does what we do and has the same problems. Filippo Valsorda told me on Bluesky. Just to show that this is a common pattern.
Update two
Some fourteen hours after I posted this blog post and it spread around the world, my enrichment PR to CISA I mentioned above got this added comment:
While it is good to be recognized, it does not feel like it will actually address the underlying problem here.
Update three
What feels like two hundred persons have pointed out that the CVSS field is not mandatory in the CVE records. It is a clarification that does not add much. The reality is that users seem to want the scores so bad that CISA will add CVSS nonetheless, mandatory or not.
Nobody cares about security anymore (news about different incidents is a daily staple)—what people care about is “I’d like to show that I had my process in place and issues addressed.”
More than a want – orgs NEED to be able to show they did something to make their insurer happy, and to show in court when they inevitably get hacked.
Is there some way to get to the management at CISA and convince them to knock it off for you guys? I respect you not wanting to put fake values in the CVSS scores, but at some point the amount of extra, unplanned work you do fighting the wrong ones may not be worth it.
The stupidest thing of all is that it’s all just security theatre.
Orgs may need to be able to show they did something, but they don’t have to prove that it’s effective or has any kind of impact.
Source: almost every security “remediation” tool out there that just scans and reports a score without actually making any changes.
I absolutely love the commit messages in
“updated data”
ah thank you, so edifying
Given you have no control how your project is used, what gives you confidence that your guesses are even vaguely correct.
I think the problem for curl is that people likely have widely differing use cases. For some a bug may well be critical, for some the issue will be zero. CVSS is supposed to give a single score.
One approach could be to add a vector of “deviance from out of the box deployment”. The more argument flags you need, the lower the score.
But is this just bodging the impossible? Maybe.
Because there are standard use cases and very edge cases.
If you drive on an Interstate Highway System, your vehicle doesn’t have to worry about large boulders (unless they’re the size of small boulders), operating under 30″ of water, or many other hazards.
If you choose to take your vehicle off of the main roads and overland, then you suddenly have a lot more risks, like rocks flipping up and puncturing your gas tank.
Should all vehicles be considered dangerous because they inadequately protect against rolling down a cliff? Or should every single operator of a motor vehicle be terrified about the safety of their vehicle because it’s very vulnerable to cliff-rolling activities?
As folks with expertise in things like network communications, I trust the curl team to have a pretty good idea about how most of curl and the Internet works, most of the time.
And I don’t have any evidence that they’ve been wrong on the side of too low about any of the impacts of their CVEs. So… yes, people are going to do stupid things, and they are going to (ab)use your tools in horrifying ways.
But `-k` doesn’t automatically make curl itself insecure, even if thousands or millions of requests every day are vulnerable to MITM attacks because a silly or stupid choice was made to use the flag.
Thanks for all you do, Daniel. This sort of theater is unbelievably annoying, but I suddenly had the fear that CISA would/will use puffed-up scores to keep software they don’t particularly like out of the United States. These are silly, terrifying times.
The same lack of competence problem applies elsewhere too, e.g. backports on “stable” distros or ELTS versions or similar projects designed to let very large corporations continue to pretend that the world does not change around them as quickly as it does.
Do you honestly think a stable distro maintainer who maintains dozens or hundreds of packages has decent grasp of every single one of them on par with a core developer in the relevant upstream project? Do you think each of those backports is tested to the same degree (or better) as a new upstream release that goes through multiple publicly available release candidates?
None of this is about actually fixing problems, it is all about the “nobody ever got fired for buying IBM” effect, i.e. people in large corporations playing internal office politics and avoiding blame by using “best practices” even if those practices are horribly broken, as long as everyone else is using the same practice they won’t stick out when things go pear-shaped.
In these environments you don’t get blame for doing something wrong and rewarded for doing something right, you get rewarded for sticking to the process regardless of the result.
Absolutely everything is a mess in this circus. The fundamental problem is that some companies make a huge amount of money selling fear to incompetent CISOs, so bugs are the raw material of a real domain of the world’s economy and this creates incentives for lots of organisms/companies/would-be hackers etc to participate to that parade.
For those who are having difficulties figuring how bad such CVSS scores can be, the exercise is simple: pick a random CVE published on any site and affecting a software you don’t use. Then read the description and try to imagine where to click in the calculator. For each click if you think “oh, probably not”, you’ll get a low score. If you think “well, possibly”, you’ll get the highest score. In both cases you have no idea about your choices, and that’s exactly what’s happening behind closed doors to feed bogus but expensive security scanners. This must really stop but I don’t see how it will
While I 100% with Daniel, let’s take a different approach here:
“While many use only the CVSS Base score for determining severity, temporal and environmental scores also exist, to factor in availability of mitigations and how widespread vulnerable systems are within an organization, respectively. ” – Wikipedia
“The Base group represents the intrinsic qualities of a vulnerability that are constant over time and across user environments, the Temporal group reflects the characteristics of a vulnerability that change over time, and the Environmental group represents the characteristics of a vulnerability that are unique to a user’s environment.” – FIRST
CVSS has had Environmental and Temporal values since v1.
Companies with trained security peeps know that a vulnerability *IS SUPPOSED TO BE RE-SCORED* based on the environment.
The problem is, as usual, a training and education issue. Lack of qualified infosec staff often just because of size of organization.
I’m usually against changing a system because people use it the wrong way.
I’m working on vulnerability scanning software, and I’m also very unhappy with the current way vulnerabilities are handled. The CVSS score is a terrible measure of severity, but it’s very often the only data point that can be extracted from CVEs in a reliable way, to try and make sense of the noise.
I was looking into whether I could apply a curl-specific fix by grabbing the CURL severity rating (low/medium/high/critical) in a structured way, and use that in place of CVSS where possible. I see the severity rating shows up in the official CURL CVE JSON data, in a database_specific field (
However, by the time it gets to the MITRE API, that data seems to be completely missing: . It looks like data gets “sanitized” (read: dropped) when it gets transmitted from the CNAs to the MITRE aggregated database?
This means we need to grab the CURL CVE data from the CNA directly, and somehow “merge” it with the MITRE API. I wonder if the CURL CNA could provide that severity in a way that would reach the MITRE database so it can be more easily picked up by vulnerability scanning software?
I think the challenge here is that the SEVERITY that CVSS spits out, in no way reflects the RISK in the real world. Severity and risk are not the same, and too many organizations make this false equivalency. In reality, I can have a critical severity vulnerability that represents little to no risk on a product. The CVE program, nor the NVD, have a way to represent risk today.
Ideally, both severity and risk would be provided. Risk being a little more difficult and may vary depending on use cases, environment, configuration, etc. But multiple risk scores could be represented for a single vulnerability.
As a CNA, perhaps another angle is not to fight against CVSS (although I completely agree on its questionable usefulness) and instead champion “risk transparency”, or a way to communicate risk as an optional data element of a CVE record so that consumers can take that into account. There are many different ways to calculate risk, and even a few calculators that people can use. The OWASP Risk Rating methodology, is flawed like most other risk methodologies, but there are many organizations that use it. Worth taking a look.
CVSS Score is not mandatory. You can consider alternate metrics called SSVC – As a “supplier” of software and you can provide your own evaluation either from a OR using your own “Supplier Decision Tree” – publish your tree and the way you arrive at your metrics. You can reuse the Decision Points defined here or make your own in your own “namespace” to bring about your “Decision”
Criticism is good. I clearly see that we can do better.
Criticism must yet not destroy the weak fundaments of a common movement to improve security. The article moves into this presumably popular direction.
We decided to fully embrace CVSS in our tooling and pair it with other metrics. Therefore we also publish one of these CVSS calculators as online tool:
@Karsten: if this blog post “destroys fundaments” merely by pointing out weaknesses and flaws in the systems and how they impact projects such as curl, then for sure the system is in an even worse state than I thought.
If you already have four levels, why not just define an equivalent score to each and send that? For example low is a 2, medium a 5, high an 8 and critical a 10.
@Ahmed: I already address that in the text: because we must provide the full CVSS calculator string as well then, meaning we need to fill in answers to all the questions, which implies “lying”.
Do you think EPSS can be used to make exploitation prediction better?
That’s why we developed new scoring system: XVRS;
Let me offer a perspective based on’s CVSS standards and practical implementation.
First, let’s clarify a fundamental point: CVSS Base scores (CVSS-B) are designed to be inputs, not final verdicts. They represent a standardized starting point for organizational risk assessment – much like throwing out the thermometer because fever levels isn’t the answer you wanted, you wanted a cure.
The goal isn’t perfect prediction – it’s better communication. Just as we don’t abandon speed limits because road conditions vary
Ever tried following a recipe that listed ingredients as ‘enough flour’? That’s the challenge security teams face when vulnerability reports abandon standardised metrics.
A constructive path forward might include:
Working within’s CVSS SIG – like contributing to industry standards rather than creating proprietary measurements that communicate “We’re smarter than everyone in that other camp that invioted us”
Better communicate that you are using CVSS Base (CVSS-B) as a foundation – like using standard cook times in recipes while acknowledging that cooking times may vary.
CVSS vectors capture detailed technical insights that help organizations make informed decisions.
It is not that difficault to be both rational and helpful to end-users – doing something speculative and bespoke rather than fact-based and standard is just hubris and effort that could be channeled to benefit the percieved failures if you choose to improve with the time spent on bespoke things
Security theatre takes second place in terms of risk to underestimating the problem.
So it seems like the problem that you have with CVSS isn’t actually a problem with CVSS at all.
You have a problem with CISA (Not affiliated with CVSS in any way) improperly scoring vulnerabilities as an ADP (once again, not determined by CVSS but by MITRE).
The reason you were not able to get a “lower score” is because you didn’t properly fill out the environmental metrics which help tell the greater story of the severity of the issue.
I don’t think your problem is with CVSS at all.
Hi Daniel. We’ve talked about this in the CVSS SIG. CVSS certainly isn’t perfect. We are actively discussing guidance about how library maintainers and vendors can provide scoring on a per-platform basis to capture impacts that are more tailored to different environments. I agree that it’s difficult to express one CVSS assessment that works for all platforms, but that’s how the ecosystem has developed. And, sometimes analysts get things wrong.
I encourage anyone to provide feedback to the CVSS SIG about ways we can improve the standard to work better for everyone. Please reach out.
@Nick: I don’t know how to fix this. I suspect it is not actually possible to “fix”. However we do it, a single score for a vulnerability is never going to work for every environment. THAT is the real problem, and that’s not really because of CVSS, but rather how it is used.
I would probably be happier if we could set a numerical score (without a CVSS format string) for our CVE Ids so that we could simply map our four levels to fixed numbers. And avoid having anyone “improve” them later. That too is not really a CVSS problem.