Tag Archives: JSON

CVE as JSON

It started as just a test to see if I could use the existing advisory data we have for all curl CVEs to date and provide that as JSON. Maybe, I thought, if we provide it good enough it can be used to populate other databases automatically or even get queried easier by tools.

Information

In the curl project we have published 141 vulnerability advisories so far, each with its own registered CVE Id. We make an effort to provide all the details about all the flaws as good and thorough as possible, but also with easy overviews and tables etc so that users can quickly detect for example which curl versions are vulnerable to which flaws etc. It is our going the extra meticulous mile that makes it extra annoying when others override what we conclude.

More machine friendly

In a recent push I decided that all the info we have and provide could and should be offered in a more machine friendly format for whoever wants it.

After a first few test shots, fellow curl team mates pointed out to me that there is an existing effort called the Open Source Vulnerability format, an openly developed JSON schema designed for pretty much exactly what I was set out to do. I agreed that it made sense to follow something existing rather to make up my own format.

Can be improved

Of course we ran into some minor snags with this schema and there are still details in it that I think can be improved and we are discussing with the OSV team to see if there is merit to our ideas or not. Still, even without any changes we can now offer our data using this established format.

The two primary things I want to improve is how we provide project identification (whose issue is this) and how we convey the severity level of the issue.

Different sets

As of now, we offer a set of different ways to get the CVE data as JSON.

1. Everything all at once

On the fixed URL https://curl.se/docs/vuln.json, we provide a JSON array holding a number of JSON objects. One object per existing CVE. Right now, this is 349KB of data. (If you ask for it compressed it will be smaller!)

This URL will always contain the entire set and it will automatically update in the future as we published new CVEs. It also automatically updates when we correct or change any of the previously published advisories.

2. Single object per issue

If you prefer to get the exact metadata for a single specific curl CVE, you can get the JSON for an issue by replacing the .html extension to .json for any CVE documented on the website. You will also find a menu option the “related box” for each documented CVE that links to it.

For example the vulnerability CVE-2022-35260, that we published last year which is documented on https://curl.se/docs/CVE-2022-35260.html, has its corresponding JSON object at https://curl.se/docs/CVE-2022-35260.json.

3. Objects per release

On the curl website we already offer a way for users to get a list of all known vulnerabilities a certain specific release is known to be vulnerable to. Of course always updated with the latest publications.

For example, curl version 7.87.0 has all its vulnerability information detailed on https://curl.se/docs/vuln-7.87.0.html. When I write this, there are eight known vulnerabilities for this version.

Screenshot of the website displaying vulnerability information for curl 7.87.0

Again, either by clicking the JSON link there on the page under the table, or simply by replacing html by json in the URL, the user can get a listing of all the CVEs this version is vulnerable to, as a JSON array with a number of JSON objects inside. In this case, eight objects as of now.

That info is thus available at https://curl.se/docs/vuln-7.87.0.json.

JSON

While I expect the format might still change a little bit going forward, and not all issues have all the metadata provided just yet (for example, the git commit ranges are still lacking on a number of issues from before 2017), here is an illustration screenshot of jq displaying the JSON object for CVE-2022-27780.

Object details

The database_specific object near the top is metadata that we have and believe belongs with the issue but that has no defined established field in the JSON schema. Since I think the data still adds value to users, I decided to put them into this section that is designed and meant exactly for this kind of extensions.

You can see that we set an “id” that is the CVE ID with a CURL- prefix. This is just us catering to the conditions of OSV and the JSON schema. We apparently need our own ID and provide the actual CVE ID as an alias, so we “fake” this by simply prepending curl to the CVE ID. We don’t use any private ID when we work with vulnerabilities: we only deal with public issues and we only deal with issues that are CVE worthy so it seems unnecessary to involve anything else.

Credits

Image by Reto Scheiwiller from Pixabay

Easier header-picking with curl

Okay you might ask, what’s the news here? We’ve been able to get HTTP response headers with curl since virtually the stone age. Yes we have. Get the page and also show the headers:

curl -i https://example.com/

Make a HEAD request and see what headers we get back:

curl -I https://example.com/

Save the response headers in a separate file:

curl -D headers.txt https://example.com/

Get a specific header

This gets a little more complicated but you can always do

curl -I https://example.com/ | grep Date:

Which of course will fail if the casing is different, you need to check for it case insensitively. There might also be another header ending with “date:” that matches so you need to make sure that this an exact match

curl -I https://example.com/ | grep -i ^Date:

Now this shows the entire header, but for most cases you only want the value. So get it with cut:

curl -I https://example.com/ | grep -i ^Date: | cut -d: -f2-

You have the header value extracted now, but the leading and trailing white spaces in the content are probably not what you want in there so let’s strip them as well:

curl -I https://example.com/ | grep -i ^Date: | cut -d: -f2- | sed 's/^ *\(.*\).*/\1/'

There are of course many different ways you can do this operation and some of them are more clever than the methods I’ve used here. They are still often more or less convoluted and error-prone.

If we imagine that this is a fairly common use case for curl users in the world, then this kind of operation is found duplicated in quite a few scripts, applications and devices in the world.

Maybe we could make this easier for curl users?

A headers API

The other day we introduced a new experimental headers API to libcurl. Using this API, an application using libcurl gets an easy to use API to extract individual or several headers and their content.

As curl is such a libcurl-using application, we have expanded it to make use of this new API and this brings some new fun features to the curl tool.

Let me emphasize that since this API is labeled experimental it is not enabled in a default build. You need to explicitly enable it!

Get a single header, the new way

I decided to extend the -w output feature for this.

To extract a single header, get the value with leading and trailing spaces trimmed, use %header{name}. To repeat the operation from above and get the Date: header

curl -I -w '%header{date}' https://example.com/

‘date’ in this example is a case insensitive header name without the trailing colon and you can of course use any header name you please there. If the given header did not actually arrive in the response, it outputs nothing.

If you want more headers output, just repeat the %header{name} construct as many times as you like. If the -w output string gets unwieldy and hard to manage on the command line, then make it into a text file instead and tell -w about it with -w @filename.

curl -I -w @filename https://example.com/

Which headers?

There are several different kinds of headers and there can be multiple requests used for a transfer, but this option outputs the “normal” server response headers from the most recent request done. The option only works for HTTP(S) responses.

All headers – as JSON

As dealing with formatted data in the form of JSON has become very popular, I want to help fertilize this by making curl able to output all response headers as a JSON object.

This way, you can move the header handling, parsing and perhaps filtering to your JSON aware tool.

Tell curl to output the received HTTP headers as a JSON object:

curl -o save -w "%{header_json}" https://example.com/

curl itself does not pretty-print this, but if you pass the JSON from curl to a beautifier such as jq, the output ends up looking like this:

{
  "age": [
    "269578"
  ],
  "cache-control": [
    "max-age=604800"
  ],
  "content-type": [
    "text/html; charset=UTF-8"
  ],
  "date": [
    "Tue, 22 Mar 2022 08:35:21 GMT"
  ],
  "etag": [
    "\"3147526947+ident\""
  ],
  "expires": [
    "Tue, 29 Mar 2022 08:35:21 GMT"
  ],
  "last-modified": [
    "Thu, 17 Oct 2019 07:18:26 GMT"
  ],
  "server": [
    "ECS (nyb/1D2E)"
  ],
  "vary": [
    "Accept-Encoding"
  ],
  "x-cache": [
    "HIT"
  ],
  "content-length": [
    "1256"
  ]
}

JSON details

The headers are presented in the same order as received over the wire. Except if there are duplicated header names, as then they are grouped on the first occurrence and all values are provided there as a JSON array.

All headers are arrays just because there can be multiple headers using the same name .

The casing for the header names are kept unmodified from what was received, but for duplicated headers the casing used for the first occurrence will be used in the output.

Update: we lowercase all header names in the JSON output.

The “status line” of HTTP 1.x response, that first line that says “HTTP1.1 200 OK” when everything is fine, is not counted as a header by this function and will therefor not be included in this output.

Ships in 7.83.0

This feature is present in source code that will ship in curl 7.83.0, scheduled to happen late April 2022. Run your own build with it enabled, or ask your packager to provide an experimental build for you.

With enough positive feedback we should be able to move this out of experimental state fairly quickly.

curl dash-dash-json

The curl “cockpit” is yet again extended with a new command line option: --json. The 245th command line option.

curl is a generic transfer tool for sending and receiving data across networks and it is completely agnostic as to what it transfers or even why.

To allow users to craft all sorts of transfers, or requests if you will, it offers a wide range of command line options. This flexibility has made it possible for a large number of users to keep using curl even as network ecosystems and its (HTTP) use have changed over time.

Craft content to send

curl offers a few convenience options for creating contents to send in HTTP “uploads” . Options such as -F for building multi-part formposts, and --data-urlencode for URL-encoding POST data.

When curl was born, JSON didn’t exist. Over the decades since, JSON has grown to become a very commonly used format for structured data, often used in “REST API” calls and more. curl command lines sending and receiving JSON are very common now.

When asking users today what features they would like to see added in curl, and when asking them what good features they see in curl alternatives, a huge chunk of them mention JSON support in one way or another.

Partly because dealing with JSON with curl sometimes make the involved command lines rather long and quirky.

JSON in curl

The discussion has been ignited in the curl community about what, if anything, we should do in curl to make it a smoother and better tool when working with JSON. The offered opinions range from nothing (“curl is content agnostic”) to full-fledged JSON generator, parser and pretty-printer (or a combination in between). I don’t think we are ready to put our collective foot down on where we should go with this.

Introducing --json

While the discussion is ongoing, we still bring a first JSON oriented feature to the tool as a first step. The --json option. This is a new option that basically works as an alias, or shortcut, to sending JSON to an endpoint. You use it like this:

curl --json '{"tool": "curl"}' https://example.com/

Send JSON from a file:

curl --json @json.txt https://example.com/

Or send JSON passed to curl on stdin:

echo '{"a":"b"}' | curl --json @- https://example.com/

This change does not affect the libcurl library at all, it is only done the tool side of things.

Details

The --json option is the equivalent of setting

--data [arg]
--header "Content-Type: application/json"
--header "Accept: application/json"

If you want another Content-Type or Accept header, then you can override them as usual with -H. If you use multiple --json options on the same command line, the contents from them will be concatenated before sent off.

Ships!

This new json option was merged in this commit and will be available and present in curl 7.82.0, to be released in early March 2022. Of course you can build it from git or a daily snapshot already now to test it out!

Future JSON stuff

I want to at least experiment a little with some level of JSON creation for curl. To help users who want to create basic JSON on the command line to send off. Lots of users run into problems with this since JSON uses double quotes a lot and then the combination of quoting and unquoting and the shell often make users confused or even frustrated.

Exactly how or to what level is still to be determined. Your opinions and feedback are very valuable and helpful.

jo+curl+jq

This trinity of commands helps you send correctly formatted data and show the returned JSON easily. Like in a command line similar to:

jo name=daniel tool=curl | curl --json @- https://httpbin.org/post | jq

jo, jq