Tag Archives: command-line

store the curl output over there

tldr: --output-dir [directory] comes in curl 7.73.0

The curl options to store the contents of a URL into a local file, -o (--output) and -O (--remote-name) were part of curl 4.0, the first ever release, already in March 1998.

Even though we often get to hear from users that they can’t remember which of the letter O’s to use, they’ve worked exactly the same for over twenty years. I believe the biggest reason why they’re hard to keep apart is because of other tools that use similar options for maybe not identical functionality so a command line cowboy really needs to remember the exact combination of tool and -o type.

Later on, we also brought -J to further complicate things. See below.

Let’s take a look at what these options do before we get into the new stuff:

--output [file]

This tells curl to store the downloaded contents in that given file. You can specify the file as a local file name for the current directory or you can specify the full path. Example, store the the HTML from example.org in "/tmp/foo":

curl -o /tmp/foo https://example.org

--remote-name

This option is probably much better known as its short form: -O (upper case letter o).

This tells curl to store the downloaded contents in a file name name that is extracted from the given URL’s path part. For example, if you download the URL "https://example.com/pancakes.jpg" users often think that saving that using the local file name “pancakes.jpg” is a good idea. -O does that for you. Example:

curl -O https://example.com/pancakes.jpg

The name is extracted from the given URL. Even if you tell curl to follow redirects, which then may go to URLs using different file names, the selected local file name is the one in the original URL. This way you know before you invoke the command which file name it will get.

--remote-header-name

This option is commonly used as -J (upper case letter j) and needs to be set in combination with --remote-name.

This makes curl parse incoming HTTP response headers to check for a Content-Disposition: header, and if one is present attempt to parse a file name out of it and then use that file name when saving the content.

This then naturally makes it impossible for a user to be really sure what file name it will end up with. You leave the decision entirely to the server. curl will make an effort to not overwrite any existing local file when doing this, and to reduce risks curl will always cut off any provided directory path from that file name.

Example download of the pancake image again, but allow the server to set the local file name:

curl -OJ https://example.com/pancakes.jpg

(it has been said that “-OJ is a killer feature” but I can’t take any credit for having come up with that.)

Which directory

So in particular with -O, with or without -J, the file is download in the current working directory. If you want the download to be put somewhere special, you had to first ‘cd’ there.

When saving multiple URLs within a single curl invocation using -O, storing those in different directories would thus be impossible as you can only cd between curl invokes.

Introducing --output-dir

In curl 7.73.0, we introduce this new command line option --output-dir that goes well together with all these output options. It tells curl in which directory to create the file. If you want to download the pancake image, and put it in $HOME/tmp no matter which your current directory is:

curl -O --output-dir $HOME/tmp https://example.com/pancakes.jpg

And if you allow the server to select the file name but still want it in $HOME/tmp

curl -OJ --output-dir $HOME/tmp https://example.com/pancakes.jpg

Create the directory!

This new option also goes well in combination with --create-dirs, so you can specify a non-existing directory with --output-dir and have curl create it for the download and then store the file in there:

curl --create-dirs -O --output-dir /tmp/receipes https://example.com/pancakes.jpg

Ships in 7.73.0

This new option comes in curl 7.73.0. It is curl’s 233rd command line option.

You can always find the man page description of the option on the curl website.

Credits

I (Daniel) wrote the code, docs and tests for this feature.

Image by Alexas_Fotos from Pixabay

curl help remodeled

curl 4.8 was released in 1998 and contained 46 command line options. curl --help would list them all. A decent set of options.

When we released curl 7.72.0 a few weeks ago, it contained 232 options… and curl --help still listed all available options.

What was once a long list of options grew over the decades into a crazy long wall of text shock to users who would enter this command and option, thinking they would figure out what command line options to try next.

–help me if you can

We’ve known about this usability flaw for a while but it took us some time to figure out how to approach it and decide what the best next step would be. Until this year when long time curl veteran Dan Fandrich did his presentation at curl up 2020 titled –help me if you can.

Emil Engler subsequently picked up the challenge and converted ideas surfaced by Dan into reality and proper code. Today we merged the refreshed and improved --help behavior in curl.

Perhaps the most notable change in curl for many users in a long time. Targeted for inclusion in the pending 7.73.0 release.

help categories

First out, curl --help will now by default only list a small subset of the most “important” and frequently used options. No massive wall, no shock. Not even necessary to pipe to more or less to see proper.

Then: each curl command line option now has one or more categories, and the help system can be asked to just show command line options belonging to the particular category that you’re interested in.

For example, let’s imagine you’re interested in seeing what curl options provide for your HTTP operations:

$ curl --help http
Usage: curl [options…]
http: HTTP and HTTPS protocol options
--alt-svc Enable alt-svc with this cache file
--anyauth Pick any authentication method
--compressed Request compressed response
-b, --cookie Send cookies from string/file
-c, --cookie-jar Write cookies to after operation
-d, --data HTTP POST data
--data-ascii HTTP POST ASCII data
--data-binary HTTP POST binary data
--data-raw HTTP POST data, '@' allowed
--data-urlencode HTTP POST data url encoded
--digest Use HTTP Digest Authentication
[...]

list categories

To figure out what help categories that exists, just ask with curl --help category, which will show you a list of the current twenty-two categories: auth, connection, curl, dns, file, ftp, http, imap, misc, output, pop3, post, proxy, scp, sftp, smtp, ssh, telnet ,tftp, tls, upload and verbose. It will also display a brief description of each category.

Each command line option can be put into multiple categories, so the same one may be displayed in both in the “http” category as well as in “upload” or “auth” etc.

--help all

You can of course still get the old list of every single command line option by issuing curl --help all. Handy for grepping the list and more.

“important”

The meta category “important” is what we use for the options that we show when just curl --help is issued. Presumably those options should be the most important, in some ways.

Credits

Code by Emil Engler. Ideas and research by Dan Fandrich.

curl write-out JSON

This is not a command line option of the week post, but I feel a need to tell you a little about our brand new addition!

--write-out [format]

This option takes a format string in which there are a number of different “variables” available that let’s a user output information from the previous transfer. For example, you can get the HTTP response code from a transfer like this:

curl -w 'code: %{response_code}' https://example.org -o saved

There are currently 34 different such variables listed and described in the man page. The most recently added one is for JSON output and it works like this:

%{json}

It is a single variable that outputs a full json object. You would for example invoke it like this when you get data from example.com:

curl --write-out '%{json}' https://example.com -o saved

That command line will spew some 800 bytes to the terminal and it won’t be very human readable. You will rather take care of that output with some kind of script/program, or if you want an eye pleasing version you can pipe it into jq and then it can look like this:

{
  "url_effective": "https://example.com/",
  "http_code": 200,
  "response_code": 200,
  "http_connect": 0,
  "time_total": 0.44054,
  "time_namelookup": 0.001067,
  "time_connect": 0.11162,
  "time_appconnect": 0.336415,
  "time_pretransfer": 0.336568,
  "time_starttransfer": 0.440361,
  "size_header": 347,
  "size_request": 77,
  "size_download": 1256,
  "size_upload": 0,
  "speed_download": 0.002854,
  "speed_upload": 0,
  "content_type": "text/html; charset=UTF-8",
  "num_connects": 1,
  "time_redirect": 0,
  "num_redirects": 0,
  "ssl_verify_result": 0,
  "proxy_ssl_verify_result": 0,
  "filename_effective": "saved",
  "remote_ip": "93.184.216.34",
  "remote_port": 443,
  "local_ip": "192.168.0.1",
  "local_port": 44832,
  "http_version": "2",
  "scheme": "HTTPS",
  "curl_version": "libcurl/7.69.2 GnuTLS/3.6.12 zlib/1.2.11 brotli/1.0.7 c-ares/1.15.0 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.3.0) nghttp2/1.40.0 librtmp/2.3"
}

The JSON object

It always outputs the entire object and the object may of course differ over time, as I expect that we might add more fields into it in the future.

The names are the same as the write-out variables, so you can read the --write-out section in the man page to learn more.

Ships?

The feature landed in this commit. This new functionality will debut in the next pending release, likely to be called 7.70.0, scheduled to happen on April 29, 2020.

Credits

This is the result of fine coding work by Mathias Gumz.

Top image by StartupStockPhotos from Pixabay

The command line options we deserve

A short while ago curl‘s 230th command line option was added (it was --mail-rcpt-allowfails). Two hundred and thirty command line options!

A look at curl history shows that on average we’ve added more than ten new command line options per year for very long time. As we don’t see any particular slowdown, I think we can expect the number of options to eventually hit and surpass 300.

Is this manageable? Can we do something about it? Let’s take a look.

Why so many options?

There are four primary explanations why there are so many:

  1. curl supports 24 different transfer protocols, many of them have protocol specific options for tweaking their operations.
  2. curl’s general design is to allow users to individual toggle and set every capability and feature individually. Every little knob can be on or off, independently of the other knobs.
  3. curl is feature-packed. Users can customize and adapt curl’s behavior to a very large extent to truly get it to do exactly the transfer they want for an extremely wide variety of use cases and setups.
  4. As we work hard at never breaking old scripts and use cases, we don’t remove old options but we can add new ones.

An entire world already knows our options

curl is used and known by a large amount of humans. These humans have learned the way of the curl command line options, for better and for worse. Some thinks they are strange and hard to use, others see the logic behind them and the rest simply accepts that this is the way they work. Changing options would be conceived as hostile towards all these users.

curl is used by a very large amount of existing scripts and programs that have been written and deployed and sit there and do their work day in and day out. Changing options, even ever so slightly, risk breaking some to many of these scripts and make a lot of developers out there upset and give them everything from nuisance to a hard time.

curl is the documented way to use APIs, REST calls and web services in numerous places. Lots of examples and tutorials spell out how to use curl to work with services all over the world for all sorts of interesting things. Examples and tutorials written ages ago that showcase curl still work because curl doesn’t break behavior.

curl command lines are even to some extent now used as a translation language between applications:

All the four major web browsers let you export HTTP requests to curl command lines that you can then execute from your shell prompts or scripts. Other web tools and proxies can also do this.

There are now also tools that can import said curl command lines so that they can figure out what kind of transfer that was exported from those other tools. The applications that import these command lines then don’t want to actually run curl, they want to figure out details about the request that the curl command line would have executed (and instead possibly run it themselves). The curl command line has become a web request interchange language!

There are also a lot of web services provided that can convert a curl command line into a source code snippet in a variety of languages for doing the same request (using the language’s native preferred method). A few examples of this are: curl as DSL, curl to Python Requests, curl to Go, curl to PHP and curl to perl.

Can the options be enhanced?

I hope I’ve made it clear why we need to maintain the options we already support. However, it doesn’t limit what we can add or that we can’t add new ways of doing things. It’s just code, of course it can be improved.

We could add alternative options for existing ones that make sense, if there are particular ones that are considered complicated or messy.

We could add a new “mode” that would have a totally new set of options or new way of approaching what we think of options today.

Heck, we could even consider doing a separate tool next to curl that would similarly use libcurl for the transfers but offer a totally new command line option approach.

None of these options are ruled out as too crazy or far out. But they all of course require that someone think of them as a good ideas and is prepared to work on making it happen.

Are the options hard to use?

Oh what a subjective call.

Packing this many features into a single tool and having every single option and superpower intuitive and easy-to-use is perhaps not impossible but at least a very very hard task. Also, the curl command line options have been added organically over a period of over twenty years so some of them of could of course have been made a little smarter if we could’ve foreseen what would come or how the protocols would later change or improve.

I don’t think curl is hard to use (what you think I’m biased?).

Typical curl use cases often only need a very small set of options. Most users never learn or ever need to learn most curl options – but they are there to help out when the day comes and the user wants that particular extra quirk in their transfer.

Using any other tool, even those who pound their chest and call themselves user-friendly, if they grow features close to the amount of abilities that curl can do, such command lines also grow substantially and will no longer always be intuitive and self-explanatory. I don’t think a very advanced tool can remain easy to use in all circumstances. I think the aim should be to keep the commonly use cases easy. I think we’ve managed this in curl, in spite of having 230 different command line options.

Should we have another (G)UI?

Would a text-based or even graphical UI help or improve curl? Would you use one if it existed? What would it do and how would it work? Feel most welcome to tell me!

Related

See the cheat sheet refreshed and why not the command line option of the week series.

curl ootw: –keepalive-time

(previously blogged about options are listed here.)

This option is named --keepalive-time even if the title above ruins the double-dash (thanks for that WordPress!). This command line option was introduced in curl 7.18.0 back in early 2008. There’s no short version of it.

The option takes a numerical argument; number of seconds.

What’s implied in the option name and not spelled out is that the particular thing you ask to keep alive is a TCP connection. When the keepalive feature is not used, TCP connections typically don’t send anything at all if no data is transmitted.

Idle TCP connections

Silent TCP connections typically cause the two primary issues:

  1. Middle-boxes that track connections, such as your typical NAT boxes (home WiFi routers most notoriously) will consider silent connections “dead” after a certain period of time and drop all knowledge about them, leading to the connection non functioning when the client (or server) later wants to resume operation of it.
  2. Neither side of the connection will notice when the network between them breaks, as it takes actual traffic to do so. This is of course also a feature, because there’s no need to be alarmed by a breakage if there’s no traffic as it might be fine again when it eventually gets used again.

TCP stacks then typically implement a low-level feature where they can send a “ping” frame over the connection if it has been idle for a certain amount of time. This is the keepalive packet.

--keepalive-time <seconds> therefor sets the interval. After this many seconds of “silence” on the connection, there will be a keepalive packet sent. The packet is totally invisible to the applications on both sides but will maintain the connection through NATs better and if the connection is broken, this packet will make curl detect it.

Keepalive is not always enough

To complicate issues even further, there are also devices out there that will still close down connections if they only send TCP keepalive packets and no data for certain period. Several protocols on top of TCP have their own keepalive alternatives (sometimes called ping) for this and other reasons.

This aggressive style of closing connections without actual traffic TCP traffic typically hurts long-going FTP transfers. This, because FTP sets up two connections for a transfer, but the first one is the “control connection” and while a transfer is being delivered on the “data connection”, nothing happens over the first one. This can then result in the control connection being “dead” by the time the data transfer completes!

Default

The default keepalive time is 60 seconds. You can also disable keepalive completely with the --no-keepalive option.

The default time has been selected to be fairly low because many NAT routers out there in the wild are fairly aggressively and close idle connections already after two minutes (120) seconds.

For what protocols

This works for all TCP-based protocols, which is what most protocols curl speaks use. The only exception right now is TFTP. (See also QUIC below.)

Example

Change the interval to 3 minutes:

curl --keepalive-time 180 https://example.com/

Related options

A related functionality is the --speed-limit amd --speed-time options that will cancel a transfer if the transfer speed drops below a given speed for a certain time. Or just the --max-time that sets a global timeout for an entire operation.

QUIC?

Soon we will see QUIC getting used instead of TCP for some protocols: HTTP/3 being the first in line for that. We will have to see what exactly we do with this option when QUIC starts to get used and what the proper mapping and behavior shall be.

curl goez parallel

The first curl release ever saw the light of day on March 20, 1998 and already then, curl could transfer any amount of URLs given on the command line. It would iterate over the entire list and transfer them one by one.

Not even 22 years later, we introduce the ability for the curl command line tool to do parallel transfers! Instead of doing all the provided URLs one by one and only start the next one once the previous has been completed, curl can now be told to do all of them, or at least many of them, at the same time!

This has the potential to drastically decrease the amount of time it takes to complete an operation that involves multiple URLs.

–parallel / -Z

Doing transfers concurrently instead of serially of course changes behavior and thus this is not something that will be done by default. You as the user need to explicitly ask for this to be done, and you do this with the new –parallel option, which also as a short-hand in a single-letter version: -Z (that’s the upper case letter Z).

Limited parallelism

To avoid totally overloading the servers when many URLs are provided or just that curl runs out of sockets it can keep open at the same time, it limits the parallelism. By default curl will only try up to 50 transfers concurrently, so if there are more transfers given to curl those will wait to get started once one of the first transfers are completed. The new –parallel-max command line option can be used to change the concurrency limit.

Progress meter

Is different in this mode. The new progress meter that will show up for parallel transfers is one output for all transfers.

Transfer results

When doing many simultaneous transfers, how do you figure out how they all did individually, like from your script? That’s still to be figured out and implemented.

No same file splitting

This functionality makes curl do URLs in parallel. It will still not download the same URL using multiple parallel transfers the way some other tools do. That might be something to implement and offer in a future fine tuning of this feature.

libcurl already do this fine

This is a new command line feature that uses the fact that libcurl can already do this just fine. Thanks to libcurl being a powerful transfer library that curl uses, enabling this feature was “only” a matter of making sure libcurl was used in a different way than before. This parallel change is entirely in the command line tool code.

Ship

This change has landed in curl’s git repository already (since b8894085000) and is scheduled to ship in curl 7.66.0 on September 11, 2019.

I hope and expect us to keep improving parallel transfers further and we welcome all the help we can get!

curl another host

Sometimes you want to issue a curl command against a server, but you don’t really want curl to resolve the host name in the given URL and use that, you want to tell it to go elsewhere. To the “wrong” host, which in this case of course happens to be the right host. Because you know better.

Don’t worry. curl covers this as well, in several different ways…

Fake the host header

The classic and and easy to understand way to send a request to the wrong HTTP host is to simply send a different Host: header so that the server will provide a response for that given server.

If you run your “example.com” HTTP test site on localhost and want to verify that it works:

curl --header "Host: example.com" http://127.0.0.1/

curl will also make cookies work for example.com in this case, but it will fail miserably if the page redirects to another host and you enable redirect-following (--location) since curl will send the fake Host: header in all further requests too.

The --header option cleverly cancels the built-in provided Host: header when a custom one is provided so only the one passed in from the user gets sent in the request.

Fake the host header better

We’re using HTTPS everywhere these days and just faking the Host: header is not enough then. An HTTPS server also needs to get the server name provided already in the TLS handshake so that it knows which cert etc to use. The name is provided in the SNI field. curl also needs to know the correct host name to verify the server certificate against (server certificates are rarely registered for an IP address). curl extracts the name to use in both those case from the provided URL.

As we can’t just put the IP address in the URL for this to work, we reverse the approach and instead give curl the proper URL but with a custom IP address to use for the host name we set. The --resolve command line option is our friend:

curl --resolve example.com:443:127.0.0.1 https://example.com/

Under the hood this option populates curl’s DNS cache with a custom entry for “example.com” port 443 with the address 127.0.0.1, so when curl wants to connect to this host name, it finds your crafted address and connects to that instead of the IP address a “real” name resolve would otherwise return.

This method also works perfectly when following redirects since any further use of the same host name will still resolve to the same IP address and redirecting to another host name will then resolve properly. You can even use this option multiple times on the command line to add custom addresses for several names. You can also add multiple IP addresses for each name if you want to.

Connect to another host by name

As shown above, --resolve is awesome if you want to point curl to a specific known IP address. But sometimes that’s not exactly what you want either.

Imagine you have a host name that resolves to a number of different host names, possibly a number of front end servers for the same site/service. Not completely unheard of. Now imagine you want to issue your curl command to one specific server out of the front end servers. It’s a server that serves “example.com” but the individual server is called “host-47.example.com”.

You could resolve the host name in a first step before curl is used and use --resolve as shown above.

Or you can use --connect-to, which instead works on a host name basis. Using this, you can make curl replace a specific host name + port number pair with another host name + port number pair before the name is resolved!

curl --connect-to example.com:443:host-47.example.com:443 https://example.com/

Crazy combos

Most options in curl are individually controlled which means that there’s rarely logic that prevents you from using them in the awesome combinations that you can think of.

--resolve, --connect-to and --header can all be used in the same command line!

Connect to a HTTPS host running on localhost, use the correct name for SNI and certificate verification, but then still ask for a separate host in the Host: header? Sure, no problem:

curl --resolve example.com:443:127.0.0.1 https://example.com/ --header "Host: diff.example.com"

All the above with libcurl?

When you’re done playing with the curl options as described above and want to convert your command lines to libcurl code instead, your best friend is called --libcurl.

Just append --libcurl example.c to your command line, and curl will generate the C code template for you in that given file name. Based on that template, making use of  that code correctly is usually straight-forward and you’ll get all the options to read up in a convenient way.

Good luck!

Update: thanks to @Manawyrm, I fixed the ndash issues this post originally had.

A flying curl progress bar

curl features an alternative progress bar. When you invoke it with -# or the longer version –progress-bar, curl will show the transfer progress using a single “bar” on the screen instead of the default meter that shows a lot of data like amount of data, transfer speeds and times.

$ curl -# -O https://example.com/coolfile.tar.gz
############################################ 100.0%

The alternative progress bar works great when the amount of data to transfer is known since then it can actually know how large part of the transfer that is done etc. If the amount of data is unknown – which is not a super rare situation – the progress bar output instead used to output one ‘#’ per kilobyte of data so that it would still show something. That could then end up filling up the screen and more if you did a large transfer.

$ curl -# -O https://example.com/nosize.html
###########################################################################################################

The space ship bar

Starting in curl 7.58.0 (to be released on January 24, 2018), this latter progress bar layout is modified. If the total size is unknown, it will now instead display a small space ship flying across the line, back and forth – and it will only move as long as there is data being transferred. If it stalls, the little ship stops.

“Over” the space ship there are four nonsensical flying hashes (‘#’) that are simply moving across the line on a sine wave, following each other. They move independently of there being data transferred or not.

It can then end up looking similar to this:

Pointless

There’s no real “meaning” behind this new progress bar output mode. I wanted it to

  1. only use a single line, even in the no-total size known case
  2. somehow indicate when there’s no data flying (ie space ship stops)
  3. make it slightly more interesting to watch than just one # per kilobyte

Since this new bar has just landed and this is the first time we ship a release with it, I wouldn’t be surprised if we end up polishing it further later on.

Can you tell I started out my programming life as a demo programmer on the Commodore 64? 🙂

Easier HTTP requests with h2c

I spend a large portion of my days answering questions and helping people use curl and libcurl. With more than 200 command line options it certainly isn’t always easy to find the correct ones, in combination with the Internet and protocols being pretty complicated things at times… not to mention the constant problem of bad advice. Like code samples on stackoverflow that repeats non-recommended patterns.

The notorious -X abuse is a classic example, or why not the widespread disease called too much use of the –insecure option (at a recent count, there were more than 118,000 instances of “curl --insecure” uses in code hosted by github alone).

Sending HTTP requests with curl

HTTP (and HTTPS) is by far the most used protocol out of the ones curl supports. curl can be used to issue just about any HTTP request you can think of, even if it isn’t always immediately obvious exactly how to do it.

h2c to the rescue!

h2c is a new command line tool and associated web service, that when passed a complete HTTP request dump, converts that into a corresponding curl command line. When that curl command line is then run, it will generate exactly(*) the HTTP request you gave h2c.

h2c stands for “headers to curl”.

Many times you’ll read documentation somewhere online or find a protocol/API description showing off a full HTTP request. “This is what the request should look like. Now send it.” That is one use case h2c can help out with.

Example use

Here we have an HTTP request that does Basic authentication with the POST method and a small request body. Do you know how to tell curl to send it?

The request:

POST /receiver.cgi HTTP/1.1
Host: example.com
Authorization: Basic aGVsbG86eW91Zm9vbA==
Accept: */*
Content-Length: 5
Content-Type: application/x-www-form-urlencoded

hello

I save the request above in a text file called ‘request.txt’ and ask h2c to give the corresponding curl command line:

$ ./h2c < request.txt
curl --http1.1 --header User-Agent: --user "hello:youfool" --data-binary "hello" https://example.com/receiver.cgi

If we add "--trace-ascii dump” to that command line, run it, and then inspect the dump file after curl has completed, we can see that it did indeed issue the HTTP request we asked for!

Web Site

Maybe you don’t want to install another command line tool written by me in your system. The solution is the online version of h2c, which is hosted on a separate portion of the official curl web site:

https://curl.se/h2c/

The web site lets you paste a full HTTP request into a text form and the page then shows the corresponding curl command line for that request.

h2c “as a service”

Inception alert: you can also use the web version of h2c by sending over a HTTP request to it using curl. You’ll then get nothing but the correct curl command line output on stdout.

To send off the same file we used above:

curl --data-urlencode http@request.txt https://curl.se/h2c/

or of course if you rather want to pass your HTTP request to curl on stdin, that’s equally easy:

cat request.txt | curl --data-urlencode http@- https://curl.se/h2c/

Early days, you can help!

h2c was created just a few days ago. I’m sure there are bugs, issues and quirks to iron out. You can help! Files issues or submit pull-requests!

(*) = barring bugs, there are still some edge cases where the exact HTTP request won’t be possible to repeat, but where we instead will attempt to do “the right thing”.

curl man page disentangled

The nroff formatted source file to the man page for the curl command line tool was some 110K and consisted of more than 2500 lines by the time this overhaul, or disentanglement if you will, started. At the moment of me writing this, the curl version in git right now, supports 204 command line options.

Working with such a behemoth of a document has gotten a bit daunting to people and the nroff formatting itself is quirky and esoteric. For some time I’ve also been interested in creating some sort of system that would allow us to generate a single web page for each individual command line option. And then possibly allow for expanded descriptions in those single page versions.

To avoid having duplicated info, I decided to create a new system in which we can document each individual command line option in a separate file and from that collection of hundreds of files we can generate the big man page, we can generate the “curl –help” output and we can create all those separate pages suitable for use to render web pages. And we can automate some of the nroff syntax to make it less error-prone and cause less sore eyes for the document editors!

With this system we also get a unified handling of things added in certain curl versions, affecting only specific protocols or dealing with references like “see also” mentions. It gives us a whole lot of meta-data for the command line options if you will and this will allow us to do more fun things going forward I’m sure.

You’ll find the the new format documented, and you can check out the existing files to get a quick glimpse on how it works. As an example, look at the –resolve documentation source.

Today I generated the first full curl.1 replacement and pushed to git, but eventually that file will be removed from git and instead generated at build time by the regular build system. No need to commit a generated file in the long term.