Tag Archives: command-line

A flying curl progress bar

curl features an alternative progress bar. When you invoke it with -# or the longer version –progress-bar, curl will show the transfer progress using a single “bar” on the screen instead of the default meter that shows a lot of data like amount of data, transfer speeds and times.

$ curl -# -O https://example.com/coolfile.tar.gz
############################################ 100.0%

The alternative progress bar works great when the amount of data to transfer is known since then it can actually know how large part of the transfer that is done etc. If the amount of data is unknown – which is not a super rare situation – the progress bar output instead used to output one ‘#’ per kilobyte of data so that it would still show something. That could then end up filling up the screen and more if you did a large transfer.

$ curl -# -O https://example.com/nosize.html
###########################################################################################################

The space ship bar

Starting in curl 7.58.0 (to be released on January 24, 2018), this latter progress bar layout is modified. If the total size is unknown, it will now instead display a small space ship flying across the line, back and forth – and it will only move as long as there is data being transferred. If it stalls, the little ship stops.

“Over” the space ship there are four nonsensical flying hashes (‘#’) that are simply moving across the line on a sine wave, following each other. They move independently of there being data transferred or not.

It can then end up looking similar to this:

Pointless

There’s no real “meaning” behind this new progress bar output mode. I wanted it to

  1. only use a single line, even in the no-total size known case
  2. somehow indicate when there’s no data flying (ie space ship stops)
  3. make it slightly more interesting to watch than just one # per kilobyte

Since this new bar has just landed and this is the first time we ship a release with it, I wouldn’t be surprised if we end up polishing it further later on.

Can you tell I started out my programming life as a demo programmer on the Commodore 64? 🙂

Easier HTTP requests with h2c

I spend a large portion of my days answering questions and helping people use curl and libcurl. With more than 200 command line options it certainly isn’t always easy to find the correct ones, in combination with the Internet and protocols being pretty complicated things at times… not to mention the constant problem of bad advice. Like code samples on stackoverflow that repeats non-recommended patterns.

The notorious -X abuse is a classic example, or why not the widespread disease called too much use of the –insecure option (at a recent count, there were more than 118,000 instances of “curl –insecure” uses in code hosted by github alone).

Sending HTTP requests with curl

HTTP (and HTTPS) is by far the most used protocol out of the ones curl supports. curl can be used to issue just about any HTTP request you can think of, even if it isn’t always immediately obvious exactly how to do it.

h2c to the rescue!

h2c is a new command line tool and associated web service, that when passed a complete HTTP request dump, converts that into a corresponding curl command line. When that curl command line is then run, it will generate exactly(*) the HTTP request you gave h2c.

h2c stands for “headers to curl”.

Many times you’ll read documentation somewhere online or find a protocol/API description showing off a full HTTP request. “This is what the request should look like. Now send it.” That is one use case h2c can help out with.

Example use

Here we have an HTTP request that does Basic authentication with the POST method and a small request body. Do you know how to tell curl to send it?

The request:

POST /receiver.cgi HTTP/1.1
Host: example.com
Authorization: Basic aGVsbG86eW91Zm9vbA==
Accept: */*
Content-Length: 5
Content-Type: application/x-www-form-urlencoded

hello

I save the request above in a text file called ‘request.txt’ and ask h2c to give the corresponding curl command line:

$ ./h2c < request.txt
curl --http1.1 --header User-Agent: --user "hello:youfool" --data-binary "hello" https://example.com/receiver.cgi

If we add “–trace-ascii dump” to that command line, run it, and then inspect the dump file after curl has completed, we can see that it did indeed issue the HTTP request we asked for!

Web Site

Maybe you don’t want to install another command line tool written by me in your system. The solution is the online version of h2c, which is hosted on a separate portion of the official curl web site:

https://curl.haxx.se/h2c/

The web site lets you paste a full HTTP request into a text form and the page then shows the corresponding curl command line for that request.

h2c “as a service”

Inception alert: you can also use the web version of h2c by sending over a HTTP request to it using curl. You’ll then get nothing but the correct curl command line output on stdout.

To send off the same file we used above:

curl --data-urlencode http@request.txt https://curl.haxx.se/h2c/

or of course if you rather want to pass your HTTP request to curl on stdin, that’s equally easy:

cat request.txt | curl --data-urlencode http@- https://curl.haxx.se/h2c/

Early days, you can help!

h2c was created just a few days ago. I’m sure there are bugs, issues and quirks to iron out. You can help! Files issues or submit pull-requests!

(*) = barring bugs, there are still some edge cases where the exact HTTP request won’t be possible to repeat, but where we instead will attempt to do “the right thing”.

curl man page disentangled

The nroff formatted source file to the man page for the curl command line tool was some 110K and consisted of more than 2500 lines by the time this overhaul, or disentanglement if you will, started. At the moment of me writing this, the curl version in git right now, supports 204 command line options.

Working with such a behemoth of a document has gotten a bit daunting to people and the nroff formatting itself is quirky and esoteric. For some time I’ve also been interested in creating some sort of system that would allow us to generate a single web page for each individual command line option. And then possibly allow for expanded descriptions in those single page versions.

To avoid having duplicated info, I decided to create a new system in which we can document each individual command line option in a separate file and from that collection of hundreds of files we can generate the big man page, we can generate the “curl –help” output and we can create all those separate pages suitable for use to render web pages. And we can automate some of the nroff syntax to make it less error-prone and cause less sore eyes for the document editors!

With this system we also get a unified handling of things added in certain curl versions, affecting only specific protocols or dealing with references like “see also” mentions. It gives us a whole lot of meta-data for the command line options if you will and this will allow us to do more fun things going forward I’m sure.

You’ll find the the new format documented, and you can check out the existing files to get a quick glimpse on how it works. As an example, look at the –resolve documentation source.

Today I generated the first full curl.1 replacement and pushed to git, but eventually that file will be removed from git and instead generated at build time by the regular build system. No need to commit a generated file in the long term.

Turn many pictures into a movie

Challenge: you have 90 pictures of various sizes, taken in different formats and shapes. Using all sorts strange file names. Make a movie out of all of them, with the images using the correct aspect ratio. And add music. Use only command line tools on Linux.

Solution: this is a solution, you can most likely solve this in 22 other ways as well. And by posting it here, I can find it myself if I ever want to do the same stunt again…

#!/bin/sh

j=0
# convert options
pic="-resize 1920x1080 -background black -gravity center -extent 1920x1080"

# loop over the images
for i in `ls *jpg | sort -R`; do
 echo "Convert $i"
 convert $pic $i "pic-$j.jpg"
 j=`expr $j + 1`
done

# now generate the movie
mp3="file.mp3"
echo "make movie"
ffmpeg -framerate 3 -i pic-%d.jpg -i $mp3 -acodec copy -c:v libx264 -r 30 -pix_fmt yuv420p -s 1920x1080 -shortest out.mp4

Explained

This is a shell script.

The ‘pic’ variable holds command line options for the ImageMagick ‘convert‘ tool. It resizes each picture to 1920×1080 while maintaining aspect ratio and if the pic gets smaller, it is centered and gets a black border.

The loop goes through all files matching *,jpg, randomizes the order with ‘sort’ and then runs ‘convert’ on them one by one and calls the output files pic-[number].jpg where number is increased by one for each image.

Once all images have the correct and same size, ‘ffmpeg‘ is invoked. It is told to produce a movie with 3 photos per second, how to find all the images, to include an mp3 file into the output and to stop encoding when one of the streams ends – this assumes the playing time of the mp3 file is longer than the total time the images are shown so the movie stops when we run out of images to show.

Result

The ‘out.mp4’ file, uploaded to youtube could then look like this:

(music by Bensound.com)

Unnecessary use of curl -X

I’ve grown a bit tired of the web filling up with curl command line examples showing use of superfluous -X’s. I’m putting code where my mouth is.

Starting with curl 7.45.0 (due to ship October 7th 2015), the tool will help users to understand that their use of the -X (or –request) is very often unnecessary or even downright wrong. If you specify the same method with -X that will be used anyway, and you have verbose mode enabled, curl will inform you about it and gently push you to stop doing it.

Example:

$ curl -I -XHEAD http://example.com –verbose

The option dash capital i means asking curl to issue a HEAD request. Adding -X HEAD to that command line asks for it again. This option sequence will now make curl say:

Note: Unnecessary use of -X or –request, HEAD is already inferred.

It’ll also inform the user similarly if you do -XGET on a normal fetch or -XPOST when using one of the -d options. Like this:

$ curl -v -d hello -XPOST http://example.com
Note: Unnecessary use of -X or –request, POST is already inferred.

curl will still continue to work exactly like before though, these are only informational texts that won’t alter any behaviors. Again, it only says this if verbose mode is enabled.

What -X does

When doing HTTP with curl, the -X option changes the actual method string in the HTTP request. That’s all it does. It does not change behavior accordingly. It’s the perfect option when you want to send a DELETE method or TRACE or similar that curl has no native support for and you want to send easily. You can use it to make curl send a GET with a request-body or you can use it to have the -d option work even when you want to send a PUT. All good uses.

Why superfluous -X usage is bad

I know several users out there will disagree with this. That’s also why this is only shown in verbose mode and it only says “Note:” about it. For now.

There are a few problems with the superfluous uses of -X in curl:

One of most obvious problems is that if you also tell curl to follow HTTP redirects (using -L or –location), the -X option will also be used  on the redirected-to requests which may not at all be what the server asks for and the user expected. Dropping the -X will make curl adhere to what the server asks for. And if you want to alter what method to use in a redirect, curl already have dedicated options for that named –post301, –post302 and –post303!

But even without following redirects, just throwing in an extra -X “to clarify” leads users into believing that -X has a function to serve there when it doesn’t. It leads the user to use that -X in his or her’s next command line too, which then may use redirects or something else that makes it unsuitable.

The perhaps biggest mistake you can do with -X, and one that now actually leads to curl showing a “warning”, is if you’d use -XHEAD on an ordinary command line (that isn’t using -I). Like this (I’ll display it crossed over to make it abundantly clear that this is a bad command line):

$ curl -XHEAD http://example.com/

… which will have curl act as if it sends a GET but it sends a HEAD. A response to a HEAD never has a body, although it sends the size of the body exactly like a GET response which thus mostly will lead to curl to sit there waiting for the response body to arrive when it simply won’t and it’ll hang.

Starting with this change, this is the warning it’ll show for the above command line:

Warning: Setting custom HTTP method to HEAD may not work the way you want.

curl, smiley-URLs and libc

Some interesting Unicode URLs have recently been seen used in the wild – like in this billboard ad campaign from Coca Cola, and a friend of mine asked me about curl in reference to these and how it deals with such URLs.

emojicoke-by-stevecoleuk-450

(Picture by stevencoleuk)

I ran some tests and decided to blog my observations since they are a bit curious. The exact URL I tried was ‘www.😃.ws’ (not the same smiley as shown on this billboard: 😂) – it is really hard to enter by hand so now is the time to appreciate your ability to cut and paste! It appears they registered several domains for a set of different smileys.

These smileys are not really allowed IDN (where IDN means International Domain Names) symbols which make these domains a bit different. They should not (see below for details) be converted to punycode before getting resolved but instead I assume that the pure UTF-8 sequence should or at least will be fed into the name resolver function. Well, either way it should either pass in punycode or the UTF-8 string.

If curl was built to use libidn, it still won’t convert this to punycode and the verbose output says “Failed to convert www.😃.ws to ACE; String preparation failed

curl (exact version doesn’t matter) using the stock threaded resolver

  • Debian Linux (glibc 2.19) – FAIL
  • Windows 7 – FAIL
  • Mac OS X 10.9 – SUCCESS

But then also perhaps to no surprise, the exact same results are shown if I try to ping those host names on these systems. It works on the mac, it fails on Linux and Windows. Wget 1.16 also fails on my Debian systems (just as a reference and I didn’t try it on any of the other platforms).

My curl build on Linux that uses c-ares for name resolving instead of glibc succeeds perfectly. host, nslookup and dig all work fine with it on Linux too (as well as nslookup on Windows):

$ host www.😃.ws
www.\240\159\152\131.ws has address 64.70.19.202
$ ping www.😃.ws
ping: unknown host www.😃.ws

While the same command sequence on the mac shows:

$ host www.😃.ws
www.\240\159\152\131.ws has address 64.70.19.202
$ ping www.😃.ws
PING www.😃.ws (64.70.19.202): 56 data bytes
64 bytes from 64.70.19.202: icmp_seq=0 ttl=44 time=191.689 ms
64 bytes from 64.70.19.202: icmp_seq=1 ttl=44 time=191.124 ms

Slightly interesting additional tidbit: if I rebuild curl to use gethostbyname_r() instead of getaddrinfo() it works just like on the mac, so clearly this is glibc having an opinion on how this should work when given this UTF-8 hostname.

Pasting in the URL into Firefox and Chrome works just fine. They both convert the name to punycode and use “www.xn--h28h.ws” which then resolves to the same IPv4 address.

Update: as was pointed out in a comment below, the “64.70.19.202” IP address is not the correct IP for the site. It is just the registrar’s landing page so it sends back that response to any host or domain name in the .ws domain that doesn’t exist!

What do the IDN specs say?

The U-263A smileyThis is not my area of expertise. I had to consult Patrik Fältström here to get this straightened out (but please if I got something wrong here the mistake is still all mine). Apparently this smiley is allowed in RFC 3940 (IDNA2003), but that has been replaced by RFC 5890-5892 (IDNA2008) where this is DISALLOWED. If you read the spec, this is 263A.

So, depending on which spec you follow it was a valid IDN character or it isn’t anymore.

What does the libc docs say?

The POSIX docs for getaddrinfo doesn’t contain enough info to tell who’s right but it doesn’t forbid UTF-8 encoded strings. The regular glibc docs for getaddrinfo also doesn’t say anything and interestingly, the Apple Mac OS X version of the docs says just as little.

With this complete lack of guidance, it is hardly any additional surprise that the glibc gethostbyname docs also doesn’t mention what it does in this case but clearly it doesn’t do the same as getaddrinfo in the glibc case at least.

What’s on the actual site?

A redirect to www.emoticoke.com which shows a rather boring page.

emoticoke

Who’s right?

I don’t know. What do you think?

Why curl defaults to stdout

(Recap: I founded the curl project, I am still the lead developer and maintainer)

When asking curl to get a URL it’ll send the output to stdout by default. You can of course easily change this behavior with options or just using your shell’s redirect feature, but without any option it’ll spew it out to stdout. If you’re invoking the command line on a shell prompt you’ll immediately get to see the response as soon as it arrives.

I decided curl should work like this, and it was a natural decision I made already when I worked on the predecessors during 1997 or so that later would turn into curl.

On Unix systems there’s a common mantra that “everything is a file” but also in fact that “everything is a pipe”. You accomplish things on Unix by piping the output of one program into the input of another program. Of course I wanted curl to work as good as the other components and I wanted it to blend in with the rest. I wanted curl to feel like cat but for a network resource. And cat is certainly not the only pre-curl command that writes to stdout by default; they are plentiful.

And then: once I had made that decision and I released curl for the first time on March 20, 1998: the call was made. The default was set. I will not change a default and hurt millions of users. I rather continue to be questioned by newcomers, but now at least I can point to this blog post! 🙂

About the wget rivalry

cURLAs I mention in my curl vs wget document, a very common comment to me about curl as compared to wget is that wget is “easier to use” because it needs no extra argument in order to download a single URL to a file on disk. I get that, if you type the full commands by hand you’ll use about three keys less to write “wget” instead of “curl -O”, but on the other hand if this is an operation you do often and you care so much about saving key presses I would suggest you make an alias anyway that is even shorter and then the amount of options for the command really doesn’t matter at all anymore.

I put that argument in the same category as the people who argue that wget is easier to use because you can type it with your left hand only on a qwerty keyboard. Sure, that is indeed true but I read it more like someone trying to come up with a reason when in reality there’s actually another one underneath. Sometimes that other reason is a philosophical one about preferring GNU software (which curl isn’t) or one that is licensed under the GPL (which wget is) or simply that wget is what they’re used to and they know its options and recognize or like its progress meter better.

I enjoy our friendly competition with wget and I seriously and honestly think it has made both our projects better and I like that users can throw arguments in our face like “but X can do Y”and X can alter between curl and wget depending on which camp you talk to. I also really like wget as a tool and I am the occasional user of it, just like most Unix users. I contribute to the wget project well, both with code and with general feedback. I consider myself a friend of the current wget maintainer as well as former ones.

curl needs a fresh take on command line options

I just posted about this on the curl-users mailing list and I’ll just echo it here to reach a slightly larger audience:

One of the not so good behaviors of curl is how many of the command line options work when being repeated: toggling on/off.

We’ve got bug reports about this in the past and I know for a fact that this behavior has burnt more than one guy who’s tried to set default options for curl in their .curlrc etc. When they then re-use the same option on the command line or in a script, it effectively disables the option again…

I’d like this corrected. I want people to be able to explicitly enable and disable features with the command line options. I think the toggling is very rarely useful and something we can just abandon – unless we can figure out a way to keep it for backwards compatibility when we introduce the new behavior.

I’m willing to sacrifice some backwards compatibility to get this done, but I would of course like to hurt as few users as possible.

I’m very interested to get ideas and feedback from you guys on how we can accomplish this!

My first thoughts on how to do this, is simply to convert all the current options to enable options and then introduce a new concept that negates the option. Like -v or –verbose to enable verbose, and –no-verbose to disable verbose.

Any bright ideas?

Update: my suggestion above is what has now been committed targeted for the upcoming 7.19.0 release…