I talked with Ed Hoover on the between screens podcast a while ago and that episode has now been published. It is a dense 12 minutes as the good Ed edited it massively.
Tag Archives: cURL and libcurl
curl up in Nuremberg!
I’m very happy to announce that the curl project is about to run our first ever curl meeting and developers conference.
March 18-19, Nuremberg Germany
Everyone interested in curl, libcurl and related matters is invited to participate. We only ask of you to register and pay the small fee. The fee will be used for food and more at the event.
You’ll find the full and detailed description of the event and the specific location in the curl wiki.
The agenda for the weekend is purposely kept loose to allow for flexibility and unconference-style adding things and topics while there. You will thus have the chance to present what you like and affect what others present. Do tell us what you’d like to talk about or hear others talk about! The sign-up for the event isn’t open yet, as we first need to work out some more details.
We have a dedicated mailing list for discussing the meeting, called curl-meet, so please consider yourself invited to join in there as well!
Thanks a lot to SUSE for hosting!
Feel free to help us make a cool logo for the event!
(The 19th birthday of curl is suitably enough the day after, on March 20.)
poll on mac 10.12 is broken
When Mac OS X first launched they did so without an existing poll function. They later added poll() in Mac OS X 10.3, but we quickly discovered that it was broken (it returned a non-zero value when asked to wait for nothing) so in the curl project we added a check in configure for that and subsequently avoided using poll() in all OS X versions to and including Mac OS 10.8 (Darwin 12). The code would instead switch to the alternative solution based on select() for these platforms.
With the release of Mac OS X 10.9 “Mavericks” in October 2013, Apple had fixed their poll() implementation and we’ve built libcurl to use it since with no issues at all. The configure script picks the correct underlying function to use.
Enter macOS 10.12 (yeah, its not called OS X anymore) “Sierra”, released in September 2016. Quickly we discovered that poll() once against did not act like it should and we are back to disabling the use of it in preference to the backup solution using select().
The new error looks similar to the old problem: when there’s nothing to wait for and we ask poll() to wait N milliseconds, the 10.12 version of poll() returns immediately without waiting. Causing busy-loops. The problem has been reported to Apple and its Radar number is 28372390. (There has been no news from them on how they plan to act on this.)
poll() is defined by POSIX and The Single Unix Specification it specifically says:
If none of the defined events have occurred on any selected file descriptor, poll() waits at least timeout milliseconds for an event to occur on any of the selected file descriptors.
We pushed a configure check for this in curl, to be part of the upcoming 7.51.0 release. I’ll also show you a small snippet you can use stand-alone below.
Apple is hardly alone in the broken-poll department. Remember how Windows’ WSApoll is broken?
Here’s a little code snippet that can detect the 10.12 breakage:
#include <poll.h>
#include <stdio.h>
#include <sys/time.h>
int main(void)
{
struct timeval before, after;
int rc;
size_t us;
gettimeofday(&before, NULL);
rc = poll(NULL, 0, 500);
gettimeofday(&after, NULL);
us = (after.tv_sec - before.tv_sec) * 1000000 +
(after.tv_usec - before.tv_usec);
if(us < 400000) {
puts("poll() is broken");
return 1;
}
else {
puts("poll() works");
}
return 0;
}
Follow-up, January 2017
This poll bug has been confirmed fixed in the macOS 10.12.2 update (released on December 13, 2016), but I’ve found no official mention or statement about this fact.
screenshotted curl credits
If you have more or better screenshots, please share!
This shot is taken from the ending sequence of the PC version of the game Grand Theft Auto V. 44 minutes in! See the youtube version.
Sky HD is a satellite TV box.
This is a Philips TV. The added use of c-ares I consider a bonus!
The infotainment display of a BMW car.
Playstation 4 lists open source products it uses.
This is a screenshot from an Iphone open source license view. The iOS 10 screen however, looks like this:
curl in iOS 10 with an older year span than in the much older screenshot?
Instagram on an Iphone.
Spotify on an Iphone.
Virtualbox (thanks to Anders Nilsson)
Battle.net (thanks Anders Nilsson)
Freebox (thanks Alexis La Goutte)
The Youtube app on Android. (Thanks Ray Satiro)
The Youtube app on iOS (Thanks Anthony Bryan)
UBReader is an ebook reader app on Android.
MindMaple is using curl (Thanks to Peter Buyze)
license screen from a VW Sharan car (Thanks to Jonas Lejon)
Skype on Android
Skype on an iPad
Nissan Qashqai 2016 (thanks to Peteski)
The Mercedes Benz license agreement from 2015 listing which car models that include curl.
Nintendo Switch uses curl (Thanks to Anders Nilsson)
The Thermomix TM 5 kitchen/cooking appliance (Thanks to Sergio Conde)
Cisco Anyconnect (Thanks to Dane Knecht) – notice the age of the curl copyright string in comparison to the main one!
Sony Android TV (Thanks to Sajal Kayan)
The reMarkable E-paper tablet uses curl. (Thanks to Zakx)
BMW i3, snapshot from this video (Thanks to Terence Eden)
BMW i8. (Thanks to eeeebbbbrrrr)
Amazon Kindle Paperwhite 3 (thanks to M Hasbini)
Xiaomi Android uses both curl and libcurl. (Thanks to Björn Stenberg)
Verisure V box microenhet smart lock runs curl (Thanks to Jonas Lejon)
curl in a Subaru (Thanks to Jani Tarvainen)
Another VW (Thanks to Michael Topal)
Oppo Android uses curl (Thanks to Dio Oktarianos Putra)
Chevrolet Traverse 2018 uses curl according to its owners manual on page 403. It is mentioned almost identically in other Chevrolet model manuals such as for the Corvette, the 2018 Camaro, the 2018 TRAX, the 2013 VOLT, the 2014 Express and the 2017 BOLT.
The curl license is also in owner manuals for other brands and models such as in the GMC Savana, Cadillac CT6 2016, Opel Zafira, Opel Insignia, Opel Astra, Opel Karl, Opel Cascada, Opel Mokka, Opel Ampera, Vauxhall Astra … (See 100 million cars run curl).
The Onkyo TX-NR609 AV-Receiver uses libcurl as shown by the license in its manual. (Thanks to Marc Hörsken)
Fortnite uses libcurl. (Thanks to Neil McKeown)
Red Dead Redemption 2 uses libcurl. The ending sequence video. (Thanks to @nadbmal)
Philips Hue Lights uses libcurl (Thanks to Lorenzo Fontana)
Pioneer makes Blu-Ray players that use libcurl. (Thanks to Maarten Tijhof)
curl is credited in the game Marvel’s Spider-Man for PS4.
Garmin Fenix 5X Plus runs curl (thanks to Jonas Björk)
Crusader Kings II uses curl (thanks to Frank Gevaerts)
DiRT Rally 2.0 (PlayStation 4 version) uses curl (thanks to Roman)
Microsoft Flight Simulator uses libcurl. Thanks to Till von Ahnen.
Google Photos on Android uses curl.
Crusader Kings III uses curl (thanks to Frank Gevaerts)
The SBahn train in Berlin uses curl! (Thanks to @_Bine__)
LG uses curl in TVs.
Garmin Forerunner 245 also runs curl (Thanks to Martin)
The bicycle computer Hammerheaed Karoo v2 (thanks to Adrián Moreno Peña)
Playstation 5 uses curl (thanks to djs)
The Netflix app on Android uses libcurl (screenshot from January 29, 2021). Set to Swedish, hence the title of the screen.
(Google) Android 11 uses libcurl. Screenshot from a Pixel 4a 5g.
Samsung Android uses libcurl in presumably every model…
The ending sequence as seen on YouTube.
A Samsung TV speaking Swedish showing off a curl error code. Thanks to Thomas Svensson.
Polestar 2 (thanks to Robert Friberg for the picture)
Harman Kardon uses libcurl in their Enchant soundbars (thanks to Fabien Benetou). The name and the link in that list are hilarious though.
VW Polo running curl (Thanks to Vivek Selvaraj)
a BMW 2021 R1250GS motorcycle (Thanks to @growse)
Baldur’s Gate 3 uses libcurl (Thanks to Akhlis)
An Andersson TV using curl (Thanks to Björn Stenberg)
Ghost of Tsushima – a game. (Thanks to Patrik Svensson)
Sonic Frontier (Thanks to Esoteric Emperor)
The KAON NS1410 (set top box), possibly also called Mirada Inspire3 or Broadcom Nexus,. (Thanks to Aksel)
The Panasonic DC-GH5 camera. (Thanks fasterthanlime)
Plexamp, the Android app. (Thanks Fabio Loli)
The Dacia Sandero Stepway car (Thanks Adnane Belmadiaf)
The Garmin Venu Sq watch (Thanks gergo)
The Eventide H9000 runs curl. A high-end audio processing and effects device. (Thanks to John Baylies)
Diablo IV (Thanks to John Angelmo)
The Siemens EQ900 espresso machine runs curl. Screenshots below from a German version.
Thermomix TM6 by Vorwerk (Thanks to Uli H)
The Grandstream GXP2160 uses curl (thanks to Cameron Katri)
Assassin’s Creed Mirage. (Thanks to Patrik Svensson)
Factorio (Thanks to Der Große Böse Wolff)
Leica Q2 and Leica M11 use curl (Thanks to PattaFeuFeu)
Renault Logan (thanks to Aksel)
The original model of the PlayStation Vita (PCH-1000, 3G) (thanks to ml)
The 2023 Infiniti QX80, Premium Select trim level (an SUV)
Renault Scenic (thanks to Taxo Rubio)
Volvo XC40 Recharge 2024 edition obviously features a libcurl from 2019…
The Roland Fantom 6 Synthesizer Keyboard, runs curl (Thanks Anopka)
Used in the Mini Countryman (the car) (Thanks Alejandro Pablo Revilla)
Nissan Qashqai MY21 (Thanks Michele Adduci)
The Nintendo 3DS internet browser probably runs curl. It does not say so, but the license looks like the curl one. (Thanks to Marlon Pohl)
This is a Seat Leon (Thanks to Mormegil)
25,000 curl questions on stackoverflow
Over time, I’ve reluctantly come to terms with the fact that a lot of questions and answers about curl is not done on the mailing lists we have setup in the project itself.
A primary such external site with curl related questions is of course stackoverflow – hardly news to programmers of today. The questions tagged with curl is of course only a very tiny fraction of the vast amount of questions and answers that accumulate on that busy site.
The pile of questions tagged with curl on stackoverflow has just surpassed the staggering number of 25,000. Of course, these questions involve persons who ask about particular curl behaviors (and a large portion is about PHP/CURL) but there’s also a significant amount of tags for questions where curl is only used to do something and that other something is actually what the question is about. And ‘libcurl’ is used as a separate tag and is often used independently of the ‘curl’ one. libcurl is tagged on almost 2,000 questions.
But still. 25,000 questions. Wow.
I visit that site every so often and answer to some questions but I often end up feeling a great “distance” between me and questions there, and I have a hard time to bridge that gap. Also, stackoverflow the site and the format isn’t really suitable for debugging or solving problems within curl so I often end up trying to get the user move over to file an issue on curl’s github page or discuss the curl problem on a mailing list instead. Forums more suitable for plenty of back-and-forth before the solution or fix is figured out.
Now, any bets for how long it takes until we reach 100K questions?
A sea of curl stickers
To spread the word, to show off the logo, to share the love, to boost the brand, to allow us to fill up our own and our friend’s laptop covers I ordered a set of curl stickers to hand out to friends and fans whenever I meet any going forward. They arrived today, and I thought I’d give you a look. (You can still purchase your own set of curl stickers from unixstickers.com)
The sticker is 74 x 26 mm at its max.
My first 20 years of HTTP
During the autumn 1996 I took my first swim in the ocean known as HTTP. Twenty years ago now.
I had previously worked with writing an IRC bot in C, and IRC is a pretty simple text based protocol over TCP so I could use some experiences from that when I started to look into HTTP. That IRC bot was my first real application distributed to the world that was using TCP/IP. It was portable to most unixes and Amiga and it was open source.
1996 was the year the movie Independence Day premiered and the single hit song that plagued the world more than others that year was called Macarena. AOL, Webcrawler and Netscape were the most popular websites on the Internet. There were less than 300,000 web sites on the Internet (compared to some 900 million today).
I decided I should spice up the bot and make it offer a currency exchange rate service so that people who were chatting could ask the bot what 200 SEK is when converted to USD or what 50 AUD might be in DEM. – Right, there was no Euro currency yet back then!
I simply had to fetch the currency rates at a regular interval and keep them in the same server that ran the bot. I just needed a little tool to download the rates over HTTP. How hard can that be? I googled around (this was before Google existed so that was not the search engine I could use!) and found a tool named ‘httpget’ that made pretty much what I wanted. It truly was tiny – a few hundred lines of code.
I don’t have an exact date saved or recorded for when this happened, only the general time frame. You know, we had no smart phones, no Google calendar and no digital cameras. I sported my first mobile phone back then, the sexy Nokia 1610 – viewed in the picture on the right here.
The HTTP/1.0 RFC had just recently came out – which was the first ever real spec published for HTTP. RFC 1945 was published in May 1996, but I was blissfully unaware of the youth of the standard and I plunged into my little project. This was the first published HTTP spec and it says:
HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification reflects common usage of the protocol referred too as "HTTP/1.0". This specification describes the features that seem to be consistently implemented in most HTTP/1.0 clients and servers.
Many years after that point in time, I have learned that already at this time when I first searched for a HTTP tool to use, wget already existed. I can’t recall that I found that in my searches, and if I had found it maybe history would’ve made a different turn for me. Or maybe I found it and discarded for a reason I can’t remember now.
I wasn’t the original author of httpget; Rafael Sagula was. But I started contributing fixes and changes and soon I was the maintainer of it. Unfortunately I’ve lost my emails and source code history from those earliest years so I cannot easily show my first steps. Even the oldest changelogs show that we very soon got help and contributions from users.
The earliest saved code archive I have from those days, is from after we had added support for Gopher and FTP and renamed the tool ‘urlget’. urlget-3.5.zip was released on January 20 1998 which thus was more than a year later my involvement in httpget started.
The original httpget/urlget/curl code was stored in CVS and it was licensed under the GPL. I did most of the early development on SunOS and Solaris machines as my first experiments with Linux didn’t start until 97/98 something.
The first web page I know we have saved on archive.org is from December 1998 and by then the project had been renamed to curl already. Roughly two years after the start of the journey.
RFC 2068 was the first HTTP/1.1 spec. It was released already in January 1997, so not that long after the 1.0 spec shipped. In our project however we stuck with doing HTTP 1.0 for a few years longer and it wasn’t until February 2001 we first started doing HTTP/1.1 requests. First shipped in curl 7.7. By then the follow-up spec to HTTP/1.1, RFC 2616, had already been published as well.
The IETF working group called HTTPbis was started in 2007 to once again refresh the HTTP/1.1 spec, but it took me a while until someone pointed out this to me and I realized that I too could join in there and do my part. Up until this point, I had not really considered that little me could actually participate in the protocol doings and bring my views and ideas to the table. At this point, I learned about IETF and how it works.
I posted my first emails on that list in the spring 2008. The 75th IETF meeting in the summer of 2009 was held in Stockholm, so for me still working on HTTP only as a spare time project it was very fortunate and good timing. I could meet a lot of my HTTP heroes and HTTPbis participants in real life for the first time.
I have participated in the HTTPbis group ever since then, trying to uphold the views and standpoints of a command line tool and HTTP library – which often is not the same as the web browsers representatives’ way of looking at things. Since I was employed by Mozilla in 2014, I am of course now also in the “web browser camp” to some extent, but I remain a protocol puritan as curl remains my first “child”.
Mozilla’s search for a new logo
I’m employed by Mozilla. The same Mozilla that recently has announced that it is looking around for feedback on how to revamp its logo and graphical image.
It was with amusement I saw one of the existing suggestions for a new logo by using “://” (colon slash slash) the name:
… compared with the recently announced new curl logo:
Me being in both teams and being a general Internet protocol enthusiast I couldn’t be more happy if Mozilla would end up using a design so clearly based on the same underlying thoughts. After all,
Imitation is the sincerest of flattery
as Charles Caleb Colton once so eloquently expressed it.
Removing the PowerShell curl alias?
PowerShell is a spiced up command line shell made by Microsoft. According to some people, it is a really useful and good shell alternative.
Already a long time ago, we got bug reports from confused users who couldn’t use curl from their PowerShell prompts and it didn’t take long until we figured out that Microsoft had added aliases for both curl and wget. The alias had the shell instead invoke its own command called “Invoke-WebRequest” whenever curl or wget was entered. Invoke-WebRequest being PowerShell’s own version of a command line tool for fiddling with URLs.
Invoke-WebRequest is of course not anywhere near similar to neither curl nor wget and it doesn’t support any of the command line options or anything. The aliases really don’t help users. No user who would want the actual curl or wget is helped by these aliases, and users who don’t know about the real curl and wget won’t use the aliases. They were and remain pointless. But they’ve remained a thorn in my side ever since. Me knowing that they are there and confusing users every now and then – not me personally, since I’m not really a Windows guy.
Fast forward to modern days: Microsoft released PowerShell as open source on github yesterday. Without much further ado, I filed a Pull-Request, asking the aliases to be removed. It is a minuscule, 4 line patch. It took way longer to git clone the repo than to make the actual patch and submit the pull request!
It took 34 minutes for them to close the pull request:
“Those aliases have existed for multiple releases, so removing them would be a breaking change.”
To be honest, I didn’t expect them to merge it easily. I figure they added those aliases for a reason back in the day and it seems unlikely that I as an outsider would just make them change that decision just like this out of the blue.
But the story didn’t end there. Obviously more Microsoft people gave the PR some attention and more comments were added. Like this:
“You bring up a great point. We added a number of aliases for Unix commands but if someone has installed those commands on WIndows, those aliases screw them up.
We need to fix this.”
So, maybe it will trigger a change anyway? The story is ongoing…
HTTP/2 connection coalescing
Section 9.1.1 in RFC7540 explains how HTTP/2 clients can reuse connections. This is my lengthy way of explaining how this works in reality.
Many connections in HTTP/1
With HTTP/1.1, browsers are typically using 6 connections per origin (host name + port). They do this to overcome the problems in HTTP/1 and how it uses TCP – as each connection will do a fair amount of waiting. Plus each connection is slow at start and therefore limited to how much data you can get and send quickly, you multiply that data amount with each additional connection. This makes the browser get more data faster (than just using one connection).
Add sharding
Web sites with many objects also regularly invent new host names to trigger browsers to use even more connections. A practice known as “sharding”. 6 connections for each name. So if you instead make your site use 4 host names you suddenly get 4 x 6 = 24 connections instead. Mostly all those host names resolve to the same IP address in the end anyway, or the same set of IP addresses. In reality, some sites use many more than just 4 host names.
The sad reality is that a very large percentage of connections used for HTTP/1.1 are only ever used for a single HTTP request, and a very large share of the connections made for HTTP/1 are so short-lived they actually never leave the slow start period before they’re killed off again. Not really ideal.
One connection in HTTP/2
With the introduction of HTTP/2, the HTTP clients of the world are going toward using a single TCP connection for each origin. The idea being that one connection is better in packet loss scenarios, it makes priorities/dependencies work and reusing that single connections for many more requests will be a net gain. And as you remember, HTTP/2 allows many logical streams in parallel over that single connection so the single connection doesn’t limit what the browsers can ask for.
Unsharding
The sites that created all those additional host names to make the HTTP/1 browsers use many connections now work against the HTTP/2 browsers’ desire to decrease the number of connections to a single one. Sites don’t want to switch back to using a single host name because that would be a significant architectural change and there are still a fair number of HTTP/1-only browsers still in use.
Enter “connection coalescing”, or “unsharding” as we sometimes like to call it. You won’t find either term used in RFC7540, as it merely describes this concept in terms of connection reuse.
Connection coalescing means that the browser tries to determine which of the remote hosts that it can reach over the same TCP connection. The different browsers have slightly different heuristics here and some don’t do it at all, but let me try to explain how they work – as far as I know and at this point in time.
Coalescing by example
Let’s say that this cool imaginary site “example.com” has two name entries in DNS: A.example.com and B.example.com. When resolving those names over DNS, the client gets a list of IP address back for each name. A list that very well may contain a mix of IPv4 and IPv6 addresses. One list for each name.
You must also remember that HTTP/2 is also only ever used over HTTPS by browsers, so for each origin speaking HTTP/2 there’s also a corresponding server certificate with a list of names or a wildcard pattern for which that server is authorized to respond for.
In our example we start out by connecting the browser to A. Let’s say resolving A returns the IPs 192.168.0.1 and 192.168.0.2 from DNS, so the browser goes on and connects to the first of those addresses, the one ending with “1”. The browser gets the server cert back in the TLS handshake and as a result of that, it also gets a list of host names the server can deal with: A.example.com and B.example.com. (it could also be a wildcard like “*.example.com”)
If the browser then wants to connect to B, it’ll resolve that host name too to a list of IPs. Let’s say 192.168.0.2 and 192.168.0.3 here.
Host A: 192.168.0.1 and 192.168.0.2 Host B: 192.168.0.2 and 192.168.0.3
Now hold it. Here it comes.
The Firefox way
Host A has two addresses, host B has two addresses. The lists of addresses are not the same, but there is an overlap – both lists contain 192.168.0.2. And the host A has already stated that it is authoritative for B as well. In this situation, Firefox will not make a second connect to host B. It will reuse the connection to host A and ask for host B’s content over that single shared connection. This is the most aggressive coalescing method in use.
The Chrome way
Chrome features a slightly less aggressive coalescing. In the example above, when the browser has connected to 192.168.0.1 for the first host name, Chrome will require that the IPs for host B contains that specific IP for it to reuse that connection. If the returned IPs for host B really are 192.168.0.2 and 192.168.0.3, it clearly doesn’t contain 192.168.0.1 and so Chrome will create a new connection to host B.
Chrome will reuse the connection to host A if resolving host B returns a list that contains the specific IP of the connection host A is already using.
The Edge and Safari ways
They don’t do coalescing at all, so each host name will get its own single connection. Better than the 6 connections from HTTP/1 but for very sharded sites that means a lot of connections even in the HTTP/2 case.
curl also doesn’t coalesce anything (yet).
Surprises and a way to mitigate them
Given some comments in the Firefox bugzilla, the aggressive coalescing sometimes causes some surprises. Especially when you have for example one IPv6-only host A and a second host B with both IPv4 and IPv6 addresses. Asking for data on host A can then still use IPv4 when it reuses a connection to B (assuming that host A covers host B in its cert).
In the rare case where a server gets a resource request for an authority (or scheme) it can’t serve, there’s a dedicated error code 421 in HTTP/2 that it can respond with and the browser can then go back and retry that request on another connection.
Starts out with 6 anyway
Before the browser knows that the server speaks HTTP/2, it may fire up 6 connection attempts so that it is prepared to get the remote site at full speed. Once it figures out that it doesn’t need all those connections, it will kill off the unnecessary unused ones and over time trickle down to one. Of course, on subsequent connections to the same origin the client may have the version information cached so that it doesn’t have to start off presuming HTTP/1.