live-streamed curl development

As some of you already found out, I’ve tried live-streaming curl development recently. If you want to catch previous and upcoming episodes subscribe on my twitch page.

Why stream

For the fun of it. I work alone from home most of the time and this is a way for me to interact with others.

To show what’s going on in curl right now. By streaming some of my development I also show what kind of work that’s being done, showing that a lot of development and work are being put into curl and I can share my thoughts and plans with a wider community. Perhaps this will help getting more people to help out or to tickle their imagination.

A screenshot from live stream #11 when parallel transfers with curl was shown off for the first time ever!

For the feedback and interaction. It is immediately notable that one of the biggest reasons I enjoy live-streaming is the chat with the audience and the instant feedback on mistakes I do or thoughts and plans I express. It becomes a back-and-forth and it is not at all just a one-way broadcast. The more my audience interact with me, the more fun I have! That’s also the reason I show the chat within the stream most of the time since parts of what I say and do are reactions and follow-ups to what happens there.

I can only hope I get even more feedback and comments as I get better at this and that people find out about what I’m doing here.

And really, by now I also think of it as a really concentrated and devoted hacking time. I can get a lot of things done during these streaming sessions! I’ll try to keep them going a while.

Twitch

I decided to go with twitch simply because it is an established and known live-streaming platform. I didn’t do any deeper analyses or comparisons, but it seems to work fine for my purposes. I get a stream out with video and sound and people seem to be able to enjoy it.

As of this writing, there are 1645 people following me on twitch. Typical recent live-streams of mine have been watched by over a hundred simultaneous viewers. I also archive all past streams on Youtube, so you can get almost the same experience my watching back issues there.

I announce my upcoming streaming sessions as “events” on Twitch, and I announce them on twitter (@bagder you know). I try to stick to streaming on European day time hours basically because then I’m all alone at home and risk fewer interruptions or distractions from family members or similar.

Challenges

It’s not as easy as it may look trying to write code or debug an issue while at the same time explaining what I do. I learnt that the sessions get better if I have real and meaty issues to deal with or features to add, rather than to just have a few light-weight things to polish.

I also quickly learned that it is better to now not show an actual screen of mine in the stream, but instead I show a crafted set of windows placed on the output to look like it is a screen. This way there’s a much smaller risk that I actually show off private stuff or other content that wasn’t meant for the audience to see. It also makes it easier to show a tidy, consistent and clear “desktop”.

Streaming makes me have to stay focused on the development and prevents me from drifting off and watching cats or reading amusing tweets for a while

Trolls

So far we’ve been spared from the worst kind of behavior and people. We’ve only had some mild weirdos showing up in the chat and nothing that we couldn’t handle.

Equipment and software

I do all development on Linux so things have to work fine on Linux. Luckily, OBS Studio is a fine streaming app. With this, I can setup different “scenes” and I can change between them easily. Some of the scenes I have created are “emacs + term”, “browser” and “coffee break”.

When I want to show off me fiddling with the issues on github, I switch to the “browser” scene that primarily shows a big browser window (and the chat and the webcam in smaller windows).

When I want to show code, I switch to “emacs + term” that instead shows a terminal and an emacs window (and again the chat and the webcam in smaller windows), and so on.

OBS has built-in support for some of the major streaming services, including twitch, so it’s just a matter of pasting in a key in an input field, press ‘start streaming’ and go!

The rest of the software is the stuff I normally use anyway for developing. I don’t fake anything and I don’t make anything up. I use emacs, make, terminals, gdb etc. Everything this runs on my primary desktop Debian Linux machine that has 32GB of ram, an older i7-3770K CPU at 3.50GHz with a dual screen setup. The video of me is captured with a basic Logitech C270 webcam and the sound of my voice and the keyboard is picked up with my Sennheiser PC8 headset.

Some viewers have asked me about my keyboard which you can hear. It is a FUNC-460 that is now approaching 5 years, and I know for a fact that I press nearly 7 million keys per year.

Coffee

In a reddit post about my live-streaming, user ‘digitalsin’ suggested “Maybe don’t slurp RIGHT INTO THE FUCKING MIC”.

How else am I supposed to have my coffee while developing?

This is my home office standard setup. On the left is my video conference laptop and on the right is my regular work laptop. The two screens in the middle are connected to the desktop computer.

What is the incentive for curl to release the library for free?

(This is a repost of the answer I posted on stackoverflow for this question. This answer immediately became my most ever upvoted answer on stackoverflow with 516 upvotes during the 48 hours it was up before a moderator deleted it for unspecified reasons. It had then already been marked “on hold” for being “primarily opinion- based” and then locked but kept: “exists because it has historical significance”. But apparently that wasn’t good enough. I’ve saved a screenshot of the deletion. Debated on meta.stackoverflow.com. Status now: it was brought back but remains locked.)

I’m Daniel Stenberg.

I made curl

I founded the curl project back in 1998, I wrote the initial curl version and I created libcurl. I’ve written more than half of all the 24,000 commits done in the source code repository up to this point in time. I’m still the lead developer of the project. To a large extent, curl is my baby.

I shipped the first version of curl as open source since I wanted to “give back” to the open source world that had given me so much code already. I had used so much open source and I wanted to be as cool as the other open source authors.

Thanks to it being open source, literally thousands of people have been able to help us out over the years and have improved the products, the documentation. the web site and just about every other detail around the project. curl and libcurl would never have become the products that they are today were they not open source. The list of contributors now surpass 1900 names and currently the list grows with a few hundred names per year.

Thanks to curl and libcurl being open source and liberally licensed, they were immediately adopted in numerous products and soon shipped by operating systems and Linux distributions everywhere thus getting a reach beyond imagination.

Thanks to them being “everywhere”, available and liberally licensed they got adopted and used everywhere and by everyone. It created a defacto transfer library standard.

At an estimated six billion installations world wide, we can safely say that curl is the most widely used internet transfer library in the world. It simply would not have gone there had it not been open source. curl runs in billions of mobile phones, a billion Windows 10 installations, in a half a billion games and several hundred million TVs – and more.

Should I have released it with proprietary license instead and charged users for it? It never occured to me, and it wouldn’t have worked because I would never had managed to create this kind of stellar project on my own. And projects and companies wouldn’t have used it.

Why do I still work on curl?

Now, why do I and my fellow curl developers still continue to develop curl and give it away for free to the world?

  1. I can’t speak for my fellow project team members. We all participate in this for our own reasons.
  2. I think it’s still the right thing to do. I’m proud of what we’ve accomplished and I truly want to make the world a better place and I think curl does its little part in this.
  3. There are still bugs to fix and features to add!
  4. curl is free but my time is not. I still have a job and someone still has to pay someone for me to get paid every month so that I can put food on the table for my family. I charge customers and companies to help them with curl. You too can get my help for a fee, which then indirectly helps making sure that curl continues to evolve, remain free and the kick-ass product it is.
  5. curl was my spare time project for twenty years before I started working with it full time. I’ve had great jobs and worked on awesome projects. I’ve been in a position of luxury where I could continue to work on curl on my spare time and keep shipping a quality product for free. My work on curl has given me friends, boosted my career and taken me to places I would not have been at otherwise.
  6. I would not do it differently if I could back and do it again.

Am I proud of what we’ve done?

Yes. So insanely much.

But I’m not satisfied with this and I’m not just leaning back, happy with what we’ve done. I keep working on curl every single day, to improve, to fix bugs, to add features and to make sure curl keeps being the number one file transfer solution for the world even going forward.

We do mistakes along the way. We make the wrong decisions and sometimes we implement things in crazy ways. But to win in the end and to conquer the world is about patience and endurance and constantly going back and reconsidering previous decisions and correcting previous mistakes. To continuously iterate, polish off rough edges and gradually improve over time.

Never give in. Never stop. Fix bugs. Add features. Iterate. To the end of time.

For real?

Yeah. For real.

Do I ever get tired? Is it ever done?

Sure I get tired at times. Working on something every day for over twenty years isn’t a paved downhill road. Sometimes there are obstacles. During times things are rough. Occasionally people are just as ugly and annoying as people can be.

But curl is my life’s project and I have patience. I have thick skin and I don’t give up easily. The tough times pass and most days are awesome. I get to hang out with awesome people and the reward is knowing that my code helps driving the Internet revolution everywhere is an ego boost above normal.

curl will never be “done” and so far I think work on curl is pretty much the most fun I can imagine. Yes, I still think so even after twenty years in the driver’s seat. And as long as I think it’s fun I intend to keep at it.

Why they use curl

As a reader of my blog you know curl. You also most probably already know why you would use curl and if I’m right, you’re also a fan of using the right tool for the job. But do you know why others use curl and why they switch from other solutions to relying on curl for their current and future data transfers? Let me tell you the top reasons I’m told by users.

Logging and exact error handling

What exactly happened in the transfer and why are terribly important questions to some users, and with curl you have the tools to figure that out and also be sure that curl either returns failure or the command worked. This clear and binary distinction is important to users for whom that single file every transfer is important. For example, some of the largest and most well-known banks in the world use curl in their back-ends where each file transfer can mean a transfer of extremely large sums of money.

A few years ago I helped a money transaction service switch to curl to get that exact line in the sand figured out. To know exactly and with certainty if money had been transferred – or not – for a given operation. Vital for their business.

curl does not have the browsers’ lenient approach of “anything goes as long as we get something to show” when it comes to the Internet protocols.

Verbose goodness

curl’s verbose output options allow users to see exactly what curl sends and receives in a quick and non-complicated way. This is invaluable for developers to figure out what’s happening and what’s wrong, in either end involved in the data transfer.

curl’s verbose options allows developers to see all sent and received data even when encryption is used. And if that is not enough, its SSLKEYLOGFILE support allows you to take it to the next level when you need to!

Same behavior over time

Users sometimes upgrade their curl installations after several years of not having done so. Bumping a software’s version after several years and many releases, any software really, can be a bit of a journey and adventure many times as things have changed, behavior is different and things that previously worked no longer do etc.

With curl however, you can upgrade to a version that is a decade newer, with lots of new fancy features and old crummy bugs fixed, only to see that everything that used to work back in the day still works – the same way. With curl, you can be sure that there’s an enormous focus on maintaining old functionality when going forward.

Present on all platforms

The fact that curl is highly portable, our users can have curl and use curl on just about any platform you can think of and use it with the same options and behaviors across them all. Learn curl on one platform, then continue to use it the same way on the next system. Platforms and their individual popularity vary over time and we enjoy to allow users to pick the ones they like – and you can be sure that curl will run on them all.

Performance

When doing the occasional file transfer every once in a while, raw transfer performance doesn’t matter much. Most of the time will then just be waiting on network anyway. You can easily get away with your Python and java frameworks’ multiple levels of overhead and excessive memory consumption.

Users who scan the Internet or otherwise perform many thousands of transfers per second from a large number of threads and machines realize that they need fewer machines that spend less CPU time if they build their file transfer solutions on top of curl. In curl we have a focus on only doing what’s required and it’s a lean and trimmed solution with a well-documented API built purely for Internet data transfers.

The features you want

The author of a banking application recently explained for us that one of the top reasons why they switched to using curl for doing their Internet data transfers, is curl’s ability to keep the file name from the URL.

curl is a feature-packed tool and library that most likely already support the protocols you need and provide the power features you want. With a healthy amount of “extension points” where you can extend it or hook in your custom extra solution.

Support and documentation

No other tool or library for internet transfers have even close to the same amount of documentation, examples available on the net, existing user base that can help out and friendly users to support you when you run into issues. Ask questions on the mailing lists, post a bug on the bug tracker or even show your non-working code on stackoverflow to further your project.

curl is really the only Internet transfer option available to get something that’s old and battle-proven proven by the giants of the industry, that is trustworthy, high-performing and yet for which you can also buy commercial support for, today.

This blog post was also co-posted on wolfssl.com.

curl + hackerone = TRUE

There seems to be no end to updated posts about bug bounties in the curl project these days. Not long ago I mentioned the then new program that sadly enough was cancelled only a few months after its birth.

Now we are back with a new and refreshed bug bounty program! The curl bug bounty program reborn.

This new program, which hopefully will manage to survive a while, is setup in cooperation with the major bug bounty player out there: hackerone.

Basic rules

If you find or suspect a security related issue in curl or libcurl, report it! (and don’t speak about it in public at all until an agreed future date.)

You’re entitled to ask for a bounty for all and every valid and confirmed security problem that wasn’t already reported and that exists in the latest public release.

The curl security team will then assess the report and the problem and will then reward money depending on bug severity and other details.

Where does the money come from?

We intend to use funds and money from wherever we can. The Hackerone Internet Bug Bounty program helps us, donations collected over at opencollective will be used as well as dedicated company sponsorships.

We will of course also greatly appreciate any direct sponsorships from companies for this program. You can help curl getting even better by adding funds to the bounty program and help us reward hard-working researchers.

Why bounties at all?

We compete for the security researchers’ time and attention with other projects, both open and proprietary. The projects that can help put food on these researchers’ tables might have a better chance of getting them to use their tools, time, skills and fingers to find our problems instead of someone else’s.

Finding and disclosing security problems can be very time and resource consuming. We want to make it less likely that people give up their attempts before they find anything. We can help full and part time security engineers sustain their livelihood by paying for the fruits of their labor. At least a little bit.

Only released code?

The state of the code repository in git is not subject for bounties. We need to allow developers to do mistakes and to experiment a little in the git repository, while we expect and want every actual public release to be free from security vulnerabilities.

So yes, the obvious downside with this is that someone could spot an issue in git and decide not to report it since it doesn’t give any money and hope that the flaw will linger around and ship in the release – and then reported it and claim reward money. I think we just have to trust that this will not be a standard practice and if we in fact notice that someone tries to exploit the bounty in this manner, we can consider counter-measures then.

How about money for the patches?

There’s of course always a discussion as to why we should pay anyone for bugs and then why just pay for reported security problems and not for heroes who authored the code in the first place and neither for the good people who write the patches to fix the reported issues. Those are valid questions and we would of course rather pay every contributor a lot of money, but we don’t have the funds for that. And getting funding for this kind of dedicated bug bounties seem to be doable, where as a generic pay contributors fund is trickier both to attract money but it is also really hard to distribute in an open project of curl’s nature.

How much money?

At the start of this program the award amounts are as following. We reward up to this amount of money for vulnerabilities of the following security levels:

Critical: 2,000 USD
High: 1,500 USD
Medium: 1,000 USD
Low: 500 USD

Depending on how things go, how fast we drain the fund and how much companies help us refill, the amounts may change over time.

Found a security flaw?

Report it!

One year in still no visa

One year ago today. On the sunny Tuesday of April 17th 2018 I visited the US embassy in Stockholm Sweden and applied for a visa. I’m still waiting for them to respond.

My days-since-my-visa-application counter page is still counting. Technically speaking, I had already applied but that was the day of the actual physical in-person interview that served as the last formal step in the application process. Most people are then getting their visa application confirmed within weeks.

Initially I emailed them a few times after that interview since the process took so long (little did I know back then), but now I haven’t done it for many months. Their last response assured me that they are “working on it”.

Lots of things have happened in my life since last April. I quit my job at Mozilla and started a new job at wolfSSL, again working for a US based company. One that I cannot go visit.

During this year I missed out on a Mozilla all-hands, I’ve been invited to the US several times to talk at conferences that I had to decline and a friend is getting married there this summer and I can’t go. And more.

Going forward I will miss more interesting meetings and speaking opportunities and I have many friends whom I cannot visit. This is a mild blocker to things I would want to do and it is an obstacle to my profession and career.

I guess I might get my rejection notice before my counter reaches two full years, based on stories I’ve heard from other people in similar situations such as mine. I don’t know yet what I’ll do when it eventually happens. I don’t think there are any rules that prevent me from reapplying, other than the fact that I need to pay more money and I can’t think of any particular reason why they would change their minds just by me trying again. I will probably give it a rest a while first.

I’m lucky and fortunate that people and organizations have adopted to my situation – a situation I of course I share with many others so it’s not uniquely mine – so lots of meetings and events have been held outside of the US at least partially to accommodate me. I’m most grateful for this and I understand that at times it won’t work and I then can’t attend. These days most things are at least partly accessible via video streams etc, repairing the harm a little. (And yes, this is a first-world problem and I’m fortunate that I can still travel to most other parts of the world without much trouble.)

Finally: no, I still have no clue as to why they act like this and I don’t have any hope of ever finding out.


Test servers for curl

curl supports some twenty-three protocols (depending on exactly how you count).

In order to properly test and verify curl’s implementations of each of these protocols, we have a test suite. In the test suite we have a set of handcrafted servers that speak the server-side of these protocols. The more used a protocol is, the more important it is to have it thoroughly tested.

We believe in having test servers that are “stupid” and that offer buttons, levers and thresholds for us to control and manipulate how they act and how they respond for testing purposes. The control of what to send should be dictated as much as possible by the test case description file. If we want a server to send back a slightly broken protocol sequence to check how curl supports that, the server must be open for this.

In order to do this with a large degree of freedom and without restrictions, we’ve found that using “real” server software for this purpose is usually not good enough. Testing the broken and bad cases are typically not easily done then. Actual server software tries hard to do the right thing and obey standards and protocols, while we rather don’t want the server to make any decisions by itself at all but just send exactly the bytes we ask it to. Simply put.

Of course we don’t always get what we want and some of these protocols are fairly complicated which offer challenges in sticking to this policy all the way. Then we need to be pragmatic and go with what’s available and what we can make work. Having test cases run against a real server is still better than no test cases at all.

Now SOCKS

“SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

(according to Wikipedia)

Recently we fixed a bug in how curl sends credentials to a SOCKS5 proxy as it turned out the protocol itself only supports user name and password length of 255 bytes each, while curl normally has no such limits and could pass on credentials with virtually infinite lengths. OK, that was silly and we fixed the bug. Now curl will properly return an error if you try such long credentials with your SOCKS5 proxy.

As a general rule, fixing a bug should mean adding at least one new test case, right? Up to this time we had been testing the curl SOCKS support by firing up an ssh client and having that setup a SOCKS proxy that connects to the other test servers.

curl -> ssh with SOCKS proxy -> test server

Since this setup doesn’t support SOCKS5 authentication, it turned out complicated to add a test case to verify that this bug was actually fixed.

This test problem was fixed by the introduction of a newly written SOCKS proxy server dedicated for the curl test suite (which I simply named socksd). It does the basic SOCKS4 and SOCKS5 protocol logic and also supports a range of commands to control how it behaves and what it allows so that we can now write test cases against this server and ask the server to misbehave or otherwise require fun things so that we can make really sure curl supports those cases as well.

It also has the additional bonus that it works without ssh being present so it will be able to run on more systems and thus the SOCKS code in curl will now be tested more widely than before.

curl -> socksd -> test server

Going forward, we should also be able to create even more SOCKS tests with this and make sure to get even better SOCKS test coverage.

no more global dns cache in curl

In January 2002, we added support for a global DNS cache in libcurl. All transfers set to use it would share and use the same global cache.

We rather quickly realized that having a global cache without locking was error-prone and not really advisable, so already in March 2004 we added comments in the header file suggesting that users should not use this option.

It remained in the code and time passed.

In the autumn of 2018, another fourteen years later, we finally addressed the issue when we announced a plan for this options deprecation. We announced a date for when it would become deprecated and disabled in code (7.62.0), and then six months later if no major incidents or outcries would occur, we said we would delete the code completely.

That time has now arrived. All code supporting a global DNS cache in curl has been removed. Any libcurl-using program that sets this option from now on will simply not get a global cache and instead proceed with the default handle-oriented cache, and the documentation is updated to clearly indicate that this is the case. This change will ship in curl 7.65.0 due to be released in May 2019 (merged in this commit).

If a program still uses this option, the only really noticeable effect should be a slightly worse name resolving performance, assuming the global cache had any point previously.

Programs that want to continue to have a DNS cache shared between multiple handles should use the share interface, which allows shared DNS cache and more – with locking. This API has been offered by libcurl since 2003.


curl says bye bye to pipelining

HTTP/1.1 Pipelining is the protocol feature where the client sends off a second HTTP/1.1 request already before the answer to the previous request has arrived (completely) from the server. It is defined in the original HTTP/1.1 spec and is a way to avoid waiting times. To reduce latency.

HTTP/1.1 Pipelining was badly supported by curl for a long time in the sense that we had a series of known bugs and it was a fragile feature without enough tests. Also, pipelining is fairly tricky to debug due to the timing sensitivity so very often enabling debug outputs or similar completely changes the nature of the behavior and things are not reproducing anymore!

HTTP pipelining was never enabled by default by the large desktop browsers due to all the issues with it, like broken server implementations and the likes. Both Firefox and Chrome dropped pipelining support entirely since a long time back now. curl did in fact over time become more and more lonely in supporting pipelining.

The bad state of HTTP pipelining was a primary driving factor behind HTTP/2 and its multiplexing feature. HTTP/2 multiplexing is truly and really “pipelining done right”. It is way more solid, practical and solves the use case in a better way with better performance and fewer downsides and problems. (curl enables multiplexing by default since 7.62.0.)

In 2019, pipelining should be abandoned and HTTP/2 should be used instead.

Starting with this commit, to be shipped in release 7.65.0, curl no longer has any code that supports HTTP/1.1 pipelining. It has been disabled in the code since 7.62.0 already so applications and users that use a recent version already should not notice any difference.

Pipelining was always offered on a best-effort basis and there was never any guarantee that requests would actually be pipelined, so we can remove this feature entirely without breaking API or ABI promises. Applications that ask libcurl to use pipelining can still do that, it just won’t have any effect.

Workshop Season 4 Finale

The 2019 HTTP Workshop ended today. In total over the years, we have now done 12 workshop days up to now. This day was not a full day and we spent it on only two major topics that both triggered long discussions involving large parts of the room.

Cookies

Mike West kicked off the morning with his cookies are bad presentation.

One out of every thousand cookie header values is 10K or larger in size and even at the 50% percentile, the size is 480 bytes. They’re a disaster on so many levels. The additional features that have been added during the last decade are still mostly unused. Mike suggests that maybe the only way forward is to introduce a replacement that avoids the issues, and over longer remove cookies from the web: HTTP state tokens.

A lot of people in the room had opinions and thoughts on this. I don’t think people in general have a strong love for cookies and the way they currently work, but the how-to-replace-them question still triggered lots of concerns about issues from routing performance on the server side to the changed nature of the mechanisms that won’t encourage web developers to move over. Just adding a new mechanism without seeing the old one actually getting removed might not be a win.

We should possibly “worsen” the cookie experience over time to encourage switch over. To cap allowed sizes, limit use to only over HTTPS, reduce lifetimes etc, but even just that will take effort and require that the primary cookie consumers (browsers) have a strong will to hurt some amount of existing users/sites.

(Related: Mike is also one of the authors of the RFC6265bis draft in progress – a future refreshed cookie spec.)

HTTP/3

Mike Bishop did an excellent presentation of HTTP/3 for HTTP people that possibly haven’t kept up fully with the developments in the QUIC working group. From a plain HTTP view, HTTP/3 is very similar feature-wise to HTTP/2 but of course sent over a completely different transport layer. (The HTTP/3 draft.)

Most of the questions and discussions that followed were rather related to the transport, to QUIC. Its encryption, it being UDP, DOS prevention, it being “CPU hungry” etc. Deploying HTTP/3 might be a challenge for successful client side implementation, but that’s just nothing compared the totally new thing that will be necessary server-side. Web developers should largely not even have to care…

One tidbit that was mentioned is that in current Firefox telemetry, it shows about 0.84% of all requests negotiates TLS 1.3 early data (with about 12.9% using TLS 1.3)

Thought-worthy quote of the day comes from Willy: “everything is a buffer”

Future Workshops

There’s no next workshop planned but there might still very well be another one arranged in the future. The most suitable interval for this series isn’t really determined and there might be reasons to try tweaking the format to maybe change who will attend etc.

The fact that almost half the attendees this time were newcomers was certainly good for the community but that not a single attendee traveled here from Asia was less good.

Thanks

Thanks to the organizers, the program committee who set this up so nicely and the awesome sponsors!

curl, open source and networking