Tag Archives: Google

Google Open Source Peer Bonus award 2023

I am honored to yet again receive a peer bonus award from Google. This is a Google program for which persons like me can be nominated by Googlers and as a result receive grants.

I previously received such an award in 2020.

Update

A few people noticed and have commented on the fact that this letter is signed by Chris DiBona and dated April 19th 2023, while sources say he was let go from Google back in January. Which means one or two of those things are wrong.

a Google grant for libcurl work

Earlier this year I was the recipient of a monetary Google patch grant with the expressed purpose of improving security in libcurl.

This was an upfront payout under this Google program describing itself as “an experimental program that rewards proactive security improvements to select open-source projects”.

I accepted this grant for the curl project and I intend to keep working fiercely on securing curl. I recognize the importance of curl security as curl remains one of the most widely used software components in the world, and even one that is doing network data transfers which typically is a risky business. curl is responsible for a measurable share of all Internet transfers done over the Internet an average day. My job is to make sure those transfers are done as safe and secure as possible. It isn’t my only responsibility of course, as I have other tasks to attend to as well, but still.

Do more

Security is already and always a top priority in the curl project and for myself personally. This grant will of course further my efforts to strengthen curl and by association, all the many users of it.

What I will not do

When security comes up in relation to curl, some people like to mention and propagate for other programming languages, But curl will not be rewritten in another language. Instead we will increase our efforts in writing good C and detecting problems in our code earlier and better.

Proactive counter-measures

Things we have done lately and working on to enforce everywhere:

String and buffer size limits – all string inputs and all buffers in libcurl that are allowed to grow now have a maximum allowed size, that makes sense. This stops malicious uses that could make things grow out of control and it helps detecting programming mistakes that would lead to the same problems. Also, by making sure strings and buffers are never ridiculously large, we avoid a whole class of integer overflow risks better.

Unified dynamic buffer functions – by reducing the number of different implementations that handle “growing buffers” we reduce the risk of a bug in one of them, even if it is used rarely or the spot is hard to reach with and “exercise” by the fuzzers. The “dynbuf” internal API first shipped in curl 7.71.0 (June 2020).

Realloc buffer growth unification – pretty much the same point as the previous, but we have earlier in our history had several issues when we had silly realloc() treatment that could lead to bad things. By limiting string sizes and unifying the buffer functions, we have reduced the number of places we use realloc and thus we reduce the number of places risking new realloc mistakes. The realloc mistakes were usually in combination with integer overflows.

Code style – we’ve gradually improved our code style checker (checksrc.pl) over time and we’ve also gradually made our code style more strict, leading to less variations in code, in white spacing and in naming. I’m a firm believer this makes the code look more coherent and therefore become more readable which leads to fewer bugs and easier to debug code. It also makes it easier to grep and search for code as you have fewer variations to scan for.

More code analyzers – we run every commit and PR through a large number of code analyzers to help us catch mistakes early, and we always remove detected problems. Analyzers used at the time of this writing: lgtm.com, Codacy, Deepcode AI, Monocle AI, clang tidy, scan-build, CodeQL, Muse and Coverity. That’s of course in addition to the regular run-time tools such as valgrind and sanitizer builds that run the entire test suite.

Memory-safe components – curl already supports getting built with a plethora of different libraries and “backends” to cater for users’ needs and desires. By properly supporting and offering users to build with components that are written in for example rust – or other languages that help developers avoid pitfalls – future curl and libcurl builds could potentially avoid a whole section of risks. (Stay tuned for more on this topic in a near future.)

Reactive measures

Recognizing that whatever we do and however tight ship we run, we will continue to slip every once in a while, is important and we should make sure we find and fix such slip-ups as good and early as possible.

Raising bounty rewards. While not directly fixing things, offering more money in our bug-bounty program helps us get more attention from security researchers. Our ambition is to gently drive up the reward amounts progressively to perhaps multi-thousand dollars per flaw, as long as we have funds to pay for them and we mange keep the security vulnerabilities at a reasonably low frequency.

More fuzzing. I’ve said it before but let me say it again: fuzzing is really the top method to find problems in curl once we’ve fixed all flaws that the static analyzers we use have pointed out. The primary fuzzing for curl is done by OSS-Fuzz, that tirelessly keeps hammering on the most recent curl code.

Good fuzzing needs a certain degree of “hand-holding” to allow it to really test all the APIs and dig into the dustiest corners, and we should work on adding more “probes” and entry-points into libcurl for the fuzzer to make it exercise more code paths to potentially detect more mistakes.

See also my presentation testing curl for security.

Google Open Source Peer Bonus award 2020

I’m honored to – once again – be a recipient of this award Google hands out to open source contributors, annually. I was previously awarded this in 2011.

I don’t get a lot of awards. Getting this token of appreciation feels awesome and I’m humbled and grateful I was not only nominated but also actually selected as recipient. Thank you, Google!

Nine years ago I got 350 USD credits in the Google store and I got my family a set of jackets using them – my kids have grown significantly since then, so to them those black beauties are now just a distant memory, but I still actually wear mine from time to time!

curl beers and curl stickers!

This time, the reward comes with a 250 USD “payout” (that’s the gift mentioned in the mail above), as a real money transfer that can be spent on other things than just Google merchandise!

I’ve decided to accept the reward and the money and I intend to spend it on beer and curl stickers for my friends and fans. As I prefer to view it:

The Google Open Source Beer Bonus.

Thank you Google and thank you Gaspar!

Update: the Google Open Source blog post about it.

Google to reimplement curl in libcrurl

Not the entire thing, just “a subset”. It’s not stated very clearly exactly what that subset is but the easy interface is mentioned in the Chrome bug about this project.

What?

The Chromium bug states that they will create a library of their own (named libcrurl) that will offer (parts of) the libcurl API and be implemented using Cronet.

Cronet is the networking stack of Chromium put into a library for use on mobile. The same networking stack that is used in the Chrome browser.

There’s also a mentioned possibility that “if this works”, they might also create “crurl” tool which is then their own version of the curl tool but using their own library. In itself is a pretty strong indication that their API will not be fully compatible, as if it was they could just use the existing curl tool…

Why?

“Implementing libcurl using Cronet would allow developers to take advantage of the utility of the Chrome Network Stack, without having to learn a new interface and its corresponding workflow. This would ideally increase ease of accessibility of Cronet, and overall improve adoption of Cronet by first-party or third-party applications.”

Logically, I suppose they then also hope that 3rd party applications can switch to this library (without having to change to another API or adapt much) and gain something and that new applications can use this library without having to learn a new API. Stick to the old established libcurl API.

How?

By throwing a lot of man power on it. As the primary author and developer of the libcurl API and the libcurl code, I assume that Cronet works quite differently than libcurl so there’s going to be quite a lot of wrestling of data and code flow to make this API work on that code.

The libcurl API is also very versatile and is an API that has developed over a period of almost 20 years so there’s a lot of functionality, a lot of options and a lot of subtle behavior that may or may not be easy or straight forward to mimic.

The initial commit imported the headers and examples from the curl 7.65.1 release.

Will it work?

Getting basic functionality for a small set of use cases should be simple and straight forward. But even if they limit the subset to number of functions and libcurl options, making them work exactly as we have them documented will be hard and time consuming.

I don’t think applications will be able to arbitrarily use either library for a very long time, if ever. libcurl has 80 public functions and curl_easy_setopt alone takes 268 different options!

Given enough time and effort they can certainly make this work to some degree.

Releases?

There’s no word on API/ABI stability or how they intend to ship or version their library. It is all very early still. I suppose we will learn more details as and if this progresses.

Flattered?

I think this move underscores that libcurl has succeeded in becoming an almost defacto standard for network transfers.

A Google office building in New York.

There’s this saying about imitation and flattery but getting competition from a giant like Google is a little intimidating. If they just put two paid engineers on their project they already have more dedicated man power than the original libcurl project does…

How will it affect curl?

First off: this doesn’t seem to actually exist for real yet so it is still very early.

Ideally the team working on this from Google’s end finds and fixes issues in our code and API so curl improves. Ideally this move makes more users aware of libcurl and its API and we make it even easier for users and applications in the world to do safe and solid Internet transfers. If the engineers are magically good, they offer a library that can do things better than libcurl can, using the same API so application authors can just pick the library they find work the best. Let the best library win!

Unfortunately I think introducing half-baked implementations of the API will cause users grief since it will be hard for users to understand what API it is and how they differ.

Since I don’t think “libcrurl” will be able to offer a compatible API without a considerable effort, I think applications will need to be aware of which of the APIs they work with and then we have a “split world” to deal with for the foreseeable future and that will cause problems, documentation problems and users misunderstanding or just getting things wrong.

Their naming will possibly also be reason for confusion since “libcrurl” and “crurl” look so much like typos of the original names.

We are determined to keep libcurl the transfer library for the internet. We support the full API and we offer full backwards compatibility while working the same way on a vast amount of different platforms and architectures. Why use a copy when the original is free, proven and battle-tested since years?

Rights?

Just to put things in perspective: yes they’re perfectly allowed and permitted to do this. Both morally and legally. curl is free and open source and licensed under the MIT license.

Good luck!

I wish the team working on this the best of luck!

Updates after initial post

Discussions: the hacker news discussion, the reddit thread, the lobsters talk.

Rename? it seems the google library might change name to libcurl_on_cronet.

Update in April 2020:

According to an update to the bug entry dated February 28th 2020:

Remove libcurl_on_cronet and dependencies.

This project was never finished, and we have no current plans to
continue development.

Blabbed about curl at Google

I was invited by Robert Nyman to speak at Google in their Stockholm offices on August 26th, in his Google Tech Talk series.

curl – a hobby project with a billion users” was the humble title of the talk.

curl is like a Swiss army-knife for HTTP and internet transfers. For over 17 years the project has been run by volunteers and now counts perhaps more than one billion users. Daniel takes us through how it started, how it works and why it never gets done.

Already back in June all the 70 seats were taken and there were more than twice as many persons in the waiting list by the time the talk was happening! A totally mind-blowing interest I mostly credit Robert’s reach and ability to gather people.

Here’s the video of the talk:

To my great surprise and joy, I got this awesome gift from the host:

my LG watch urbane(It is an LG Watch Urbane)

A third day of HTTP Workshopping

I’ve met a bunch of new faces and friends here at the HTTP Workshop in Münster. Several who I’ve only seen or chatted with online before and some that I never interacted with until now. Pretty awesome really.

Out of the almost forty HTTP fanatics present at this workshop, five persons are from Google, four from Mozilla (including myself) and Akamai has three employees here. Those are the top-3 companies. There are a few others with 2 representatives but most people here are the only guys from their company. Yes they are all guys. We are all guys. The male dominance at this event is really extreme and we’ve discussed this sad circumstance during breaks and it hasn’t gone unnoticed.

This particular day started out grand with Eric Rescorla (of Mozilla) talking about HTTP Security in his marvelous high-speed style. Lots of talk about how how the HTTPS usage is right now on  the web, HTTPS trends, TLS 1.3 details and when it is coming and we got into a lot of talk about how HTTP deprecation and what can and cannot be done etc.

Next up was a presentation about HTTP Privacy and Anonymity by Mike Perry (from the Tor project) about lots of aspects of what the Tor guys consider regarding fingerprinting, correlation, network side-channels and similar things that can be used to attempt to track user or usage over the Tor network. We got into details about what recent protocols like HTTP/2 and QUIC “leak” or open up for fingerprinting and what (if anything) can or could be done to mitigate the effects.

Evolving HTTP Header Fields by Julian Reschke (of Green Bytes) then followed, discussing all the variations of header syntax that we have in HTTP and how it really is not possible to write a generic parser that can handle them, with a suggestion on how to unify this and introduce a common format for future new headers. Julian’s suggestion to use JSON for this ignited a discussion about header formats in general and what should or could be done for HTTP/3 and if keeping support for the old formats is necessary or not going forward. No real consensus was reached.

Willy Tarreau (from HAProxy) then took us into the world of HTTP Infrastructure scaling and Load balancing, and showed us on the microsecond level how fast a load balancer can be, how much extra work adding HTTPS can mean and then ending with a couple suggestions of what he thinks could’ve helped his scenario. That then turned into a general discussion and network architecture brainstorm on what can be done, how it could be improved and what TLS and other protocols could possibly be do to aid. Cramming out every possible gigabit out of load balancers certainly is a challange.

Talking about cramming bits, Kazuho Oku got to show the final slides when he showed how he’s managed to get his picohttpparser to parse HTTP/1 headers at a speed that is only slightly slower than strlen() – including a raw dump of the x86 assembler the code is turned into by a compiler. What could possibly be a better way to end a day full of protocol geekery?

Google graciously sponsored the team dinner in the evening at a Peruvian place in the town! Yet another fully packed day has ended.

I’ll top off today’s summary with a picture of the gift Mark Nottingham (who’s herding us through these days) was handing out today to make us stay keen and alert (Mark pointed out to me that this was a gift from one of our Japanese friends here):

kitkat

What a removed search from Google looks like

Back in the days when I participated in the starting of the Subversion project, I found the mailing list archive we had really dysfunctional and hard to use, so I set up a separate archive for the benefit of everyone who wanted an alternative way to find Subversion related posts.

This archive is still alive and it recently surpassed 370,000 archived emails, all related to Subversion, for seven different mailing lists.

Today I received a notice from Google (shown in its entirety below) that one of the mails received in 2009 is now apparently removed from a search using a name – if done within the European Union at least. It is hard to take this seriously when you look at the page in question, and as there aren’t that very many names involved in that page the possibilities of which name it is aren’t that many. As there are several different mail archives for Subversion mails I can only assume that the alternative search results also have been removed.

This is the first removal I’ve got for any of the sites and contents I host.


Notice of removal from Google Search

Hello,

Due to a request under data protection law in Europe, we are no longer able to show one or more pages from your site in our search results in response to some search queries for names or other personal identifiers. Only results on European versions of Google are affected. No action is required from you.

These pages have not been blocked entirely from our search results, and will continue to appear for queries other than those specified by individuals in the European data protection law requests we have honored. Unfortunately, due to individual privacy concerns, we are not able to disclose which queries have been affected.

Please note that in many cases, the affected queries do not relate to the name of any person mentioned prominently on the page. For example, in some cases, the name may appear only in a comment section.

If you believe Google should be aware of additional information regarding this content that might result in a reversal or other change to this removal action, you can use our form at https://www.google.com/webmasters/tools/eu-privacy-webmaster. Please note that we can’t guarantee responses to submissions to that form.

The following URLs have been affected by this action:

http://svn.haxx.se/users/archive-2009-08/0808.shtml

Regards,

The Google Team

HTTP/2 interop pains

At around 06:49 CEST on the morning of August 27 2014, Google deployed an HTTP/2 draft-14 implementation on their front-end servers that handle logins to Google accounts (and possibly others). Those at least take care of all the various login stuff you do with Google, G+, gmail, etc.

The little problem with that was just that their implementation of HTTP2 is in disagreement with all existing client implementations of that same protocol at that draft level. Someone immediately noticed this problem and filed a bug against Firefox.

The Firefox Nightly and beta versions have HTTP2 enabled by default and so users quickly started to notice this and a range of duplicate bug reports have been filed. And keeps being filed as more users run into this problem. As far as I know, Chrome does not have this enabled by default so much fewer Chrome users get this ugly surprise.

The Google implementation has a broken cookie handling (remnants from the draft-13 it looks like by how they do it). As I write this, we’re on the 7th day with this brokenness. We advice bleeding-edge users of Firefox to switch off HTTP/2 support in the mean time until Google wakes up and acts.

You can actually switch http2 support back on once you’ve logged in and it then continues to work fine. Below you can see what a lovely (wildly misleading) error message you get if you try http2 against Google right now with Firefox:

google-http2-draft14-cookies

This post is being debated on hacker news.

Updated: 20:14 CEST: There’s a fix coming, that supposedly will fix this problem on Thursday September 4th.

Update 2: In the morning of September 4th (my time), Google has reverted their servers to instead negotiate SPDY 3.1 and Firefox is fine with this.

groups.google.com hates greylisting

Dear Google,

Here’s a Wikipedia article for you: Greylisting.

After you’ve read that, then consider the error message I always get for my groups.google.com account when you disable mail sending to me due to “bouncing”:

Bounce status Your email address is currently flagged as bouncing. For additional information or to correct this, view your email status here [link].

Following that link I get to read the reason:

“Google tried to deliver your message, but it was rejected by the server for the recipient domain haxx.se by [mailserver]. The error that the other server returned was: 451 4.7.1 Greylisting in action, please come back later”

See, even the error message spells out what it is all about!

Thanks to this feature of Google groups, I cannot participate in any such lists/groups for as long as I keep my greylisting activated since it’ll keep disabling mail delivery to me.

Enabling greylisting decreased my spam flood to roughly a third of the previous volume (and now I’m at 500-1000 spam emails/day) so I’m not ready to disable it yet. I just have to not use google groups.

Update: I threw in the towel and I now whitelist google.com servers to get around this problem…