Tag Archives: github

20,000 github stars

In September 2018 I celebrated 10,000 stars, up from 5,000 back in May 2017. We made 1,000 stars on August 12, 2014.

Today I’m cheering for the 20,000 stars curl has received on GitHub.

It is worth repeating that this is just a number without any particular meaning or importance. It just means 20,000 GitHub users clicked the star symbol for the curl project over at curl/curl.

At exactly 08:15:23 UTC today we reached this milestone. Checked with a curl command line like this:

$ curl -s https://api.github.com/repos/curl/curl | jq '.stargazers_count'
20000

(By the time I get around to finalize this post, the count has already gone up to 20087…)

To celebrate this occasion, I decided I was worth a beer and this time I went with a hand-written note. The beer was a Swedish hazy IPA called Amazing Haze from the brewery Stigbergets. One of my current favorites.

Photos from previous GitHub-star celebrations :

Github steel

I honestly don’t know what particular thing I did to get this, but GitHub gave me a 3D-printed steel version of my 2020 GitHub contribution “matrix”. You know that thing on your GitHub profile that normally looks something like this:

The gift package included this friendly note:

Hi @bagder,

As we welcome 2021, we want to thank and congratulate you on what you brought to 2020. Amidst the year’s challenges, you found time to continue giving back and contributing to the community.

Your hard work, care, and attention haven’t gone unnoticed.

Enclosed is your 2020 GitHub contribution graph, 3D printed in steel. You can also view it by pointing your browser to https://github.co/skyline. It tells a personal story only you can truly interpret.

Please accept this small gift as a token of appreciation on behalf of all of us here at GitHub, and everyone who benefits from your work.

Thank you and all the best for the year ahead!

With <3, from GitHub

I think I’ll put it under one of my screens here on my desk for now. The size is 145 mm x 30 mm x 30 mm. 438 grams.

Thanks GitHub!

Update: the print is done by shapeways.com

What if GitHub is the devil?

Some critics think the curl project shouldn’t use GitHub. The reasons for being against GitHub hosting tend to be one or more of:

  1. it is an evil proprietary platform
  2. it is run by Microsoft and they are evil
  3. GitHub is American thus evil

Some have insisted on craziness like “we let GitHub hold our source code hostage”.

Why GitHub?

The curl project switched to GitHub (from Sourceforge) almost eleven years ago and it’s been a smooth ride ever since.

We’re on GitHub not only because it provides a myriad of practical features and is a stable and snappy service for hosting and managing source code. GitHub is also a developer hub for millions of developers who already have accounts and are familiar with the GitHub style of developing, the terms and the tools. By being on GitHub, we reduce friction from the contribution process and we maximize the ability for others to join in and help. We lower the bar. This is good for us.

I like GitHub.

Self-hosting is not better

Providing even close to the same uptime and snappy response times with a self-hosted service is a challenge, and it would take someone time and energy to volunteer that work – time and energy we now instead can spend of developing the project instead. As a small independent open source project, we don’t have any “infrastructure department” that would do it for us. And trust me: we already have enough infrastructure management to deal with without having to add to that pile.

… and by running our own hosted version, we would lose the “network effect” and convenience for people that already are on and know the platform. We would also lose the easy integration with cool services like the many different CI and code analyzer jobs we run.

Proprietary is not the issue

While git is open source, GitHub is a proprietary system. But the thing is that even if we would go with a competitor and get our code hosting done elsewhere, our code would still be stored on a machine somewhere in a remote server park we cannot physically access – ever. It doesn’t matter if that hosting company uses open source or proprietary code. If they decide to switch off the servers one day, or even just selectively block our project, there’s nothing we can do to get our stuff back out from there.

We have to work so that we minimize the risk for it and the effects from it if it still happens.

A proprietary software platform holds our code just as much hostage as any free or open source software platform would, simply by the fact that we let someone else host it. They run the servers our code is stored on.

If GitHub takes the ball and goes home

No matter which service we use, there’s always a risk that they will turn off the light one day and not come back – or just change the rules or licensing terms that would prevent us from staying there. We cannot avoid that risk. But we can make sure that we’re smart about it, have a contingency plan or at least an idea of what to do when that day comes.

If GitHub shuts down immediately and we get zero warning to rescue anything at all from them, what would be the result for the curl project?

Code. We would still have the entire git repository with all code, all source history and all existing branches up until that point. We’re hundreds of developers who pull that repository frequently, and many automatically, so there’s a very distributed backup all over the world.

CI. Most of our CI setup is done with yaml config files in the source repo. If we transition to another hosting platform, we could reuse them.

Issues. Bug reports and pull requests are stored on GitHub and a sudden exit would definitely make us lose some of them. We do daily “extractions” of all issues and pull-requests so a lot of meta-data could still be saved and preserved. I don’t think this would be a terribly hard blow either: we move long-standing bugs and ideas over to documents in the repository, so the currently open ones are likely possible to get resubmitted again within the nearest future.

There’s no doubt that it would be a significant speed bump for the project, but it would not be worse than that. We could bounce back on a new platform and development would go on within days.

Low risk

It’s a rare thing, that a service just suddenly with no warning and no heads up would just go black and leave projects completely stranded. In most cases, we get alerts, notifications and get a chance to transition cleanly and orderly.

There are alternatives

Sure there are alternatives. Both pure GitHub alternatives that look similar and provide similar services, and projects that would allow us to run similar things ourselves and host locally. There are many options.

I’m not looking for alternatives. I’m not planning to switch hosting anytime soon! As mentioned above, I think GitHub is a net positive for the curl project.

Nothing lasts forever

We’ve switched services several times before and I’m expecting that we will change again in the future, for all sorts of hosting and related project offerings that we provide to the work and to the developers and participators within the project. Nothing lasts forever.

When a service we use goes down or just turns sour, we will figure out the best possible replacement and take the jump. Then we patch up all the cracks the jump may have caused and continue the race into the future. Onward and upward. The way we know and the way we’ve done for over twenty years already.

Credits

Image by Elias Sch. from Pixabay

Updates

After this blog post went live, some users remarked than I’m “disingenuous” in the list of reasons at the top, that people have presented to me. This, because I don’t mention the moral issues staying on GitHub present – like for example previously reported workplace conflicts and their association with hideous American immigration authorities.

This is rather the opposite of disingenuous. This is the truth. Not a single person have ever asked me to leave GitHub for those reasons. Not me personally, and nobody has asked it out to the wider project either.

These are good reasons to discuss and consider if a service should be used. Have there been violations of “decency” significant enough that should make us leave? Have we crossed that line in the sand? I’m leaning to “no” now, but I’m always listening to what curl users and developers say. Where do you think the line is drawn?

AI-powered code submissions

Who knows, maybe May 18 2020 will mark some sort of historic change when we look back on this day in the future.

On this day, the curl project received the first “AI-powered” submitted issues and pull-requests. They were submitted by MonocleAI, which is described as:

MonocleAI, an AI bug detection and fixing platform where we use AI & ML techniques to learn from previous vulnerabilities to discover and fix future software defects before they cause software failures.

I’m sure these are still early days and we can’t expect this to be perfected yet, but I would still claim that from the submissions we’ve seen so far that this is useful stuff! After I tweeted about this “event”, several people expressed interest in how well the service performs, so let me elaborate on what we’ve learned already in this early phase. I hope I can back in the future with updates.

Disclaimers: I’ve been invited to try this service out as an early (beta?) user. No one is saying that this is complete or that it replaces humans. I have no affiliation with the makers of this service other than as a receiver of their submissions to the project I manage. Also: since this service is run by others, I can’t actually tell how much machine vs humans this actually is or how much human “assistance” the AI required to perform these actions.

I’m looking forward to see if we get more contributions from this AI other than this first batch that we already dealt with, and if so, will the AI get better over time? Will it look at how we adjusted its suggested changes? We know humans adapt like that.

Pull-request quality

Monocle still needs to work on adapting its produced code to follow the existing code style when it submits a PR, as a human would. For example, in curl we always write the assignment that initializes a variable to something at declaration time immediately on the same line as the declaration. Like this:

int name = 0;

… while Monocle, when fixing cases where it thinks there was an assignment missing, adds it in a line below, like this:

int name;
name = 0;

I can only presume that in some projects that will be the preferred style. In curl it is not.

White space

Other things that maybe shouldn’t be that hard for an AI to adapt to, as you’d imagine an AI should be able to figure out, is other code style issues such as where to use white space and where not no. For example, in the curl project we write pointers like char * or void *. That is with the type, a space and then an asterisk. Our code style script will yell if you do this wrong. Monocle did it wrong and used it without space: void*.

C89

We use and stick to the most conservative ANSI C version in curl. C89/C90 (and we have CI jobs failing if we deviate from this). In this version of C you cannot mix variable declarations and code. Yet Monocle did this in one of its PRs. It figured out an assignment was missing and added the assignment in a new line immediately below, which of course is wrong if there are more variables declared below!

int missing;
missing = 0; /* this is not C89 friendly */
int fine = 0;

NULL

We use the symbol NULL in curl when we zero a pointer . Monocle for some reason decided it should use (void*)0 instead. Also seems like something virtually no human would do, and especially not after having taken a look at our code…

The first issues

MonocleAI found a few issues in curl without filing PRs for them, and they were basically all of the same kind of inconsistency.

It found function calls for which the return code wasn’t checked, while it was checked in some other places. With the obvious and rightful thinking that if it was worth checking at one place it should be worth checking at other places too.

Those kind of “suspicious” code are also likely much harder fix automatically as it will include decisions on what the correct action should actually be when checks are added, or perhaps the checks aren’t necessary…

Credits

Image by Couleur from Pixabay

curl: 3K forks

It’s just another meaningless number, but today there are 3,000 forks done of the curl GitHub repository.

This pops up just a little over three years since we reached our first 1,000 forks. Also, 10,000 stars no too long ago.

Why fork?

A typical reason why people fork a project on GitHub, is so that they can make a change in their own copy of the source code and then suggest that change to the project in the form of a pull-request.

The curl project has almost 700 individual commit authors, which makes at least 2,300 forks done who still haven’t had their pull-requests accepted! Of course those are 700 contributors who actually managed to work all the way through to inclusion. We can imagine that there is a huge number of people who only ever thought about doing a change, some who only ever just started to do it, many who ditched the idea before it was completed, some who didn’t actually manage to implement it properly, some who got their idea and suggestion shut down by the project and of course, lots of people still have their half-finished change sitting there waiting for inspiration.

Then there are people who just never had the intention of sending any change back. Maybe they just wanted to tinker with the code and have fun. Some want to do private changes they don’t want to offer or perhaps they already know the upstream project won’t accept.

We just can’t tell.

Many?

Is 3,000 forks a lot or a little? Both. It is certainly more forks than we’ve ever had before in this project. But compared to some of the most popular projects on GitHub, even comparing to some other C projects (on GitHub the most popular projects are never written in C) our numbers are dwarfed by the really popular ones. You can probably guess which ones they are.

In the end, this number is next to totally meaningless as it doesn’t say anything about the project nor about what contributions we get or will get in the future. It tells us we have (or had) the attention of a lot of users and that’s about it.

I will continue to try to make sure we’re worth the attention, both now and going forward!

(Picture from pixabay.)

10,000 stars

On github, you can ‘star’ a project. It’s a fairly meaningless way to mark your appreciation of a project hosted on that site and of course, the number doesn’t really mean anything and it certainly doesn’t reflect how popular or widely used or unused that particular software project is. But here I am, highlighting the fact that today I snapped the screenshot shown above when the curl project just reached this milestone: 10,000 stars.

In the great scheme of things, the most popular and starred projects on github of course have magnitudes more stars. Right now, curl ranks as roughly the 885th most starred project on github. According to github themselves, they host an amazing 25 million public repositories which thus puts curl in the top 0.004% star-wise.

There was appropriate celebration going on in the Stenberg casa tonight and here’s a photo to prove it:

I took a photo when we celebrated 1,000 stars. It doesn’t feel so long ago but was a little over 1500 days ago.

August 12 2014

Onwards and upwards!

curl: 5000 stars

The curl project has been hosted on github since March 2010, and we just now surpassed 5000 stars. A star on github of course being a completely meaningless measurement for anything. But it does show that at least 5000 individuals have visited the page and shown appreciation this low impact way.

5000 is not a lot compared to the really popular projects.

On August 12, 2014 I celebrated us passing 1000 stars with a beer. Hopefully it won’t be another seven years until I can get my 10000 stars beer!

A thousand curl forks

a fork

The curl repository on github has now been forked 1,000 times. Or actually, there are 1,000 forks kept alive as the counter is actually decreased when people remove their forks again. curl has had its primary git repository on github since March 22, 2010. A little more than two days between every newly created fork.

1000-forks

If you’re not used to the github model: a fork is typically made to get yourself your own copy of someone’s source tree so that you can make changes to that and publish your own version of the source tree without having to get the changes you’ve done merged into the original repository that you forked off from. But it is also the most common way to offer changes back to github based projects:  send a request that a particular change in your version of the source tree should get merged into the mother project. A so called Pull Request.

Trivia: The term “fork” when meaning “to divide in branches, go separate ways” has been used in the English language since the 14th century.

I’m only aware of one actual separate line of development that is a true fork of libcurl that I believe is still being maintained: libgnurl.

curl: embracing github more

Pull requests and issues filed on github are most welcome!

The curl project has been around for a long time by now and we’ve been through several different version control systems. The most recent switch was when we switched to git from CVS back in 2010. We were late switchers but then we’re conservative in several regards.

When we switched to git we also switched to github for the hosting, after having been self-hosted for many years before that. By using github we got a lot of services, goodies and reliable hosting at no cost. We’ve been enjoying that ever since.

cURLHowever, as we have been a traditional mailing list driving project for a long time, I have previously not properly embraced and appreciated pull requests and issues filed at github since they don’t really follow the old model very good.

Just very recently I decided to stop fighting those methods and instead go with them. A quick poll among my fellow team mates showed no strong opposition and we are now instead going full force ahead in a more github embracing style. I hope that this will lower the barrier and remove friction for newcomers and allow more people to contribute easier.

As an effect of this, I would also like to encourage each and everyone who is interested in this project as a user of libcurl or as a contributor to and hacker of libcurl, to skip over to the curl github home and press the ‘watch’ button to get notified when future requests and issues appear.

We also offer this helpful guide on how to contribute to the curl project!

Don’t email me

Why I insist on people to keep issues on the mailing list(s)

A recent twitter discussion I had with Andrei Neculau contributed to his blog post on this subject, basically arguing that I’m wrong but with many words and explanations.

It triggered me to write up my primary reasons for why I strongly object to handle open source issues, questions and patches privately (for free) in open source projects that I have a leading role in.

1. I spend a considerable amount of my spare time on open source projects. I devote some 15-20 unpaid hours a week for those communities. By emailing me and insisting on a PRIVATE conversation you’re suddenly yanking the mutex flag and you’re now requesting that I spend parts of this time on YOU ALONE and not the rest of the community. That’s selfish.

2. By insisting on a private conversation you FORCE me to repeat myself since ideas and questions are rarely unique or done for the first time. So you have a problem or a question that’s very similar to one I just responded to. And the next person will ask the same one tomorrow. By insisting on doing them in public already in the first email, already the second person can read it without me having to write it twice. And the third person who didn’t even realize he was interested in that topic will find out and read it as well (either now when the mail gets sent out or even years later when that user find the archived mailing list on the web). Private emails deny that ability. That’s selfish.

3. By emailing me privately and asking questions and help, you assume that I am the single best person to ask this question at this given time. What if I happen to be on vacation, be under a rough period at work or just not know the particular area of the project very good. I may be the leader or a public person of a project, but I may still not know much about feature X for operating system Z about which you ask. Ask on the list at once and you’ll reach the correct person. That’s more efficient.

4. By emailing me privately, you indirectly put a load on me to reply – or to get off as a rude person. Yes you’re friendly and you ask me nicely and yet even after you remind me after a few days I STILL DON’T RESPOND. Even if I just worked five 16-hour work days and you asked questions I don’t know the answer to… That’s inefficient and rude.

5. Yes, you can say that subscribing to an email list can be daunting and flood you with hundreds or thousands of emails per month – that’s completely true. But if you only wanted to send that single question or submit the single issue, then you can unsubscribe again quite soon and escape most of that load. Then YOU do the work instead of demanding someone else to do it for you. When you want to handle a SINGLE issue, it is much better load balancing if you do the extra work and the people who do tens or HUNDREDS of issues per month in the project do less work per issue.

6. You’re suggesting that I could forward the private question to the mailing list? Yes I can, but then I need to first ask for permission to do so (or be a jerk) and if the person who sent me the mail is going to send me another mail anyway, (s)he can just as well spend that time to send the first mail to the list instead of say YES to me and then make me do his or hers work. It’s just more efficient. Also, forwarded questions tend to end up so that replies and follow-up questions don’t find their way back to the original poster and that’s bad.

7. I propose and use different lists for different purposes to ease the problem with too many (uninteresting) emails.