The open source decade, fueled by cloud and GitHub
Commentary: The last decade has been open source’s most productive by far. Find out why Matt Asay considers it a Cambrian explosion of choice and innovation.
If the 2000s were the years when open source battled for survival with old world hegemonies, the 2010s was the decade when open source “won” and began to drive most every modern technological innovation. From cloud to mobile to big data to data science, open source has been at the heart of these and other mega trends since 2010 and, as such, has encouraged contributions from even its most stalwart foes.
On that note, let’s look at the most important open source stories of the last decade, starting with the place where much (though not all) open source lives: GitHub.
SEE: More from our Decade in Review series (TechRepublic on Flipboard)
In the beginning was the pull request
“GitHub changed everything…Nothing else [comes] close [in importance],” declared Red Hat’s Andrew Shafer. Git, of course, has been with us since 2005, but GitHub, founded in 2008, made Git usable by the masses. Git wasn’t the first version control system, and GitHub was not the first place open source code was kept (remember SourceForge, Google Code, etc.?), but GitHub steamrolled them all.
The secret of Git(Hub)? People.
As Cloud CMS founder Michael Uzquiano has stressed, “[T]he facility of pull requests via systems like GitHub…really delivered on the promise of code being open.” Buried in Uzquiano’s comment is the importance of the person on the other end of that pull request. Hazelcast’s David Brimley takes this further, arguing that “fully integrated tooling like wikis, actions, CI/GitLab” enabled distributed open source teams to grow. In other words, version control, as important as it was, lacked the social aspect that GitHub offered. Open source became open collaboration, and that made all the difference.
It’s therefore not surprising that the developer world held its breath when Microsoft announced in mid 2018 that it had acquired GitHub for $7.5 billion. In 2008, such a deal would have been unthinkable. Microsoft, for example, still hadn’t donned its hair shirt for years of calling Linux a “cancer” and open source “un-American.” In late 2009 I wrote on sister site CNET, “[Steve] Ballmer needs to learn to speak to developers or risks ruining the house that [Bill] Gates built.” Microsoft looked likely to spend the next 10 years much like its last: Fighting the open source risk.
Instead, it changed. Almost completely.
From open source zero to open source hero, Microsoft has become the world’s largest open source contributor (measured in terms of employees actively contributing to open source projects on GitHub). Partly this came down to a change in CEO, with Satya Nadella more developer-friendly than his predecessor, but much of it was simple self-interest: Microsoft was a developer-oriented platform company. If it wanted to remain a “going concern,” it needed to be concerned with what developers wanted.
And they wanted open source. Oh, and cloud.
Raging against the machine
Cloud undergirds pretty much every open source trend of the past 10 years. (Disclosure: I have worked for AWS since August 2019.) Without cloud, there would be no GitHub, no modern CI/CD toolchains that have done so much to foster open source development, no dramatic rise in containers, etc. Just as open source gave developers an easy path to exceptional software without detouring through Purchasing or Legal so, too, did cloud enable developers to spin up the hardware necessary to run open source software for relatively little without waiting for IT to provision servers.
Cloud, in short, completes open source in ways that Tim O’Reilly anticipated back in 2008. It has enabled the Cambrian explosion of innovation in open source over the decade.
SEE: The most important cloud advances of the decade (TechRepublic)
Indeed, it was the cloud that really fueled the accelerated rise of open source, even as open source gave rise to cloud. Yet one of the biggest stories of the decade was the sometimes uneasy alliance between cloud and open source. As I wrote in 2018, commercial open source vendors sought to block cloud vendors from distributing their open source code, experimenting with a number of license changes, even as they tell their investors (see here and here), “We haven’t seen [cloud competition] really affect any of our metrics, when it comes to downloads, community adoption, or…our sales numbers.” As we leave the decade, there are faint signs of a thaw.
Against this backdrop of cloud as the infrastructure enabler and GitHub as the locus for development, so many cool things have happened with open source since 2010.
A Cambrian explosion of open source joy
As important as the back-end infrastructure development (e.g., Docker revolutionized application development through containers yet ultimately the company failed to profit therefrom), front-end development for mobile and web exploded. Within the enterprise set, we may like to fixate on Kubernetes and containers, but open source front-end development technologies like Angular and React touch far more developers, as AWS’ Ian Massingham has pointed out:
Kubernetes: 60.2K stars (43.6K repos on search term)
Vue: 152K stars (324K repos)
React: 140K stars (1M+ repos)
Node.js: 65.8K stars (746K repos)
Angular: 54.3K stars (672K repos)
The same is true of the exploding data infrastructure world. Apache Hadoop was all the rage and then gave way to Apache Spark, which gave way to…the list goes on. Indeed, the pace of innovation within data science has been so pronounced that it has become almost pointless learning how to pronounce the names of new open source data infrastructure projects as they have their 15 minutes of fame. RedMonk analyst James Governor argued that we were entering the polyglot era of software development, and the decade confirmed that view at every turn.
Rounding out the polyglot era
Especially databases. While the world spent decades storing data in (mostly) relational databases (RDBMS), developed by a few enterprise IT vendors, in late 2009 the launch of MongoDB sparked significant changes in how developers viewed their database options. Instead of relying on the RDBMS to manage increasingly “big data,” with its unprecedented variety, volume, and velocity, developers embraced an array of so-called (and almost entirely open source) NoSQL databases, including document databases, key-value stores, graph databases, time series databases, and more.
SEE: How to build a successful developer career (free PDF) (TechRepublic)
Even as developers exulted in this smorgasbord of choice, RDBMS PostgreSQL started its own resurgence. PostgreSQL never attained quite the status of its open source sibling, MySQL, yet over the decade PostgreSQL grew to become the fourth-most popular database, according to DB-Engines. PostgreSQL became hot in the past decade, yet remains the unsung hero of data.
Which is a good place to end. Most of the decade’s hottest open source technologies, and the stories that accompanied them, were all about change. PostgreSQL, by contrast, demonstrates one of the other wonderful things about open source: How projects can evolve to meet new use cases. Linux has demonstrated this with operating systems, and PostgreSQL is doing the same in databases. From 2010 until 2020 the explosion of new open source choices is mind-boggling, yet the persistence of PostgreSQL is comforting, reminding us that open source can be whatever we need it to be.