How Redis scratched an itch — and changed databases forever

25 May 2020, 10:00 by Matt Asay

Why would you ever write a new database? Particularly an in-memory database, which, back in 2009, made zero sense to the ruling database class of the time. Salvatore Sanfilippo didn’t really care. He wasn’t trying to change anyone’s minds about what a database should be. He just needed to scale a real-time analytics engine, and MySQL couldn’t do so cost-effectively.

So he built Redis, as one does, and changed the database market forever.

In a series of conversations with open source project founders like Sanfilippo over the past few weeks, I’ve been struck by how often they started with trying to answer a specific need — an “itch,” in the open source parlance — but ended up changing the way whole product categories work. They weren’t trying to be clever. They were just trying to be useful.

Which makes me wonder. If your organization is trying to figure out new ways to innovate, have you tried encouraging your employees to dive more deeply into open source?

Explore new things

Sometimes we get stuck in the comfortable routines of legacy technology. Sanfilippo said as much in a Twitter conversation, pointing out that it’s “possible to explore new things” in certain software fields even if everything looks “solved.” The relational database (RDBMS) market was somewhat stagnant for decades. Yes, we saw new functionality emerge, and even cool new open source databases (MySQL, PostgreSQL), but we kept shoving things into rows and columns because that was the “right way” to work with data.

Even though it wasn’t. Not always, at least. And maybe not even most of the time.

Yes, some data absolutely fits the relational model, particularly systems of record like an ERP system. But as the world has moved to systems of engagement, very little of that data is structured/semi-structured. The relational model requires data be flattened into two-dimensional, tabular structures. A document database like MongoDB, by contrast, offers a rich, dynamic data model that better fits with the structure of objects in modern programming languages.

This isn’t to say that one database is better than another; each does its job well. Rather, it’s to suggest that we wasted a lot of years trying to squeeze unstructured data into a relational format. We figured the database was a “solved” problem, as Sanfilippo put it. That mistake cost us significant amounts of productivity.

Nor is this just a database thing.

From imitative to innovative

In the early days of open source, some of the more well-known projects like Linux and MySQL tried to copycat the functionality of their proprietary, expensive peers (like Unix and Oracle). Over time, these (and other) projects have trended toward innovative, rather than imitative. At the same time, there were always projects, like Redis, that broke new ground or trod old ground in new ways that dramatically expanded the universe of users.

And often they started with one person’s “itch.”

For example, Daniel Stenberg just needed to be able to download and transfer currency rates for fellow IRC users, but there wasn’t a good way to do that. So he built Curl, which now boasts billions of users. In fact, you probably use Curl every day without knowing it.

When Simon Willison worked at The Guardian, he longed for a better way to publish static data sets to the web, so that others could query and engage with data in novel ways. He also wanted to “blow off steam,” as he said, so open sourcing Datasette made sense. In the process he embrace the public domain-licensed SQLite and started publishing data that will run virtually anywhere (on an Apple Watch, for example). Willison is now using Datasette to experiment with querying his own data (via his Dog Sheep project) or, as he calls it, personal analytics.

Then there’s Dries Buytaert, who built a “little website” to help friends share an ADSL connection. That website morphed into an open source CMS (content management system) called Drupal that now attracts close to 10,000 contributors each year. Drupal wasn’t the first CMS, and it’s not the last. But its flexibility to become whatever its thousands of contributors need it to become allows it to innovate to fit a broad array of needs.

Or how about Jens Axboe, a longtime contributor to the Linux kernel and maintainer of the Linux I/O stack? Axboe kept getting bug reports in from people running weird workloads that weren’t easy for him to reproduce. So he came up with a tool, Fio, the Flexible I/O tester, that roughly mimics what their workload is doing. Fifteen years later, Fio is the industry standard for modeling storage workloads.

Watch the open sourcerors

Want to see the future of technology? It’s playing out in plain sight on GitHub, GitLab, and other open source code repositories. Tim O’Reilly used to suggest we “watch the alpha geeks” today to see where technology is going tomorrow. Those alpha geeks are almost always kicking around their new ideas — their Redis, Fio, Curl, Drupal, or Datasette — somewhere in the open source ether.

Maybe they work for your company. Maybe they’re tired of the bureaucracy and so create an open source project to “blow off steam,” in Willison’s words, and get the pleasure of pure productivity, absent anyone from Legal telling them “no.” If you want to help keep that future firmly planted within your company, let your developers engage deeply in open source. Encourage them to do so. That way the future of technology won’t be happening somewhere else. It will be happening within your firewall. Or on GitHub, more likely.

How Redis scratched an itch — and changed databases forever

Explore new things

From imitative to innovative

Watch the open sourcerors

Read more about open source: