Jul 14 2016

Is Datomic the Sequel to SQL?

Well, no, but it will change the way you think about data...

After a career spent working almost entirely with SQL-based relational databases, in the past year or so I've gotten the chance to use Datomic. It's very different, and I have come to realize that Datomic possesses unusual powers. Greatest of all of these powers is that Datomic remembers everything.

WHAT DATOMIC DOES

Rich Hickey, eventually the creator of Datomic, famously asserted that, when we modify a value in-place, (as is commonly done with a SQL update statement) we "have gotten time wrong". If you haven't seen it already, take a small detour and watch Hickey's conference presentation, "Are we there yet?".

If you apply this affinity for immutable data to a bona fide persistence mechanism, Datomic is what you end up with. Datomic never updates data in-place; rather, it deals only in "Facts". A Fact describes a characteristic of an entity at a particular time, and as such, it never changes. The entity may adopt a new characteristic later on, in which case a new Fact supersedes the old fact for subsequent queries interested in the database's current values. But the original, now-obviated Fact remains recorded forever, and importantly, remains available for interested queries to discover.

Datomic provides a variety of ways to access historical Facts. The simplest way is mind-blowingly intuitive: "as-of". Given a transaction or a moment in time, as-of will return a snapshot of the database consisting of all the facts known at that time. You don't need to try to identify a suitable backup, or about whether some values could have changed between available backups. Because Datomic knows the order of all transactions, it can guarantee that queries against the snapshot returned by as-of will retrieve exactly the same results as if they were run at that specific moment.

HOW THIS CAN HELP YOU

One of the best reasons to keep all historical Facts around forever is: "You never know what you're going to need, so if you can, you might as well keep everything". If you've already gone to the extent of running your application on Datomic, you don't even need to worry about whether to decide to remember specific pieces of information anymore. If it's recorded at all, it's recorded forever.

This eidetic memory comes in handy in various ways. Sometimes it can even allow you to improve and/or simplify your application's data model, since many forms of historical tracking that you would normally need to design are suddenly just baked into your persistence engine.

However, the most common use case occurs when you rely on Datomic's perfect memory to help you troubleshoot mind-boggling problems. If you are trying to understand how something mystifying could have happened, as long as you know when it happened, you can essentially hit the "Rewind" button and the "Pause" button on your entire database to look at the state of everything at that time. Often, doing this can light the way to a fix much more quickly than you would be able to accomplish simply by looking at code and imagining the effect it might have, given the right combination of data and circumstances.

Looking at a trustworthy snapshot of database state gives your mind another angle from which to approach a difficult problem. I liken it to the way that an interactive debugger with breakpoints can help you diagnose small-scale issues in functions. You can see the whole situation instead of imagining it, and this allows you to simply notice a small detail that's out of place, rather than having to postulate that such a detail could exist.

HOW THIS SHOULD CHANGE YOUR APPROACH

During the course of designing an application, questions often arise about where to store various pieces of information. For example, configuration settings are certainly a type of data, yet in many cases, even in database-centric applications, the configuration is stored in a filesystem rather than in the database. Back when I tended to use Rails and Postgres, I would have considered it strange to save the application settings to Postgres.

If you're using Datomic, you should be aggressive about saving things in your database, because the reward for doing so is so much higher. Following the example of application configuration of a Rails app: Imagine if you found yourself wondering if a certain configuration setting could have caused a problem in a previous deploy. Rather than recreating the setting you think might have caused the problem, and then trying to reproduce the troublesome event, you could simply use Datomic to go back in time and look at the actual configuration that was in effect when the problem happened.

If you start to buy into this approach, you should consider Datomic the "master copy" of any Facts that were relevant at a particular point. In other words, write code that depends on the values in Datomic, and then tackle the separate problem of how to get those values into and out of Datomic. Other forms of data representation should be employed mainly as a way to enhance interfaces between Datomic and other components of the system. Define (and document) processes that move data between Datomic and these external data stores.

To continue the "application configuration" example, you can still put your configuration in a file so that it is easy to read and easy to change. It's just that, at some point during the bootstrap process, the application should read the file and convert it to Datomic Facts. To aid your future troubleshooting detective work, you should ensure that when any other part of the application needs to know about the configuration, it is derived from the Datomic Facts rather than from another source.

WHY YOU MIGHT NOT BE ABLE TO USE DATOMIC

While it's very exciting to explore, I should warn you that Datomic isn't going to work for everyone. If, as an agency or individual who is accustomed to working with relational databases, you're looking into Datomic, here are some things to keep in mind:

  • Training time: It will take quite a while to become fluent with Datomic. For example, there is very little resemblance between Datalog (the query language used by Datomic) and SQL. If you or your team are like me, you've been using SQL for so long that you "think in SQL"... it's hard to exaggerate how long it takes to start "thinking in Datalog".

  • Hosting: Since SQL databases are so widely used, hosting of SQL databases has evolved to the point where many applications can be reasonably deployed without the involvement of anyone who is expert in the chosen database. For example, you can get pretty far by running a Rails application on Heroku, using Heroku's hosted Postgres service, and never really learning a lot about Postgres. The same is not true of Datomic; you are just going to have to do a lot more configuration and maintenance yourself.

  • Price: If you use Datomic for a large scale application, you will have to pay a significant amount of money for the privilege.

  • Java: Unless your application code is running on a Java Virtual Machine, you're going to be using something significantly less powerful than the "full Datomic experience".

Nevertheless, Datomic is fascinating and has such potential that I heartily recommend looking for ways to overcome these hurdles! At the very least, learning to use Datomic will expand your perspective about what a database can do.

Matthew Forsyth

Share: