Treating Shared Databases Like APIs in a DevOps World

Source – infoq.com

Simon Sabin, Principle Consultant at Sabin.io, spoke at WinOps 2017 conference on how to include database changes in a continuous deployment model. A key aspect when sharing databases across multiple services or applications is to treat them as APIs, from the perspective of the database owners.

Sabin suggested that mechanisms such as views, triggers and stored procedures can be used to change the database internal structure while keeping backwards compatibility with the applications data operations. He gave the example of migrating a credit card numbers table from plain text to encrypted, where the applications might still retrieve the numbers in plain text by querying a view while the actual table is now encrypted and decrypted via a trigger, as illustrated below:

The database respects an implicit contract/interface with the applications whereby internal changes (often required for performance, security and other improvements) should not impact the applications. Following the API analogy, breaking changes should be minimized and communicated with plenty of notice for applications to adapt and deploy forward compatible updates before the actual database changes. Such an approach is malleable to applying consumer-driven contracts for databases.

According to Sabin, this approach is needed to intermediate two distinct views when dealing with shared databases: application developers tend to see them as “dumb storage” while database administrators (DBAs) see them as repositories of critical business data. In an ideal world each application team would own their database and the team would include a DBA role, but this is often hard to apply.

When applications themselves require schema or other changes, Sabin advises reducing batch size (because achieving reliability and consistency in databases is a complex task) and applying one small change at a time, together with the corresponding application code change. The migration mechanism of choice (gold schema vs migration scripts) will depend on the type of change. For Sabin, the mechanisms are complementary more than opposed. For schema changes a gold schema approach where a target schema (desired state) is declared and a tool such as SQLCompare applies the “diff” between current and desired state is adequate. More complex modifications might be better suited for migration scripts such as DBDeploy, Flyway or ReadyRoll.

Other recommendations from Sabin included designing the database to fit your tooling – not the other way around, and adding deployment information in the database itself so it’s easy to trace and diagnose issues. Treating database changes as part of the delivery pipeline also promotes auditability and compliance. Finally, Sabin suggested data security (at a time when major data breaches continue to take place) could be largely improved with a “private” vs “public” approach, for example having small accessible (“public”) databases that simply hold views to larger, secure databases (“private”) not directly reachable by the applications themselves.

Similarly to how the DevSecOps movement took some time to become established, the idea of including database changes in an integrated, continuous deployment process has been around for some time with different designations (Sabin calls it “Data DevOps”, others call it “DataOps”, others “Database Lifecycle Management (DLM)”) . For Eduardo Piairo, another speaker at WinOps 2017:

It’s not about defining a new process for handling database changes, it’s about integrating them in the service lifecycle along with application and infrastructure code.