Microservices and big data start to get closer
Microservices are riding a wave of user interest, leading to changes in IT organization. ThoughtWorks’ expert, Zhamak Dehghani, discusses what that means for big data.
Database development and management are changing dramatically, as microservices gain wider use in the enterprise.
That’s according to Zhamak Dehghani, principal consultant at Chicago-based ThoughtWorks, a global technology consulting firm specializing in Agile methods, distributed systems and open source software adoption. The firm has been active in bringing together microservices and big data.
Microservices bear some resemblance to a once-pervasive industry trend called SOA, or service-oriented architecture. Both capitalize on a general move away from monolithic computer architecture.
While it is still early, microservices are expected to make continued inroads. By 2022, according to analyst firm IDC, 90% of all apps will include microservices architectures.
The move to microservices — and closely associated Kubernetes and software containers — has been cited as a factor in IBM’s plans to acquire Red Hat and the merger of Hadoop vendors Cloudera and Hortonworks.
Accompanying the microservices wave are DevOps, cloud computing, domain-driven design and NoSQL databases. All these technologies and practices find special favor, as organizations try to move new applications into production more quickly and update them often once they are there.
Dehghani, who worked as a software engineer and architect during 20 years in distributed computing, embedded systems and communications, has focused at ThoughtWorks on applying domain-driven design concepts in operational systems.
Using domain-driven design and microservices, Dehghani said, developers address a business problem first, and then employ suitable technology to build or improve apps. That is instead of doing things the other way around, with technologies leading the way.
Such techniques drove greater use of NoSQL databases that were especially fit for specific purposes, compared to more general relational databases. ThoughtWorks was among the organizations helping to drive NoSQL’s ascent and, similarly, the growth of interest in low-latency, globally deployed SQL databases.
At the consultancy, Dehghani contributes to the ThoughtWorks Technology Radar, a technology scorecard that has been a model for users assessing open source software microservices options.
From monoliths to microservices
Dehghani maintained that microservices are moving responsibility for databases from central IT to developers who are working as part of lines of business.
“Businesses used to put the developers in one tier and the users in another. Now, as you see a move from monoliths to microservices, you see a change in how we organize IT,” she said. “What used to happen was that operational data was owned by DBAs [database administrators] or database experts that integrated all the applications into one layer of a database.”
That lead to an all-too-familiar scenario in which “every time you wanted to make a change, you had to go to a separate team,” according to Dehghani. When developers or business users managed to hook up with data stewards, they often took a place behind a long backlog of such requests.
“With microservices, we have changed the model so that the people that own the application or service are also responsible for the database, the data itself and the schema in which it resides,” she said.
Just-in-time domain design
But, to date, IT organizations haven’t broadly applied such domain-driven lessons in big data, Dehghani contended.
In fact, big data development has taken place outside of the purview of microservices architecture, Dehghani said. While she gives open source software systems like Hadoop and Spark credit as steps away from proprietary software, she said work on such big data frameworks like this have usually followed a monolithic model.
It has been hard to move these big data projects into general production. On top of that, “they haven’t answered the needs of organizations,” she said.
Microservices and big data take shape
Meanwhile, a closer relationship between microservices and big data has been building.
Dehghani outlined the need for what she called the decentralization of big data architecture in a posting on Thoughtworks colleague Martin Fowler’s widely read technology blog. There, she wrote that big data for analytical and machine learning purposes “has remained centralized and disconnected from the business domains.”
Dehghani said data lakes, for a while the hallmark of big data efforts, tend to become data silos — and a distributed data mesh approach is emerging that is more in tune with microservices thinking.
“The way we broke down operational capabilities around domains has to come to big data, too,” she said. Still, microservices won’t find greater use in data management without further learning.
Infrastructure teams and cloud providers need to be part of the effort, too, Dehghani said, because it is not good for every application development team to have to maintain the infrastructure for its applications.
Also, she cautioned that the paths organizations take will need to vary depending on available skill sets.
“Respecting organization maturity is important. I see a lot of people that want to run microservices on the cloud overnight. But that change is a journey in itself. A lot of failures I see are due to not recognizing that — not having an incremental approach,” Dehghani said.