The Next Generation of DevOps: ML Ops
Source – insidebigdata.com
In this special guest feature, Debashis Saha, Vice President of Platform Engineering at Intuit, discusses how DevOps methodologies can be applied to machine learning, in what he calls “ML Ops.” With ML Ops, he believes it can provide an end-to-end automation of the process, creating transparency and delivering efficiency and productivity for everyoenbody involved to deliver value rapidly. As VP of Platform Engineering for Intuit, Debashis leads the engineering teams responsible for the platform, developer and application services that enable developers to be productive and innovate for our customers. Prior to this, he was the Executive VP and CTO for Jiff, Inc. Prior to Jiff, he was VP of Commerce Platform Infrastructure at eBay, where he led the engineering teams who built, managed and operated the platforms and infrastructure services for eBay. His portfolio included eBay’s entire infrastructure ranging from data centers to cloud, platforms, frameworks, data, engineering services and operations. Debashis earned an M.S. in Electrical Engineering and Computer Science from Massachusetts Institute of Technology and a Bachelor of Technology in Computer Science and Engineering from the Indian Institute of Technology, Kharagpur.
The age of AI is upon us. As AI becomes more ubiquitous, many are finding new and innovative ways to operationalize data science in order to increase efficiency, speed and scale.
As I look at traditional DevOp methodologies, there are synergies and parallels that can also be applied to the data science world. The new chasm involves multiple disciplines: Data Engineering, Data Science and Software Engineering. Traditional DevOps is the battleground for developers and operations which continues in the world of data science in a more pronounced manner – data engineers, data scientists, software developers and operations. These four personas come with different requirements, constraints and velocity. It is extremely hard to balance all four that satisfy the business requirements while complying with corporate and organizational policies.
The Rise of ML Ops
Appropriate data engineering is necessary to transform raw data into processed data appropriate for use in machine learning algorithms. This leads to a fusion of data engineering and data science, and if not done effectively, can lead to a reduction in productivity, efficiency, and speed development, deployment and ultimately broad adoption of data science. I call this “ML Ops”, an essential element of machine learning development that complements and completes the life cycle of an ML developer.
ML Ops encapsulates aspects of data engineering, software engineering, and data science to provide an end-to-end view of applying intelligence from data to a business use case. A majority of data science projects stay in the labs because integration with production environments is extremely complicated, manual and prone to error. The lack of sophisticated ML Ops therefore hinders any company or business to extract intelligence from the data they already have and apply them to their business processes and triggers disillusionment of data science and machine learning in general.
We are starting to see automation frameworks and services emerge in the public and private domain that bridge the skill set and process gaps of data science, software development and data engineering.
Applying ML Ops to Your Organization
An ML Ops platform will provide end-to-end automation of the processes that involve solving a business problem. A typical automation process includes iterative life cycles in data engineering (preparation, cleaning, refining and transformation), data science (model development, training, testing, validation, and optimization) and deployment (further testing, deployment, experimentation, monitoring, performance engineering and operating). Each of these are very complex processes and have separate tools and systems that typically don’t integrate well, include lots of manual touch points and handoffs and sometimes don’t even interoperate. The first order problem is lack of visibility and transparency in the end-to-end process. A modern ML ops engineering platform will stitch together these disparate steps into a seamless workflow that will enable collaboration between everybody involved in solving the business problem.
We are in the very early stages of “Data Science Productivity”, similar to the days when the first tools like compilers or editors started to appear for developing software for computers. We see two primary reasons for a lack of integrated tools and platforms in this space: a) The rapidly changing landscape in each of the three contributing areas: data platforms, data science algorithms and platforms, and cloud infrastructure b) Talent and skill set gaps in comprehensively understanding all the disciplines involved to be able to provide meaningful abstraction, automation and productivity in a generic fashion that is broadly applicable and useful to a lot of real practical use cases.
The fundamental need of the hour is to be able to deliver and apply quality intelligence that can be trusted to a business problem rapidly. There is a tremendous need to refresh the models in near real-time, if not real-time. In addition, the business wants to experiment with a variety of intelligence for improving the customer’s experience, which means applying different cross sections of data, model and software. Lack of quality and hence testing in this whole cycle diminishes trust and repeatability of the results. In many industries, like banking and insurance, there is a regulatory need for proving reproducibility and veracity of the model using the same data. And therefore, ability to provide capabilities to test during the development, deployment and post usage are extremely critical for both efficiency and compliance.
As the encapsulation of Data Science with Data becomes more sophisticated, we can expect to deliver AI and machine learning in an extremely scalable manner through many cloud services. A true ML Ops driven end-to-end data science platform can have transformative impact in the world by unlocking the latent intelligence in all of the data present in each business and in public domain.