Harnessing AI to make DevOps more effective

Source – zdnet.com

Loom Systems’ AI-powered log analysis aims to give companies a warning when there may be a problem in their system by reading logs and detecting when something is likely to go wrong — and then sends out alerts so DevOps and IT managers can respond before systems go down.

ZDNet talked to CEO Gabby Menachem about the company’s future plans.

ZDNet: Tell me about the origins of Loom Systems.

Menachem: We were founded two years ago and using our own money. We did this, rather than use venture capital funding, because we wanted to be closer to our customers. That way we could better understand the changing needs of companies in this data-driven industry.

We felt that having worked with big data platforms in the past, we have seen how all companies have been occupied with solving problems with data.

From infrastructure to data analysis platforms to automated data science, at Loom we have been building all of these. And with all that experience, what we have come to realise is that, while companies should be able to continuously analyse big data, they are not succeeding in doing that.

That’s because many of them find it just too hard. It’s hard to have the right methodology, it’s hard to analyse it, and it’s hard to develop actual use cases.

So, understanding that, we broke it down into different stages of analysis and decided to build what we think is the world’s first cognitive platform that can analyse the data for you.

For the first two years we worked without any VC funding. We took in only revenue from customers. We built up very quickly and in a very short time got up to 28 employees in offices in Tel Aviv and San Francisco.

Now we are expanding our sales to the US and we have taken in our first VC money. We are growing quickly and now we have over 30 people, and intend to have 50 by the year-end. We now have customers of various sizes, up to some who are in the Fortune 500.

And a big part of that is big data?
Maybe I should start with a specific use case.

Where, at one time, you would have been talking to someone about what the problem was, now you are talking to a digital system instead.

Now instead of talking to people, staff have to do their work on a system and, whatever the problem, they need to solve it in minutes, not the hours or days that they have been used to.

This new strain on IT, together with the fact that the digital medium itself has grown so fast — and not forgetting that it is also a very big part of the revenue for all of these companies — creates pressure. So IT operations has become another place where business needs to invest in order to lower the problems that they are seeing and to mitigate the churn that IT causes.

If IT is such a big part in this, how do you get that across to the people who work in IT? 
I think we are already having some success with that. Gartner recently praised us as a Cool Vendor of 2017 for expertise in performance analysis. That was because we were able to show that we could take the insights of a problem and create a crowdsourced solution that IT people and DevOps people could put on their usual schedule while continuing to contribute to IT and DevOps.

How does this work? The fact that every digital system today creates log messages is the basis of understanding how DevOps people solve issues within businesses. Whenever you have a problem, either you find out about it by yourself or through a customer. You then go into the system and you try to investigate by looking at the logs and trying to guess what went wrong. That usually entails using the methodology that you have. That is usually a person who is trying to find a word, or a structural anomaly in the logs, or specific messages that can tip you off on where the problem originated from.

Now that causality problem — how do you find the root cause of issues? — is a big problem because you need to solve it in minutes. Now, when we looked at these problems we thought: ‘What if we could build a system that could go through that process just as a human would but do it in an automated manner?’.

Now you can just stream logs from all your systems into Loom Systems, and what it does, using AI, is look at all your logs and do this investigation on a continuous basis. So instead of investigating when something goes wrong, it investigates on an ongoing basis and if it finds something it alerts you.

It replaces the way you do monitoring so that, instead of investigating with metrics that you have pre-defined, now you have a system that surfaces interesting events that could be problems in your IT systems. And it does it without you needing to know how to define these problems.

That’s the way to be on top of everything and it mitigates the risk that we talked about earlier.

The second thing is, after you understand this then you can reverse the process so that you are now proactive. Now whenever something happens in the system, you get alerted on it even though the customer doesn’t know about it.

Are your systems applicable to any type of system or application?
Our system is agnostic but we are seeing more and more customers in the financial industry along with telcos. I think that’s probably because in the financial industry you are seeing a lot more innovation and early adopters of technology.

And the telco industry specifically I think, because the margins there are now low so they are looking at a lot more efficiencies and ways to mitigate IT risk. I think that a technology like ours is not only managing to put a cap on the risk that they have, but is also giving them a better estimate for the customer. For example, if you have a problem with your cell phone, that’s not something that customers take lightly these days so if you have issues, customers want them solved very quickly.

Where are you seeing the most interest?
I think ecommerce is a growing space and in a lot of cases, where they used to be doing things that could be handled by a sole IT guy, they are now expanding and are becoming very large systems that are a big source of revenue.

We are also seeing a lot of manufacturing companies that want us to look at the logs from robots. Now, it is all based on secure assembly lines that are working faster than humans could, but it is also creating a big risk because if the line is not working, that hits your revenue. And all that is also working with our platforms.

Our software is there to predict issues and fix them fast. But the pressure is on from companies because they are buying these robots and they need software to manage them but the software is not being built at the same pace as the ability to manufacture.

So where is DevOps in all of this?
With these modern systems, efficiency of software is key and companies have been working on making software work more efficiently. DevOps has been important in the software world but now we are not only seeing it there but everywhere.

Having a good continuous development strategy and integration skills in manufacturing plants is something that companies (like GE for example who have been pushing it for a few years now) are taking into account in their vision for the future.

But can DevOps really be part of manufacturing?
Where I see DevOps in the future is as part of a new methodology. Because software is moving so fast, you need a new way of doing operations.

Now it is becoming so tied to the software that you need to incorporate the developer and we think that [line Of business] is the perfect solutions for that.

Where it used to be aimed at a specific solution, now, because the systems are so complex, just getting an understanding of how a business is performing can be hugely complex. So more and more people have to be put on looking at business data and DevOps data.

Now in that sense, DevOps is still young. A lot of the early issues around DevOps have been solved but now you have new issues like, how to handle all this data and build actions on top of it.

And this just returns us to the basic issue. It is all about the business, so you are not just doing DevOps because it’s new and cool, you are building it into your systems because it can solve business problems.

While it is doing that it is also building you a better service for your customers.

Leave a Reply