The Four Stages of DevOps Maturity
Source – forbes.com
I attended my first DevOps Enterprise Summit (DOES17) in San Francisco this week and watched speakers from large enterprises like Capital One, Disney, Nike, CSG, and more share the lessons learned from their multiyear DevOps transformations. Most of these companies have been working on their transformations from three to five or more years, and have reached a level of maturity where the business value of DevOps is very easy to see and measure.
Like any new technology, methodology, process or paradigm shift, DevOps transformations go through various stages of maturity. Two years ago I wrote a post called The Four Stages of Cloud Competence and referenced Noel Burch’s four stages of learning to describe how enterprises were adopting (or not adopting) cloud computing.
1. Unconscious Incompetence
Individuals do not understand or know how to do something and do not necessarily recognize the deficit. They may even deny the usefulness of the skill. Before moving on to the next stage, individuals must recognize their own incompetence, and the value of the new skill. The length of time individuals spend in this stage depends on the strength of their stimulus to learn.
2. Conscious Incompetence
Though individuals do not understand or know how to do something, they do recognize the deficit, as well as the value of a new skill in addressing the deficit. At this stage, making mistakes can be integral to the learning process.
3. Conscious Competence
Individuals understand or know how to do something. However, demonstrating the skill or knowledge requires concentration. It may be broken down into steps, and there is heavy conscious involvement in executing the new skill.
4. Unconscious Competence
Individuals have had so much practice with a skill that it becomes second nature and can be easily done. As a result, the skill can be performed while executing another task. Individuals may be able to teach it to others, depending upon how and when it was learned.
The Four Stages of DevOps Maturity
This model has many parallels to how I see large organizations embracing DevOps over the course of last several years. The following stages are not based on scientific analysis or modeling, but rather on how I have seen organizations mature as they progress through the learning curve.
Stage 1 – DevOps Denial and Misinterpretation
At the Unconscious Incompetence stage, the lack of understanding of what DevOps is and what its business benefits are causes organizations to deny the usefulness of DevOps and write it off as a kind of fad or bad marketing term. Some call this resistance to change, but it’s really a lack of understanding of the core value proposition. The same people who complain about painful deployments, excessive on-call support issues, constant firefighting and poor work-life balance defend this way of life over the DevOps way, because they simply don’t understand what DevOps is and how it will improve their current state.
They think of DevOps as operators having to learn how to code, or developers eliminating operation jobs, so they fight it with all their might. In reality, DevOps came about from operators partnering with developers to improve the deployments, stability, reliability and their quality of life.
Kishore Jalleda, Sr. Director of Production Engineering at Yahoo describes DevOps best in this slide.
At stage one, this message is lost, and organizations proceed with a “business as usual” attitude, waiting for this shiny object to pass in the night like many buzzwords before. The age-old silo operating model lives on.
Stage 2 – Automation for the Sake of Automation
The Conscious Incompetence stage is usually present in the first 12 to 18 months of an organization’s DevOps journey. Many organizations grab onto the automation aspects of DevOps and start madly writing Chef scripts to automate everything and anything. Although some value is derived from these efforts (automated builds, automated infrastructure), much of this work is still being performed in silos, with limited to no improvement in collaboration between groups. Three common anti-patterns are implemented in stage 2.
Anti-pattern 1 – The DevOps Silo
What is the best way to fix your silo problem? Create a new silo called DevOps, and hire a bunch of DevOps Engineers (see my No you are not a DevOps Engineer rant). Welcome to your new bottleneck. In this model, a new element of separation is inserted between Dev and Ops and a “build it and they will come” mindset leads to more shadow IT as the Dev’s needs continue to go unmet.
Anti-pattern 2 – Dev Don’t Need No Stinkin’ Ops
In this model, development teams script away the ops team and provision all of their own infrastructure as code. After all, infrastructure provisioning is their biggest bottleneck. The downside to this approach is that Devs are not experts at networking, security, compliance, support and many other capabilities that have been provided for them by various shared service groups. So although the Dev groups can now move much faster, they drastically increase risks for their company that can have catastrophic consequences.
Anti-pattern 3 – Rebranding Sysadmins as DevOps Engineers
In this model, nothing really changes other than the sysadmins’ titles and resumes. All of the bottlenecks that keep work from flowing left to right remain, but some really cool scripts get built. Often, too many scripts are built, and now Ops is buried in script hell trying to manage thousands of lines of scripts, sometimes not checked into a repository. This “build it and they will come” approach often either backfires or just creates new bottlenecks. The Ops team often sees efficiency in their day-to-day tasks, but at the expense of newly wasted time in the Dev’s day-to-day tasks.
This may all sound like doom and gloom, but you have to start somewhere. A lot of learning occurs in this stage. Both Dev and Ops get familiar with the tooling and start to have a better understanding of each other’s needs. Some processes are improved, but usually those improvements are departmental and not system-wide.
Stage 3 – Collaboration and Reorganization
The Conscious Competence stage usually exists from years two to four. At this stage, the organization understands that DevOps is more than engineers and scripts, and is all about improving the entire software development lifecycle (SDLC), from business idea inception to business idea running in production.
By the time a company has entered stage 3, they have had pockets of success from grassroots efforts within the company. (Note: sometimes success is driven top-down, but more often than not a grassroots effort starts the journey.) Certain products or services within the portfolio have seen massive benefits from early DevOps efforts, and management has taken notice. Now the organization is open to doing the really hard stuff–people and process change.
At stage 3, collaboration expands beyond Dev and Ops to security, legal, compliance, audit and all those other bottleneck areas for Dev. In fact, the pattern we see from stage 3 to 4 is the shifting left of bottlenecks. First it was infrastructure. Then QA (automated testing in the pipeline). Then security (security scans in the pipeline, security and controls baked into the infrastructure, etc.).
In stage 3 we see the birth of platforms. Whether companies are using public or private cloud, or no clouds at all, platform teams build a layer of enterprise guard rails on top of the infrastructure layer, and provide self-service capabilities for the Dev teams. These platforms take the requirements from GRC (governance, risk, compliance), security, etc., and provide APIs and abstractions so that the Dev teams can inherit these controls and policies as they consume the platform services. This operating model often looks like this:
In this model the Ops team is responsible for providing and operating the platform, and the developers are responsible for operating their applications, which are built on top of the enterprise guard rails. In stage 3 we finally get to “you build it, you run it.”
As organizations mature through Stage 3, we see improved collaboration across departments, and value stream mapping exercises being performed to aid in process improvement activities. A learning organization begins to emerge, and activities like blameless post mortems, gameday exercises, adoption of lean concepts, and others take root.
Stage 4 – A High Performing Organization
At the Unconscious Competence stage, organizations have the ability to deploy multiple times a day with certainty and minimal risk. Removing bottlenecks becomes a way of life. What I saw over and over at DOES17 were large organizations shifting everything left. In stage 3 we saw QA and security shifting left. In stage 4 we see Ops, GRC, and even tier 1 through tier 3 support shifting to the business units (BUs).
Each BU becomes a self-sufficient organization with full stack teams (squads) made up of experts across all the necessary technology and process domains. The classic security, Ops and GRC teams still exist, but their role is to establish policy and standards, not implement them. The BUs implement them in the way that’s optimal for their products or services.
This was best explained by CSG International’s Scott Prugh and Erica Morrison as they compared their old way of thinking to their new way of thinking.
After CSG removed infrastructure as their biggest bottleneck, they went after the next biggest bottleneck, which was organizational structures. The slide below shows how they adopted the T-Shaped management model, where one person owns both the development and operations of the product within the BU.
This transformation took a while to perfect. Once things were operating smoothly, they moved on to the next biggest bottleneck: process. The next slide shows how they moved the responsibility of implementing and monitoring security and GRC to the BUs within the T-shaped teams.
The next bottleneck on their list is tier 1 through tier 3 support. I am looking forward to hearing their presentation next year, and learning about their experiences shifting that process left to the BUs.
Capital One now has over 300 products that are being deployed up to 50 times a day, up from 20 products last year. Yes, you heard that right. A financial institution running on the public cloud (AWS) is changing compliant software multiple times a day across 300 product lines. If they were my competition, I would be shaking in my boots. Many financial institutions require 50 meetings to deploy 1 piece of code. These folks are deploying tens of thousands of times a day across their portfolio. And this is not Facebook, Twitter or Amazon. This is a bank!
If you haven’t figured it out yet, companies like Capital One, who are in Stage 4 of DevOps maturity are also advanced in their digital transformation, because of the velocity with which the can deploy software. For example, Capital One introduced banking via Alexa in 2016, shortly after Alexa became available as a viable production-quality device. Most companies had not even considered what a device like Alexa could do for their customers when Capital One had already deployed it.
Companies like Nike and others gave us examples of how they responded to urgent business requests within a week, which they previously would not have been able to deliver on because of the limitations of their legacy operating model.
DevOps is real, and is not a fad or another fancy buzzword. As companies move through the stages of learning, each stage provides more business value to the company. The speed at which business moves is increasing exponentially, and the companies that can deploy at the prevailing speed of business will win. As we look at the technologies that are driving the future–such as AI, machine learning, block chain and IoT–the companies that have entered Stage 4 maturity will be able to bring solutions using these technologies faster than their competition, creating even more separation in marketplace. My advice is to either embrace DevOps and start transforming your organization, or go out of business.