Managed Kubernetes doesn’t mean set it and forget it
Don’t let the name fool you: Cloud-based managed Kubernetes services still require user management. Here’s what IT teams still need to know when deploying on AKS, EKS and GKE.
Containers and container orchestration are hard, which is why many people think cloud providers’ managed Kubernetes services will automatically solve these problems for them. Yes, managed Kubernetes is a real solution to these issues, but it’s also a potential trap that exposes users to even bigger problems.
There are technical issues with the portability of managed Kubernetes elements, but the biggest challenge is at a higher level — in application architecture and development planning. Addressing these issues can create portability problems, but remember: Your application is an application first and containers and Kubernetes second. Your first priority is ensuring that your application runs properly and meets its business objectives.
Managed Kubernetes compatibility
Consider software and version compatibility when you begin your architecture design or app development. Pure container and Kubernetes applications have become common, but new application development increasingly includes reliance on additional tools, especially service meshes.
If the tools you want aren’t available on all the cloud provider services you plan to use, you’ll have to limit your choices or add the tools externally. This will somewhat limit the operational benefits of managed Kubernetes.
Service mesh technology is an important consideration because the load balancing and discovery challenges can be considerable with native managed Kubernetes services, such as Amazon Elastic Kubernetes Service, Microsoft Azure Kubernetes Service and Google Kubernetes Engine. It’s also an area where you’ll likely find major differences among providers, which translates to heightened risks of lock-in. Think service mesh from the start, and try to focus on managed Kubernetes providers that integrate it. Istio is currently the most broadly supported service mesh for managed Kubernetes.
Versioning is more complicated. Open source software tends to undergo more version changes per year than commercial software, and version requirements are rarely synchronized when multiple products are involved. Managed Kubernetes services will also have software versioning issues for the included elements. The more tools you use, the more likely it is that your versioning needs won’t line up without some effort.
In some cases, the version of one element in your deployment isn’t compatible with another. Start your versioning assessment by reviewing what’s required by the managed Kubernetes services you’re planning to use, and then validate whether those versions will work with any added elements you’re considering.
Kubernetes and app configurations and parameters
Next, examine the configuration and parameterization of your applications and Kubernetes. Managed Kubernetes services tend to make assumptions about application setup, and if these assumptions are not followed, the result can range from poor performance to total failure. Most users say they can’t assess the requirements from managed Kubernetes documentation, despite their best efforts.
Users most frequently cite problems with nodes in bad or unexpected states after lifecycle changes. This can result in random or consistent node failures or long delays in transitioning from one application state to another. Usually, the problem can be corrected by changing the configuration and parameterization of the applications or Kubernetes.
Unfortunately, this doesn’t always happen because managed services can make users complacent. Users often ignore or forget details they’d otherwise be forced to address if the management framework wasn’t in place, even when some of those details are still there to manage.
The only suitable strategy here is to set up a test application and a prototype of your managed Kubernetes service and do small-scale deployments to validate compatibility. Run through all the stages of application lifecycle evolution you expect in production, including deployment, redeployment, scaling, and full or partial application teardowns. Make sure that you do this at a scale sufficient to measure any performance issues; think of a test cluster with at least 10 — and preferably 20 — nodes as the goal.
For each lifecycle stage, run full analysis of node and application state before and after the change, and audit the difference to ensure it matches expectations. Scaling, descaling and teardowns — independent of or as part of a redeployment — are the places where you’re most likely to find configuration issues and unexpected state.
Managed Kubernetes monitoring
Most managed Kubernetes services come with a preferred monitoring approach, but you must ensure that the provided tool covers all the issues you may face. Managed Kubernetes services will try to recover from problems such as node failures or lost connectivity. They may even do well enough to hide the issues but not well enough that the problems don’t impacts users’ quality of experience. If the provider’s native service isn’t sufficient for your needs, pick a strong monitoring package, like Container Advisor or Prometheus.
ack of effective monitoring doesn’t cause problems as much as hide them, and users say that’s the biggest problem with managed Kubernetes services. There’s an implicit dividing line that separates the provider’s contractual responsibility from what the user must handle. Monitoring is the only way to see what’s happening at that dividing line, and yet most users don’t check their managed service contracts against the monitoring capabilities of the tools included.
All services with a contract and a service-level agreement demand a clear understanding of roles and responsibilities. They also require the user of the service to take responsibility for some aspects of operations. With managed Kubernetes, too many users are letting these issues slide, and the results aren’t making them happy. Don’t be one of them.