Kubernetes, Containers and HPC
Software containers and Kubernetes are important tools for building, deploying, running and managing modern enterprise applications at scale and delivering enterprise software faster and more reliably to the end user — while using resources more efficiently and reducing costs.
Recently, high performance computing (HPC) is moving closer to the enterprise and can therefore benefit from an HPC container and Kubernetes ecosystem, with new requirements to quickly allocate and deallocate computational resources to HPC workloads. Compute capacity, be it for enterprise or HPC workloads, can no longer be planned years in advance.
Getting on-demand resources from a shared compute resource pool has never been easier as cloud service providers and software vendors are continuously investing and improving their services. In enterprise computing, packaging software in container images and running containers is meanwhile standard. Kubernetes has become the most widely used container and resource orchestrator in the Fortune 500 companies. The HPC community, led by efforts in the AI community, is picking up the concept and applying it to batch jobs and interactive applications.
Containers and HPC: Why Should We Care?
HPC leaders have a hard time. There are lots of changes and new ways of thinking in software technology and IT operations. Containerization have become ubiquitous; container orchestration with Kubernetes is the new standard. Deep learning workloads are continuously increasing their footprint, and Site Reliability Engineering (SRE) has been adopted on many sites. It is very hard for each of the new technologies to judge their usefulness for HPC type of workloads.
Introduction to Kubernetes
If your engineers or operators are running a single container on their laptop, they probably use Docker for doing that. But when having multiples of containers potentially on dozens or hundreds of machines, it becomes difficult to get them maintained. Kubernetes simplifies container orchestration by providing scheduling, container life-cycle management, networking functionalities and more in a scalable and extensible platform.
Major components of Kubernetes are the Kubernetes master which contains an API server, scheduler, and a controller manager. Controllers are a main concept: they watch out for the current state of resources and compare them with the expected state. If they differ, they take actions to move to an expected state. On the execution side we have the kubelet which is in contact with the master as well as a network proxy. The kubelet manages containers by using the container runtime interface (CRI) for interacting with runtimes like Docker, containerd, or CRI-O.
Running Kubernetes or HPC Schedulers?
Kubernetes is doing workload and resource management. Sounds familiar? Yes, in many ways it shares lots of functionalities with traditional HPC workload managers. The main differences are the workload types they focus on. While HPC workload managers are focused on running distributed memory jobs and support high-throughput scenarios, Kubernetes is primarily built for orchestrating containerized microservice applications.
HPC workload managers like Univa Grid Engine added a huge number of features in the last decades. Some notable functionalities are:
– Support for shared and distributed memory (like MPI based) jobs
– Advance reservations for allocating and blocking resources in advance
– Fair-share to customize resource usage patterns across users, projects, and departments
– Resource reservation for collecting resources for large jobs
– Preemption for stopping low prior jobs in favor for running high prior jobs
– NUMA aware scheduling for automatically allocating cores and sockets
– Adhere to standards for job submission and management (like DRMAA and DRMAAv2)
HPC workload managers are tuned for speed, throughput, and scalability, being capable of running millions of batch jobs a day and supporting the infrastructure of the largest supercomputers in the world. What traditional HPC workload managers lack are means for supporting microservice architectures, deeply integrated container management capabilities, network management, and application life-cycle management. They are primarily built for running batch jobs in different scenarios like high-throughput, MPI jobs spanning across potentially hundreds or thousands of nodes, jobs running weeks, or jobs using special resource types (GPUs, FPGAs, licenses, etc.).
Kubernetes on the other hand is built for containerized microservice applications from the bottom- up. Some notable features are:
– Management of sets of pods. Pods consist of one or more co-located containers.
– Networking functionalities through a pluggable overlay network
– Self-healing through controller concept comparing expected with current state
– Declarative style configuration
– Load balancing functionalities
– Rolling updates of different versions of workloads
– Integrations in many monitoring and logging solutions
– Hooks to integrate external persistent storage in pods
– Service discovery and routing
What Kubernetes lacks at this time is a proper high-throughput batch job queueing system with a sophisticated rule system for managing resource allocations. But one of the main drawbacks we see is that traditional HPC engineering applications are not yet built to interact with Kubernetes. But this will change in the future. New kinds of AI workloads on the other hand are supporting Kubernetes already very well – in fact many of these packages are targeted to Kubernetes.
Can HPC Workload Be Managed by Kubernetes?
We should combine both Kubernetes and HPC workload-management systems to fully meet the HPC requirements. Kubernetes will be used for managing HPC containers along with all the required services. Inside the containers, not just the engineering application can be run, but also the capability to either plug into an existing HPC cluster or run an entire HPC resource manager installation (like SLURM or Univa Grid Engine) needs to be provided. In that way, we can provide compatibility to the engineering applications and can exploit the extended batch scheduling capabilities. At the same time our whole deployment can be operated in all Kubernetes enabled environments with the advantages of standardized container orchestration.
Ease of Administration
HPC environments consist of a potentially large set of containers. The higher abstraction of container orchestration compared to self-managing container single runtime engines provides the necessary flexibility we need to fulfill different customer requirements. Management operations like scaling the deployment are much simpler to implement and execute.
The Run-Time for Hybrid and Multi-Cloud Offers True Portability
Portability is a key value of containers. Kubernetes provides us this portability for fleets of containers. We can have the same experience on-premises as well as on different cloud infrastructures. Engineers can seamlessly switch the infrastructure without any changes for the engineers. In that way we can choose the infrastructure by criteria like price, performance, and capabilities. When running on premises we can start offering true hybrid-cloud experience by providing a consistent infrastructure with the same operational and HPC application experience and seamlessly use on-demand cloud resources when required.
Embracing Kubernetes for the specific requirements of HPC and engineering workload is not straight forward. But due to the success of Kubernetes and its open and extensible architecture the ecosystem is opening up for HPC applications primarily driven by the demand of new AI workloads and HPC containers.
About the Authors
Daniel Gruber, Burak Yenier, and Wolfgang Gentzsch are with UberCloud, a company that started in 2013 with developing HPC container technology and containerized engineering applications, to facilitate access and use of engineering HPC workload in a shared on-premise or on-demand cloud environment. This article is based on a white paper they wrote detailing their experience using UberCloud HPC containers and Kubernetes.