Open Nav

How to leverage Kubernetes for distributed workloads

How to leverage Kubernetes for distributed workloads

Machine Learning (ML) is rapidly evolving and becoming essential to all businesses and organizations around the world. Kubernetes, the popular container orchestrator, plays an integral part in offering a solution to the challenges faced by Data Scientists and ML Engineers looking to scale their ML workloads. 

In this webinar, Kubernetes expert Itay Ariel will give an overview of the challenges of running distributed workloads for machine learning. He will walk through how to run distributed training with Kubernetes on popular libraries such as PyTorch, TensorFlow and Spark. We’ll discuss the key advantages Kubernetes offers as a platform for training and deploying ML models, and eliminating the infrastructure complexity associated with scaling and managing containerized applications. You will learn how to utilize Kubernetes for distributed workloads to easily scale your ML models and automate the management of workload performance. 

What you’ll learn:

  • What to consider before implementing Kubernetes to your ML workloads.
  • The main challenges of running distributed workloads on Kubernetes.
  • Real world examples and use cases such as PyTorch.
  • Best practices for optimizing ML workloads on Kubernetes.
  • How to easily use Kubernetes to scale your ML models.
  • How to use cnvrg.io, KubeFlow and Ray to easily run distributed workloads.