In this webinar, we’re joined by Eri Rubin the VP of research and development at DeepCube a cnvrg.io customer and NVIDIA Deep Learning Solutions Architect Adam Tetelman to discuss how to optimize distributed training for multi-node and multi-GPU training to maximize performance.
Distributed deep learning can be complex, with many factors contributing to the overall success of a deployment. Training at-scale requires a well designed data center with proper storage, networking, compute, and software design. In this webinar, we will hear from industry experts in distributed deep learning training and go over the best practices for building dynamic distributed training clusters using containers, PyTorch software tips for distributed training, and strategies for data center design and workload management to maximize NVIDIA GPU utilization. Alongside with the cnvrg.io software platform, these best-practices for deep learning software and hardware will help individual training jobs run faster while getting you a higher data center ROI and boosting cluster utilization.
We’ll follow with a live Megatron-LM example of using PyTorch in cnvrg.io. Along with DeepCube, NVIDIA and cnvrg.io CEO, Yochay Ettun, we will share performance optimization tips for:
In this webinar, we’re joined by Eri Rubin the VP of research and development at DeepCube a cnvrg.io customer and NVIDIA Deep Learning Solutions Architect Adam Tetelman to discuss how to optimize distributed training for multi-node and multi-GPU training to maximize performance.
Distributed deep learning can be complex, with many factors contributing to the overall success of a deployment. Training at-scale requires a well designed data center with proper storage, networking, compute, and software design. In this webinar, we will hear from industry experts in distributed deep learning training and go over the best practices for building dynamic distributed training clusters using containers, PyTorch software tips for distributed training, and strategies for data center design and workload management to maximize NVIDIA GPU utilization. Alongside with the cnvrg.io software platform, these best-practices for deep learning software and hardware will help individual training jobs run faster while getting you a higher data center ROI and boosting cluster utilization.
We’ll follow with a live Megatron-LM example of using PyTorch in cnvrg.io. Along with DeepCube, NVIDIA and cnvrg.io CEO, Yochay Ettun, we will share performance optimization tips for:
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.