Armada presentation by Jamie Poole at K8s Batch + HPC Day, on Oct 24, 2022 #1564
Unanswered
dans77777
asked this question in
News & Events
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Title: Running Batch Jobs At Massive Scale On Kubernetes
Description:
Thousands of GPUs. Hundreds of thousands of CPUs. Learn how (and why!) G-Research designed and built Armada - a system to enable massive throughput of batch jobs running on Kubernetes. In this session you’ll hear how we use large scale batch compute on Kubernetes to spot patterns in financial markets and predict the future. Armada enables us to schedule millions of batch jobs across many clusters and tens of thousands of nodes, getting optimum utilisation of our hardware to enable our researchers to run the latest machine-learning and advanced data science techniques across vast datasets. We’ll cover the architecture and approach of Armada, challenges and techniques for running Kubernetes at scale and some war stories and lessons learned along the way.
https://sched.co/1AsSX
Beta Was this translation helpful? Give feedback.
All reactions