Talk

GPU Inference in K8s: Acceleration, Sharing and Scaling Without Pain

In RussianComplexity -For practicing engineers

How can I speed up GPU inference in Kubernetes and not go crazy? It's all about scaling, sharing, speeding up the start and choosing shaders. With examples, hacks, and conclusions from real production.

Speakers

Schedule