Everything Everywhere all at Once: vLLM...- Brittany Rockwell & Shireen Kheradpey
Sponsored Session: Everything Everywhere all at Once: vLLM Hardware Optionality with Spotify and Google - Brittany Rockwell, Google & Shireen Kheradpey, Spotify Tired of being constrained to a single hardware type for your PyTorch pipelines? Join Google and Spotify for a practical tour showcasing how to achieve true hardware optionality across GPUs and TPUs for your inference workloads. We’ll explore real-world, global architectures, demonstrating how PyTorch facilitates advanced cross-accelerator workflows. This session features Brittany, Google’s Product Lead for vLLM, sharing insights from her direct work with its creators and key industry partners, alongside Shireen Kheradpey, Senior Machine Learning Infrastructure Engineer at Spotify, offering an exclusive look into Spotify’s globally-scaled, PyTorch-powered ML platform. They will share practical strategies for serving in low-latency regimes, kernel optimizations, performance tuning, and robust deployment patterns using GPUs and TPUs, empowering you to achieve true hardware optionality for your workloads.