AI Oct 22, 2025

Verl: A Flexible and Efficient RL Framework for LLMs

Verl: A Flexible and Efficient RL Framework for LLMs - Hongpeng Guo & Ziheng Jiang, ByteDance Seed Large language models (LLMs) have unlocked capabilities in language understanding and multimodal tasks. Integrating reinforcement learning (RL) at scale remains a major challenge. Existing frameworks lack either the abstractions for complex dataflow or the scalability for billion-parameter models. verl (https://github.com/volcengine/verl) is an open-source framework for building end-to-end RL pipelines with LLMs. It provides high-level abstractions and optimizations for dataflow orchestration and resource management via a hybrid-controller model. WorkerGroup modules and ResourcePool components distribute computation and resources across GPU clusters, delivering high throughput and strong extensibility. Since its release, verl has been adopted in both academic research and industry production. It integrates with major training backends (FSDP, FSDP2, Megatron-LM) and inference engines (vLLM, SGLang), supports RL algorithms (PPO, GRPO, DAPO, etc) and agentic features (multi-turn dialogue, tool calling). Recent DeepSeek-671B post-training integration further demonstrates verl’s scalability to ultra-large models, making it a robust foundation for RL-enhanced LLM systems.