CONCUR: High-Throughput Agentic Batch Inference of LLM via Congestion-Based Concurrency Control

Published in International Conference on Machine Learning (ICML), 2026

CONCUR improves agentic batch inference throughput with congestion-based concurrency control for KV-cache pressure.

Recommended citation: Qiaoling Chen, Zhisheng Ye, Tian Tang, Peng Sun, Boyu Tian, Guoteng Wang, Shenggui Li, Yonggang Wen, Zhenhua Han, and Tianwei Zhang. "CONCUR: High-Throughput Agentic Batch Inference of LLM via Congestion-Based Concurrency Control." International Conference on Machine Learning (ICML), 2026.
Download Paper