BACK_TO_FEEDAICRIER_2
vLLM adds Cohere-Transcribe for efficient ASR
OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoOPENSOURCE RELEASE

vLLM adds Cohere-Transcribe for efficient ASR

vLLM has integrated Cohere's new cohere-transcribe-03-2026 model, providing native support for high-throughput speech-to-text. By leveraging variable-length encoder inputs, the integration eliminates traditional padding overhead to maximize inference efficiency.

// ANALYSIS

Cohere's move into ASR via vLLM directly challenges Whisper's dominance because the integration is built around variable-length encoder inputs instead of fixed-padding models. Adding it to the v1/audio/transcriptions API gives developers a unified stack for serving both LLMs and state-of-the-art ASR from a single engine, and native CohereAsrForConditionalGeneration support makes it a credible open-weights alternative to proprietary transcription APIs. The standardized English text normalizers in vLLM's test suite help make the integration feel production-ready for enterprise deployments.

// TAGS
vllmcoherespeechopen-sourceinferenceapiaudio-gen

DISCOVERED

17d ago

2026-03-26

PUBLISHED

17d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

LinkSea8324