BACK_TO_FEEDAICRIER_2
Cohere Labs drops 2B ASR model
OPEN_SOURCE ↗
REDDIT · REDDIT// 16d agoMODEL RELEASE

Cohere Labs drops 2B ASR model

Cohere Labs released Cohere Transcribe, an open-source 2B-parameter speech-to-text model on Hugging Face that supports 14 languages. It uses a Conformer encoder and Transformer decoder, and it is aimed at offline transcription and production serving via transformers or vLLM.

// ANALYSIS

My read: Cohere is treating ASR like serious infrastructure, not a demo feature. The release is strong on accuracy and deployment ergonomics, but it still leaves common transcript workflows to external tooling.

  • Cohere claims best-in-class transcription accuracy and up to 3x faster real-time factor than other dedicated ASR models in the same size range.
  • The model supports 14 languages, but it does not auto-detect language and performs poorly on code-switched audio.
  • It also ships without timestamps or speaker diarization, so meeting and call analytics stacks will still need extra layers.
  • The repo is Apache 2.0, but Hugging Face adds a small friction point because users must accept contact-info sharing before downloading.
// TAGS
cohere-transcribespeechopen-sourceinferencebenchmark

DISCOVERED

16d ago

2026-03-26

PUBLISHED

16d ago

2026-03-26

RELEVANCE

9/ 10

AUTHOR

LinkSea8324