BACK_TO_FEEDAICRIER_2
Together AI drops Aurora for adaptive speculative training
OPEN_SOURCE ↗
REDDIT · REDDIT// 11d agoOPENSOURCE RELEASE

Together AI drops Aurora for adaptive speculative training

Together AI's Aurora is an open-source reinforcement learning framework that continuously updates speculative decoding draft models during live inference. By creating a "serve-to-train" flywheel, it eliminates the "stale model" problem and delivers up to 1.5x speedups on frontier models without expensive offline pretraining or distillation pipelines.

// ANALYSIS

Aurora is a major step toward self-improving inference stacks that adapt to real-world traffic patterns in real-time. It signals the end of the static "train-then-serve" era for speculative decoding.

  • Decouples training and inference servers to allow "hot-swapping" model weights without service downtime
  • Achieves 1.25x-1.5x performance gains even when starting from an untrained "Day-0" speculator
  • Uses a novel Tree Attention mechanism to process both accepted and rejected token branches in a single efficient pass
  • Built on SGLang and open-sourced, providing a blueprint for the next generation of adaptive inference
// TAGS
auroratogether-aillminferenceopen-sourceresearchsglang

DISCOVERED

11d ago

2026-04-01

PUBLISHED

11d ago

2026-03-31

RELEVANCE

9/ 10

AUTHOR

incarnadine72