Subquadratic releases SubQ 1.1 Small
Subquadratic has released SubQ 1.1 Small, a subquadratic sparse attention model claiming near-perfect retrieval up to 12 million tokens. At a 1-million-token context, the model requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2 while maintaining strong reasoning capabilities.
Subquadratic's SSA architecture proves that moving from quadratic to linear attention doesn't have to compromise reasoning capabilities, successfully combining near-perfect 12M-token retrieval with strong GPQA and coding performance.
- –Extreme compute efficiency: Requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2 at 1M tokens.
- –Content-routing generalization: Generalizes successfully to 12M tokens despite being predominantly trained at 1M tokens, thanks to position-independent routing.
- –High benchmark baseline: Scores 89.7% pass@4 on LiveCodeBench v6 and 85.4% on GPQA Diamond, putting it on par with mid-tier frontier models.
- –Credibility validation: Benchmarks are third-party verified by Appen, reducing skepticism around the startup's performance claims.
DISCOVERED
1d ago
2026-06-16
PUBLISHED
1d ago
2026-06-16
RELEVANCE
AUTHOR
subquadratic