Subquadratic releases SubQ 1.1 Small

// 46d agoMODEL RELEASE

Subquadratic releases SubQ 1.1 Small

Subquadratic has released SubQ 1.1 Small, a subquadratic sparse attention model claiming near-perfect retrieval up to 12 million tokens. At a 1-million-token context, the model requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2 while maintaining strong reasoning capabilities.

// ANALYSIS

Subquadratic's SSA architecture proves that moving from quadratic to linear attention doesn't have to compromise reasoning capabilities, successfully combining near-perfect 12M-token retrieval with strong GPQA and coding performance.

–Extreme compute efficiency: Requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2 at 1M tokens.
–Content-routing generalization: Generalizes successfully to 12M tokens despite being predominantly trained at 1M tokens, thanks to position-independent routing.
–High benchmark baseline: Scores 89.7% pass@4 on LiveCodeBench v6 and 85.4% on GPQA Diamond, putting it on par with mid-tier frontier models.
–Credibility validation: Benchmarks are third-party verified by Appen, reducing skepticism around the startup's performance claims.

// TAGS

subqsubquadraticssallmlong-contextmodel-release

DISCOVERED

46d ago

2026-06-16

PUBLISHED

46d ago

2026-06-16

RELEVANCE

8/ 10

AUTHOR

subquadratic

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

LAUNCH15m ago

NVIDIA releases physical AI stack for industrial robotics

NVIDIA has unveiled broad industry adoption of its unified physical AI platform across major industrial partners including Boston Dynamics, Caterpillar, Franka Robotics, and NEURA Robotics. By providing a comprehensive infrastructure combining Jetson Thor computing hardware, Isaac GR00T foundation models, and Omniverse high-fidelity simulation environments, NVIDIA is supplying the core tech stack required to train and deploy autonomous robots across heavy industry, manufacturing, and commercial applications.

OPEN SOURCE41m ago

AirLLM runs 70B models on 4GB VRAM

AirLLM is an open-source Python library designed to perform memory-efficient inference of massive Large Language Models on consumer-grade hardware with limited VRAM. By utilizing layer-by-layer sequential execution directly from disk, AirLLM drastically reduces memory overhead, allowing models as large as 70B parameters to run on a single 4GB GPU without relying on quantization, pruning, or distillation.

OPEN SOURCE41m ago

Harbour Masters drops Lighthouse Banjo-Kazooie PC port

Lighthouse is an open-source native PC source port of the iconic 1998 Nintendo 64 game Banjo-Kazooie, developed by the Harbour Masters community. Built in C through reverse-engineered decompilation, Lighthouse enables PC gamers to run Banjo-Kazooie natively with enhanced visuals, modern controls, high refresh rates, randomizer functionality, and multiplayer features without relying on emulation.