Nemotron REAP cut hits AIME 90%+

// 45d agoMODEL RELEASE

Nemotron REAP cut hits AIME 90%+

Max-and-Omnis released a REAP-pruned math variant of NVIDIA Nemotron-3-Super, shrinking the 120B latent-MoE model to 64B while keeping 12B active parameters. The AWQ and FP8 builds reportedly top 90% avg@4 on AIME 2026 and fit on a single high-end H100 or RTX PRO 6000 Blackwell.

// ANALYSIS

This is a serious local-inference experiment, but the headline number should be read as a community benchmark until broader evals and reproduction land.

–REAP pruning from 512 to 256 experts is the real story: it cuts deployment weight without giving up the sparse-MoE active-parameter profile
–FP8 beats AWQ on quality but takes a roughly 40% throughput hit, making this a practical quality-vs-latency choice for math workloads
–The included vLLM patch matters because expert routing edge cases still break real-world serving paths for unusual MoE shapes
–Fine-tuning on about 270 AIMO3 and AstralMath problems means the AIME result is impressive, but narrow and potentially sensitive to prompt placement
–Single-GPU 90%+ AIME-class math performance is exactly the kind of open-weights pressure that makes smaller, specialized reasoning models worth watching

// TAGS

nemotron-3-super-64b-math-reapllmreasoningfine-tuninginferencegpuopen-weightsbenchmark

DISCOVERED

45d ago

2026-04-22

PUBLISHED

45d ago

2026-04-22

RELEVANCE

9/ 10

AUTHOR

max6296

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO1h ago

Mint remasters 2D games into 3D worlds

Mint (mint.gg) has released a demo showcasing the ability to remaster classic 2D games into interactive 3D worlds. Using assets from Pokémon Ruby, the platform demonstrates how 2D tiles and sprites can be turned into a navigable 3D environment.

LAUNCH2h ago

Jarvis enforces human-approved local AI execution

Jarvis is a local AI operator system designed to prioritize human oversight and strict system control by requiring explicit human approval for every proposed action. All steps taken by the AI are fully logged, inspectable, and subject to legal verification to provide a practical, audit-ready local environment.

UPDATE4h ago

Antigravity CLI updates add LaTeX and model selection

Three releases for the Antigravity CLI were rolled out in the past week, delivering numerous quality-of-life improvements based on user feedback. The updates include support for LaTeX math equations, the introduction of a new --model flag along with the agy models command, and a new /permissions command for managing permissions.