Numina-Lean-Agent solves Putnam 2025, drops open-source

// 109d agoOPENSOURCE RELEASE

Numina-Lean-Agent solves Putnam 2025, drops open-source

Project Numina's open-source mathematical agent achieved a perfect score on the Putnam 2025 competition using machine-verified Lean 4 proofs. The system bridges the gap between LLM reasoning and formal verification by leveraging Claude Code and the Model Context Protocol to autonomously interact with the Lean proof assistant.

// ANALYSIS

Numina-Lean-Agent represents a significant shift from fine-tuning specialized math models to optimizing agentic reasoning on general foundation models. The system achieved a perfect 12/12 score on Putnam 2025, matching state-of-the-art closed-source systems like AxiomProver. Built on Claude Code using Claude Opus 4.5, it proves that general-purpose coding agents can outperform specialized math models when given the right tools. It employs specialized MCP servers for real-time compiler feedback and semantic search across the Lean Mathlib library. The agent successfully formalized the complex Brascamp-Lieb theorem with over 8,000 lines of code, and its open-source release democratizes high-end theorem proving tools previously restricted to proprietary systems.

// TAGS

numina-lean-agentlean-4reasoningagentmcpopen-sourcellm

DISCOVERED

109d ago

2026-04-12

PUBLISHED

109d ago

2026-04-12

RELEVANCE

9/ 10

AUTHOR

AI Search

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE19m ago

tuicr brings terminal-native code reviews to CLI

tuicr is a Rust-based terminal interface for local code reviews featuring Vim navigation, multi-VCS support, and direct PR submissions. Built for keyboard workflows, it integrates with AI coding agents to enable structured diff exports and review assistance.

OPEN SOURCE20m ago

Baileys provides direct socket API for WhatsApp Web

Baileys is an open-source TypeScript and JavaScript library designed to communicate directly with WhatsApp Web using WebSockets. By connecting at the protocol level rather than running a headless browser like Puppeteer or Selenium, Baileys drastically reduces resource consumption while offering developers robust programmatic access to WhatsApp messaging, multi-device authentication, media transfer, and group management.

INFRA1h ago

Tenstorrent Blackhole cluster runs Llama 70B locally

A solo developer bypassed expensive enterprise GPUs by assembling a local hardware setup with four Tenstorrent Blackhole cards priced at $1,299 each inside a Linux workstation. By wiring the cards directly card-to-card with QSFP-DD 800 Gbit fiber optical links, the setup achieves high-bandwidth inter-card communication to run Meta's Llama 3.3 70B model locally with high energy efficiency and minimal operational electricity costs.