Numina-Lean-Agent solves Putnam 2025, drops open-source
Project Numina's open-source mathematical agent achieved a perfect score on the Putnam 2025 competition using machine-verified Lean 4 proofs. The system bridges the gap between LLM reasoning and formal verification by leveraging Claude Code and the Model Context Protocol to autonomously interact with the Lean proof assistant.
Numina-Lean-Agent represents a significant shift from fine-tuning specialized math models to optimizing agentic reasoning on general foundation models. The system achieved a perfect 12/12 score on Putnam 2025, matching state-of-the-art closed-source systems like AxiomProver. Built on Claude Code using Claude Opus 4.5, it proves that general-purpose coding agents can outperform specialized math models when given the right tools. It employs specialized MCP servers for real-time compiler feedback and semantic search across the Lean Mathlib library. The agent successfully formalized the complex Brascamp-Lieb theorem with over 8,000 lines of code, and its open-source release democratizes high-end theorem proving tools previously restricted to proprietary systems.
DISCOVERED
7h ago
2026-04-12
PUBLISHED
7h ago
2026-04-12
RELEVANCE
AUTHOR
AI Search