NVIDIA NIM drops free DeepSeek V4 endpoints
NVIDIA Inference Microservices (NIM) has integrated DeepSeek’s latest V4 models into its API Catalog, providing developers with free, high-performance prototyping access. The offering includes DeepSeek-V4 Pro (1.6T MoE) and DeepSeek-V4 Flash, both optimized for CUDA acceleration to deliver industry-leading throughput and latency through OpenAI-compatible endpoints.
NVIDIA is subsidizing access to frontier-class open weights to cement NIM as the default inference layer for AI developers. The DeepSeek-V4 Pro model features a 1.6 trillion parameter MoE architecture and 1M-token context window, while NIM optimizations enable nearly 4,000 tokens per second for reasoning tasks. This strategic move locks developers into the NVIDIA AI Enterprise ecosystem early by providing free, high-performance endpoints with drop-in integration for existing agentic frameworks.
DISCOVERED
2h ago
2026-04-26
PUBLISHED
2h ago
2026-04-26
RELEVANCE
AUTHOR
AICodeKing