Tenstorrent TT-QuietBox 2 specs, 128GB VRAM

// 90d agoINFRASTRUCTURE

Tenstorrent TT-QuietBox 2 specs, 128GB VRAM

Tenstorrent’s QuietBox 2 spec sheet describes a liquid-cooled desktop AI workstation built around a Ryzen 7 9700X, 256GB of DDR5, and two Blackhole cards for 128GB of accelerator memory and 480 Tensix cores total. The company is pitching it as a local-inference box for large models, and the documentation is still marked as a draft.

// ANALYSIS

This is a serious local-AI hardware play, but the real story is not the raw specs alone. Tenstorrent’s upside is an open stack plus desktop-friendly packaging; the downside is that it still has to prove software breadth can match the hardware ambition.

–Two Blackhole cards and 128GB of accelerator memory put it squarely in self-hosted LLM territory, especially for larger models Tenstorrent already lists like GPT-OSS-120B, Llama 3.3 70B, Qwen3-32B, Qwen3-VL-32B-Instruct, and QwQ-32B.
–The 1.5kW power target is the key product decision: this is meant to live on a desk or in a home office, not demand datacenter infrastructure.
–Tenstorrent’s supported-models page already covers Qwen3 and QwQ families, but there’s no obvious support yet for Qwen 3.6 or MiniMax, so the platform still has model-coverage gaps.
–Against Nvidia, the pitch is openness and local control; against the market, the company still needs to show that specialized hardware plus open tooling can beat CUDA’s ecosystem gravity.

// TAGS

tt-quietbox-2inferencellmself-hostedopen-source

DISCOVERED

90d ago

2026-04-30

PUBLISHED

90d ago

2026-04-30

RELEVANCE

8/ 10

AUTHOR

pulse77

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

BENCHMARK32m ago

GPT-5.6 Sol tops ARC-AGI-3 benchmark

OpenAI's GPT-5.6 Sol model reached state-of-the-art performance on the ARC-AGI-3 benchmark following two specific setting adjustments. By allowing the model to perform extended reasoning across multiple context windows with the aid of a canonical compaction implementation, the system significantly improved its ability to solve complex logical reasoning problems.

LAUNCH46m ago

OpenCode Go brings open models to GitHub Copilot

OpenCode Go offers a $10/month subscription service that delivers API access to leading open-weights models, including DeepSeek V4 Pro, GLM 5.2, MiniMax, and Qwen 3.7 Max. Developers can configure these models into GitHub Copilot using custom endpoints, allowing them to leverage versatile open models seamlessly within their existing coding environment.

UPDATE49m ago

ChatGPT Voice orchestrates multi-app desktop workflows

OpenAI demonstrated enhanced ChatGPT Voice capabilities for executing background tasks and orchestrating workflows across connected desktop applications like Slack, Google Calendar, and Navan. Combining hands-free voice interaction with screen context awareness transforms ChatGPT into an active assistant capable of cross-app task automation.