OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoINFRASTRUCTURE
VRAM.cpp runs llama.cpp fit in browser
VRAM.cpp is a browser-based VRAM estimator that runs llama.cpp’s fit logic directly, so users can check whether a specific GGUF will run on their hardware instead of relying on rough calculators. It’s aimed at the exact local-LLM question people keep asking: which quant, on which GPU, with how much host RAM.
// ANALYSIS
This is a smart answer to a real pain point: instead of approximating memory from model size, it reuses the same fitting logic the runtime depends on, which should make estimates much more credible.
- –The core advantage is fidelity: as llama.cpp’s fit algorithm evolves, the estimator inherits those improvements without a separate rules engine to keep in sync.
- –That makes it more useful than generic VRAM calculators for edge cases like Q3 variants, hybrid GPU+RAM fits, and newer model families.
- –The project still admits weak spots in multi-GPU plus host-memory splits and MoE fitting, so the hardest configurations are exactly where users should be most cautious.
- –As an open-source browser app, it lowers the friction for quick pre-flight checks before downloading huge GGUFs or running trial fits locally.
// TAGS
vram-cppllama-cppllmgpuopen-sourcedevtoolinference
DISCOVERED
3h ago
2026-04-27
PUBLISHED
7h ago
2026-04-27
RELEVANCE
8/ 10
AUTHOR
TheAconn96