OPEN_SOURCE ↗
REDDIT · REDDIT// 26d agoINFRASTRUCTURE
Dual RTX 3090s top 5080 for local coding
LocalLLaMA community consensus favors returning the RTX 5080 in favor of dual used RTX 3090s to maximize VRAM for private coding models. For a $1,300 budget, the 48GB VRAM pool from two older cards enables significantly more capable reasoning than a single 16GB modern GPU.
// ANALYSIS
VRAM capacity is the primary bottleneck for local coding; inference speed is secondary to whether the "smartest" models can actually load.
- –Returning a 16GB RTX 5080 for two used 24GB RTX 3090s triples memory, allowing 70B-parameter models to run without CPU offloading.
- –Modern "gold standard" coding models like Qwen2.5-Coder-32B require 20GB+ at high quantization, making 16GB cards effectively obsolete for high-end local development.
- –While the RTX 5080's GDDR7 and Blackwell architecture provide superior tokens-per-second, they fail the utility test for privacy-focused devs who need larger model weights.
- –Dual-GPU setups introduce significant power (750W+) and cooling overhead, requiring a 1200W+ PSU that may eat into the $1,300 hardware budget.
// TAGS
gpuai-codingllmself-hostednvidiartx-3090rtx-5080nvidia-rtx-3090-5080
DISCOVERED
26d ago
2026-03-16
PUBLISHED
26d ago
2026-03-16
RELEVANCE
8/ 10
AUTHOR
FirmAttempt6344