BACK_TO_FEEDAICRIER_2
Dual RTX 3090s top 5080 for local coding
OPEN_SOURCE ↗
REDDIT · REDDIT// 26d agoINFRASTRUCTURE

Dual RTX 3090s top 5080 for local coding

LocalLLaMA community consensus favors returning the RTX 5080 in favor of dual used RTX 3090s to maximize VRAM for private coding models. For a $1,300 budget, the 48GB VRAM pool from two older cards enables significantly more capable reasoning than a single 16GB modern GPU.

// ANALYSIS

VRAM capacity is the primary bottleneck for local coding; inference speed is secondary to whether the "smartest" models can actually load.

  • Returning a 16GB RTX 5080 for two used 24GB RTX 3090s triples memory, allowing 70B-parameter models to run without CPU offloading.
  • Modern "gold standard" coding models like Qwen2.5-Coder-32B require 20GB+ at high quantization, making 16GB cards effectively obsolete for high-end local development.
  • While the RTX 5080's GDDR7 and Blackwell architecture provide superior tokens-per-second, they fail the utility test for privacy-focused devs who need larger model weights.
  • Dual-GPU setups introduce significant power (750W+) and cooling overhead, requiring a 1200W+ PSU that may eat into the $1,300 hardware budget.
// TAGS
gpuai-codingllmself-hostednvidiartx-3090rtx-5080nvidia-rtx-3090-5080

DISCOVERED

26d ago

2026-03-16

PUBLISHED

26d ago

2026-03-16

RELEVANCE

8/ 10

AUTHOR

FirmAttempt6344