BACK_TO_FEEDAICRIER_2
Intel Arc Pro B70 hits 282 t/s prompt eval
OPEN_SOURCE ↗
REDDIT · REDDIT// 7h agoBENCHMARK RESULT

Intel Arc Pro B70 hits 282 t/s prompt eval

A Reddit user reports high-performance local LLM results using the 32GB Intel Arc Pro B70 (Battlemage) on a legacy HP Z640 workstation. Achieving 282 tokens per second on prompt evaluation for a 35B parameter model, the SYCL-powered setup demonstrates the viability of modern Intel silicon for high-VRAM AI workloads on aging hardware.

// ANALYSIS

The report confirms that llama.cpp’s SYCL backend is now mature enough for production-grade speeds, significantly outperforming Vulkan on Battlemage hardware. Successful deployment on a PCIe 3.0 system proves the architecture's resilience to older bandwidth standards, extending the life of legacy workstations. Furthermore, performance spikes in prompt evaluation suggest that Intel's driver-level optimizations for Flash Attention are delivering competitive throughput. At $949, the card enables running large models like Qwen 3.6 35B with massive 130k context windows entirely in VRAM, effectively undercutting the "Nvidia tax" for local inference.

// TAGS
llmgpuedge-aiopen-sourceintel-arc-pro-b70llama-cppinferencebenchmark

DISCOVERED

7h ago

2026-04-19

PUBLISHED

9h ago

2026-04-19

RELEVANCE

8/ 10

AUTHOR

Serious_Rub_3674