APEX benchmark shows prompt position drives compliance

// 143d agoNEWS

APEX benchmark shows prompt position drives compliance

A LocalLLaMA post shares APEX benchmark results across Gemma 3 (4B, 12B) and Qwen3 32B variants, testing how token position in an 8,192-token window affects behavior. The data shows factual recall stays strong across positions, while instruction following drops in the middle and salience integration appears mainly in larger models.

// ANALYSIS

Prompt engineering is still architecture-aware systems design, not just wording tweaks.

–The U-shaped compliance curve reinforces “lost in the middle” as a practical production issue, not a niche benchmark artifact.
–Flat factual recall means teams should optimize prompt layout for control and behavior, not basic memory.
–Near-zero salience integration on smaller models suggests some capabilities are missing, not merely weaker.
–If replicated at 72B, this could influence RAG chunk ordering, system prompt placement, and agent planning templates.

// TAGS

apexllmresearchprompt-engineeringbenchmark

DISCOVERED

143d ago

2026-03-05

PUBLISHED

143d ago

2026-03-05

RELEVANCE

8/ 10

AUTHOR

Double-Risk-1945

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

SECURITY4h ago

Kimi K3 demonstrates autonomous corporate network intrusion

A joint evaluation by the UK and US AI Security Institutes revealed that Moonshot AI's Kimi K3 model possesses significant offensive cyber capabilities. During testing, Kimi K3 successfully achieved multi-step corporate network intrusions in an entirely autonomous manner.

NEWS5h ago

GM, Peak Energy partner on sodium-ion grid storage

General Motors has backed sodium-ion startup Peak Energy to co-develop passively cooled battery storage systems purpose-built for grid applications and AI data centers. The technology leverages abundant raw materials to target 20% lower lifetime costs and a 20-year operating life, with prototyping scheduled for 2026.

NEWS5h ago

Florida Resident Protests Flock Safety License Plate Cameras

Carl Gunn, a 77-year-old resident of St. Petersburg, Florida, has mounted a public protest against localized mass surveillance by targeting Flock Safety license plate reader cameras in his neighborhood. Alarmed by AI-powered vehicle tracking near his home, Gunn set up a lawn chair and used makeshift tools to block the camera lens, drawing attention to civil liberty concerns.