PRISM targets O(1) KV block selection with photonics

// 78d agoRESEARCH PAPER

PRISM targets O(1) KV block selection with photonics

PRISM is a research repo and paper proposing O(1) photonic block selection for long-context LLM inference, replacing the usual O(N) KV-cache signature scan with an optical broadcast-and-weight core on TFLN. The project also ships a GPU-only selector, simulator, and benchmark code, with the README claiming 944x faster selection and 18,000x lower energy than an H100 scan at 1M context, plus a modeled 5.3x total-decode win at 100M context.

// ANALYSIS

Hot take: this is the rare hardware idea that actually matches the workload shape, because broadcast-to-many is what photonics does best, but the eye-catching gains still depend on simulation assumptions rather than a fabricated chip.

–The repo is real and usable today: MIT-licensed code, a paper PDF, a demo, a simulator, and a GPU-only `BlockSelector` for current LLM stacks.
–It attacks a real bottleneck: block-sparse methods like Quest or RocketKV still scan all candidate block signatures from HBM every decode step, so latency rises with context even when fetches are sparse.
–The scaling story is compelling: the README models 5.3x faster total decode at 100M context in batch serving, which is where HBM bandwidth pain gets brutal.
–The GPU-only selector already claims 100% needle retrieval and 0% LongBench-v2 drop, so there is a practical software fallback even before any chip exists. Sources: https://github.com/hyoseokp/PRISM ; https://www.reddit.com/r/LocalLLaMA/comments/1s1f8sq/designed-a-photonic-chip-for-o1-kv-cache-block/

// TAGS

prismllminferencebenchmarkresearchopen-sourcegpu

DISCOVERED

78d ago

2026-03-23

PUBLISHED

78d ago

2026-03-23

RELEVANCE

8/ 10

AUTHOR

Exact-Schedule-3442

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL26m ago

Claude Fable 5 launch sparks massive developer backlash

Anthropic's Claude Fable 5 launch faces severe developer backlash over aggressive safety restrictions, high pricing, and a forced 30-day data retention policy. The model silently routes chemistry, biology, and cybersecurity requests to the older Opus 4.8 model, frustrating users with opaque downgrades and anti-distillation blocks.

MODEL27m ago

Designers praise Claude Fable 5 landing pages

Educator and designer Meng To highlighted Claude Fable 5's capability for creating landing pages on X, calling the model "a monster" for the task. Released in June 2026, Claude Fable 5 is Anthropic's latest Mythos-class AI model, featuring a 1-million-token context window, a 128,000-token output capacity, and advanced reasoning for long-horizon agentic workflows, making it highly effective for complex design and front-end code generation tasks.

MODEL1h ago

Claude Fable 5 hits Google Cloud

Anthropic's new Mythos-class frontier AI model, Claude Fable 5, is now generally available on Google Cloud's Agent Platform (Vertex AI). Designed for complex, long-horizon reasoning and autonomous workflows, Fable 5 is built for tasks such as software engineering, deep research, and multi-day agentic execution, featuring built-in safety guardrails that automatically redirect sensitive queries to Claude Opus 4.8.