LLM.Genesis packs LLM inference into 64KB SRAM

// 109d agoOPENSOURCE RELEASE

LLM.Genesis packs LLM inference into 64KB SRAM

LLM.Genesis is a C++ inference engine that encodes model topology and forward-pass logic as GCS DNA, a custom binary instruction stream. It is built to stream weights on demand and generate deterministically inside a 64KB SRAM budget, targeting hardware where normal LLM runtimes would not fit.

// ANALYSIS

This feels less like a faster llama.cpp and more like a manifesto for turning LLM execution into a tiny virtual machine. The 64KB SRAM target is the real differentiator: it makes the project genuinely interesting for embedded setups, but it also means I/O, tooling, and format friction will matter more than raw throughput.

–The `STREAM` opcode and paged weight loading push the bottleneck toward storage, so latency will be dominated by flash or SD-card performance.
–GCS DNA is a clever separation of model logic from the runner, but it creates a new format the ecosystem has to adopt and debug.
–The runtime claims zero external dependencies, yet the repo still leans on Python for compilation tooling, so the build story is light, not pure C++.
–There are no releases yet, which makes this read more like an architectural prototype than production-ready inference infrastructure.

// TAGS

llminferenceopen-sourceedge-aiself-hostedllm-genesis

DISCOVERED

109d ago

2026-03-26

PUBLISHED

109d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

Routine_Lettuce1592

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

MODEL18m ago

OpenAI GPT-5.6 hits Amazon Bedrock

OpenAI's GPT-5.6 model family—including Sol, Terra, and Luna—is now generally available on Amazon Bedrock. Running on Bedrock's next-generation inference engine, the models support prompt caching with a 90% discount and match OpenAI's first-party pricing.

UPDATE1h ago

OpenRouter splits rankings by model weight

OpenRouter has updated its rankings platform by introducing separate leaderboards for open-weight and closed-weight models. This allows developers to track and compare usage statistics of proprietary, API-exclusive models against downloadable open-weight models.

UPDATE1h ago

Codex and Claude Code introduce advanced in-app browser capabilities, including multi-tab support and cookie imports, accelerating the shift toward autonomous computer use.

Codex has updated its in-app browser to support multiple tabs, cookie importing, and password persistence, with Anthropic's Claude Code quickly following with similar web-browsing capabilities. These upgrades allow AI agents to navigate authenticated sites and perform browser-based tasks alongside code editors and terminals. By embedding robust browser control directly into the agentic environment, developers can execute end-to-end workflows without leaving the command line or workspace app.