iMac G3 runs LLM on 32MB RAM
Developer Maddie Dreese successfully ported Andrej Karpathy's `llama2.c` to run on a 233 MHz PowerPC G3 iMac. By using a 260K parameter TinyStories model and custom memory management, the project achieves local inference on 28-year-old hardware with 500x less RAM than modern machines.
This project serves as a deep dive into classic Mac OS memory architecture, requiring manual endian-swapping for the PowerPC’s big-endian architecture and a fix for a Grouped-Query Attention (GQA) bug in the original code. To fit the model into 32MB of RAM, the developer used MaxApplZone() and NewPtr() to bypass fixed memory partitions and implemented static compile-time arrays for the KV cache. It highlights the portability of the Transformer architecture and serves as a masterclass in extreme optimization for legacy systems.
DISCOVERED
6d ago
2026-04-06
PUBLISHED
6d ago
2026-04-06
RELEVANCE
AUTHOR
maddiedreese