SmolLM2 hits 7 tok/s on Roblox Native

// 90d agoOPENSOURCE RELEASE

SmolLM2 hits 7 tok/s on Roblox Native

Developer u/antwon_dev has successfully implemented Hugging Face’s SmolLM2-135M-Q8 running natively within Roblox’s Luau engine. The implementation achieves 7 tokens per second, enabling zero-latency, on-platform AI inference for game developers without the cost or privacy concerns of external API calls.

// ANALYSIS

Running a transformer natively in Roblox’s Luau is a significant technical milestone that unlocks persistent, zero-cost AI for the platform's creator ecosystem. By bypassing costly external API calls, this implementation proves that small, high-quality models like SmolLM2 can provide real-time NPC interactions. Future optimizations like better parallelization could push performance beyond the current 7 tok/s benchmark, especially if weights are serialized directly within the game to remove GitHub dependencies. This paves the way for truly autonomous agents and complex procedural generation within the Roblox environment.

// TAGS

smollm2-roblox-nativesmollm2robloxluaullmedge-aigame-dev

DISCOVERED

90d ago

2026-04-18

PUBLISHED

90d ago

2026-04-17

RELEVANCE

8/ 10

AUTHOR

antwon_dev

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE29m ago

B.AI adds Kimi K3 3T-class model to API

B.AI has rapidly integrated Moonshot AI's newly released Kimi K3 model into its API platform. This update provides developers with immediate access to what is described as the world's first open 3T-class AI model, enabling them to leverage its advanced computational capabilities without the overhead of hosting it themselves.

LAUNCH55m ago

Roblox launches Build mobile AI game creator

Roblox is launching Build, a mobile-first AI tool within its app that generates basic, playable games from text prompts. The tool shares a backend with Roblox Studio, allowing creators to start projects on mobile and refine them on desktop.

UPDATE1h ago

TanStack AI ships client-side message queueing

TanStack AI has introduced client-side message queuing within its useChat hook to manage concurrent prompt submissions and prevent race conditions during active LLM streams. The update supports FIFO, batch, and interrupt queuing strategies to automatically transmit messages once the stream settles.