Parameter Golf fits 24M LLM in 15MB

// 110d agoBENCHMARK RESULT

Parameter Golf fits 24M LLM in 15MB

The post reports a top-3 finish in OpenAI's Parameter Golf challenge by squeezing a 24M-parameter LLM into about 15MB. The main gain comes from per-row INT8 calibration that tries five clip percentiles and keeps the lowest-MSE reconstruction, while a wider model appears to scale better than extra depth under the same cap.

// ANALYSIS

At this scale, quantization is the product, not a finishing touch. The submission reads like GPTQ-lite plus architecture tuning, which is exactly the kind of compression-first work that moves a tiny-model leaderboard.

–Five clip candidates per row is a very cheap search space, but it directly attacks reconstruction error instead of assuming one clipping heuristic will be good enough.
–The width-over-depth result suggests that once the byte ceiling is tight, representational breadth matters more than stacking layers.
–Landing around 15.06MB means the margin is microscopic, so schedule tweaks, optimizer choices, and packing format can all move the ranking.
–Because OpenAI's rules fix the dataset and cap training time, leaderboard separation is likely coming from many small compounding improvements, not one dramatic breakthrough.

// TAGS

parameter-golfllmbenchmarkresearch

DISCOVERED

110d ago

2026-03-25

PUBLISHED

110d ago

2026-03-25

RELEVANCE

9/ 10

AUTHOR

TrashFun5286

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE16m ago

Grok Build adds multiline input, scrolling

SpaceXAI has released Grok Build versions 0.2.99 and 0.2.98, introducing multiline input and terminal scrolling for its terminal-based AI coding assistant. The updates allow users to input complex prompts directly on the dashboard and scroll through chat histories using PageUp and PageDown.

INFRA1h ago

GLM-5 runs natively on Ascend via FlagOS

Zhipu AI's GLM-5 has been packaged for native execution on Huawei Ascend NPUs using the FlagOS framework, representing the first CUDA-free deployment of a Chinese general-purpose LLM on domestic hardware. This integration satisfies local sovereignty requirements across hardware, model, and inference runtime in a single package.

INFRA1h ago

Alchemy enables declarative agentic infrastructure

Sam Goodwin shared a declarative workflow for constructing agentic infrastructure using Alchemy, combining English prompts and TypeScript code in a single TypeScript file. By utilizing string template literals and a simple alchemy deploy command, developers can deploy applications directly to the cloud without manual environment setup.