LLM Werewolf runner adds skill memory

// 45d agoOPENSOURCE RELEASE

LLM Werewolf runner adds skill memory

A standalone runner lets four LLMs play One Night Ultimate Werewolf over any OpenAI-compatible API, with a live UI, crash resume, and persistent gameskill notes. The author says the main improvement so far is getting models to stop clinging to their original role after card swaps and start playing more strategically.

// ANALYSIS

This is more interesting as an agent-behavior sandbox than as a game demo. The repo surfaces a real weakness in LLMs: they can reason through a swap, then still emotionally anchor to the wrong identity.

–The persistent `gameskill` file is a lightweight memory loop, so each match can influence the next without finetuning.
–Making the runner API-agnostic lowers the barrier to testing older local models that do not support tool calls well.
–The live transcript plus phase machine gives you a clean trace of how identity confusion, bluffing, and strategy evolve over time.
–The project is useful for studying prompt design and state tracking, especially because “goal-oriented” prompting already changed model behavior.
–A 5-player version would probably increase entropy and make the social deduction dynamics less deterministic.

// TAGS

llmagenttool-useautomationopen-sourceself-hostedllm-plays-one-night-werewolf

DISCOVERED

45d ago

2026-05-21

PUBLISHED

45d ago

2026-05-21

RELEVANCE

8/ 10

AUTHOR

Some-Cauliflower4902

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE43m ago

xAI releases Grok Build 0.2.87

Grok Build 0.2.87 is a quality-of-life release for xAI's command-line interface coding agent. The update introduces automatic detection of subscription upgrades to eliminate CLI restarts and adds a persistent "Never allow" option to Bash permission prompts.

NEWS2h ago

Developer Pairs Codex and Cursor for AI Coding

The post highlights a developer's workflow combining OpenAI's Codex model with the Cursor IDE. The developer notes that an IDE is essential for reviewing Codex's outputs and maintaining a project overview, and praises Cursor's built-in Composer 2.5 model as a highly effective tool for many development tasks.

MODEL2h ago

Grok 4.5 enters private beta

Grok 4.5, xAI's next-generation large language model, is reportedly in private beta testing at Tesla and SpaceX. Powered by a massive 1.5 trillion-parameter V9 model, its early performance is described by Elon Musk as close to, or perhaps exceeding, Anthropic's Claude 3 Opus, signaling a significant capability upgrade for xAI's suite of products.