LLM Werewolf runner adds skill memory
A standalone runner lets four LLMs play One Night Ultimate Werewolf over any OpenAI-compatible API, with a live UI, crash resume, and persistent gameskill notes. The author says the main improvement so far is getting models to stop clinging to their original role after card swaps and start playing more strategically.
This is more interesting as an agent-behavior sandbox than as a game demo. The repo surfaces a real weakness in LLMs: they can reason through a swap, then still emotionally anchor to the wrong identity.
- –The persistent `gameskill` file is a lightweight memory loop, so each match can influence the next without finetuning.
- –Making the runner API-agnostic lowers the barrier to testing older local models that do not support tool calls well.
- –The live transcript plus phase machine gives you a clean trace of how identity confusion, bluffing, and strategy evolve over time.
- –The project is useful for studying prompt design and state tracking, especially because “goal-oriented” prompting already changed model behavior.
- –A 5-player version would probably increase entropy and make the social deduction dynamics less deterministic.
DISCOVERED
1h ago
2026-05-21
PUBLISHED
4h ago
2026-05-21
RELEVANCE
AUTHOR
Some-Cauliflower4902