Chess-GPT impossible-move test probes board model
This Reddit discussion proposes stress-testing Karvonen’s chess transformer with illegal, trajectory-impossible, and ambiguous moves to see whether its latent board-state probes stay coherent or break in distinct ways. The experiment is aimed at separating rule tracking, current-position tracking, attack geometry, piece identity, and strategic expectation into different failure modes.
Good idea, and more interesting than a generic robustness test: it turns “the model has a board representation” into a causal question about how that representation behaves under structured contradiction. The real signal would be qualitative dissociations, not just worse accuracy.
- –Karvonen’s prior work already shows linear probes and interventions can recover and edit latent board state, so impossible inputs are a direct test of whether that state is actually used
- –Rule violations should pressure the model’s update mechanism; trajectory violations test whether it tracks history or only final configuration
- –“Impossible threat” cases are the sharpest probe for relational structure, because the square occupancy can be fine while attack geometry is nonsense
- –Referential ambiguity is a separate axis: if probes commit to one knight, that suggests piece identity is encoded; if they preserve ambiguity, occupancy may dominate over object tracking
- –Strategic absurdity should mainly hit skill or move-prior estimates, which gives a useful control for separating tactical confusion from world-model collapse
DISCOVERED
2h ago
2026-04-16
PUBLISHED
6h ago
2026-04-16
RELEVANCE
AUTHOR
Infamous-Payment-164