REDDIT · REDDIT// 3h agoRESEARCH PAPER

Rorschach tests expose LLM contamination limits

The paper tests GPT-4o, Grok-3, and Gemini 2.0 Flash Thinking on the 10 standard Rorschach cards, then codes their outputs with Exner-style metrics and an AI self-analysis step. As an exploratory demo, it is interesting; as evidence about genuine perception, it is methodologically thin because the stimuli, scoring logic, and many “expected” interpretations are almost certainly part of the models’ training exposure.

// ANALYSIS

Hot take: this is much better at revealing how multimodal models reproduce culturally saturated response patterns than at revealing anything clean about visual cognition.

–The strongest objection is contamination: the standard cards, manuals, and common interpretations are public and likely embedded in training data, so the task is partly retrieval and pattern completion, not fresh perception.
–The setup is under-controlled for a serious inference: public web interfaces, default vendor behavior, and apparently single-pass administration make the outputs hard to interpret causally.
–Even so, the paper has limited exploratory value: it shows that LLMs can generate coherent, human-like narratives around ambiguous images and can be scored by psychometric frameworks, which is itself a useful stress test.
–The scientifically stronger version would use novel or synthetic ambiguous stimuli, multiple repeated trials, API-level decoding control, preregistered coding, and baselines that separate image understanding from memorized psychometric language.
–What it mostly demonstrates is advanced statistical pattern matching plus learned psychometric priors, not evidence that the models have an “inner world” in any human sense.
–Papers like this get through peer review when they are framed as pilot or hypothesis-generating work, but the interpretation often outruns the evidential strength of the design.

// TAGS

llmrorschachpsychologymultimodal-aitraining-data-contaminationprojective-testspeer-reviewai-safety

DISCOVERED

3h ago

2026-04-28

PUBLISHED

3h ago

2026-04-28

RELEVANCE

7/ 10

AUTHOR

Impossible_Echo4029