UserHarness reframes Theory of Mind as user-mind reconstruction

// 45d agoRESEARCH PAPER

UserHarness reframes Theory of Mind as user-mind reconstruction

UserHarness is an inference-time framework for Theory-of-Mind tasks that models a user’s partial observations, evolving beliefs, intentions, and actions instead of inferring mental state indirectly. In the paper, the approach is evaluated across five benchmarks and reaches up to 95.94% macro accuracy, with reported gains of more than 15% relative over existing inference methods and about 20% relative over the strongest prompt-only harness.

// ANALYSIS

Hot take: this reads less like a clever prompting trick and more like a useful mental-model decomposition for agents that need to reason about what a user sees, thinks, and will do next.

–The main idea is strong because it makes the hidden state explicit: observations, beliefs, intentions, and actions.
–The benchmark story is compelling, especially the reported ceiling of 95.94% macro accuracy across five tasks.
–The framing suggests this could generalize beyond ToM benchmarks into agent assistants, planning, and user simulation.
–The claim is still paper-level evidence, so the real test is whether the abstraction holds on messier real-world interaction traces.

// TAGS

theory-of-minduser-modelingagentinference-timellmreasoningbenchmark

DISCOVERED

45d ago

2026-05-30

PUBLISHED

45d ago

2026-05-30

RELEVANCE

9/ 10

AUTHOR

Discover AI

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS51m ago

AI market shifts from benchmarks to utility

In the early stages of the AI boom, market dynamics were defined by a straightforward race to build the smartest model with the highest benchmark scores. However, as the ecosystem matures, raw computational power and peak capabilities are no longer the sole measures of success, meaning the most powerful AI models may not necessarily become the most important or widely adopted.

MODEL1h ago

GPT-5.6 retains reasoning context across turns

A key architectural detail has been revealed for OpenAI's new GPT-5.6 model family: unlike predecessor models that discarded Chain of Thought (CoT) context at each turn to save context window space, GPT-5.6 maintains its reasoning context across the entire conversation history. This change ensures that the model preserves its logical chain and intermediate reasoning steps throughout multi-turn interactions.

OPEN SOURCE4h ago

scroll-world launches scroll-driven 3D flight skill

scroll-world is an open-source, framework-agnostic agent skill that leverages Higgsfield to generate immersive, scroll-driven 3D camera flights through diorama scenes for landing pages. By rendering seamless connection clips between neighboring frames, it allows developers to build interactive 3D narrative websites navigated simply by scrolling, without requiring heavy game engines.