Selective contrastive training trims hallucinations with 10% data

// 45d agoOPENSOURCE RELEASE

Selective contrastive training trims hallucinations with 10% data

This side project releases code for a selective contrastive post-training method that frames hallucination as a preference problem: a frozen base model generates a bad continuation, the training model compares it against the gold answer, and learning only happens when the preference margin is weak. The repo describes first-divergence loss masking, a gated objective, and benchmark gains that are presented as improved factuality with roughly 10% of the data.

// ANALYSIS

Strong idea, because it turns hallucination reduction into a targeted margin problem instead of brute-force full-data alignment.

–The self-generated negative sample is the right kind of hard negative for this problem: it is model-produced, task-matched, and cheap to obtain.
–The selective gate is the main efficiency win; if the reported results hold, it avoids wasting updates on already-separated cases.
–The approach is conceptually close to preference optimization, but with a more surgical loss window after first divergence.
–Main caveat: the evidence here is still self-reported project-level validation, so reproducibility and benchmark breadth matter more than the headline gain.

// TAGS

hallucination-mitigationllmcontrastive-learningpreference-optimizationpost-trainingfactualityopen-source

DISCOVERED

45d ago

2026-04-24

PUBLISHED

45d ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

Round_Apple2573

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO42m ago

Mastra outlines transition to structured specifications

In this episode of the AI Agents Hour, Mastra co-founders Shane Thomas and Abhi Aiyer explore shifting from manual prompting to structured, developer-first agent specifications. The discussion covers tool integrations like CodeRabbit for pull request reviews and Freestyle's forkable Linux VM sandboxes to build reliable, deterministic AI agents.

BENCHMARK1h ago

Cognition has introduced FrontierCode, a software engineering benchmark that measures the mergeability, quality, and maintainability of AI-generated code rather than simple functional correctness.

Cognition has launched FrontierCode, a new software engineering benchmark designed to evaluate whether AI-generated code meets the high standards of production-grade codebases. Developed in partnership with the maintainers of 36 flagship open-source repositories who spent over 40 hours per task, the benchmark consists of three nested subsets (Extended, Main, and Diamond) totaling 150 tasks. FrontierCode goes beyond classical unit tests to measure mergeability, style, regression safety, and scope using novel grading methods like reverse-classical testing (which ensures agent-written tests fail on the buggy base code) and adaptive classical grading with their new tool, mutagent. In initial testing, Claude Opus 4.8 led all models but only scored 13.4% on the hardest Diamond set, highlighting that frontier models still struggle significantly with writing production-ready, maintainable code.

MODEL1h ago

Claude Mythos is rumored to launch within the next two weeks, reportedly outperforming Opus 4.8 in coding and reasoning.

According to a post by BridgeMind AI on X, Anthropic's Claude Mythos model is set to release this week or next. In head-to-head comparisons with Opus 4.8, Mythos reportedly wins in almost every category, demonstrating extreme capabilities as a coding and reasoning model.