BACK_TO_FEEDAICRIER_2
ACM survey maps why AI-text detection keeps breaking
OPEN_SOURCE ↗
LOBSTERS · LOBSTERS// 40d agoNEWS

ACM survey maps why AI-text detection keeps breaking

This Communications of the ACM piece synthesizes the state of LLM-text detection, covering black-box classifiers, white-box watermarking, benchmark datasets, and adversarial attacks. Its core message is that headline accuracy numbers hide fragile real-world performance, especially under paraphrasing attacks, dataset bias, and low-false-positive constraints.

// ANALYSIS

Detection is maturing as a research field, but this survey makes clear that deployment-grade reliability still lags model progress.

  • Black-box detectors can perform well in-domain, yet often overfit artifacts in curated datasets and generalize poorly.
  • White-box watermarking improves provenance tracing but introduces tradeoffs in text quality and can be attacked with adaptive querying.
  • Paraphrasing remains a practical evasion path, showing that many current detectors are brittle against low-cost adversarial edits.
  • The authors argue evaluation should emphasize true-positive rates at very low false-positive rates, not just aggregate AUC/accuracy.
// TAGS
llmresearchsafetyai-codingthe-science-of-detecting-llm-generated-text

DISCOVERED

40d ago

2026-03-03

PUBLISHED

42d ago

2026-02-28

RELEVANCE

8/ 10