ScreenLeak nears frontier on PII redaction
ScreenLeak benchmarks a local redaction stack for computer-use data, pairing a 278 MB text model with a small image detector and a trace-level leakage eval. The key claim is speed plus privacy: its text model runs offline at 9 ms p50 on CPU while landing near frontier APIs on synthetic PII-removal tests.
This is a solid niche benchmark, not just a flashy model claim. The important takeaway is that privacy redaction for screen telemetry looks like a solvable systems problem, but the hardest part is still behavior, not detection.
- –The text redaction result matters because it compares against desktop-telemetry PII, where generic DLP tools and regex baselines look weak
- –The image side reinforces a familiar pattern: frontier multimodal models can spot sensitive content, but specialized small detectors are better at tight localization
- –The trace benchmark is the caution flag: recognizing PII does not mean an agent will withhold it when summarizing what it saw
- –The strongest caveat is methodology: synthetic, in-distribution validation is useful, but it is still an upper bound rather than proof on messy real-world desktops
- –The Reddit response already shows the credibility test the project will face: people want reproducible weights or a Hugging Face link, not just benchmark charts
DISCOVERED
4h ago
2026-05-26
PUBLISHED
12h ago
2026-05-26
RELEVANCE
AUTHOR
louis3195