BACK_TO_FEEDAICRIER_2
OpenAI Privacy Filter Tops GLiNER on PII Eval
OPEN_SOURCE ↗
REDDIT · REDDIT// 1d agoBENCHMARK RESULT

OpenAI Privacy Filter Tops GLiNER on PII Eval

A 600-sample PII benchmark suggests OpenAI Privacy Filter outperforms GLiNER on boundary-overlap scoring, even though its strict exact-match score looks much worse because tokenizer offsets shift spans by one character. The result flips the usual intuition: the model is faster on CPU, but its true accuracy only shows up if you score spans more forgivingly.

// ANALYSIS

The big takeaway is that this is less a model-vs-model knockout than a reminder that eval methodology can swamp the headline numbers. If you score character spans the wrong way, you can make a strong redaction model look broken.

  • On strict exact match, GLiNER looks better; on boundary overlap, OpenAI Privacy Filter wins overall, which is the metric that better matches redaction workflows.
  • The reported gap is driven by tokenization and span reconstruction, not by the model failing to find the entities in the first place.
  • GLiNER still has a real advantage for custom schemas because you can pass arbitrary entity types at inference.
  • OpenAI Privacy Filter’s CPU throughput is materially better here, so it is the more practical option when you need fast local redaction at scale.
  • The threshold sweep matters: GLiNER at 0.7 beats the default 0.5, which means out-of-the-box comparisons can be misleading.
// TAGS
openai-privacy-filterglinerevaluationbenchmarkopen-weightsmoelocal-firstinferencellm

DISCOVERED

1d ago

2026-05-01

PUBLISHED

1d ago

2026-05-01

RELEVANCE

8/ 10

AUTHOR

gvij