OPEN_SOURCE ↗
REDDIT · REDDIT// 20d agoBENCHMARK RESULT
SEFR enables fast SQL-only classification in BigQuery
Hamidreza Keshavarz's walkthrough shows how the SEFR classifier can be expressed as a single BigQuery SQL query for training and scoring. On fraud benchmarks, the SQL version trails logistic regression on AUC but is 18x faster because it avoids iterative optimization.
// ANALYSIS
Hot take: this is less about inventing a better classifier and more about matching the math to the warehouse engine.
- –The real win is operational: one query, no persisted model object, and no separate ML pipeline to manage.
- –SEFR's closed-form training maps cleanly to aggregations and joins, so BigQuery can spread the work across many slots.
- –Logistic regression still has the better ranking quality, so SEFR reads best as a fast baseline or ELT-native option, not a universal replacement.
- –It is strongest where auditability and simplicity matter more than squeezing out the last few AUC points, and weakest on nonlinear or multiclass problems.
// TAGS
bigquerysqlsefrclassifierbenchmarkfraud-detectionlogistic-regressiondata-engineering
DISCOVERED
20d ago
2026-03-23
PUBLISHED
20d ago
2026-03-22
RELEVANCE
8/ 10
AUTHOR
CriticalofReviewer2