XGBoost retraining, fine-tuning debate tackles clickstream drift
A Reddit ML thread asks whether a daily e-commerce clickstream stack should retrain XGBoost models from scratch or keep extending them with fresh data. The practical answer hinges less on terminology than on drift, validation windows, and which parts of the system are truly online.
This is really a production-ML question disguised as a training-method question. For XGBoost, rolling retrains are usually the safer default; for the bandit layer, incremental updates still make sense.
- –XGBoost does support training continuation, but daily tree-stacking can make versioning and drift diagnosis messy.
- –The 30/90/180-day weighting already bakes in recency; the real work is backtesting window sizes against recent holdouts.
- –Retrain on schedule or when drift and performance metrics slip, not just because new data landed.
- –Keep Thompson sampling or LinUCB incremental, since those methods are built to absorb feedback online.
- –Transfer learning is mostly the wrong mental model here; this is about retrain cadence and monitoring, not model reuse in the neural-net sense.
DISCOVERED
60d ago
2026-03-28
PUBLISHED
62d ago
2026-03-27
RELEVANCE
AUTHOR
Bluem00n1o1