OPEN_SOURCE ↗
REDDIT · REDDIT// 2d agoBENCHMARK RESULT
TurboQuant Pro compresses BGE-M3 with PCA
TurboQuant Pro shows that a one-time PCA rotation can make non-Matryoshka embeddings far more truncation-friendly, with BGE-M3 cosine staying near-perfect at 512d and still strong at 128d. The project also benchmarks PCA-plus-quantization against scalar, binary, and PQ compression on a multilingual retrieval corpus.
// ANALYSIS
This is a strong reminder that embedding compression is often a basis-selection problem, not just a bit-budget problem. PCA is not a retrieval silver bullet, but it looks like a very solid linear baseline before moving to heavier codecs.
- –Naive tail truncation is a weak baseline for non-Matryoshka models; PCA makes the retained dimensions carry much more signal.
- –Cosine reconstruction alone is not enough for retrieval decisions, because Recall@10 degrades faster than similarity in the reported runs.
- –PCA plus low-bit quantization looks like a practical middle ground between scalar quantization and more aggressive binary or PQ compression.
- –The next questions are stability and generality: how much the PCA fit varies by corpus, language mix, and downstream ANN index behavior.
// TAGS
turboquant-proembeddingquantizationbenchmarkvector-dbopen-source
DISCOVERED
2d ago
2026-04-09
PUBLISHED
2d ago
2026-04-09
RELEVANCE
8/ 10
AUTHOR
ahbond