LaBSE Tops Armenian Retrieval, OpenAI Wins EN-RU

// 45d agoBENCHMARK RESULT

LaBSE Tops Armenian Retrieval, OpenAI Wins EN-RU

A benchmark of 19 embedding runs across 18 checkpoints on 245 trilingual EPG/title triplets and 783 abbreviation pairs found that LaBSE beat paid APIs on Armenian cross-lingual retrieval. OpenAI text-embedding-3-large led EN↔RU but dropped sharply on Armenian, suggesting retrieval metrics matter more than cosine alignment.

// ANALYSIS

The blunt takeaway: for low-resource, non-Latin scripts, the “best” embedding model is the one trained for retrieval on multilingual parallel data, not the newest or most expensive API.

–LaBSE ranked #1 on retrieval with R@1 0.834 and MRR 0.864, ahead of all paid APIs in the benchmark.
–OpenAI `text-embedding-3-large` did best on EN↔RU but fell to R@1 0.210 on EN↔HY and RU↔HY, showing poor transfer to Armenian.
–The post’s strongest point is the alignment-vs-retrieval split: some models look good on mean cosine yet fail at actual nearest-neighbor selection.
–`e5-large` and `e5-large-v2` are presented as “monolingual traps” for this use case, with inflated cosine and weak retrieval.
–Cohere `embed-v4.0` is reported to regress versus `embed-multilingual-v3.0` on this task, which is a useful warning against blind model upgrades.

// TAGS

embeddingsmultilingualcross-lingual-retrievallow-resource-languagearmenianepgopen-sourceopenaicoheresentence-transformers

DISCOVERED

45d ago

2026-04-24

PUBLISHED

45d ago

2026-04-24

RELEVANCE

9/ 10

AUTHOR

FigAltruistic2086

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

BENCHMARK10m ago

Structured docs boost coding agent precision

Mintlify published results showing that connecting Claude Code to structured documentation via MCP on a million-line monorepo improved answer precision by 64%. The experiment also demonstrated a 50% reduction in token consumption and a 1.5x speedup in task completion.

RESEARCH13m ago

Chris Potts introduces AI coding CPI

Stanford researcher Christopher Potts and Moritz Sudhof created a "Consumer Price Index" for AI coding using developer interaction data from the SWE-chat dataset. Examining Anthropic's Opus 4.6 model, they found that engineering output per token dropped by up to 85% in two months, though code survival rates rose from 90% to 95%.

UPDATE22m ago

Anthropic adds Claude connector observability, submission

Anthropic has launched a public beta of an observability dashboard and in-app submission portal for Model Context Protocol (MCP) connectors. Available in organization settings for Team and Enterprise admins, the dashboard tracks usage and latency across surfaces like Claude Code, while the portal simplifies directory submission.