Talkie Trains 13B Model on 1930 Text
Talkie is a 13B "vintage" language model released in April 2026 and trained on 260B tokens of pre-1931 English, with both a base model and an instruction-tuned chat version. The project is meant to study how era-frozen models preserve historical knowledge, style, bias, and contamination-free behavior.
This is more research probe than product launch, but it is a strong one: by cutting training off at 1930, Talkie gives researchers a clean way to measure what modern web data changes in an LLM.
- –The cutoff makes the model useful for studying temporal knowledge, anachronism, and what a model "knows" when the future is removed.
- –Historical OCR noise and leakage detection are the hard problems here, and they matter as much as scale for whether the experiment is trustworthy.
- –The instruction-tuned checkpoint is interesting because it tries to preserve period-appropriate voice without falling back to modern assistant habits.
- –Its real value is comparative research against modern twins, not frontier benchmark performance.
DISCOVERED
45d ago
2026-04-28
PUBLISHED
45d ago
2026-04-28
RELEVANCE
AUTHOR
VolumeTechnician