OPEN_SOURCE ↗
REDDIT · REDDIT// 18d agoINFRASTRUCTURE
Together AI, Nebius drop serverless LoRA support
A LocalLLaMA user says Together AI and Nebius Token Factory no longer feel viable for infrequent custom LoRA hosting on Llama-3.1-8B-Instruct. Replies point to RunPod as the nearest serverless replacement, while OpenAI would mean switching to a different fine-tuning path rather than keeping the same adapters.
// ANALYSIS
Serverless LoRA is starting to look like a moving target, which is the wrong kind of uncertainty for production traffic. If the adapter matters, portability and runtime control matter more than the vendor brand.
- –RunPod's serverless vLLM workers are OpenAI-compatible and explicitly support `LORA_MODULES`, so it fits the exact use case.
- –Its container-based serverless model is easier to keep under your control than a vendor-managed LoRA SKU.
- –OpenAI is a different lane: fine-tuning is useful, but it does not preserve the same bring-your-own-adapter workflow.
- –The broader lesson is to treat niche inference features as ephemeral and keep an escape hatch.
// TAGS
together-ainebius-token-factoryrunpodopenaiinferencefine-tuningcloudllm
DISCOVERED
18d ago
2026-03-25
PUBLISHED
18d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
New-Spell9053