BACK_TO_FEEDAICRIER_2
Together AI, Nebius drop serverless LoRA support
OPEN_SOURCE ↗
REDDIT · REDDIT// 18d agoINFRASTRUCTURE

Together AI, Nebius drop serverless LoRA support

A LocalLLaMA user says Together AI and Nebius Token Factory no longer feel viable for infrequent custom LoRA hosting on Llama-3.1-8B-Instruct. Replies point to RunPod as the nearest serverless replacement, while OpenAI would mean switching to a different fine-tuning path rather than keeping the same adapters.

// ANALYSIS

Serverless LoRA is starting to look like a moving target, which is the wrong kind of uncertainty for production traffic. If the adapter matters, portability and runtime control matter more than the vendor brand.

  • RunPod's serverless vLLM workers are OpenAI-compatible and explicitly support `LORA_MODULES`, so it fits the exact use case.
  • Its container-based serverless model is easier to keep under your control than a vendor-managed LoRA SKU.
  • OpenAI is a different lane: fine-tuning is useful, but it does not preserve the same bring-your-own-adapter workflow.
  • The broader lesson is to treat niche inference features as ephemeral and keep an escape hatch.
// TAGS
together-ainebius-token-factoryrunpodopenaiinferencefine-tuningcloudllm

DISCOVERED

18d ago

2026-03-25

PUBLISHED

18d ago

2026-03-25

RELEVANCE

8/ 10

AUTHOR

New-Spell9053