REDDIT · REDDIT// 18d agoINFRASTRUCTURE

Together AI, Nebius drop serverless LoRA support

A LocalLLaMA user says Together AI and Nebius Token Factory no longer feel viable for infrequent custom LoRA hosting on Llama-3.1-8B-Instruct. Replies point to RunPod as the nearest serverless replacement, while OpenAI would mean switching to a different fine-tuning path rather than keeping the same adapters.

// ANALYSIS

Serverless LoRA is starting to look like a moving target, which is the wrong kind of uncertainty for production traffic. If the adapter matters, portability and runtime control matter more than the vendor brand.

–RunPod's serverless vLLM workers are OpenAI-compatible and explicitly support `LORA_MODULES`, so it fits the exact use case.
–Its container-based serverless model is easier to keep under your control than a vendor-managed LoRA SKU.
–OpenAI is a different lane: fine-tuning is useful, but it does not preserve the same bring-your-own-adapter workflow.
–The broader lesson is to treat niche inference features as ephemeral and keep an escape hatch.

// TAGS

together-ainebius-token-factoryrunpodopenaiinferencefine-tuningcloudllm

DISCOVERED

18d ago

2026-03-25

PUBLISHED

18d ago

2026-03-25

RELEVANCE

8/ 10

AUTHOR

New-Spell9053