REDDIT · REDDIT// 3h agoINFRASTRUCTURE

I Is Not Singular builds 4 LoRA agents

A solo builder used qwen3:8b, llama.cpp multi-LoRA hot-swap, and per-agent QLoRA to make four local LLM agents diverge on a single RTX 3070 8GB. The big result is that separate adapters preserved distinct personalities where one shared LoRA had flattened them.

// ANALYSIS

This is a strong proof-of-concept for weight-level personalization on consumer hardware, not just prompt cosplay.

–Per-agent LoRA solved the “majority persona wins” problem that showed up with a single shared adapter
–The memory math is the real enabler: Q4_K_M base plus four small adapters stays inside 8GB, and training still fits
–The split between persona LLM and shared inner modules is the most interesting part; it gives a cleaner model of identity than prompt stacks do
–The sleep-cycle retraining loop makes the agents’ evolution observable and grounded in their own experience instead of external steering
–The practical stack notes matter too: CUDA-built llama.cpp, hot-swappable adapters, and multilingual cleanup are the difference between a demo and a working system

// TAGS

i-is-not-singularllmagentlorafine-tuninggpuself-hosted

DISCOVERED

3h ago

2026-05-01

PUBLISHED

5h ago

2026-05-01

RELEVANCE

8/ 10

AUTHOR

Vivid-Usual237