Qwen3.5-35B-A3B turns photos into rough 3D scenes
A Reddit user demoed Qwen3.5-35B-A3B generating walkable HTML 3D scenes from photos using llama.cpp and a Q4 quant, then shared the results on YouTube. It is clearly an experimental community showcase rather than a product launch, but it highlights how far open multimodal models have pushed into lightweight spatial reasoning and scene reconstruction.
This is messy, impractical, and exactly the kind of hack that hints at where open models are headed next.
- –The interesting part is not visual polish but the model’s ability to infer depth, layout, and object placement from a single image
- –Running it through llama.cpp with a quantized 35B-class model makes the demo more notable for local AI builders than a cloud-only proof of concept would be
- –Outputting HTML 3D scenes instead of a proprietary format suggests a low-friction path for browser demos, agent environments, and synthetic scene prototyping
- –It is still far from production-ready, but it points toward multimodal models becoming useful front ends for lightweight 3D authoring workflows
DISCOVERED
91d ago
2026-03-10
PUBLISHED
95d ago
2026-03-06
RELEVANCE
AUTHOR
c64z86
