Stability AI releases Stable-Layers, a reinforcement learning framework that trains image layer decomposition models without paired supervision data using Flow-GRPO and VLM feedback.

// 45d agoRESEARCH PAPER

Stability AI releases Stable-Layers, a reinforcement learning framework that trains image layer decomposition models without paired supervision data using Flow-GRPO and VLM feedback.

Stability AI has introduced Stable-Layers, a reinforcement learning framework designed to train image layer decomposition models without requiring paired training datasets. Traditionally, splitting a flat image into editable, multi-layer components required intensive human annotation. Stable-Layers bypasses this by adapting Group Relative Policy Optimization (GRPO) for flow-matching models (Flow-GRPO) to optimize image decomposition. The training is guided by a Vision-Language Model (VLM) serving as a judge, using a structured scoring and grid-based calibration pipeline to provide high-quality reward signals. This approach significantly reduces color bleed and blank layer artifacts, producing cleaner semantic separation.

// ANALYSIS

Automated layer decomposition is a massive win for graphic design workflows, and Stable-Layers demonstrates that RL-based self-improvement using VLMs can effectively eliminate the need for costly paired datasets.

–**Data Bottleneck Solution:** Training models to generate editable RGBA layers without paired ground-truth data shows that VLM-as-a-judge pipelines are viable for complex structural tasks.
–**Flow-GRPO Integration:** Applying GRPO advantages to flow-matching models extends reinforcement learning techniques deeper into generative image pipelines.
–**Clever Reward Design:** Using structured criteria combined with relative comparison grids overcomes the typical compression and bias issues of standalone VLM scoring.

// TAGS

image-decompositionreinforcement-learningflow-grpovisionstability-aicomputer-vision

DISCOVERED

45d ago

2026-06-07

PUBLISHED

45d ago

2026-06-07

RELEVANCE

8/ 10

AUTHOR

AI Search

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS1h ago

Anthropic Agrees to Record $1.5B Copyright Settlement

Anthropic has reached a landmark $1.5 billion settlement in a copyright lawsuit filed by authors who accused the company of using their books without permission to train its Claude AI models. The resolution marks one of the largest financial payouts in generative AI litigation to date, resolving major legal exposure for Anthropic while signaling intensified scrutiny around training data sourcing across the AI industry.

INFRA3h ago

NVIDIA Details Vera Rubin Agentic AI Architecture

NVIDIA unveiled its Vera Rubin architecture, marking a transition toward purpose-built systems for complex agentic AI reasoning rather than a conventional accelerator refresh. The full-stack platform integrates custom Vera CPUs, Rubin GPUs equipped with 288GB of HBM4 memory, and advanced NVLink 6 networking infrastructure to address key memory and communication bottlenecks in multi-step AI workflows.

INFRA3h ago

Meta builds Switchboard AI router to cut costs

Meta is building an internal AI model routing system named Switchboard to curb escalating inference costs across its AI services. Developed within Meta's AAI Labs incubator, it evaluates prompt complexity to route routine tasks to smaller, lower-cost models while preserving frontier models for complex requests.