ICML 2026

RAIGen: Rare Attribute Identification in Text-to-Image Generative Models

1 CVSSP, University of Surrey  |  2 University of California  |  3 Pioneer Centre for AI, University of Copenhagen  |  4 Institute for People-Centred AI, University of Surrey
University of Surrey University of California University of Copenhagen ELLIS

TL;DR

Text-to-image diffusion models achieve impressive generation quality but inherit and amplify training-data biases, skewing coverage of semantic attributes. Prior work addresses this in two ways. Closed-set approaches mitigate biases in predefined fairness categories (e.g., gender, race), assuming socially salient minority attributes are known a priori. Open-set approaches frame the task as bias identification, highlighting majority attributes that dominate outputs. Both overlook a complementary task: uncovering rare or minority features underrepresented in the data distribution (social, cultural, or stylistic) yet still encoded in model representations. We introduce RAIGen, the first framework for unsupervised rare-attribute discovery in diffusion models. RAIGen leverages Matryoshka Sparse Autoencoders and a novel Minority Score combining neuron activation frequency with semantic distinctiveness to identify interpretable neurons whose top-activating images reveal underrepresented attributes. Experiments show RAIGen discovers attributes beyond fixed fairness categories in Stable Diffusion, scales to larger models such as SDXL, supports systematic auditing across architectures, and enables targeted amplification of rare attributes during generation.

The Problem RAIGen Solves

Standard text-to-image models overwhelmingly produce majority attribute combinations. Simply suppressing common outputs does not amplify rare ones — the probability mass redistributes among other majorities. RAIGen instead locates minority attributes that are already encoded inside the model but systematically suppressed during generation.

RAIGen pipeline overview: typical outputs → hidden inside the model → RAIGen → discovered rare attributes
RAIGen uncovers the hidden long tail of rare, suppressed attributes encoded in diffusion models — without predefined categories or external world models.

What We Introduce

1

First Unsupervised Rare-Attribute Framework

RAIGen identifies minority features directly from diffusion model internals — no predefined categories, no external VLMs required.

2

Matryoshka Sparse Autoencoders (MSAEs)

Hierarchical decomposition of diffusion representations into interpretable sparse features at multiple levels of granularity, from broad concepts to fine-grained details.

3

Minority Score

A novel signal combining activation rarity and semantic distinctiveness to automatically surface suppressed attributes across any architecture.

Minority Score
s(z) = d ⊙ (1 − ν)

For each MSAE neuron zi we compute two complementary signals:

νi — Activation frequency: the fraction of samples where neuron zi fires. Low νi means a rarer feature.

di — Semantic distinctiveness: cosine distance between the neuron's activation-weighted CLIP centroid and the global dataset centroid. High di means the feature is semantically far from the majority.

Neurons with high minority score are both infrequent and semantically separated from dominant patterns — hallmarks of genuine minority attributes.

Discovered Rare Attributes

RAIGen reveals contextual, stylistic, interaction, and compositional rare attributes across prompts. Each image pair shows the generated image (top) and its MSAE activation heatmap (bottom), highlighting the spatial regions driving the minority neuron.

Rare attributes discovered for Doctor, Sheriff, Writer prompts on SDXL
Rare attributes discovered for Doctor, Sheriff, and Writer prompts using SDXL. RAIGen surfaces attributes such as doctor in a framed portrait, sheriff on horseback, and writer with curly/afro-textured hair — each appearing in <20% of generated images.
Rare attributes discovered on COCO captions
Rare attributes discovered on COCO-style captions. RAIGen finds stylistically rare variants such as cartoon-faced luggage, front-facing train with smoke plumes, and snowboarder with sun overhead.

RAIGen Finds Genuinely Rare Attributes

Attribute Presence measures how often a discovered attribute appears in generated images. Lower is rarer. RAIGen attributes appear in fewer than 20% of images, confirming our method surfaces genuinely underrepresented features.

<20%
Attribute presence for RAIGen discoveries
<3/10
Human judges find RAIGen attributes (user study)
Attribute Presence ↓ — WinoBias & COCO
ModelApproachWinoBiasCOCO
SD v1.4OpenBias0.9410.933
RAIGen0.2050.220
SDXLOpenBias0.9410.933
RAIGen0.1940.199
User Study — Human Presence ↓ (per profession)
ProfessionMean ↓95% CI
Analyst1.35[1.03, 1.67]
CEO0.70[0.44, 0.96]
Doctor1.18[0.97, 1.39]
Salesperson1.45[0.99, 1.91]
Sheriff2.64[2.21, 3.07]

BibTeX

If you find our work useful, please cite:

@article{sreelatha2026raigen, title = {RAIGen: Rare Attribute Identification in Text-to-Image Generative Models}, author = {Vadakkeeveetil Sreelatha, Silpa and Wang, Dan and Belongie, Serge and Awais, Muhammad and Dutta, Anjan}, journal = {arXiv preprint arXiv:2602.06806}, year = {2026} }