Target-Based Prompting for Fair Generative AI Output

New research explores target-based prompting as a method to control demographic representation in generative AI models, raising questions about who defines fairness when synthesizing images of people.

Share
Target-Based Prompting for Fair Generative AI Output

A new research paper, Who Defines Fairness? Target-Based Prompting for Demographic Representation in Generative Models, tackles one of the thorniest problems in synthetic media: how to ensure generative AI systems produce outputs that reflect specified demographic distributions—and who gets to decide what those distributions should be.

As text-to-image and text-to-video models become embedded in advertising, entertainment, education, and journalism, the demographic skew of their outputs has become a recurring controversy. Models trained on web-scale data inherit the biases of their corpora, often overrepresenting certain genders, skin tones, ages, and cultural markers while underrepresenting others. The paper proposes a structured approach—target-based prompting—as a mechanism for shifting outputs toward explicit demographic targets at inference time, without retraining the underlying model.

The Core Idea: Targets as Specifications

Rather than relying on implicit assumptions about what "fair" or "diverse" generation should look like, target-based prompting requires the user—or a deploying organization—to specify a desired demographic distribution as part of the prompt construction process. The model is then steered, through prompt engineering, toward generating outputs that match that target distribution across a batch of generations.

This reframes the fairness question. Instead of asking "is this model biased?" the paper asks "biased relative to what target?" That shift is technically significant because it makes the fairness criterion explicit, auditable, and contestable. A model producing 90% light-skinned faces is biased relative to a uniform target but unbiased relative to, say, the demographic distribution of Norway. Whose target wins?

Technical Approach

The methodology involves augmenting prompts with demographic descriptors drawn from a defined target distribution. For each generation in a batch, a descriptor is sampled according to the target probabilities, and the resulting prompt is fed to the generative model. The aggregate output of the batch is then measured against the target using classifiers that estimate demographic attributes from generated images.

Key technical components include:

  • Target distribution specification: Explicit probability vectors over demographic categories such as perceived gender, skin tone, and age group.
  • Prompt augmentation strategies: Techniques for inserting demographic descriptors into prompts in ways that preserve the user's original creative intent.
  • Measurement via attribute classifiers: Automated estimation of demographic attributes in generated images, with the well-known caveat that such classifiers themselves carry bias and uncertainty.
  • Distance metrics: Comparison between observed output distributions and target distributions to quantify how closely the prompting strategy achieves its goal.

Why This Matters for Synthetic Media

For platforms deploying generative models at scale—stock image services, ad generation tools, avatar creation systems—target-based prompting offers a tractable alternative to fine-tuning or RLHF-based debiasing. It is cheaper, more transparent, and easier to update as norms or jurisdictions change. A platform serving multiple markets could apply different targets per region without maintaining separate models.

It also has implications for content authenticity and downstream detection. Synthetic media generated under explicit demographic targets leaves different statistical fingerprints than unconstrained generations. This could affect detection systems trained on the natural output distributions of base models, and it raises questions about how provenance metadata should reflect prompt-level interventions.

The Unresolved Question

The paper's title points to its central tension: who defines fairness? Target-based prompting solves the technical problem of hitting a specified distribution, but it does not solve the political problem of choosing one. Possible authorities include the model developer, the deploying platform, the end user, regulators, or community representatives. Each choice produces different outputs and different accountability structures.

Recent controversies around overcorrected image generation—where models produced historically inaccurate demographics in an attempt to broaden representation—illustrate the stakes. Target-based prompting makes those choices visible rather than hidden inside a model's weights, which is arguably an improvement in governance even when it does not resolve the underlying disagreements.

Implications for Builders

For teams building with diffusion models, video generators, or avatar systems, the takeaways are practical. Demographic outputs can be steered without retraining; targets should be specified explicitly and logged; attribute classifiers used for measurement should be validated for the populations involved; and any deployment should consider the legitimacy of whoever sets the target. As synthetic media regulation tightens in the EU, US states, and elsewhere, these design decisions will increasingly need to be defensible to auditors and the public.


Stay informed on AI video and digital authenticity. Follow Skrew AI News.