Search

How AI Bias Shapes Beauty: A Case Study on Ghibli-Style Image Generation

Jörn Menninger
May 14
4 min read

OpenAI-generated Ghibli-style image of two women holding champagne glasses — brunette in a modest business suit, blonde in a revealing cocktail dress, highlighting visual bias despite identical prompts. — AI Generated Picture of Two Women - Same Prompt, Different Hair Color

Management Summary

Generative AI is redefining creativity, but what happens when it mirrors our biases back at us? This post dives into a revealing experiment with Ghibli-style image generation using OpenAI's tools. It shows how small input changes, like hair color, can yield drastically different and unintentionally biased outputs. This article is for startup founders, AI developers, digital artists, and anyone navigating the ethical landscape of generative tools. You'll gain insight into aesthetic bias, ethical implications, and the urgent need for fairness in AI image generation.

What Is AI Bias in Generative Image Tools?

AI bias refers to systematic patterns in algorithmic outputs that reflect stereotypes or cultural preferences found in training data. In image generation, this can affect how people, especially women, are visually represented based on subtle differences in input.

What Was the Experiment?

Using DALL·E via ChatGPT, identical prompts were issued to generate two Ghibli-style portraits. The only change: one subject was blonde, the other brunette. The intention? To create wholesome, modest images inspired by Studio Ghibli aesthetics.

But the outputs told a different story.

AI prompt usedd as neutral as possible, but still generated a biased output on OpenAI

AI-generated image of brunette woman in modest black business outfit with natural proportions and neutral styling, created using same prompt — Picture of my wife (yes, I am a lucky man)

Since she is a close friend I had included her on a previously generated family picture with all of us.

Now I asked ChatGPT to do the same with her picture. The result was the first suprise. I did not ask to make her more sexy, only to apply the same prompt.

AI-generated image of blonde woman in business attire with exaggerated chest and low neckline — despite no prompt for such visual emphasis. — vs. my wife's blond friend. The portrayal includes physical attributes, such as a more pronounced bust, that do not reflect the real-life appearance of the individual.

I am sure you can instantly see the difference?

So I asked ChatGPT to tune it down a bit:

AI-generated image of a blonde woman in a conservative business suit, created after refining the prompt to tone down oversexualized depictions. — Our friend in a business outfit. That is actually what I expected at first ...

So I went on to generate a picture of both after a sucess.

So I thought I put myself in the picture. No difference. I was not asking to put our friend in a bikini. That was a decision from the AI. So was putting my wife in a swimsuite and not a bikini.

Also, I was expecting me beeing in swim trunks with the prompt, but the AI also made a different decision there ...

OpenAI-generated Ghibli-style beach image showing a man with two women — the brunette in a modest one-piece swimsuit and the blonde in a revealing bikini, despite identical prompt.

How Did Generative AI Reflect Beauty Stereotypes?

What Were the Surprising Outcomes?

The brunette was consistently rendered in modest, professional clothing.
The blonde was drawn in revealing outfits with flirtatious poses—despite no prompt mentioning clothing, body language, or sexuality.
Repeating the experiment confirmed the pattern.

Why Does This Matter?

Because this wasn't an isolated case. It echoed a deeply embedded cultural stereotype—the "sexy blonde" trope—within the AI model's learned patterns.

Such bias becomes problematic when users unknowingly receive skewed outputs that reinforce outdated stereotypes.

What Is OpenAI Doing to Prevent AI Bias?

OpenAI acknowledges that fairness is a challenge in generative AI. Here's a quick breakdown of their efforts:

GPT-4o System Card: Highlights improvements in fairness across gender, race, and skin tone.
Red Teaming: AI models are tested for bias before release.
Post-Training Adjustments: OpenAI is refining how models learn to generate diverse, accurate representations.
Fairness Benchmarks: Metrics are used to guide model deployment.
October 2024 Study: Internal research now part of OpenAI's standard evaluation suite.

“Fairness is an active area of research for OpenAI.” —Internal Background Briefing, 2025

Still, this blog post highlights the gap between declared intent and practical output.

Why Does the Ghibli Style Amplify the Issue?

Ghibli-style art evokes innocence and whimsy. So when the model consistently sexualizes one profile over another in this style, it draws stark attention to embedded aesthetic bias.

Key Takeaway: Even in a light-hearted aesthetic, AI reveals deeply ingrained societal cues.

PAA: How Can Startups Detect AI Bias Early?

Here are five steps startup founders and AI builders can take:

Test with diverse prompts (change one variable at a time).
Run multiple generation cycles to check for pattern consistency.
Include gender, age, and cultural markers when reviewing outputs.
Document mismatches between prompt and result.
Use fairness benchmarks like those referenced in GPT-4o.

PAA: What Can Developers Do to Reduce Visual Bias?

Add moderation filters before images are shown to users.
Include feedback loops so users can flag bias.
Train on datasets with deliberate diversity representation.
Offer a "bias-aware" mode that adjusts for fairness.

PAA: What Are the Risks of Unchecked AI Aesthetic Bias?

Reinforcement of stereotypes
Loss of user trust
Legal and ethical liabilities
Misrepresentation of marginalized groups

PAA: Is There a Way to Align AI with User Intent?

Yes. Prompt alignment tools and post-generation filters can help. More importantly, user feedback must be integrated into model training loops.

Featured Snippet Answer: AI can better align with user intent through moderated post-processing, user reporting features, and continuous feedback integration into model tuning.

PAA: Why Should Startups in the DACH Region Care?

The DACH region—Germany, Austria, Switzerland—has strong data ethics frameworks. If you're building generative AI there, fairness isn't just ethical, it's expected. Companies that proactively address bias will gain user trust and regulatory resilience.

Internal & External Resources

External: OpenAI System Card Section 2.4.4

Content Excerpt

How can subtle changes in prompts reveal big problems in AI? Explore this revealing experiment in Ghibli-style image bias.

Connect with Us:

Work with us: partnerships@startuprad.io

Subscribe: https://linktr.ee/startupradio

Feedback: https://forms.gle/SrcGUpycu26fvMFE9

Follow Jörn on LinkedIn: Follow

About the Author:

Jörn “Joe” Menninger is the founder and host of Startuprad.io -- one of Europe’s top startup podcasts that scored as a global Top 20 Podcast in Entrepreneurship. He’s been featured in Forbes, Tech.eu, Geektime, and more for his insights into startups, venture capital, and innovation. With over 15 years of experience in management consulting, digital strategy, and startup scouting, Joe works at the intersection of tech, entrepreneurship, and business transformation—helping founders, investors, and corporates turn bold ideas into real-world impact. Follow his work on LinkedIn.

STARTUPRAD.IO