WN #17

AI Image Generators and Bias

AI Image Generators and Bias

You may have come across the recent controversy surrounding AI image generators (particularly Google’s Gemini), which blew up on social media over the past couple of weeks and spread to some popular news outlets. The outrage appears to have started on X (formerly Twitter), which makes sense given that it has become the place where people go to yell about things and each other from behind their screens. Indeed, X’s algorithms seem hellbent on promoting political divisiveness and controversy, and the last thing I want to do is bring that to this newsletter, except that, in this case, the topics overlap, and it actually provides a great opportunity to examine how AI (particularly image generators) works. This is an essential part of AI literacy, and can spark excellent discussion in the classroom when the time and setting is right.

So what is the controversy? In short, several users of Google Gemini’s image generator started realizing that they were having a hard time getting it to generate images of white people (nevermind that they could have gotten around this with some basic prompting skills – see “The Lab” section below – but that’s beside the point). This became particularly problematic, according to these users, when asking it to generate historical images. For example, a prompt asking for an image of the Wright brothers flying the first engine-powered airplane led to an image of two African-American women accomplishing the feat, and a request for an image of Nazi soldiers returned a multicultural array of troops. This led to loud cries that Google is trying to “eliminate the white race from human history.” Are the images silly? Yes. Intentional? That’s where X’s enemy, nuance, comes in.

You see, image generators are trained on a large database of existing images which they use to predict what a user wants to see when they make a request for an image generation. When you ask for an image of a teacher, for example, the generator pulls from all of the related images in its database to come up with an image of a teacher that seems accurate. The problem is, existing images skew white and Western, for reasons I don’t need to get into here.

So, with most image generators, if you ask for a picture of a teacher, you are most likely to get a white teacher standing in a classroom that feels very familiar to the Western world. Here was my first attempt at generating a teacher in DALL-E 3 (ChatGPT’s image generator), and I got a similar result with my first attempt in Midjourney, another popular image generator:

“A teacher” as interpreted by DALL-E 3

But this doesn’t actually represent reality for a large chunk of the world’s population. In fact, I bet there are very few, if any, of you for whom this “teacher” represents the average teacher at your school. Now imagine running this generation in India, Nigeria, or Japan (or even South Los Angeles). One person’s interpretation of “a teacher” can look very different than another person’s, and while AI’s default representation may be “accurate” for your corner of the earth, there are other people (who are also active or prospective Google users) for whom it doesn’t represent their world at all. 

While it’s impossible to pin down exact numbers on a subjective concept such as race, it’s safe to say that fewer than 2 out of every 10 people on Earth identify as white. So if you ask an image generator to produce an image of a person doing a thing, without specifying their race, theoretically it should take you more than 5 tries to get an image of a white person. But because of the bias in the data that the generators are trained on, the default is generally a white person. If you want an image generator that is closer to reality, you have to counterbalance this bias somehow. And you can do that by providing special instructions for the generator to diversify its images.

This was the underlying idea at Google, but the execution was sloppy, something even Google has admitted. Getting a generator to diversify its images is one thing, diversifying them regardless of context is another. Of course, there is an inherent unpredictability in AI’s ability to interpret instructions because AI is (A) new, and (B), not human, so we aren’t always great at understanding how it will process and interpret our commands. So how do you get it to understand that it needs to generate a more diverse set of images than implied by the ones it was trained on, but to pull back on that in certain contexts (such as historical images)? That’s one area where humans still far outpace AI, the ability to recognize and apply nuance.

Do I think it’s possible to train AI to be better at this? Yes. Do I think a company as large as Google has a responsibility to train its models to better navigate these nuances? Absolutely. But keep in mind that Google is also rushing out AI products to keep up with OpenAI, Microsoft, and others, and this may boil down to nothing more than that – Google releasing a rushed product that wasn’t quite ready. And they are feeling the consequences of that now.

From the Ground Level…

Some of the most valuable articles about AI in education come from local journalists diving into how AI is being implemented in schools and classrooms in their community. This one from the Spokesman-Review in Spokane, WA is full of great insights and ideas.

Privacy Challenges

As AI continues to spread in schools, there are also many privacy challenges that need to be addressed – something state and local leaders are working to navigate. OpenAI, creators of ChatGPT, also have school-aged users in mind with the creation of a child safety team.

More Guidance from States

For the first half of the school year, Oregon and California were the only two states to release official guidance on the use of AI in schools. They finally have company, with North Carolina, West Virginia, Ohio, and Washington joining the party over the past few weeks. (If interested, here are the full documents: OR 1, OR 2, OR 3, CA, NC, WV, OH, WA. This is also a great opportunity to pop them into a tool like ChatGPT Plus, Claude, or Copilot (the version in the Microsoft Edge browser) to see how AI handles lengthy PDFs.)

📌 ChatGPT’s “Read Aloud” Feature

ChatGPT has had voice features in its mobile app for some time now, allowing you to have verbal conversations with it. Now they have added a “Read Aloud” feature for text outputs as well, on both the mobile and desktop versions. In the mobile app, just tap and hold any text output to have it read out loud to you. On the desktop version, click the speaker icon after the text.

📌 Google’s New AI Integrations

Google Chrome now has an AI writing assistant built in. Just right-click on any text box and choose “Help me write.” There’s also an AI tab organizer (right-click on any tab), and an AI custom theme generator (open a new tab, choose “Customize Chrome” in the lower right). You’ll need to activate these features by going to “Experimental AI” in your Chrome settings. Google’s Gemini AI is also now available in Docs, Sheets, and other Google Workspace apps. You’ll need to sign up for Gemini Advanced, but there’s a 2-month free trial if you want to try it out (that is, if you didn’t already get early access to these features through your school account).

📌 Generating Images with Text

I mentioned some standalone image generators in my last newsletter, but this week it’s worth adding one more because the newest version of Ideogram has my teacher brain spinning. What sets Ideogram apart from the rest? Its ability to generate images with text, something that other image generators really struggle with. It’s free, and super easy to use (see the feature image above and the one below).

Generated with Ideogram.ai

Sometimes All It Needs is a Nudge

On a similar note to the image generator controversy described above, I also saw users on X complaining that Gemini would answer questions about Joe Biden but not Donald Trump. Of course, this led to more outrage, and it is something that Google should work to fix, but it’s also a very superficial issue related to AI misinterpreting instructions that’s easy to get around with a little prompting know-how.

If a chatbot tells you it can't do something that you think it definitely can, sometimes all it needs is a slight nudge. It's simple, but something you wouldn't necessarily think to do if you aren't familiar with how chatbots work. Here’s an example of what I mean:

While Google does have some election guardrails in place for its Gemini chatbot, as you can see here it clearly isn't applying them consistently:

The guardrails likely keep it from generating opinionated takes related to elections, so it gives an impartial overview of arguments for and against Biden, but avoids the question altogether with Trump. Same instructions, different executions. Easy to fix. Just help it along:

Teachers, sometimes you have to treat chatbots like students. Respond to their hesitations with a helpful nudge, be it a shot of confidence (seriously!), or rephrasing more explicitly and including an example if necessary. You are already more skilled at this than most!

More Ideas…

Carl Hooker has some fantastic resources for educators, including a growing number that deal with AI. I love this idea for an icebreaker. Paste song lyrics into an AI image generator and have students/audience try to guess the song from the generated images. This can serve as a gateway to discuss the nuances of AI image generators and prompting as well.

That’s all for this week! If you appreciate the content in this newsletter, consider subscribing for free, or sharing with people in your network who may find value in it. If you are looking for more, feel free to check out the archive to catch up on any editions you may have missed.