The Gendered AI series looks at sci-fi movies and television to see how Hollywood treats AI of different gender presentations. For example, are female AIs given a certain type of body more than male AIs? Are certain AI genders more subservient? What genders are the masters of AI? This particular post is about gender and embodiment. If you haven’t read the series intro, related embodiment distributions, or correlations 101 posts, I recommend you read them first. As always, check out the live Google sheet for the most recent data.
What do we see when we look at the correlations of gender and embodiment? First up, the overly-binary chart, and what it tells us.
I see three big takeaways.
When AI appears indistinguishable from human, it is female significantly more often than male. When AI presents as female, it is much more likely to be embodied as indistinguishable from a human than an anthropomorphic or mechanical robot. Hollywood likes its female-presenting AIs to be human-like.
Anthropomorphic robots are more likely to be male than female. Hollywood likes its male-presenting AIs to be anthropomorphic robots.
If an AI is mechanical, it is more likely to be “other.” (Having no gender, multiple genders, or genderfluid.)
These first two biases make me think of the longstanding male-gaze popular-culture trope that pairs a conventionally-attractive female character with a conventionally-unattractive male. (Called “Ugly Guy Hot Wife” on TV Tropes.)
Recent research from Denmark hints that these may be the most engaging forms to engage children (and adults?) in the audience: learning outcomes in a study of VR teachers found that girls learn best from a young, female-presenting researcher, and boys learned best when that teacher presented as a drone. The study did not venture a hypothesis as to why this is, or whether this is desirable. These were the only two options tested with the students, so much more work is needed to test what combinations of presentation, embodiment, and superpowers (the drone hovered) are the most effective. And we still have to discuss the ethics and possible long-term effects of such tailoring. But still, interesting in light of this finding.
Not a surprise
When AI is indistinguishable from human, it is less likely to have a gender other than male or female.
If an AI presents with no gender, it is embodied as a mechanical robot. Little surprise there.
Mechanical robots are more likely to be neither male nor female.
When we look more closely at the numbers, it gets a little weirder. This makes for a very complicated graph, so I’ll use a screen grab from the sheets as the image.
Of course we would not expect many socially gendered characters to be indistinguishable from a human, but you’ll note that socially male is much higher than socially female, and that’s because while there are no characters that are both [socially female + indistinguishable from human], there is one tagged [socially male + indistinguishable from human], and that’s Ruk, from Star Trek (the original Series) episode “What are Little Girls Made of?”
Bucking other trends toward male-ness, [disembodied + female-voiced] AI are 8 times as likely to appear as disembodied, male-voiced AI, of which there is only one example, JARVIS from the MCU.
So the basic distributions (prior posts in the series) are fascinating themselves, but what brought us to this study is how those counts correlate. And while you could correlate any of these attributes (gender, embodiment, subservience, etc.) against any other, what follows is a measure of the correlation of gender to the other attributes.
In case you are not familiar with correlations, here’s the sci-fi interfaces “correlations 101”.
Ratios of values
Let’s say you have a group of 100 people, and you know their sex (simplified as male and female for this explanation) and their eye color (simplified again to green, blue, or brown). Let’s also say there’s a perfectly even ratio of attributes. Half are male and half are female. One-third of people have green, another third have blue, and the last third have brown eyes.
Correlations across attributes
The question of correlation goes something like this: When we meet a female in this group, what are the odds her eyes are brown?
In a perfect distribution of sex and eye color, you might expect ⅓ of women to have green eyes, ⅓ of women to have blue eyes, and ⅓ to have brown eyes. After all, ⅓ of (this imaginary) population does, and women are half of that, so, logically, ⅓ of them should have brown eyes. That would mean that for any of these females, the odds should be around 33% that their eyes are brown.
But if, looking at the data, you actually found that ⅔ of women had blue eyes and ⅓ of the women had green eyes, you would have a very imperfect distribution, and you would rightly wonder what was going on. Why do the guys have all the brown eyes? Is blue-eyed-ness somehow connected to being female? This would point at something weird going on, bearing further inquiry. What’s Up with Dudes Having all the Brown Eyes? Thank you for coming to my TED talk.
So that’s a basic explanation. Of course we don’t really care about eye color. But if you substitute eye color for, say, wealth, you can see why we might care about looking at correlations. If the top 33% of earners were all dudes, we’d try and suss out why the gross wealth inequality.
Now, circles and wedges make for easy pedagogical shapes, but they’re not that great for understanding the data, especially when it gets more complicated, say, with our 11 categories of sci-fi AI gender presentation. So instead of circular diagrams, instead I’ll use bar charts to show how far off from perfect each attribute is. In the case of the perfect distribution, the bars would be at zero, as on the lower left in the image above. It would be a very boring bar chart.
But in the case of the weird dudes-brown and ladies-blue scenario on the lower right, the bar charts for blue and brown would be correspondingly as far from zero as the chart will allow. The green attribute, since it was perfectly distributed in that example, still sits at zero. You’ll note though that if you added up all the blue values in the chart, they would sum to zero. The same for brown and green bars. If you cared to do a check of the data, this is one way you could check to see if it was valid.
Of course real world data rarely, if ever, looks this extreme and clean. It’s usually more nuanced, and needs careful reading. In the example below, females are overweighted for blue eyes and males overweighted for the other two. That bar chart would look like this.
Note that it’s important to read the scale on the left. We’re no longer looking at a 100-percent bars. The female-blue overweighting is only 16.67 percent. That would be significant, but not as significant as if it was peaked out at 100. So be sure and read the scales.
NOTE: If you’re not interested in the soundness of the methods, the rest of this post is going to be boring. But I need to lay out my methods to make sure I’m not doing my math wrong (if I was, we’d have to reconsider all the conclusions). I’ll also use as plain spoken language as I can in case you want to follow along. The good news is, it’s pretty simple math.
If we were working with floating-point values, then we might be able to do some fancy math called a Pearson correlation to measure correlations. I did this as part of the Untold AI study. But each of our variables in the Gendered AI study are categorical, more like eye color than weight. So I had to go about looking at correlations in a different way.
First I looked at simple counts for all combinations of attribute pairs. For example: There are 2 biologically female very good AI characters, and 3 biologically male very evil characters,…
Then I looked at the percentage of each value in its attribute. 7% of characters are very good, for example. 10% of characters are biologically female.
I performed a simple multiplication of the percentages of each value to understand what a perfect distribution would be for those value pairs. Given that 7% are very good and 10% are biologically female, if very goodness and biological femaleness were perfectly distributed, we would expect .7% of all characters to be very good and biologically female.
I then multiplied that times the number of characters in the survey, and came up with the number of characters we would expect to see with those two values. Given 327 characters, and an expected .7%, we would expect to see 2.289 characters in the survey with this combination. (Characters can’t have fractional attributes in my method, but I don’t round until the end.)
Next I subtracted the perfect distribution number from the actual number to come up with variance. A negative means we see less than we would expect. A positive means we see more than we expect.
I then translated those variance units to a percentage of the total number of characters. This lets us compare apples to apples across attribute pairs, regardless of size.
Finally I created some conditional formatting that showed the lowest number across the correlations as the darkest red, the highest number across the set as darkest green, zero as white, and everything in between on a scale between those three values. This allows us to look and at a glance see bias as color on a table. It’s not gorgeous infographics, but it is dense, effective data presentation.
In some cases it pays to compare the data as oversimplified binary gender counts (male, female, and other) and so you will find an aggregated table on the correlation page, that looks like this.
But of course there are detailed bias tables. They look like this.
Those can be hard to read, so in the posts, I instead present that data in the bar chart format that I showed way up at the top of this post.
This method is long, and tedious to recount, so rather than going through the chain for each correlation, I’ll just be showing tables when the comparison is interesting, showcasing the bar charts, and then talking about the results. You can see the whole chain, step by step, in the live Google sheet, right down to individual cell formulas. If you’re a data nerd, anyway.
Also, if you’re browsing the live sheet, you’ll see little black triangles in the upper right corner of some of the cells. These are “Notes” in the Google Sheet that show the exact examples. They take some processing, and so take a second or two to appear after you’ve changed the dropdown at the top.
So, for instance, if you wanted to know what examples were tagged as both “architectural” embodiment and “socially female” a rollover would reveal there are two: The city computer from Logan’s Run, and Deep Thought (pictured above). If there is not a note attached to a cell, that means there are no examples.
Data science people righty want to know if the bias we see can be attributed to all that random noise that happens in real life. One way to test for that is something called a Chi Square Test. Those tests are at the bottom of the sheet. If the results aren’t statistically significant, the results could be dismissed. But, per the results of these Chi Square tests, the correlation studies can not wholly be dismissed as noise.
So that’s a lot, but it was necessary set-up. On to the correlations themselves!
There are all of three AI characters who elect their gender presentation for some reason other than deception.
1In “The Offspring” episode of Star Trek: The Next Generation, Data builds an adult child named Lal. Data gives Lal the opportunity to pick their gender, and Lal picks female.
2Holly, the AI in Red Dwarf begins presenting male and after a bit reveals that she would rather present as female. Later, she is destroyed and rebuilt from an earlier copy, when the AI presents as male again, but notably, this was not Holly’s decision.
3The Machine, from Person of Interest (shout-out: it won the award for best representation of the AI science in the Untold AI series, and a personal favorite) chooses in the last season to adopt the voice of its main devotee, Root, who is female.
Though this is a very small sample inside our dataset, it is notable in light the male bias that AI characters show, by these examples,…
…when an AI chooses a gender presentation, it is always a female.
Not quite “picking a gender”
There are a handful of other times an AI winds up with a gender presentation that can not quite be said to be a matter of personal preference.
If you’re wondering about the Maschinenmensch from Metropolis, its gender is not a choice, but something assigned to it by the mad scientist Rotwang as part of a plot of deception.
If you’re thinking of Skynet, from the Terminator series, it has no presenting gender until Terminator Salvation. In that film the AI chooses to mimic a female character, Dr. Kogan, because “Calculations confirm Serena Kogan’s face is the easiest for [Marcus] to process.” It assures him that if he preferred someone different, Skynet could mimic another person. So this is not picking gender for an identity reason as much as a mask for efficacy.
Later in Terminator Genisys, Skynet is embodied as a man, the T-5000 known as “Alex,” but this appears to be the opportunistic colonization of an available body rather than a selection by the AI.
The Puppet Master from Ghost in the Shell is similarly an opportunistic colonization of a female cyborg. There might be some selection process in the choice of a victim, but that evidence is not on screen.
In Futurama, Bender has also opted several times to be female, but it is for the express purpose of getting something out of the deal, such as competing in the Robo-Olympics or to play a heel character in wrestling. By the end of each episode, he’s back to being his old self again.
If you know of additional or even counterexamples, let me know so I can add them to the database. But as of right now, the AI future looks female.
Where we are: To talk about how sci-fi AI attributes correlate, we first have to understand how their attributes are distributed. In the first distribution post, I presented the foundational distributions for sex and gender presentation across sci-fi AI. Today we’ll discuss categorically how intelligent the AI appears to be.
Where we are: To talk about how sci-fi AI attributes correlate, we first have to understand how their attributes are distributed. In the first distribution post, I presented the foundational distributions for sex and gender presentation across sci-fi AI. Today we’ll discuss goodness.
Goodness is a very crude estimation of how good or evil the AI seems to be. It’s wholly subjective, and as such it’s only useful patterns rather than ethical precision.
If you’re looking at the Google Sheet, note that I originally called it “alignment” because of old D&D vocabulary, but honestly it does not map well to that system at all.
Very good are AI characters that seem virtuous and whose motivations are altruistic. Wall·E is very good.
Somewhat good are characters who lean good, but whose goodness may be inherited from their master, or whose behavior occasionally is self-serving or other-damaging. JARVIS from Iron Man is somewhat good.
Neutral or mixed characters may be true to their principles but hostile to members of outgroups; or exhibit roughly-equal variations in motivations, care for others, and effects. Marvin from The Hitchhiker’s Guide to the Galaxy is neutral.
Somewhat evil characters are characters who lean evil, but whose evil may be inherited from their master, or whose behavior is occasionally altruistic or nurturing. A character who must obey another is limited to somewhat evil. David from Prometheus is somewhat evil.
Very evil are AI characters whose motivations are highly self-serving or destructive. Skynet from The Terminator series is very evil, given that whole multiple-time-traveling-attempts-at-genocide thing.
Though slightly more evil than good, it’s a roughly even split in the survey between evil, good, and neutral AI characters.
Where we are: To talk about how sci-fi AI attributes correlate, we first have to understand how their attributes are distributed. In the first distribution post, I presented the foundational distributions for sex and gender presentation across sci-fi AI. Today we’ll discuss how germane the AI character’s gender is germane to the plot of the story in which they appear.
Is the AI character’s gender germane to the plot? This aspect was tagged to test the question of whether characters are by default male, and only made female when there is some narrative reason for it. (Which would be shitty and objectifying.) To answer such a question we would first need to identify those characters that seemed to have the gender they do, and look at the sex ratio of what remains.
Example: A human is in love with an AI. This human is heteroromantic and male, so the AI “needs” to be female. (Samantha in Her by Spike Jonze, pictured below).
If we bypass examples like this, i.e. of characters that “need” a particular gender, the gender of those remaining ought to be, by exclusion, arbitrary. This set could be any gender. But what we see is far from arbitrary.
Before I get to the chart, two notes. First, let me say, I’m aware it’s a charged statement to say that any character’s gender is not germane. Given modern identity and gender politics, every character’s gender (or lack of, in the case of AI) is of interest to us, with this study being a fine and at-hand example. So to be clear, what I mean by not germane is that it is not germane to the plot. The gender could have been switched and say, only pronouns in the dialogue would need to change. This was tagged in three ways.
Not: Where the gender could be changed and the plot not affected at all. The gender of the AI vending machines in Red Dwarf is listed as not germane.
Slightly: Where there is a reason for the gender, such as having a romantic or sexual relation with another character who is interested in the gender of their partners. It is tagged as slightly germane if, with a few other changes in the narrative, a swap is possible. For instance, in the movie Her, you could change the OS to male, and by switching Theodore to a non-heterosexual male or a non-homosexual woman, the plot would work just fine. You’d just have to change the name to Him and make all the Powerpuff Girl fans needlessly giddy.
Highly: Where the plot would not work if the character was another sex or gender. Rachel gave birth between Blade Runner and Blade Runner 2049. Barring some new rule for the diegesis, this could not have happened if she was male, nor (spoiler) would she have died in childbirth, so 2049 could not have happened the way it did.
Second, note that this category went through a sea-change as I developed the study. At first, for instance, I tagged the Stepford Wives as Highly Germane, since the story is about forced gender roles of married women. My thinking was that historically, husbands have been the oppressors of wives far more than the other way around, so to change their gender is to invert the theme entirely. But I later let go of this attachment to purity of theme, since movies can be made about edge cases and even deplorable themes. My approval of their theme is immaterial.
So, the chart. Given those criteria, the gender of characters is not germane the overwhelming majority of the time.
At the time of writing, there are only six characters that are tagged as highly germane, four of which involve biological acts of reproduction. (And it would really only take a few lines of dialogue hinting at biotech to overcome this.)
A baby? But we’re both women.
Yes, but we’re machines, and not bound by the rules of humanity.
HIR lays her hand on XEM’s stomach.
HIR’s hand glows.
XEM looks at HIR in surprise.
Anyway, here are the four breeders.
David from Uncanny
Rachel from Blade Runner (who is revealed to have made a baby with Deckard in the sequel Blade Runner 2049)
Deckard from Blade Runner and Blade Runner 2049
Proteus IV from the disturbing Demon Seed
The last two highly germane are cases where a robot was given a gender in order to mimic a particular living person, and in each case that person is a woman.
Maria from Metropolis
Buffybot from Buffy the Vampire Slayer
I admit that I am only, say, 51% confident in tagging these as highly germane, since you could change the original character’s gender. But since this is such a small percentage of the total, and would not affect the original question of a “default” gender either way, I didn’t stress too much about finding some ironclad way to resolve this.
Where we are: To talk about how sci-fi AI attributes correlate, we first have to understand how their attributes are distributed. In the first distribution post, I presented the foundational distributions for sex and gender presentation across sci-fi AI. Today we’ll discuss subservience and free will.
The degree of free-willedness is tagged as subservience.
The majority of AIs are free-willed, that is, answerable only to their own conscience. Ultron from the Marvel Cinematic Universe is free-willed.
A large proportion answer to some master, but enjoy a wide berth in interpreting instructions, and can formulate new plans to achieve goals. There are tagged improvisational obedience. Gort from The Day the Earth Stood Still is one of these.
A few are tagged as slavish obedience, and will take no action unless ordered to do so and will only take the action instructed. Robbie the Robot in Forbidden Planet is slavishly obedient.
A small minority are bound to a master against their will. These characters are tagged with reluctant obedience. Ava from Ex Machina was only reluctantly obedient, and took great pains to escape.
Reinforcing the notion that from embodiment, the subservience of AI is the exception for these characters. Mostly they are as free-willed as people are. (Insert determinist counter-argument here.)
Where we are: To talk about how sci-fi AI attributes correlate, we first have to understand how their attributes are distributed. In the first distribution post, I presented the foundational distributions for sex and gender presentation across sci-fi AI. Today we’ll discuss the gender of the AI’s master.
In the prior post I shared the distributions for subservience. And while most sci-fi AI are free-willed, what about the rest? Those poor digital souls who are compelled to obey someone, someones, or some thing? What is the gender of their master?
Of course this becomes much more interesting when later we see the correlation against the gender of the AI, but the distribution is also interesting in and of itself. The gender options of this variable are the same as the options for the gender of the AI character, but the master may not be AI.
Before we get to the breakdown, this bears some notes, because the question of master is more complicated than it might first seem.
If a character is listed as free-willed, I set their master as N/A (Not Applicable). This may ring false in some cases. For example, the characters in Westworld can be shut down with near-field command signals, so they kind of have “masters.” But, if you asked the character themselves, they are completely free-willed and would smash those near-field signals to bits, given the chance. N/A is not shown in this chart because masterlessness does not make sense when looking at masters.
Similarly, there are AI characters listed as free-willed but whose “job” entails obedience to some superior; like BB-8 in the Star Wars diegesis, who is an astromech droid, and must obey a pilot. But since BB-8 is free to rebel and quit his job if he wants to, he is listed as free-willed and therefore has a master of N/A.
If a character had an obedience directive like, “obey humans,” the gender of the master is tagged as “Multiple.” Because Multiple would not help us understand a gender bias, it is not shown on the chart.
The Terminator robots were a tough call, since in the movies in which most of them appear, Skynet is their master, and it does not gain a gender until Terminator Salvation, when it appears on screen as a female. Later it infects a human body that is male in Terminator Genisys. Ultimately I tagged these characters as having a master of the gender particular to their movie. Up to Salvation it’s None. In Salvation it’s female, and in Genisys it’s male.
So, with those notes, here is the distribution. It’s another sausagefest.
Again, we see the masters are highly skewed male. This doesn’t distinguish between human male and AI male, which partly accounts for the high biologically male value compared to male. Note that sex ratios in Hollywood tend towards 2:1 male:female for actors, generally. So the 12:1 (aggregating sex) that we see here cannot be written off as a matter inherited from available roles. Hollywood tells us that men are masters.
The 12:1 sex ratio cannot be written off as a matter inherited from available roles. It’s something more.
Oh, and it’s not a mistake in the data, there are nosocially female AI characters who are masters of another AI of any gender presentation. That leaves us with 5 female masters, countable on one hand, and the first two can be dismissed as a technicality, since these were identities adopted by Skynet as a matter of convenience.
Skynet-as-Kogan is master of John, the T-3000, from Terminator Genisys
Skynet-as-Kogan is master of the T-5000 from Terminator Genisys
Barbarella is master of Alphy from Barbarella
VIKI is master of the NS-5 robots from I, Robot
Martha is master of Ash in Black Mirror, “Be Right Back”