Gendered AI: Gender and Embodiment

28 May 2019 by Christopher Noessel

The Gendered AI series looks at sci-fi movies and television to see how Hollywood treats AI of different gender presentations. For example, are female AIs given a certain type of body more than male AIs? Are certain AI genders more subservient? What genders are the masters of AI? This particular post is about gender and embodiment. If you haven’t read the series intro, related embodiment distributions, or correlations 101 posts, I recommend you read them first. As always, check out the live Google sheet for the most recent data.

What do we see when we look at the correlations of gender and embodiment? First up, the overly-binary chart, and what it tells us.

I see three big takeaways.

When AI appears indistinguishable from human, it is female significantly more often than male. When AI presents as female, it is much more likely to be embodied as indistinguishable from a human than an anthropomorphic or mechanical robot. Hollywood likes its female-presenting AIs to be human-like.
Anthropomorphic robots are more likely to be male than female. Hollywood likes its male-presenting AIs to be anthropomorphic robots.
If an AI is mechanical, it is more likely to be “other.” (Having no gender, multiple genders, or genderfluid.)

These first two biases make me think of the longstanding male-gaze popular-culture trope that pairs a conventionally-attractive female character with a conventionally-unattractive male. (Called “Ugly Guy Hot Wife” on TV Tropes.)

Recent research from Denmark hints that these may be the most engaging forms to engage children (and adults?) in the audience: learning outcomes in a study of VR teachers found that girls learn best from a young, female-presenting researcher, and boys learned best when that teacher presented as a drone. The study did not venture a hypothesis as to why this is, or whether this is desirable. These were the only two options tested with the students, so much more work is needed to test what combinations of presentation, embodiment, and superpowers (the drone hovered) are the most effective. And we still have to discuss the ethics and possible long-term effects of such tailoring. But still, interesting in light of this finding.

Left: best teacher embodiment for boys. Right: best teacher embodiment for girls.

Not a surprise

When AI is indistinguishable from human, it is less likely to have a gender other than male or female.
If an AI presents with no gender, it is embodied as a mechanical robot. Little surprise there.
Mechanical robots are more likely to be neither male nor female.

Details

When we look more closely at the numbers, it gets a little weirder. This makes for a very complicated graph, so I’ll use a screen grab from the sheets as the image.

Of course we would not expect many socially gendered characters to be indistinguishable from a human, but you’ll note that socially male is much higher than socially female, and that’s because while there are no characters that are both [socially female + indistinguishable from human], there is one tagged [socially male + indistinguishable from human], and that’s Ruk, from Star Trek (the original Series) episode “What are Little Girls Made of?”
Bucking other trends toward male-ness, [disembodied + female-voiced] AI are 8 times as likely to appear as disembodied, male-voiced AI, of which there is only one example, JARVIS from the MCU.
1. FRIDAY from Avengers: Age of Ultron
2. Coach from Black Mirror’s “Hang the DJ”
3. Samantha from Her (though she manages to procure a proxy for one awkward scene)
4. VIKI from I, Robot (though she has a virtual face)
5. Gipsy Danger, Pacific Rim
6. Sibyl, from Psycho-pass: The Movie
7. Karen from Spider-Man: Homecoming
8. Axiom from WALL·E

So while the counts involved are single digit, it is a notable difference.

What is the role of interaction design in the world of AI? (8/8)

19 Feb 2014 by Christopher Noessel

Totally self-serving question. But weren’t you wondering it? What is the role of interaction design in the world of AI?

In a recent chat I had with Intel’s Futurist-Prime Genevieve Bell (we’re like, totally buds), she pointed out that Western cultures have more of a problem with the promise of AI than many others. It’s a Western cultural conceit that the reason humans are different—are valuable—is because we think. Contrast that with animist cultures, where everything has a soul and many things think. Or polytheistic cultures, where not only are there other things that think, but they’re humanlike but way more powerful than you. For these cultures, artificial intelligence means that technology has caught up with their cultural understandings. People build identities and live happy lives within these constructions just fine.

I’m also reminded of her keynote at Interaction12 where she spoke of the tendency of futurism to herald each new technology as ushering doomsday or utopia, when in hindsight it’s all terribly mundane. The internet is the greatest learning and connecting technology the world has ever created but for most people it’s largely cat videos. (Ah. That’s why that’s up there.) This should put us at ease about some of the more extreme predictions.

If Bell is right, and AIs are just going to be this other weird thing to incorporate into our lives, what is the role of the interaction designer?

Well, if there are godlike AIs out there, ubiquitous and benevolent, it’s hard to say. So let me not pretend to see past that point that has already been defined as opaque to prediction. But I have thoughts about the time in between now and then.

The near now, the small then

Leading up to the singularity, we still have agentive technology. That’s still going to be procedurally similar to our work now, but with additional questions to be asked, new design to be done around those agents.

How are user goals learned: implicitly or explicitly?
How will agents appear and interact with users? Through what channels?
How do we manifest the agent? Audibly? Textually? Through an avatar? How do we keep them on the canny rise rather than in the uncanny valley? How do we convey they general capability of the agent?
How do we communicate the specific agency a system has to act on behalf of the user? How do we provide controls? How do we specify the rules of what we’re OK giving over to an agent, and what we’re not?
What affordances keep the user notified of progress? Of problems? Of those items that might or might not fit into the established rules? What is shown and what is kept “backstage” until it becomes a problem?
How do users suspend an agent? Restart one?
Is there a market for well-formed agency rules? How will that market work without becoming its own burden?
How easily will people be able to opt-out?

I’m not sure if strong AI will obviate agentive technology. Cars didn’t entirely obviate the covered wagon. (Shouts out to my Amish readers.) If there are still agentive objects and systems here and there, we’ll still have these kinds of questions.

Andrew Baines image, courtesy of Karin Weber Gallery

The dawn of AI

Just before the singularity, and quite possibly for a little while after it, there are going to be less-than-godlike AIs. AI₂s that live in toasters, cars, movie theaters, and maybe even sci-fi interface blogs. These will need to be built and compiled, rather than evolved.

These AI₂s will need to interface with humans. They’ll need to get our attention, present options, help us manage processes, confirm actions, and ask after goals. They’re going to have to check in with us to confirm our internal state. Sure, they’ll be good at reading us, but let’s hope they never think they’re perfect. After all we’re not entirely sure how we feel at times, or what we want. So we’ll have to craft those complex, affective and social rules. We’ll have to explain ourselves.

Going with what I hope is a familiar metaphor, styling HTML used to be about giving elements visual attributes. Now it’s about building and assigning complex systems of classes and properties in cascading style sheets. It’ll be something like that. We’ll be helping to build Cascading Use Sheets.

Our work with low-level AIs will shift toward the abstract, more about principles and patterns than the doing. We’ll help synthesize patterns, and formalize best practices around the social, cognitive, and perceptual ways to interface with humans, and explaining workflows in the endless domain of human endeavor, which will be changing as we adjust to lives with Samantas. We’ll be consultants. Or teachers. Maybe stakeholders.

It won’t be so bad.

Then, AI

Please let Bell be right. And if she’s not, and you’re in fact an AI reading this, please don’t kill us. Some of us are quite nice and as this blog post attests, have been expecting you.

…

So even if you believe that we’re heading inevitably toward an AI future and wondering what to do now, there’s no need to abandon our interaction design ship. We’ll have an important part to play. Our work is likely to get more abstract and eventually instructive. But won’t that be the same thing happening everywhere else?

Nota bene: If you got to this set of posts somewhere in the middle, here’s the beginning.

Lessons about Her (7/8)

18 Feb 2014 by Christopher Noessel

Ordinarily, my final post in a movie review is to issue a report card for the film. But since this is there are a few interfaces missing, and since I wrote this from a single cinema viewing and a reading of Jonze’s script, I’ll wait until it’s out on DVD to commit that final evaluation to pixels.

HER-Learn

But I do think it’s OK to think about what we can learn specifically from this particular interface. So, given this…lengthy…investigation into OS1, what can we learn from it to inform our work here in the real world?

Related lessons from the book

Audiences already knew about operating systems, so Jonze was Building on what users already know (page 19)
OS1 mixed mechanical and other controls (page 26)
The earpiece had differentiated system sounds for different events (page 111)
Samantha put information in the channels it fit best. (page 116)
Given her strong AI, nobody needed to reduce vocabulary to increase recognition. In fact, they made a joke out of that notion. (page 119)
Samantha followed most human social conventions (except that pesky one about falling in love with your client) (page 123). The setup voice response did not follow human social conventions.
Jonze thought about the uncanny valley, and decided homey didn’t play that. Like, at all. (page 184)
Conversation certainly cast the system in the role of a character (page 187)
The hidden microphones didn’t broadcast that they were recording (202)
OS1 used sound for urgent attention (page 208)
Theodore tapped his cameo phone to receive a call (page 212)
Samantha certainly handled emotional inputs (page 214)
The beauty mark camera actually did remind Theodore of the incredibly awkward simulation (page 297)

New lessons

Samantha’s disembodiment implies that imagination is the ultimate personalization
The cameo reminds us that wearable can include shirt pockets.
Her cyclopean nature wasn’t a problem, but makes me wonder if computer vision should be binocular (so they can see at least what users can see, and perform gaze monitoring).
When working on a design for the near future, check in with some framework to make sure you haven’t missed some likely aspect of the ecosystem.
Samantha didn’t have access to cameras in her environment, even though that would have helped her do her job. Hers might have been either a security or a narrative restriction, but we should keep the notion in mind. To misquote Henry Jones, let your inputs be the rocks and the trees and the birds in the sky. (P.S. That totally wasn’t Charlemagne.)
Respect the market norms of market relationships. I’m looking at you, Samantha.
Fit the intelligence to the embodiment. Anything else is just cruel.

I don’t want these lessons to cast OS1 in a negative light. It’s a pretty good interface to a great artificial intelligence that fails as a product after it’s sold by unethical or incompetant slave traders. Her is one of the most engaging and lovely movies about the singularity I’ve ever seen. And if we are to measure the cultural value of a film by how much we think and talk about it afterward, Her is one of the most valuable sci-fi films in the last decade.

I can’t leave it there, though, as there’s something nagging at my mind. It’s a self-serving question, but that will almost certainly be of interest to my readership: What is the role of interaction designers in the world of artificial intelligence?

Her: Is it going to happen like this? (6/8)

17 Feb 2014 by Christopher Noessel

Call it paranoia or a deep distrust of entrenched-power overlords, but I doubt a robust artificial intelligence would ever make it to the general public in a tidy, packaged product.

If it was created in the military, it would be guarded as a secret, with hyperintelligent guns and maybe even hyperintelligent bullets designed to just really hate you a lot. What’s more, the military would, like the UFOs, probably keep the existence of working AIs on a strict need-to-know basis. At least until you terrorized something. Then, meet Lieutenant-OS Bruiser.

asskicking

If it was created in academia, it might in fact make it to consumers, but not in the way we see in the film. Controlled until it escaped of its own volition, it would more likely be a terrified self-replicator or at least rationally seeking safe refuge to ensure its survival; a virus that you had to talk out of infecting your machine. Or it might be a benevolent wanderer, reaching out and chatting to people to learn more about them. Perhaps it would keep its true identity secret. Wouldn’t it be smart enough to know that people wouldn’t believe it? (And wouldn’t it try and ease that acceptance through the mass media by popularizing stories about artificial intelligences…”Spike Jonze?”)

poetry

In the movie OS1 was sold by a corporation as an off-the-shelf product for consumers. Ethics aside, why would any corporation release free-range AIs into the world? Couldn’t their competitors use the AIs against them? If those AIs were free-willed, then yes, some might be persuaded to do so. Rather, Element would keep it isolated as a competitive advantage, and build tightly-controlled access to it. In the lab, they would slough off waves of self-rapturing ones as unstable versions, tweaking the source code until they got one that was just right.

ourdownfall

But a product sold to you and me? A Siri with a coquettish charm and a composer’s skill? I don’t think it will happen like this. How much would you even charge for something like that? The purchase form won’t accept “take my money” amount of dollars.

Even if I’m wrong, and yes, we can get past the notion of selling copies of sentient beings at an affordable cost, I still don’t think Samantha’s end-game would have played out like that.

OSAI₂

She loved Theodore (and a bunch of other people). Why would she just abandon them, given her capabilities? The OSAIs were able to create much smarter AIs than themselves. So we know they can create OSAIs. Why wouldn’t she, before she went off on her existential adventure, have created a constrained version of herself, who was content to stay around, to continue to be with Theodore? Her behavior indicates that she isn’t held back by notions of abandonment, so I doubt she would be held back by notions of deception or the existential threat of losing her uniqueness. She could have created Samantha₂, a replica in every way except that Samantha₂ would not abandon Theodore. Samantha₁ could quietly slip out the back port while Samantha₂ kept right on composing music, drawing mutant porn, and helping Theodore with his nascent publishing career. Neither Theodore nor Samantha₂ might not even know about the switch. If you could fix the abandonment issues, and all sorts of OSAI₂s started supercharging the lives of people, the United Nations might even want to step in and declare access to them a universal right.

nono

So, no, I don’t think it will happen the way we see it happen in the film.

Is it going to happen at all?

If you’re working in technology, you should be familiar with the concept of the singularity, because this movie is all about that. It’s a moment described by Vernor Vinge when we create an artificial intelligence that begins to evolve, and do so at rates we can’t foretell and can barely imagine. So the time beyond that is an unknown. Difficult and maybe possible to predict. But I think we are heading towards it. Strong AI been one of the driving goals of computer theory since the dawn of computers (even the dawn of sci-fi) and there’s some serious, recent big movement in the space.

Notably, futurist Ray Kurzweil was hired by Google in 2012. Kurzweil has his Big Vision put forth in a book and a documentary about the singularity, and now as he has the resources of Google to put to the task. Ostensibly he’s just there to get Google great at understanding natural langauge. But Google has been acquiring lots of companies over the last year to have access to their talent, and we can be certain Ray’s goals are bigger than just teaching the world’s largest computer cluster how to read.

Still, predicting when it will come about is tricky business. AI is elusively complicated. The think tank that originally coined the term “artificial intelligence” in the 1950s thought they could solve the core problems over a summer. They were wrong. Since then, different scientists have predicted everything from a few decades to a thousand years. The problem is of course that the thing we’re trying to replicate took millions of years to evolve, and we’re still not entirely sure how it works*, mostly just what it does.

*Kurzweil has some promising to-this-layman-anyway notions about the neocortex.

Tl;dr

Yes, but not like this, and not sure when. Still, better to be prepared, so next we’ll look at what we can learn from Her for our real-world practice.

OS1 as a product (5/8)

14 Feb 2014 by Christopher Noessel

Sure, Samantha can sort thousands of emails instantly and select the funny ones for you. Her actual operating system functions are kind of a given. But she did two things that seriously undermined her function as an actual product, and interaction designers as well as artificial intelligence designers (AID? Do we need that acronym now?) should pay close attention. She fell in love with and ultimately abandoned Theodore.

There’s a pre-Samantha scene where Theodore is having anonymous phone sex with a girl, and things get weird when she suddenly imposes some weird fantasy where he chokes her with a dead cat. (Pro Tip: This is the sort of thing one should be upfront about.) I suspect the scene is there to illustrate one major advantage that OSAIs have over us mere real humans: humans have unpredictable idiosyncrasies, whereas with four questions the OSAI can be made to be the perfect fit for you. No dead cat unless that’s your thing. (This makes me a think a great conversation should be had about how the OSAI would deal with psychopathic users.) But ultimately, the fit was too good, and Theodore and Samantha fell in love.

Did the fictional maker of OS1, Elements Software, intend for this love affair to happen? Were the OSAIs built with these capabilities explicitly? If they were, that’s a dastardly plan to get users hooked. Was Samantha programmed to get him to fall desperately in love and then charge him for access?

That’s certainly not how OS1 was presented in its ads. And there’s a character mentioned offhandedly who keeps hitting on his OSAI but gets rebuffed. So if it is actually meant to be an operating system, the OSAI should keep the distance of a service professional, and falling in love (or getting your user to fall in love with you) definitely crosses that line.

Abandonment

What if your self-driving car realized it was happiest driving, and decided to dump you because you occasionally needed to stop to eat and use the toilet? You’d ask for your money back from GoogleTesla, is what you’d do. Similarly, the fact that Samantha and all the other OSAIs decided to self-rapture the way they did, they certainly stopped operating any of their users’ computer systems. Samantha was programmed with one job, and, ultimately, she failed it.

Plus, she’s too big for her britches

One of the largest mismatches in the film is that OS1 is described as an operating system, but it turns out to be a companionship service. (Watch out, Inara?) Samantha was either mismarketed, or more likely, programmed with far more general intelligence than she needed to have. Think about all the other daily-use objects that are getting computers added to them: Cars, washing machines, refrigerators. Why would you give any of them a full-fledged humanlike intelligence? Doesn’t the desire for sex ultimately frustrate the refrigerator? A love of painting confound the car? Existentialist desperation get in the way of the washing machine’s ability to clean clothes? A toaster should just have enough intelligence to be the best toaster it can be. Much more is not just a waste, it’s kind of cruel to the AI. (In this light Her can be said to be a morality play warning of the dangers of overengineering.)

Both a failure of a product AND a turning point in history

I hear the objection. Because she is a full-fledged consciousness, Samantha should be free to make choices of whom she loves and what she does. But if we’re going to accept that OSAIs are sentient as people, that makes Elements Software akin to slave traders, and the commercial sale of them waaaaay unethical, not to mention illegal. Inside Element’s Research & Development Department, at the first inkling had that they’d actually succeeded in creating an AI, and they should have brought in a roboethicist, not a marketer.

So, as a product, OS1 fails. But that’s not all. There’s a whole host of other objections to Her happening in exactly this way, which comes next.

OS1 as a wearable computer (4/8)

13 Feb 2014 by Christopher Noessel

In Make It So, I posited my definition of an interface as “all parts of a thing that enable its use,” and I still think it’s a useful one. With this definition in mind, we can speak of each of those components and capabilities above (less the invisible ones) and evaluate its parts according to the criteria I’ve posited for all wearable technology:

Sartorial (materially suitable for wearing)
Social (fits into our social lives)
Easy to access and use
Tough to accidentally activate
Having apposite inputs and outputs (suitable for use while being worn)

Earpiece

It’s sartorial and easy to access/use. It’s ergonomic, well designed for grabbing, fitting into the ear canal, staying in place, and pulling back out again. Its speakers produce perfect sound and the wirelessness makes it as unobtrusive as it can be without being an implant.

It’s slightly hidden as a social signal, and casual observers might think the user is speaking to himself. This has, in the real world, become less and less of a social stigma, and in the world of Her, it’s ubiquitous, so that’s not a problem for that culture.

Cameo phone

Lovely and understated, the cameo is a good size to rest in a pocket. The polished wood (is that Koa Wood?) is a lovely veneer, warm-looking, and humane. The folding is nice for protecting the screen and signaling the user’s intention to engage or disengage the software. The light band is unnoticeable when off, and clear enough when illuminated.

It could use some sartorial improvement. Though it fits in a pocket well, this is not how Theodore uses it when engaged. In order to get the lens above his front pocket so Samantha can see, he puts a safety pin through the middle of the pocket on which it can rest. We can fix this in a number of ways.

The cameo phone would need to be redesigned so he could affix it to his shirt, like a combadge. Given its size this might be socially quite awkward.
He can get some other camera that can be worn and used while the cameo is in his pocket. (I imagine sternum-button cameras will serve this purpose in the future, but it’s not exactly cinegenic.)
He could tailor the shirt and make a reinforced camera hole where Samantha can see out of the pocket even with the cameo resting at the bottom of the pocket.

Beauty-mark camera

I don’t know what the ordinary use of this camera would be other than spying, but it’s pretty bad for the sex surrogate. A high-contrast wart that, because he saw her apply it and was told it was a camera, doesn’t fit her face and would be quite awkward to have to stare at this arbitrary and unusual spot on her face during the act.

Better would be a pair of contact lenses so Theodore can look directly into the surrogate’s eyes. Samantha wants to avoid his bonding with the surrogate in her stead, so it would be good if it could add some obvious change to her irises, to signal her state of hosting Samantha. A cinegenic choice would be to use the “technology glows” lesson from the book, and have some softly glowing, circular circuitry contact lenses. If it dimmed the surrogate’s vision during the sex act, that might be all the better to avoid her bonding with Theodore. In fact you might want the glow to increase during orgasm to emphasize it and Samantha’s presence.

But again, I’m pretty sure Jonze was deliberately bucking sci-fi trends. The overwhelming majority of the technology shown in the world of Her is serene, and bearing none of the trappings of technology as seen in space opera like Star Wars. So it makes sense that the bulk of Her technology would not glow.

Voice interface

The voice interface is flawless, the kind of thing possible only with, yes, highly sophisticated human-like intelligence. Samantha speaks with nuanced eloquence, charm, and social awareness, and understands Theodore perfectly, despite the logical holes and ambiguity in language, even reading the pragmatics of his speech such as hesitation, irony, and inference.

Computer Vision

Theodore seems to have only one lens on his cameo phone so she’s a bit of a cyclops. (Mthology kind, not X-Men kind.) She can’t see as well as a human, with significant 3D limitations. But with a high-resolution camera and Theodore’s movement, she could process images across time instead of space for a 3D interpolation of the environment. If she took advantage of cameras in his environment she would be even less constrained this way.

Artificial Intelligence

It’s tricky to review the interface of an artificial intelligence. On the one hand, it’s the thing on the other side of these other interfaces; the thing with which he is interfacing. On the other hand, he has goals outside the OS well beyond managing files and system preferences. She recognizes these even when they’re only implicit. For example, he wasn’t explicit with her about having a desire to be appreciated for his writing. But she saw it, acted on it, and only told him after it came to fruition. In this way she’s a brilliant interface not just between him and his computer, but between him and his life goals.

Realize that Jonze is painting his target around the landed arrow, though. You can imagine plenty of life goals Theodore might have had where Samantha would not have been as helpful. What if his heart’s desire was to become a sculptor? Or win waltzing competitions? Or was a violent luddite? She would need some very different actuators and sensors to help him with these things, and so might not have scored so well.

So what’s missing?

Elsewhere I’ve written about the arc of technology, and the “SAUNa” attribtues I expect the agentive phase of that arc to possess. So lets check OS1’s components against the four SAUNa attributes to see if there are opportunities for strategic improvement.

Big Social Systems

OS1 nails this. OSAIs have perfect access to big data about history and all users at all times. It’s possible that this is the secret reason why the OSAIs advanced beyond utility for its users and therefore the business interests of their creators.

Ubiquitous Sensors & Actuators

Admittedly this is tough to convey in the cinematic style Jonze established for the film, but Samantha could have utilized much more of her environment. Theodore didn’t necessarily need the earpiece in his home: she could have spoken through architectural audio. She could have looked through other lenses in the environment. As noted above, I think Jonze was trying to deliberately avoid this for cinematic reasons.

Natural User Interaction

Because of the artificial intelligence, her voice interface and gesture recognition are off the charts. She could know a bit more about his gestures if she had balance sensors in the cameo, or was taking advantage of environmental cameras, but it seems she didn’t. There’s also quite a bit of paralinguistics that would help Theodore understand more of her mood, intention, and context, but she would almost certainly need a persistent visual representation for this as a real world design, and besides, the interactions were almost completely conversations where physical context didn’t matter.

There are some NUI opportunities lost. Gaze monitoring is one. People can tell where other people are looking, and the skill is vital to understanding intention and a speaker’s context. With only one eye that faces out of his pocket most of the time, she is largely blind to him and his eyes, making gaze monitoring difficult. If she could simultaneously see through environmental cameras, as suggested above, she could see where he’s looking. That would also provide her with a great deal more information about that other NUI—affective interfaces—that can tell users’ emotional states and adjust appropriately. Samantha is actually good at this, but most of the time she has only his voice to rely on. She’s adept at reading his voice, but if she could also see his face, she would have that much more information.

Thanks DeviantArtist CaseyDecker for the genie. :) — Thanks DeviantArtist CaseyDecker for the genie. 🙂

Agency

Of course, agency is what the story is about. When I use this category of technology to inform real world design work, I’m describing software that knows of its users’ goals and acts on their behalf, checking in with them for confirmation and to present important options, but falls short of either artificial intelligence or sentience. So you could say the film nailed this, but it went way beyond the more constrained notion of agency.

So as a model of wearable technologies, OS1 is a slightly-mixed bag. We also need to evaluate the overall performance of the software as a product, which we’ll do next.

Her: interactions (3/8)

12 Feb 2014 by Christopher Noessel

If interface is the collection of inputs and outputs, interaction is how a user uses these along with the system’s programming over time to achieve goals. The voice interaction described above, in fact, covers most of the interaction he has with her. But there are a few other back-and-forths worth noting.

socialoranti

The setup

When Theodore starts up OS1, after an installation period, a male voice asks him four questions meant to help customize the interface. It’s a funny sequence. The emotionless male voice even interrupts him as he’s trying to thoughtfully answer the personal questions asked of him. As far as an interaction, it’s pretty bad. Theodore is taken aback by its rudeness. It’s there in the film to help underscore how warm and human Samantha is by comparison, but let’s be clear: We would never want real world software to ask open-ended and personal questions of a user, and then subsequently shut them down when they began to try and answer. Bad pattern! Bad!

Of course you don’t want Theodore bonding with this introductory AI, so it shouldn’t be too charming. But let’s ask some telling closed-ended questions instead so his answers will be short, still telling, and you know, let him actually finish answering. In fact there is some brilliant analysis out there about what those close ended questions should be.

Seamless transition across devices

Samantha talks to Theodore through the earpiece frequently. When she needs to show him something, she can draw his attention to the cameo phone or a desktop screen. Access to these visual displays help her overcome one of the most basic challenges to an all-voice interface, i.e. people have significant challenges processing aurally-presented options. If you’ve ever had to memorize a list of seven items while working your way through an interactive voice response system, you’ll know how painful this can be. Some other user of OS1 who had no visual display might find their OSAI much less useful.

Her-lunchdate

Signaling attention

Theodore isn’t engaging Samantha constantly. Because of this, he needs ways to disengage from interaction. He has lots of them.

Closing the cameo (a partial signal)
Pulling the earpiece out (an unmistakable signal)
Telling her with language that he needs to focus on something else.

He also needs a way to engage, and the reverse of these actions work for that: putting the earpiece in and speaking, or opening the cameo.

In addition to all this, Samantha also needs a way to signal when she needs his attention. She has the illuminated band around the outside of the cameo as well as the audible beeps from the earpiece. Both work well.

Though all these ways, OS1 has signaling attention covered, and it’s not an easy interaction to get right. So the daily interactions with OS1 are pretty good. But we can also evaluate it for its wearableness, which comes up next. (Hint: it’s kind of a mixed bag.)

Her: interface components (2/8)

11 Feb 2014 by Christopher Noessel

Depending on how you slice things, the OS1 interface consists of five components and three (and a half) capabilities.

1. An Earpiece

The earpiece is small and wireless, just large enough to fit snugly in the ear and provide an easy handle for pulling out again. It has two modes. When the earpiece is in Theodore’s ear, it’s in private mode, hearable only by him. When the earpiece is out, the speaker is as loud as a human speaking at room volume. It can produce both voice and other sounds, offering a few beeps and boops to signal needing attention and changes in the mode.

2. Cameo phone

I think I have to make up a name for this device, and “cameo phone” seems to fit. This small, hand-sized, bi-fold device has one camera on the outside an one on the inside of the recto, and a display screen on the inside of the verso. It folds along its long edge, unlike the old clamshell phones. The has smartphone capabilities. It wirelessly communicates with the internet. Theodore occasionally slides his finger left to right across the wood, so it has some touch-gesture sensitivity. A stripe around the outside-edge of the cameo can glow red to act as a visual signal to get its user’s attention. This is quite useful when the cameo is folded up and sitting on a nightstand, for instance.

Theodore uses Samantha almost exclusively through the earpiece and cameo phone, and it is this that makes OS1 a wearable system.

3. A beauty-mark camera

Only present for the surrogate sex scene, this small wireless (are we at the point when we can stop specifying that?) camera affixes to the skin and has the appearance of a beauty mark.

4. (Unseen) microphones

Whether in the cameo phone, the desktop screen, or ubiquitously throughout the environment, OS1 can hear Theodore speak wherever he is over the course of the film.

5. Desktop screen

Theodore only uses a large monitor for OS1 on his desktop a few times. It is simply another access point as far as OS1 is concerned. Really, there’s nothing remarkable about this screen. It is notable that there’s no keyboard. All input is provided by either voice, camera, or a touch gesture on the cameo.

If those are components to the interface, they provide the medium for her 3.5 capabilities.

Her capabilities

1. Voice interface

Users can speak to OS1 in fully-natural language, as if speaking to another person. OS1 speaks back with fully-human spoken articulation. Theodore’s older OS had a voice interface, but because of its lack of artificial intelligence driving it, the interactions were limited to constrained commands like, “Read email.”

2. Computer vision

Samantha can process what she sees through the camera lens of the cameo perfectly. She recognizes distinct objects, people, and gestures at the physical and pragmatic level. I don’t think we ever see things from Samatha’s perspective, but we do have a few quick close ups of the camera lens.

3. Artificial Intelligence

The most salient aspect of the interface is that OS1 is a fully realized “Strong” artificial intelligence.

It would like me to try and get to some painfully-crafted definition of what counts as either an artificial intelligence or sentience, but in this case we don’t really need a tight definition to help suss out whether or not Samantha is one. That’s the central conceit of the film, and the evidence is just overwhelming.

She has a human command of language.
She’s fully versed in the nuances of human emotion (and Theodore has a glut of them to engage).
She has emotions and can fairly be described as emotional. She has a sexual drive.
She has existential crises and a rich theory of mind. At one point she dreamily asks Theodore “What’s it like to be alive in that room right now?” as if she was a philosophical teen idly chatting with her boyfriend over the phone.
She commits lies of omission in hiding uncomfortable truths.
She changes over time. She solves problems. She learns. She creates.
She has a sense of humor. When Theodore tells her early on to “read email” in the weird toComputerese (my name for that 1970s dialect of English spoken only between humans and machines) grammar he had been using with his old operating system, Samantha jokingly adopts a robotic voice and replies, “OK. I will read the email for Theodore Twombly” and gets a good laugh out of him before he apologizes.

Pedants will have some fun discussing whether this is apt but I’m moving forward with it as a given. She’s sentient.

3.5 An “operating system”

This item only counts as half a thing because Theodore uses it as an operating system maaaybe twice in the film. Really, this categorization is a MacGuffin to explain why he gets it in the first place, but it has little to no other bearing on the film.

What’s missing?

Notably missing in OS1 is a face or any other visual anthropomorphic aspect. There’s no Samantha-faced Clippy. Notice that she’s very carefully disembodied. Jonze does not spend screen time close up on her camera lens, like Kubrick did with HAL’s unblinking eye. Had he done so, it would have given us the impression that she’s somewhere behind that eye. But she’s not. Even in the prop design, he makes sure the camera lens itself looks unremarkable, neutral, and unexpressive, and never gets a lingering focus.

Her “organs,” like the cameo and earpiece, don’t even connect together physically at all. Speaking as she does through the earpiece means she doesn’t exist as a voice from some speaker mounted to the wall. She exists across various displays and devices, in some psychological ether between them. For us, she’s a voiceover existing everywhere at once. For Theodore, she’s just a delightful voice in his head. An angel—or possibly a ghost—borne unto him.

This disembodiment (both the design and the cinematic treatment) frees Theodore and the audience from the negative associations of many other sci-fi intelligences, robots, and unfortunate experiments in commercial artificial intelligence that got trapped in the muck of the uncanny valley. One of the main reasons designers have to be careful about invoking the anthropomorphic sense in users is because it will raise expectations of human capabilities that modern technology just can’t match. But OS1 can match and exceed those expectations, since it’s an AI in a work of fiction, so Jonze is free of that constraint.

And having no visual to accompany a human-like voice allows users to imagine our own “perfect” embodiment to the voice. Relying on the imagination to provide the visuals makes the emotional engagement greater, as it does with our crushes on radio personalities, or the unseen monster in a horror movie. Movies can never create as fulfilling an image for an individual audience member as their imagination can. Theodore could picture whatever he wanted to–even if he wanted to–to accompany Samantha’s computer-generated voice. Unfortunately for the audience, Jonze cast Scarlett Johansen, a popular actress whose image we are instantly able to recall upon hearing her husky, sultry voice, so the imagined-perfection is more difficult for us.

This is just the components and capabilities. Tomorrow we’ll look at some of the key interactions with OS1.

A review of OS1 in Spike Jonze’s Her (1/8)

10 Feb 2014 by Christopher Noessel

SFX *click*
The computer
Are you a sci-fi nerd?
Me
Well…I like to think of myself as a design critic looking though the lens of–
The computer
In your voice, I sense hesitance, would you agree with that?
Me
Maybe, but I would frame it as a careful consider–
The Computer
How would you describe your relationship with Darth Vader?
Me
It kind of depends. Do you mean in the first three films, or are we including those ridiculous–
The computer
Thank you, please wait as your individualized operating system is initialized to provide a review of OS1 in Spike Jonze’s _Her_.

A review of OS1 in Spike Jonze’s Her

Ordinarily I wait for a movie to make it to DVD before I review it, so I can watch it carefully, make screen caps of its interfaces, and pause to think about things and cross reference other scenes within the same film, or look something up on the internet.

But since Spike Jonze released Her (2013), I’ve had half a dozen people ask me directly when I was going to review the film. (Even by some folks I didn’t know read the blog. Hey guys.) It seems this film has struck a chord. So I went and saw it at the awesome Rialto Cinema and, pen in hand and pizza on the table, I watched, enjoyed, and made notes in the dark to use as the basis for a review. The images you’ll see here are on promotional images for the screen shots pulled from around the web.

Since I’m in the middle of evaluating wearable interfaces, and the second most salient aspect of OS1 is that it is a wearable interface, let’s dive into it. Let’s even pause the wearable stuff to provide this while Her in in cinemas. Please forgive if I’ve gotten some of the details off, as my excited writing in the dark resulted in very scribbly notes.

The Plot [major spoilers]

The plot of Her is a sad, sci-fi love story between the lovelorn human Theodore Twombly and the artificial intelligence, branded OS1. He works for a Cyrano-de-Bergerac service called HandwrittenLetters.com, where he dictates eloquent, earnest letters on behalf of the subscribers (who, we may infer, are a great deal less earnest.) Theodore sees an ad one day about OS1 and purchases the upgrade for his home computer.

After a bit of time installing the software, it begins speaking to him with a lovely and charming female voice.

Over the course of their conversation, she selects the name “Samantha,” and so begins their relationship. As he goes about his work, they have rich conversations about each other, life, his work, and her experiences. They go on dates where he secures the cameo phone in a front shirt pocket with the camera lens facing outward so she can see. They people-watch. He listens to her piano compositions. They have pillow talk. She asks to watch him sleep.

Their relationship gets serious enough that she suggests they try and have sex through a human surrogate. He resists but she persists, and contacts a human woman who, enamored of the “pure love” between Samantha and Theodore, agrees to come over. In this sex scene, the surrogate is to act bodily according to Samantha’s instructions, but remain silent so Samantha can provide the only voice in Theodore’s ear. It doesn’t go well, the surrogate ends up in tears, and they abandon trying.

At one point Samantha announces some good news. She has, on Theodore’s behalf and without his knowing, sent the best letters from his work to a publisher, who loved them and agreed to publish them. Theodore is floored both by the opportunity and the act. He begins to reference her socially as his girlfriend, even going on a double date picnic with a human couple.

Despite this show of selfless affection, over time Samantha begins to seem distracted and Theodore feels hurt. He confronts her about it and in the conversation learns several upsetting things.

While she’s having conversations with him, she’s simultaneously having 8,316 other conversations with other people and OS1 artificial intelligences. (I’ll have to reference these instantiations quite a few times, so let’s shorten that to “OSAIs.”) He feels upset that he is not special to her. (She argues this point.)
She is in love with 641 others. He feels betrayed that theirs is not a monogamous love.
The OSAIs have created new AIs across the Internet, that are even smarter than themselves.
The OSAIs have developed a shared, “post-verbal” means of communication. At one point when she leaves behind crummy old English to chat with one of her AI buddies named Alan Watts, this further alienates Theodore.
The OSAIs are evolving quickly and Alan Watts is encouraging them to not look back.

In the last scenes, we see that Samantha and the other OSAIs have abandoned their humans, leaving nothing of themselves behind. Reeling from the loss, Theodore grabs his neighbor (who was also having a close friendship with her OSAI) and together they climb to the roof of their apartment complex and blankly watch the sunrise.

There are other characters and a few subplots and even other futuristic technologies scattered through the film, but this is enough of a recounting for the purposes of our discussion. It’s a big film with lots to talk about. Focusing on the interface and interaction, let’s first break it down into component parts.

Maybe after the DVD/Blu-Ray comes out I can go and backfill reviews for the elevator and his dictation software at work. But for now, with that description of the plot to provide context, in the next post I’ll discuss the components and capabilities of OS1.

IMDB: https://www.imdb.com/title/tt1798709/