Deckard’s Elevator

3 Mar 2020 by Christopher Noessel

This is one of those interactions that happens over a few seconds in the movie, but turns out to be quite deep—and broken—on inspection.

When Deckard enters his building’s dark, padded elevator, a flat voice announces, “Voice print identification. Your floor number, please.” He presses a dark panel, which lights up in response. He presses the 9 and 7 keys on a keypad there as he says, “Deckard. 97.” The voice immediately responds, “97. Thank you.” As the elevator moves, the interface confirms the direction of travel with gentle rising tones that correspond to the floor numbers (mod 10), which are shown rising up a 7-segment LED display. We see a green projection of the floor numbers cross Deckard’s face for a bit until, exhausted, he leans against the wall and out of the projection. When he gets to his floor, the door opens and the panel goes dark.

A need for speed

An aside: To make 97 floors in 20 seconds you have to be traveling at an average of around 47 miles per hour. That’s not unheard of today. Mashable says in a 2014 article about the world’s fastest elevators that the Hitachi elevators in Guangzhou CTF Finance Building reach up to 45 miles per hour. But including acceleration and deceleration adds to the total time, so it takes the Hitachi elevators around 43 seconds to go from the ground floor to their 95th floor. If 97 is Deckard’s floor, it’s got to be accelerating and decelerating incredibly quickly. His body doesn’t appear to be suffering those kinds of Gs, so unless they have managed to upend Newton’s basic laws of motion, something in this scene is not right. As usual, I digress.

The input control is OK

The panel design is nice and was surprising in 1982, because few people had ridden in elevators serving nearly a hundred floors. And while most in-elevator panels have a single button per floor, it would have been an overwhelming UI to present the rider of this Blade Runner complex with 100 floor buttons plus the usual open door, close door, emergency alert buttons, etc. A panel that allows combinatorial inputs reduces the number of elements that must be displayed and processed by the user, even if it slows things down, introduces cognitive overhead, and adds the need for error-handling. Such systems need a “commit” control that allows them to review, edit, and confirm the sequence, to distinguish, say, “97” from “9” and “7.” Not such an issue from the 1st floor, but a frustration from 10–96. It’s not clear those controls are part of this input.

Deckard enters 8675309, just to see what will happen.

I’m a fan of destination dispatch elevator systems that increase efficiency (with caveats) by asking riders to indicate their floor outside the elevator and letting the algorithm organize passengers into efficient groups, but that only works for banks of elevators. I get the sense Deckard’s building is a little too low-rent for such luxuries. There is just one in his building, and in-elevator controls work fine for those situations, even if they slow things down a bit.

The feedback is OK

The feedback of the floors is kind of nice in that the 7-segment numbers rise up helping to convey the direction of movement. There is also a subtle, repeating, rising series of tones that accompany the display. Most modern elevators rely on the numeracy of its passengers and their sense of equilibrium to convey this information, but sure, this is another way to do it. Also, it would be nice if the voice system would, for the visually impaired, say the floor number when the door opens.

Though the projection is dumb

I’m not sure why the little green projection of the floor numbers runs across Deckard’s face. Is it just a filmmaker’s conceit, like the genetic code that gets projected across the velociraptors head in Jurassic Park?

Pictured: Sleepy Deckard. Dumb projection.

Or is it meant to be read as diegetic, that is, that there is a projector in the elevator, spraying the floor numbers across the faces of its riders? True to the New Criticism stance of this blog, I try very hard to presume that everything is diegetic, but I just can’t make that make sense. There would be much better ways to increase the visibility of the floor numbers, and I can’t come up with any other convincing reason why this would exist.

If *this* was diegetic, the scene would have ended with a shredded projector.

But really, it falls apart on the interaction details

Lastly, this interaction. First, let’s give it credit where credit is due. The elevator speaks clearly and understands Deckard perfectly. No surprise, since it only needs to understand a very limited number of utterances. It’s also nice that it’s polite without being too cheery about it. People in LA circa 2019 may have had a bad day and not have time for that shit.

Where’s the wake word?

But where’s the wake word? This is a phrase like “OK elevator” or “Hey lift” that signals to the natural language system that the user is talking to the elevator and not themselves, or another person in the elevator, or even on the phone. General AI exists in the Blade Runner world, and that might allow an elevator to use contextual cues to suss this out, but there are zero clues in the film that this elevator is sentient.

There are of course other possible, implicit “wake words.” A motion detector, proximity sensor, or even weight sensor could infer that a human is present, and start the elevator listening. But with any of these implicit “wake words,” you’d still need feedback for the user to know when it was listening. And some way to help them regain attention if they got the first interaction wrong, and there would be zero affordances for this. So really, making an explicit wake word is the right way to go.

It might be that touching the number panel is the attention signal. Touch it, and the elevator listens for a few seconds. That fits in with the events in the scene, anyway. The problem with that is the redundancy. (See below.) So if the solution was pressing a button, it should just be a “talk” button rather than a numeric keypad.

It may be that the elevator is always listening, which is a little dark and would stifle any conversation in the elevator less everyone end up stuck in the basement, but this seems very error prone and unlikely.

Deckard: *Yawns* Elevator: Confirmed. Silent alarm triggered.

This issue is similar to the one discussed in Make It So Chapter 5, “Gestural Interfaces” where I discussed how a user tells a computer they are communicating to it with gestures, and when they aren’t.

Where are the paralinguistics?

Humans provide lots of signals to one another, outside of the meaning of what is actually being said. These communication signals are called paralinguistics, and one of those that commonly appears in modern voice assistants is feedback that the system is listening. In the Google Assistant, for example, the dots let you know when it’s listening to silence and when it’s hearing your voice, providing implicit confirmation to the user that the system can hear them. (Parsing the words, understanding the meaning, and understanding the intent are separate, subsequent issues.)

Fixing this in Blade Runner could be as simple as turning on a red LED when the elevator is listening, and varying the brightness with Deckard’s volume. Maybe add chimes to indicate the starting-to-listen and no-longer-listening moments. This elevator doesn’t have anything like that, and it ought to.

Why the redundancy?

Next, why would Deckard need to push buttons to indicate “97” even while he’s saying the same number as part of the voice print? Sure, it could be that the voice print system was added later and Deckard pushes the numbers out of habit. But that bit of backworlding doesn’t buy us much.

It might be a need for redundant, confirming input. This is useful when the feedback is obscure or the stakes are high, but this is a low-stakes situation. If he enters the wrong floor, he just has to enter the correct floor. It would also be easy to imagine the elevator would understand a correction mid-ride like “Oh wait. Elevator, I need some ice. Let’s go to 93 instead.” So this is not an interaction that needs redundancy.

It’s very nice to have the discrete input as accessibility for people who cannot speak, or who have an accent that is unrecognizable to the system, or as a graceful degradation in case the speech recognition fails, but Deckard doesn’t fit any of this. He would just enter and speak his floor.

Why the personally identifiable information?

If we were designing a system and we needed, for security, a voice print, we should protect the privacy of the rider by not requiring personally identifiable information. It’s easy to imagine the spoken name being abused by stalkers and identity thieves riding the elevator with him. (And let’s not forget there is a stalker on the elevator with him in this very scene.)

This young woman, for example, would abuse the shit out of such information.

Better would be some generic phrase that stresses the parts of speech that a voiceprint system would find most effective in distinguishing people.

Tucker Saxon has written an article for VoiceIt called “Voiceprint Phrases.” In it he notes that a good voiceprint phrase needs some minimum number of non-repeating phonemes. In their case, it’s ten. A surname and a number is rarely going to provide that. “Deckard. 97,” happens to have exactly 10, but if he lived on the 2nd floor, it wouldn’t. Plus, it has that personally identifiable information, so is a non-starter.

What would be a better voiceprint phrase for this scene? Some of Saxon’s examples in the article include, “Never forget tomorrow is a new day” and “Today is a nice day to go for a walk.” While the system doesn’t care about the meaning of the phrase, the humans using it would be primed by the content, and so it would just add to the dystopia of the scene if Deckard had to utter one of these sunshine-and-rainbows phrases in an elevator that was probably an uncleaned murder scene. but I think we can do it one better.

(Hey Tucker, I would love use VoiceIt’s tools to craft a confirmed voiceprint phrase, but the signup requires that I permit your company to market to me via phone and email even though I’m just a hobbyist user, so…hard no.)

Deckard: Hi, I’m Deckard. My bank card PIN code is 3297. The combination lock to my car spells “myothercarisaspinner” and my computer password is “unicorn.” 97 please.

Here is an alternate interaction that would have solved a lot of these problems.

ELEVATOR
Voice print identification, please.
DECKARD
SIGHS
DECKARD
Have you considered life in the offworld colonies?
ELEVATOR
Confirmed. Floor?
DECKARD
97

Which is just a punch to the gut considering Deckard is stuck here and he knows he’s stuck, and it’s salt on the wound to have to repeat fucking advertising just to get home for a drink.

So…not great

In total, this scene zooms by and the audience knows how to read it, and for that, it’s fine. (And really, it’s just a setup for the moment that happens right after the elevator door opens. No spoilers.) But on close inspection, from the perspective of modern interaction design, it needs a lot of work.

Iron Man HUD: A Breakdown

1 Jul 2015 by Christopher Noessel

So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.

So on the surface of this scene, it’s a communications interface.

But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)

So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components.

Exosuit

The Iron Man is the name of the series of superpowered exosuits designed by Tony Stark. They range from the Mark I, a comparatively crude suit of armor to escape imprisonment by terrorists, through the Mark XLVI, the armor seen in The Avengers: Age of Ultron. The suit acts as defense against nearly every type of weapon known. It has repulsor beams built into the palms and in later models the arc reactor mounted in the chest that can be used to deliver concussive force. It allows the wearer to fly. Offensive weaponry varies between models, but has included a high powered laser system, and auto-targeting minigun pod and missiles. The suit can act semi-autonomously or via remote control. One of the models in The Avengers has parts that are seen to self-propel to Tony, targeting a beacon bracelet he wears, and self-assemble around him very quickly.

Immersive display

Though Tony’s head is completely covered, he has a virtual reality display within his helmet. It is a full-field-of-vision, very high-resolution, full-color display that provides stereoscopic imaging. It allows Tony to see the world around him as if he were not wearing the helmet, augment the view with goal-, person-, location-, and object-sensitive awareness.

The display varies a great deal, changing to the needs of the situation. But five icons persistently in the lower part of the display seem to be: suit status, targeting and optics, radar, artificial horizon, and map.

An interpretive view of Tony’s experience, from Iron Man (2008). — An interpretive view of Tony’s experience, from *Iron Man* (2008).

An first-person view from within the HUD, Iron Man (2008). — An first-person view from within the HUD, *Iron Man* (2008).

There is much to critique about the readability of the complex layering and translucency, the limits of human perception, and the necessarily- (and strictly-) interpretive nature of what we as audience see, but let me save those three points for a later post. For now it’s enough to log the features as aspects of the system.

Head NUI

Though Tony could use his hands to interact with an interface projected into the augmented reality view around him, his hands are often occupied in controlling flight or in combat. For this reason the means of input are head gesture, eye gesture, and voice input. A bit more on each follows.

Elements within the HUD such as reticles around his eyes follow and track his head gestures. Other elements stay locked in place. The HUD can track his gaze perfectly, allowing him to designate targets for his weapons with a fixation. Using this perfect eye tracking, Tony can also speak about something he is looking at, either in the real world or in the interface, and the system understands exactly what he’s talking about.

In fact, Tony is able to speak fully natural language commands, and indeed, carry out full-Turing conversations with the suit because of the presence of…

Strong artificial intelligence: JARVIS

An on-board artificial intelligence known as JARVIS handles any information task Tony asks of it, and monitors the surroundings and anticipates informational needs. There is strong evidence that most of the functions of the suit are handled by JARVIS behind the scenes. The crucialness of the artificial intelligence to the function of the suit cannot be overstated. It’s difficult to imagine how most of the suit could function as it does without an artificial intelligence behind the scenes facilitating results and even guiding Tony. With this in mind it is instructive to reframe the AI as the thing being named the Iron Man, with Tony Stark being an onboard manager, or, more charitably, a command-and-control center. Who quips.

Next up in the Iron HUD series: Lets review the functions of the suit.

Siege Support

21 Aug 2013 by Christopher Noessel

GitS-Aramaki-03

When Section 9 launches an assault on the Puppet Master’s headquarters, Department Chief Aramaki watches via a portable computer. It looks and behaves much like a modern laptop, with a heavy base that connects via a hinge to a thin screen. This shows him a live video feed.

The scan lines on the feed tell us that the cameras are diegetic, something Aramaki is watching, rather than the "camera" of the movie we as the audience are watching. These cameras must be placed in many places around the compound: behind the helicopter, following the Puppet Master, to the far right of the Puppet Master, and even floating far overhead. That seems a bit far-fetched until you remember that there are agents all around the compound, and Section 9 has the resources to outfitted all of them with small cameras. Even the overhead camera could be an unoticed helicopter equipped with a high-powered telephoto lens. So stretching believability, but not beyond the bounds of possibility. My main question is, given these cameras, who is doing the live editing? Aramaki’s view switches dramatically between these views as he’s watching with no apparent interaction.

A clue comes from his singular interaction with the system. When a helicopter lands in the lawn of the building, Aramaki says, "Begin recording," and a blinking REC overlay appears in the upper left and a timecode overlay appears in the lower right. If you look at the first shot in the scene, there is a soldier next to him hunched over a different terminal, so we can presume that he’s the hands-on guy, executing orders that Aramaki calls out. That same tech can be doing the live camera switching and editing to show Aramaki the feed that’s most important and relevant.

GitS-Aramaki-11

That idea makes even more sense knowing that Aramaki is a chief, and his station warrants spending money on an everpresent human technician.

Sometimes, as in this case, the human is the best interface.

Sci-fi interfaces

Stop watching sci-fi. Start using it.

Tag Archives: voice interactions