Disclosure (1994)

4 Dec 2023 by scifihughf

Our next 3D file browsing system is from the 1994 film Disclosure. Thanks to site reader Patrick H Lauke for the suggestion.

Like Jurassic Park, Disclosure is based on a Michael Crichton novel, although this time without any dinosaurs. (Would-be scriptwriters should compare the relative success of these two films when planning a study program.) The plot of the film is corporate infighting within Digicom, manufacturer of high tech CD-ROM drives—it was the 1990s—and also virtual reality systems. Tom Sanders, executive in charge of the CD-ROM production line, is being set up to take the blame for manufacturing failures that are really the fault of cost-cutting measures by rival executive Meredith Johnson.

The Corridor: Hardware Interface

The virtual reality system is introduced at about 40 minutes, using the narrative device of a product demonstration within the company to explain to the attendees what it does. The scene is nicely done, conveying all the important points we need to know in two minutes. (To be clear, some of the images used here come from a later scene in the film, but it’s the same system in both.)

The process of entangling yourself with the necessary hardware and software is quite distinct from interacting with the VR itself, so let’s discuss these separately, starting with the physical interface.

Tom wearing VR headset and one glove, being scanned. Disclosure (1994)

In Disclosure the virtual reality user wears a headset and one glove, all connected by cables to the computer system. Like most virtual reality systems, the headset is responsible for visual display, audio, and head movement tracking; the glove for hand movement and gesture tracking.

There are two “laser scanners” on the walls. These are the planar blue lights, which scan the user’s body at startup. After that they track body motion, although since the user still has to wear a glove, the scanners presumably just track approximate body movement and orientation without fine detail.

Lastly, the user stands on a concave hexagonal plate covered in embedded white balls, which allows the user to “walk” on the spot.

Closeup of user standing on curved surface of white balls. Disclosure (1994)

Searching for Evidence

The scene we’re most interested in takes place later in the film, the evening before a vital presentation which will determine Tom’s future. He needs to search the company computer files for evidence against Meredith, but discovers that his normal account has been blocked from access. He knows though that the virtual reality demonstrator is on display in a nearby hotel suite, and also knows about the demonstrator having unlimited access. He sneaks into the hotel suite to use The Corridor. Tom is under a certain amount of time pressure because a couple of company VIPs and their guests are downstairs in the hotel and might return at any time.

The first step for Tom is to launch the virtual reality system. This is done from an Indy workstation, using the regular Unix command line.

The command line to start the virtual reality system. Disclosure (1994)

Next he moves over to the VR space itself. He puts on the glove but not the headset, presses a key on the keyboard (of the VR computer, not the workstation), and stands still for a moment while he is scanned from top to bottom.

Real world Tom, wearing one VR glove, waits while the scanners map his body. Disclosure (1994)

On the left is the Indy workstation used to start the VR system. In the middle is the external monitor which will, in a moment, show the third person view of the VR user as seen earlier during the product demonstration.

Now that Tom has been scanned into the system, he puts on the headset and enters the virtual space.

The Corridor: Virtual Interface

“The Corridor,” as you’ve no doubt guessed, is a three dimensional file browsing program. It is so named because the user will walk down a corridor in a virtual building, the walls lined with “file cabinets” containing the actual computer files.

Three important aspects of The Corridor were mentioned during the product demonstration earlier in the film. They’ll help structure our tour of this interface, so let’s review them now, as they all come up in our discussion of the interfaces.

There is a voice-activated help system, which will summon a virtual “Angel” assistant.
Since the computers themselves are part of a multi-user network with shared storage, there can be more than one user “inside” The Corridor at a time.
Users who do not have access to the virtual reality system will appear as wireframe body shapes with a 2D photo where the head should be.
There are no access controls and so the virtual reality user, despite being a guest or demo account, has unlimited access to all the company files. This is spectacularly bad design, but necessary for the plot.

With those bits of system exposition complete, now we can switch to Tom’s own first person view of the virtual reality environment.

Virtual world Tom watches his hands rezzing up, right hand with glove. Disclosure (1994)

There isn’t a real background yet, just abstract streaks. The avatar hands are rezzing up, and note that the right hand wearing the glove has a different appearance to the left. This mimics the real world, so eases the transition for the user.

Overlaid on the virtual reality view is a Digicom label at the bottom and four corner brackets which are never explained, although they do resemble those used in cameras to indicate the preferred viewing area.

To the left is a small axis indicator, the three green lines labeled X, Y, and Z. These show up in many 3D applications because, silly though it sounds, it is easy in a 3D computer environment to lose track of directions or even which way is up. A common fix for the user being unable to see anything is just to turn 180 degrees around.

We then switch to a third person view of Tom’s avatar in the virtual world.

Tom is fully rezzed up, within cloud of visual static. Disclosure (1994)

This is an almost photographic-quality image. To remind the viewers that this is in the virtual world rather than real, the avatar follows the visual convention described in chapter 4 of Make It So for volumetric projections, with scan lines and occasional flickers. An interesting choice is that the avatar also wears a “headset”, but it is translucent so we can see the face.

Now that he’s in the virtual reality, Tom has one more action needed to enter The Corridor. He pushes a big button floating before him in space.

Tom presses one button on a floating control panel. Disclosure (1994)

This seems unnecessary, but we can assume that in the future of this platform, there will be more programs to choose from.

The Corridor rezzes up, the streaks assembling into wireframe components which then slide together as the surfaces are shaded. Tom doesn’t have to wait for the process to complete before he starts walking, which suggests that this is a Level Of Detail (LOD) implementation where parts of the building are not rendered in detail until the user is close enough for it to be worth doing.

Tom enters The Corridor. Nearby floor and walls are fully rendered, the more distant section is not complete. Disclosure (1994)

The architecture is classical, rendered with the slightly artificial-looking computer shading that is common in 3D computer environments because it needs much less computation than trying for full photorealism.

Instead of a corridor this is an entire multistory building. It is large and empty, and as Tom is walking bits of architecture reshape themselves, rather like the interior of Hogwarts in Harry Potter.

Although there are paintings on some of the walls, there aren’t any signs, labels, or even room numbers. Tom has to wander around looking for the files, at one point nearly “falling” off the edge of the floor down an internal air well. Finally he steps into one archway room entrance and file cabinets appear in the walls.

Tom enters a room full of cabinets. Disclosure (1994)

Unlike the classical architecture around him, these cabinets are very modern looking with glowing blue light lines. Tom has found what he is looking for, so now begins to manipulate files rather than browsing.

Virtual Filing Cabinets

The four nearest cabinets according to the titles above are

Communications
Operations
System Control
Research Data.

There are ten file drawers in each. The drawers are unmarked, but labels only appear when the user looks directly at it, so Tom has to move his head to centre each drawer in turn to find the one he wants.

Tom looks at one particular drawer to make the title appear. Disclosure (1994)

The fourth drawer Tom looks at is labeled “Malaysia”. He touches it with the gloved hand and it slides out from the wall.

Tom withdraws his hand as the drawer slides open. Disclosure (1994)

Inside are five “folders” which, again, are opened by touching. The folder slides up, and then three sheets, each looking like a printed document, slide up and fan out.

Axis indicator on left, pointing down. One document sliding up from a folder. Disclosure (1994)

Note the tilted axis indicator at the left. The Y axis, representing a line extending upwards from the top of Tom’s head, is now leaning towards the horizontal because Tom is looking down at the file drawer. In the shot below, both the folder and then the individual documents are moving up so Tom’s gaze is now back to more or less level.

Close up of three “pages” within a virtual document. Disclosure (1994)

At this point the film cuts away from Tom. Rival executive Meredith, having been foiled in her first attempt at discrediting Tom, has decided to cover her tracks by deleting all the incriminating files. Meredith enters her office and logs on to her Indy workstation. She is using a Command Line Interface (CLI) shell, not the standard SGI Unix shell but a custom Digicom program that also has a graphical menu. (Since it isn’t three dimensional it isn’t interesting enough to show here.)

Tom uses the gloved hand to push the sheets one by one to the side after scanning the content.

Tom scrolling through the pages of one folder by swiping with two fingers. Disclosure (1994)

Quick note: This is harder than it looks in virtual reality. In a 2D GUI moving the mouse over an interface element is obvious. In three dimensions the user also has to move their hand forwards or backwards to get their hand (or finger) in the right place, and unless there is some kind of haptic feedback it isn’t obvious to the user that they’ve made contact.

Tom now receives a nasty surprise.

The shot below shows Tom’s photorealistic avatar at the left, standing in front of the open file cabinet. The green shape on the right is the avatar of Meredith who is logged in to a regular workstation. Without the laser scanners and cameras her avatar is a generic wireframe female humanoid with a face photograph stuck on top. This is excellent design, making The Corridor usable across a range of different hardware capabilities.

Tom sees the Meredith avatar appear. Disclosure (1994)

Why does The Corridor system place her avatar here? A multiuser computer system, or even just a networked file server, obviously has to know who is logged on. Unix systems in general and command line shells also track which directory the user is “in”, the current working directory. Meredith is using her CLI interface to delete files in a particular directory so The Corridor can position her avatar in the corresponding virtual reality location. Or rather, the avatar glides into position rather than suddenly popping into existence: Tom is only surprised because the documents blocked his virtual view.

Quick note: While this is plausible, there are technical complications. Command line users often open more than one shell at a time in different directories. In such a case, what would The Corridor do? Duplicate the wireframe avatar in each location? In the real world we can’t be in more than one place at a time, would doing so contradict the virtual reality metaphor?

There is an asymmetry here in that Tom knows Meredith is “in the system” but not vice versa. Meredith could in theory use CLI commands to find out who else is logged on and whether anyone was running The Corridor, but she would need to actively seek out that information and has no reason to do so. It didn’t occur to Tom either, but he doesn’t need to think about it, the virtual reality environment conveys more information about the system by default.

We briefly cut away to Meredith confirming her CLI delete command. Tom sees this as the file drawer lid emitting beams of light which rotate down. These beams first erase the floating sheets, then the folders in the drawer. The drawer itself now has a red “DELETED” label and slides back into the wall.

Tom watches Meredith deleting the files in an open drawer. Disclosure (1994)

Tom steps further into the room. The same red labels appear on the other file drawers even though they are currently closed.

Tom watches Meredith deleting other, unopened, drawers. Disclosure (1994)

Talking to an Angel

Tom now switches to using the system voice interface, saying “Angel I need help” to bring up the virtual reality assistant. Like everything else we’ve seen in this VR system the “angel” rezzes up from a point cloud, although much more quickly than the architecture: people who need help tend to be more impatient and less interested in pausing to admire special effects.

The voice assistant as it appears within VR. Disclosure (1994)

Just in case the user is now looking in the wrong direction the angel also announces “Help is here” in a very natural sounding voice.

The angel is rendered with white robe, halo, harp, and rapidly beating wings. This is horribly clichéd, but a help system needs to be reassuring in appearance as well as function. An angel appearing as a winged flying serpent or wheel of fire would be more original and authentic (yes, really: Biblically Accurate Angels) but users fleeing in terror would seriously impact the customer satisfaction scores.

Now Tom has a short but interesting conversation with the angel, beginning with a question:

Tom
Is there any way to stop these files from being deleted?
Angel
I’m sorry, you are not level five.
Tom
Angel, you’re supposed to protect the files!
Angel
Access control is restricted to level five.

Tom has made the mistake, as described in chapter 9 Anthropomorphism of the book, of ascribing more agency to this software program than it actually has. He thinks he is engaged in a conversational interface (chapter 6 Sonic Interfaces) with a fully autonomous system, which should therefore be interested in and care about the wellbeing of the entire system. Which it doesn’t, because this is just a limited-command voice interface to a guide.

Even though this is obviously scripted, rather than a genuine error I think this raises an interesting question for real world interface designers: do users expect that an interface with higher visual quality/fidelity will be more realistic in other aspects as well? If a voice interface assistant has a simple polyhedron with no attempt at photorealism (say, like Bit in Tron) or with zoomorphism (say, like the search bear in Until the End of the World) will users adjust their expectations for speech recognition downwards? I’m not aware of any research that might answer this question. Readers?

Despite Tom’s frustration, the angel has given an excellent answer – for a guide. A very simple help program would have recited the command(s) that could be used to protect files against deletion. Which would have frustrated Tom even more when he tried to use one and got some kind of permission denied error. This program has checked whether the user can actually use commands before responding.

This does contradict the earlier VR demonstration where we were told that the user had unlimited access. I would explain this as being “unlimited read access, not write”, but the presenter didn’t think it worthwhile to go into such detail for the mostly non-technical audience.

Tom is now aware that he is under even more time pressure as the Meredith avatar is still moving around the room. Realising his mistake, he uses the voice interface as a query language.

“Show me all communications with Malaysia.”
“Telephone or video?”
“Video.”

This brings up a more conventional looking GUI window because not everything in virtual reality needs to be three-dimensional. It’s always tempting for a 3D programmer to re-implement everything, but it’s also possible to embed 2D GUI applications into a virtual world.

Tom looks at a conventional 2D display of file icons inside VR. Disclosure (1994)

The window shows a thumbnail icon for each recorded video conference call. This isn’t very helpful, so Tom again decides that a voice query will be much faster than looking at each one in turn.

“Show me, uh, the last transmission involving Meredith.”

There’s a short 2D transition effect swapping the thumbnail icon display for the video call itself, which starts playing at just the right point for plot purposes.

Tom watches a previously recorded video call made by Meredith (right). Disclosure (1994)

While Tom is watching and listening, Meredith is still typing commands. The camera orbits around behind the video conference call window so we can see the Meredith avatar approach, which also shows us that this window is slightly three dimensional, the content floating a short distance in front of the frame. The film then cuts away briefly to show Meredith confirming her “kill all” command. The video conference recordings are deleted, including the one Tom is watching.

Tom is informed that Meredith (seen here in the background as a wireframe avatar) is deleting the video call. Disclosure (1994)

This is also the moment when the downstairs VIPs return to the hotel suite, so the scene ends with Tom managing to sneak out without being detected.

Virtual reality has saved the day for Tom. The documents and video conference calls have been deleted by Meredith, but he knows that they once existed and has a colleague retrieve the files he needs from the backup tapes. (Which is good writing: the majority of companies shown in film and TV never seem to have backups for files, no matter how vital.) Meredith doesn’t know that he knows, so he has the upper hand to expose her plot.

Analysis

How believable is the interface?

I won’t spend much time on the hardware, since our focus is on file browsing in three dimensions. From top to bottom, the virtual reality system starts as believable and becomes less so.

Hardware

The headset and glove look like real VR equipment, believable in 1994 and still so today. Having only one glove is unusual, and makes impossible some of the common gesture actions described in chapter 5 of Make It So, which require both hands.

The “laser scanners” that create the 3D geometry and texture maps for the 3D avatar and perform real time body tracking would more likely be cameras, but that would not sound as cool.

And lastly the walking platform apparently requires our user to stand on large marbles or ball bearings and stay balanced while wearing a headset. Uh…maybe…no. Apologetics fails me. To me it looks like it would be uncomfortable to walk on, almost like deterrent paving.

Software

The Corridor, unlike the 3D file browser used in Jurassic Park, is a special effect created for the film. It was a mostly-plausible, near future system in 1994, except for the photorealistic avatar. Usually this site doesn’t discuss historical context (the “new criticism” stance), but I think in this case it helps to explain how this interface would have appeared to audiences almost two decades ago.

I’ll start with the 3D graphics of the virtual building. My initial impression was that The Corridor could have been created as an interactive program in 1994, but that was my memory compressing the decade. During the 1990s 3D computer graphics, both interactive and CGI, improved at a phenomenal rate. The virtual building would not have been interactive in 1994, was possible on the most powerful systems six years later in 2000, and looks rather old-fashioned compared to what the game consoles of the 21st C can achieve.

For the voice interface I made the opposite mistake. Voice interfaces on phones and home computing appliances have become common in the second decade of the 21st C, but in reality are much older. Apple Macintosh computers in 1994 had text-to-speech synthesis with natural sounding voices and limited vocabulary voice command recognition. (And without needing an Internet connection!) So the voice interface in the scene is believable.

The multi-user aspects of The Corridor were possible in 1994. The wireframe avatars for users not in virtual reality are unflattering or perhaps creepy, but not technically difficult. As a first iteration of a prototype system it’s a good attempt to span a range of hardware capabilities.

The virtual reality avatar, though, is not believable for the 1990s and would be difficult today. Photographs of the body, made during the startup scan, could be used as a texture map for the VR avatar. But live video of the face would be much more difficult, especially when the face is partly obscured by a headset.

How well does the interface inform the narrative of the story?

The virtual reality system in itself is useful to the overall narrative because it makes the Digicom company seem high tech. Even in 1994 CD-ROM drives weren’t very interesting.

The Corridor is essential to the tension of the scene where Tom uses it to find the files, because otherwise the scene would be much shorter and really boring. If we ignore the virtual reality these are the interface actions:

Tom reads an email.
Meredith deletes the folder containing those emails.
Tom finds a folder full of recorded video calls.
Tom watches one recorded video call.
Meredith deletes the folder containing the video calls.

Imagine how this would have looked if both were using a conventional 2D GUI, such as the Macintosh Finder or MS Windows Explorer. Double click, press and drag, double click…done.

The Corridor slows down Tom’s actions and makes them far more visible and understandable. Thanks to the virtual reality avatar we don’t have to watch an actor push a mouse around. We see him moving and swiping, be surprised and react; and the voice interface adds extra emotion and some useful exposition. It also helps with the plot, giving Tom awareness of what Meredith is doing without having to actively spy on her, or look at some kind of logs or recordings later on.

Meredith, though, can’t use the VR system because then she’d be aware of Tom as well. Using a conventional workstation visually distinguishes and separates Meredith from Tom in the scene.

So overall, though the “action” is pretty mundane, it’s crucial to the plot, and the VR interface helps make this interesting and more engaging.

How well does the interface equip the character to achieve their goals?

As described in the film itself, The Corridor is a prototype for demonstrating virtual reality. As a file browser it’s awful, but since Tom has lost all his normal privileges this is the only system available, and he does manage to eventually find the files he needs.

At the start of the scene, Tom spends quite some time wandering around a vast multi-storey building without a map, room numbers, or even coordinates overlaid on his virtual view. Which seems rather pointless because all the files are in one room anyway. As previously discussed for Johnny Mnemonic, walking or flying everywhere in your file system seems like a good idea at first, but often becomes tedious over time. Many actual and some fictional 3D worlds give users the ability to teleport directly to any desired location.

Then the file drawers in each cabinet have no labels either, so Tom has to look carefully at each one in turn. There is so much more the interface could be doing to help him with his task, and even help the users of the VR demo learn and explore its technology as well.

Contrast this with Meredith, who uses her command line interface and 2D GUI to go through files like a chainsaw.

Tom becomes much more efficient with the voice interface. Which is just as well, because if he hadn’t, Meredith would have deleted the video conference recordings while he was still staring at virtual filing cabinets. However neither the voice interface nor the corresponding file display need three dimensional graphics.

There is hope for version 2.0 of The Corridor, even restricting ourselves to 1994 capabilities. The first and most obvious is to copy 2D GUI file browsers, or the 3D file browser from Jurassic Park, and show the corresponding text name next to each graphical file or folder object. The voice interface is so good that it should be turned on by default without requiring the angel. And finally add some kind of map overlay with a you are here moving dot, like the maps that players in 3D games such as Doom could display with a keystroke.

Film making challenge: VR on screen

Virtual reality (or augmented reality systems such as Hololens) provide a better viewing experience for 3D graphics by creating the illusion of real three dimensional space rather than a 2D monitor. But it is always a first person view and unlike conventional 2D monitors nobody else can usually see what the VR user is seeing without a deliberate mirroring/debugging display. This is an important difference from other advanced or speculative technologies that film makers might choose to include. Showing a character wielding a laser pistol instead of a revolver or driving a hover car instead of a wheeled car hardly changes how to stage a scene, but VR does.

So, how can we show virtual reality in film?

There’s the first-person view corresponding to what the virtual reality user is seeing themselves. (Well, half of what they see since it’s not stereographic, but it’s cinema VR, so close enough.) This is like watching a screencast of someone else playing a first person computer game, the original active experience of the user becoming passive viewing by the audience. Most people can imagine themselves in the driving seat of a car and thus make sense of the turns and changes of speed in a first person car chase, but the film audience probably won’t be familiar with the VR system depicted and will therefore have trouble understanding what is happening. There’s also the problem that viewing someone else’s first-person view, shifting and changing in response to their movements rather than your own, can make people disoriented or nauseated.

A third-person view is better for showing the audience the character and the context in which they act. But not the diegetic real-world third-person view, which would be the character wearing a geeky headset and poking at invisible objects. As seen in Disclosure, the third person view should be within the virtual reality.

But in doing that, now there is a new problem: the avatar in virtual reality representing the real character. If the avatar is too simple the audience may not identify it with the real world character and it will be difficult to show body language and emotion. More realistic CGI avatars are increasingly expensive and risk falling into the Uncanny Valley. Since these films are science fiction rather than factual, the easy solution is to declare that virtual reality has achieved the goal of being entirely photorealistic and just film real actors and sets. Adding the occasional ripple or blur to the real world footage to remind the audience that it’s meant to be virtual reality, again as seen in Disclosure, is relatively cheap and quick.
So, solving all these problems results in the cinematic trope we can call Extradiegetic Avatars, which are third-person, highly-lifelike “renderings” of characters, with a telltale Hologram Projection Imperfection for audience readability, that may or may not be possible within the world of the film itself.

IMDB: https://www.imdb.com/title/tt0109635/

Deckard’s Photo Inspector

29 Apr 2020 by Christopher Noessel

Back to Blade Runner. I mean, the pandemic is still pandemicking, but maybe this will be a nice distraction while you shelter in place. Because you’re smart, sheltering in place as much as you can, and not injecting disinfectants. And, like so many other technologies in this film, this will take a while to deconstruct, critique, and reimagine.

Description

Doing his detective work, Deckard retrieves a set of snapshots from Leon’s hotel room, and he brings them home. Something in the one pictured above catches his eye, and he wants to investigate it in greater detail. He takes the photograph and inserts it in a black device he keeps in his living room.

Note: I’ll try and describe this interaction in text, but it is much easier to conceptualize after viewing it. Owing to copyright restrictions, I cannot upload this length of video with the original audio, so I have added pre-rendered closed captions to it, below. All dialogue in the clip is Deckard.

Deckard does digital forensics, looking for a lead.

He inserts the snapshot into a horizontal slit and turns the machine on. A thin, horizontal orange line glows on the left side of the front panel. A series of seemingly random-length orange lines begin to chase one another in a single-row space that stretches across the remainder of the panel and continue to do so throughout Deckard’s use of it. (Imagine a news ticker, running backwards, where the “headlines” are glowing amber lines.) This seems useless and an absolutely pointless distraction for Deckard, putting high-contrast motion in his peripheral vision, which fights for attention with the actual, interesting content down below.

If this is distracting you from reading, YOU SEE MY POINT.

After a second, the screen reveals a blue grid, behind which the scan of the snapshot appears. He stares at the image in the grid for a moment, and speaks a set of instructions, “Enhance 224 to 176.”

In response, three data points appear overlaying the image at the bottom of the screen. Each has a two-letter label and a four-digit number, e.g. “ZM 0000 NS 0000 EW 0000.” The NS and EW—presumably North-South and East-West coordinates, respectively—immediately update to read, “ZM 0000 NS 0197 EW 0334.” After updating the numbers, the screen displays a crosshairs, which target a single rectangle in the grid.

A new rectangle then zooms in from the edges to match the targeted rectangle, as the ZM number—presumably zoom, or magnification—increases. When the animated rectangle reaches the targeted rectangle, its outline blinks yellow a few times. Then the contents of the rectangle are enlarged to fill the screen, in a series of steps which are punctuated with sounds similar to a mechanical camera aperture. The enlargement is perfectly resolved. The overlay disappears until the next set of spoken commands. The system response between Deckard’s issuing the command and the device’s showing the final enlarged image is about 11 seconds.

Deckard studies the new image for awhile before issuing another command. This time he says, “Enhance.” The image enlarges in similar clacking steps until he tells it, “Stop.”

Other instructions he is heard to give include “move in, pull out, track right, center in, pull back, center, and pan right.” Some include discrete instructions, such as, “Track 45 right” while others are relative commands that the system obeys until told to stop, such as “Go right.”

Using such commands he isolates part of the image that reveals an important clue, and he speaks the instruction, “Give me a hard copy right there.” The machine prints the image, which Deckard uses to help find the replicant pictured.

I’d like to point out one bit of sophistication before the critique. Deckard can issue a command with or without a parameter, and the inspector knows what to do. For example, “Track 45 right” and “Track right.” Without the parameter, it will just do the thing repeatedly until told to stop. That helps Deckard issue the same basic command when he knows exactly where he wants to look and when doesn’t know what exactly what he’s looking for. That’s a nice feature of the language design.

But still, asking him to provide step-by-step instructions in this clunky way feels like some high-tech Big Trak. (I tried to find a reference that was as old as the film.) And that’s not all…

Some critiques, as it is

Can I go back and mention that amber distracto-light? Because it’s distracting. And pointless. I’m not mad. I’m just disappointed.
It sure would be nice if any of the numbers on screen made sense, and had any bearing with the numbers Deckard speaks, at any time during the interaction. For instance, the initial zoom (I checked in Photoshop) is around 304%, which is neither the 224 or 176 that Deckard speaks.
It might be that each square has a number, and he simply has to name the two squares at the extents of the zoom he wants, letting the machine find the extents, but where is the labeling? Did he have to memorize an address for each pixel? How does that work at arbitrary levels of zoom?
And if he’s memorized it, why show the overlay at all?
Why the seizure-inducing flashing in the transition sequences? Sure, I get that lots of technologies have unfortunate effects when constrained by mechanics, but this is digital.
Why is the printed picture so unlike the still image where he asks for a hard copy?
Gaze at the reflection in Ford’s hazel, hazel eyes, and it’s clear he’s playing Missile Command, rather than paying attention to this interface at all. (OK, that’s the filmmaker’s issue, not a part of the interface, but still, come on.)

The photo inspector: My interface is up HERE, Rick.

How might it be improved for 1982?

So if 1982 Ridley Scott was telling me in post that we couldn’t reshoot Harrison Ford, and we had to make it just work with what we had, here’s what I’d do…

Squash the grid so the cells match the 4:3 ratio of the NTSC screen. Overlay the address of each cell, while highlighting column and row identifiers at the edges. Have the first cell’s outline illuminate as he speaks it, and have the outline expand to encompass the second named cell. Then zoom, removing the cell labels during the transition. When at anything other than full view, display a map across four cells that shows the zoom visually in the context of the whole.

Rendered in glorious 4:3 NTSC dimensions.

With this interface, the structure of the existing conversation makes more sense. When Deckard said, “Enhance 203 to 608” the thing would zoom in on the mirror, and the small map would confirm.

The numbers wouldn’t match up, but it’s pretty obvious from the final cut that Scott didn’t care about that (or, more charitably, ran out of time). Anyway I would be doing this under protest, because I would argue this interaction needs to be fixed in the script.

How might it be improved for 2020?

What’s really nifty about this technology is that it’s not just a photograph. Look close in the scene, and Deckard isn’t just doing CSI Enhance! commands (or, to be less mocking, AI upscaling). He’s using the photo inspector to look around corners and at objects that are reconstructed from the smallest reflections. So we can think of the interaction like he’s controlling a drone through a 3D still life, looking for a lead to help him further the case.

With that in mind, let’s talk about the display.

Display

To redesign it, we have to decide at a foundational level how we think this works, because it will color what the display looks like. Is this all data that’s captured from some crazy 3D camera and available in the image? Or is it being inferred from details in the 2 dimensional image? Let’s call the first the 3D capture, and the second the 3D inference.

If we decide this is a 3-D capture, then all the data that he observes through the machine has the same degree of confidence. If, however, we decide this is a 3D inferrer, Deckard needs to treat the inferred data with more skepticism than the data the camera directly captured. The 3-D inferrer is the harder problem, and raises some issues that we must deal with in modern AI, so let’s just say that’s the way this speculative technology works.

The first thing the display should do it make it clear what is observed and what is inferred. How you do this is partly a matter of visual design and style, but partly a matter of diegetic logic. The first pass would be to render everything in the camera frustum photo-realistically, and then render everything outside of that in a way that signals its confidence level. The comp below illustrates one way this might be done.

Modification of a pair of images found on Evermotion

In the comp, Deckard has turned the “drone” from the “actual photo,” seen off to the right, toward the inferred space on the left. The monochrome color treatment provides that first high-confidence signal.
In the scene, the primary inference would come from reading the reflections in the disco ball overhead lamp, maybe augmented with plans for the apartment that could be found online, or maybe purchase receipts for appliances, etc. Everything it can reconstruct from the reflection and high-confidence sources has solid black lines, a second-level signal.
The smaller knickknacks that are out of the reflection of the disco ball, and implied from other, less reflective surfaces, are rendered without the black lines and blurred. This provides a signal that the algorithm has a very low confidence in its inference.

This is just one (not very visually interesting) way to handle it, but should illustrate that, to be believable, the photo inspector shouldn’t have a single rendering style outside the frustum. It would need something akin to these levels to help Deckard instantly recognize how much he should trust what he’s seeing.

Flat screen or volumetric projection?

Modern CGI loves big volumetric projections. (e.g. it was the central novum of last year’s Fritz winner, Spider-Man: Far From Home.) And it would be a wonderful juxtaposition to see Deckard in a holodeck-like recreation of Leon’s apartment, with all the visual treatments described above.

But…

Also seriously who wants a lamp embedded in a headrest?

…that would kind of spoil the mood of the scene. This isn’t just about Deckard’s finding a clue, we also see a little about who he is and what his life is like. We see the smoky apartment. We see the drab couch. We see the stack of old detective machines. We see the neon lights and annoying advertising lights swinging back and forth across his windows. Immersing him in a big volumetric projection would lose all this atmospheric stuff, and so I’d recommend keeping it either a small contained VP, like we saw in Minority Report, or just keep it a small flat screen.

OK, so we have an idea about how the display would (and shouldn’t) look, let’s move on to talk about the inputs.

Inputs

To talk about inputs, then, we have to return to a favorite topic of mine, and that is the level of agency we want for the interaction. In short, we need to decide how much work the machine is doing. Is the machine just a manual tool that Deckard has to manipulate to get it to do anything? Or does it actively assist him? Or, lastly, can it even do the job while his attention is on something else—that is, can it act as an agent on his behalf? Sophisticated tools can be a blend of these modes, but for now, let’s look at them individually.

Manual Tool

This is how the photo inspector works in Blade Runner. It can do things, but Deckard has to tell it exactly what to do. But we can still improve it in this mode.

We could give him well-mapped physical controls, like a remote control for this conceptual drone. Flight controls wind up being a recurring topic on this blog (and even came up already in the Blade Runner reviews with the Spinners) so I could go on about how best to do that, but I think that a handheld controller would ruin the feel of this scene, like Deckard was sitting down to play a video game rather than do off-hours detective work.

Special edition made possible by our sponsor, Tom Nook.
(I hope we can pay this loan back.)

Similarly, we could talk about a gestural interface, using some of the synecdochic techniques we’ve seen before in Ghost in the Shell. But again, this would spoil the feel of the scene, having him look more like John Anderton in front of a tiny-TV version of Minority Report’s famous crime scrubber.

One of the things that gives this scene its emotional texture is that Deckard is drinking a glass of whiskey while doing his detective homework. It shows how low he feels. Throwing one back is clearly part of his evening routine, so much a habit that he does it despite being preoccupied about Leon’s case. How can we keep him on the couch, with his hand on the lead crystal whiskey glass, and still investigating the photo? Can he use it to investigate the photo?

Here I recommend a bit of ad-hoc tangible user interface. I first backworlded this for The Star Wars Holiday Special, but I think it could work here, too. Imagine that the photo inspector has a high-resolution camera on it, and the interface allows Deckard to declare any object that he wants as a control object. After the declaration, the camera tracks the object against a surface, using the changes to that object to control the virtual camera.

In the scene, Deckard can declare the whiskey glass as his control object, and the arm of his couch as the control surface. Of course the virtual space he’s in is bigger than the couch arm, but it could work like a mouse and a mousepad. He can just pick it up and set it back down again to extend motion.

This scheme takes into account all movement except vertical lift and drop. This could be a gesture or a spoken command (see below).

Going with this interaction model means Deckard can use the whiskey glass, allowing the scene to keep its texture and feel. He can still drink and get his detective on.

Assistant Tool

Indirect manipulation is helpful for when Deckard doesn’t know what he’s looking for. He can look around, and get close to things to inspect them. But when he knows what he’s looking for, he shouldn’t have to go find it. He should be able to just ask for it, and have the photo inspector show it to him. This requires that we presume some AI. And even though Blade Runner clearly includes General AI, let’s presume that that kind of AI has to be housed in a human-like replicant, and can’t be squeezed into this device. Instead, let’s just extend the capabilities of Narrow AI.

Some of this will be navigational and specific, “Zoom to that mirror in the background,” for instance, or, “Reset the orientation.” Some will more abstract and content-specific, e.g. “Head to the kitchen” or “Get close to that red thing.” If it had gaze detection, he could even indicate a location by looking at it. “Get close to that red thing there,” for example, while looking at the red thing. Given the 3D inferrer nature of this speculative device, he might also want to trace the provenance of an inference, as in, “How do we know this chair is here?” This implies natural language generation as well as understanding.

There’s nothing from stopping him using the same general commands heard in the movie, but I doubt anyone would want to use those when they have commands like this and the object-on-hand controller available.

Ideally Deckard would have some general search capabilities as well, to ask questions and test ideas. “Where were these things purchased?” or subsequently, “Is there video footage from the stores where he purchased them?” or even, “What does that look like to you?” (The correct answer would be, “Well that looks like the mirror from the Arnolfini portrait, Ridley…I mean…Rick*”) It can do pattern recognition and provide as much extra information as it has access to, just like Google Lens or IBM Watson image recognition does.

*Left: The convex mirror in Leon’s 21st century apartment.
Right: The convex mirror in Arnolfini’s 15th century apartment

Finally, he should be able to ask after simple facts to see if the inspector knows or can find it. For example, “How many people are in the scene?”

All of this still requires that Deckard initiate the action, and we can augment it further with a little agentive thinking.

Agentive Tool

To think in terms of agents is to ask, “What can the system do for the user, but not requiring the user’s attention?” (I wrote a book about it if you want to know more.) Here, the AI should be working alongside Deckard. Not just building the inferences and cataloguing observations, but doing anomaly detection on the whole scene as it goes. Some of it is going to be pointless, like “Be aware the butter knife is from IKEA, while the rest of the flatware is Christofle Lagerfeld. Something’s not right, here.” But some of it Deckard will find useful. It would probably be up to Deckard to review summaries and decide which were worth further investigation.

It should also be able to help him with his goals. For example, the police had Zhora’s picture on file. (And her portrait even rotates in the dossier we see at the beginning, so it knows what she looks like in 3D for very sophisticated pattern matching.) The moment the agent—while it was reverse ray tracing the scene and reconstructing the inferred space—detects any faces, it should run the face through a most wanted list, and specifically Deckard’s case files. It shouldn’t wait for him to find it. That again poses some challenges to the script. How do we keep Deckard the hero when the tech can and should have found Zhora seconds after being shown the image? It’s a new challenge for writers, but it’s becoming increasingly important for believability.

Though I’ve never figured out why she has a snake tattoo here (and it seems really important to the plot) but then when Deckard finally meets her, it has disappeared.

Scene

Interior. Deckard’s apartment. Night.
Deckard grabs a bottle of whiskey, a glass, and the photo from Leon’s apartment. He sits on his couch and places the photo on the coffee table.
Deckard
Photo inspector.
The machine on top of a cluttered end table comes to life.
Deckard
Let’s look at this.
He points to the photo. A thin line of light sweeps across the image. The scanned image appears on the screen, pulled in a bit from the edges. A label reads, “Extending scene,” and we see wireframe representations of the apartment outside the frame begin to take shape. A small list of anomalies begins to appear to the left. Deckard pours a few fingers of whiskey into the glass. He takes a drink before putting the glass on the arm of his couch. Small projected graphics appear on the arm facing the inspector.
Deckard
OK. Anyone hiding? Moving?
Photo inspector
No and no.
Deckard
Zoom to that arm and pin to the face.
He turns the glass on the couch arm counterclockwise, and the “drone” revolves around to show Leon’s face, with the shadowy parts rendered in blue.
Deckard
What’s the confidence?
Photo inspector
95.
On the side of the screen the inspector overlays Leon’s police profile.
Deckard
Unpin.
Deckard lifts his glass to take a drink. He moves from the couch to the floor to stare more intently and places his drink on the coffee table.
Deckard
New surface.
He turns the glass clockwise. The camera turns and he sees into a bedroom.
Deckard
How do we have this much inference?
Photo inspector
The convex mirror in the hall…
Deckard
Wait. Is that a foot? You said no one was hiding.
Photo inspector
The individual is not hiding. They appear to be sleeping.
Deckard rolls his eyes.
Deckard
Zoom to the face and pin.
The view zooms to the face, but the camera is level with her chin, making it hard to make out the face. Deckard tips the glass forward and the camera rises up to focus on a blue, wireframed face.
Deckard
That look like Zhora to you?
The inspector overlays her police file.
Photo inspector
63% of it does.
Deckard
Why didn’t you say so?
Photo inspector
My threshold is set to 66%.
Deckard
Give me a hard copy right there.
He raises his glass and finishes his drink.

This scene keeps the texture and tone of the original, and camps on the limitations of Narrow AI to let Deckard be the hero. And doesn’t have him programming a virtual Big Trak.

Deckard’s Elevator

3 Mar 2020 by Christopher Noessel

This is one of those interactions that happens over a few seconds in the movie, but turns out to be quite deep—and broken—on inspection.

When Deckard enters his building’s dark, padded elevator, a flat voice announces, “Voice print identification. Your floor number, please.” He presses a dark panel, which lights up in response. He presses the 9 and 7 keys on a keypad there as he says, “Deckard. 97.” The voice immediately responds, “97. Thank you.” As the elevator moves, the interface confirms the direction of travel with gentle rising tones that correspond to the floor numbers (mod 10), which are shown rising up a 7-segment LED display. We see a green projection of the floor numbers cross Deckard’s face for a bit until, exhausted, he leans against the wall and out of the projection. When he gets to his floor, the door opens and the panel goes dark.

A need for speed

An aside: To make 97 floors in 20 seconds you have to be traveling at an average of around 47 miles per hour. That’s not unheard of today. Mashable says in a 2014 article about the world’s fastest elevators that the Hitachi elevators in Guangzhou CTF Finance Building reach up to 45 miles per hour. But including acceleration and deceleration adds to the total time, so it takes the Hitachi elevators around 43 seconds to go from the ground floor to their 95th floor. If 97 is Deckard’s floor, it’s got to be accelerating and decelerating incredibly quickly. His body doesn’t appear to be suffering those kinds of Gs, so unless they have managed to upend Newton’s basic laws of motion, something in this scene is not right. As usual, I digress.

The input control is OK

The panel design is nice and was surprising in 1982, because few people had ridden in elevators serving nearly a hundred floors. And while most in-elevator panels have a single button per floor, it would have been an overwhelming UI to present the rider of this Blade Runner complex with 100 floor buttons plus the usual open door, close door, emergency alert buttons, etc. A panel that allows combinatorial inputs reduces the number of elements that must be displayed and processed by the user, even if it slows things down, introduces cognitive overhead, and adds the need for error-handling. Such systems need a “commit” control that allows them to review, edit, and confirm the sequence, to distinguish, say, “97” from “9” and “7.” Not such an issue from the 1st floor, but a frustration from 10–96. It’s not clear those controls are part of this input.

Deckard enters 8675309, just to see what will happen.

I’m a fan of destination dispatch elevator systems that increase efficiency (with caveats) by asking riders to indicate their floor outside the elevator and letting the algorithm organize passengers into efficient groups, but that only works for banks of elevators. I get the sense Deckard’s building is a little too low-rent for such luxuries. There is just one in his building, and in-elevator controls work fine for those situations, even if they slow things down a bit.

The feedback is OK

The feedback of the floors is kind of nice in that the 7-segment numbers rise up helping to convey the direction of movement. There is also a subtle, repeating, rising series of tones that accompany the display. Most modern elevators rely on the numeracy of its passengers and their sense of equilibrium to convey this information, but sure, this is another way to do it. Also, it would be nice if the voice system would, for the visually impaired, say the floor number when the door opens.

Though the projection is dumb

I’m not sure why the little green projection of the floor numbers runs across Deckard’s face. Is it just a filmmaker’s conceit, like the genetic code that gets projected across the velociraptors head in Jurassic Park?

Pictured: Sleepy Deckard. Dumb projection.

Or is it meant to be read as diegetic, that is, that there is a projector in the elevator, spraying the floor numbers across the faces of its riders? True to the New Criticism stance of this blog, I try very hard to presume that everything is diegetic, but I just can’t make that make sense. There would be much better ways to increase the visibility of the floor numbers, and I can’t come up with any other convincing reason why this would exist.

If *this* was diegetic, the scene would have ended with a shredded projector.

But really, it falls apart on the interaction details

Lastly, this interaction. First, let’s give it credit where credit is due. The elevator speaks clearly and understands Deckard perfectly. No surprise, since it only needs to understand a very limited number of utterances. It’s also nice that it’s polite without being too cheery about it. People in LA circa 2019 may have had a bad day and not have time for that shit.

Where’s the wake word?

But where’s the wake word? This is a phrase like “OK elevator” or “Hey lift” that signals to the natural language system that the user is talking to the elevator and not themselves, or another person in the elevator, or even on the phone. General AI exists in the Blade Runner world, and that might allow an elevator to use contextual cues to suss this out, but there are zero clues in the film that this elevator is sentient.

There are of course other possible, implicit “wake words.” A motion detector, proximity sensor, or even weight sensor could infer that a human is present, and start the elevator listening. But with any of these implicit “wake words,” you’d still need feedback for the user to know when it was listening. And some way to help them regain attention if they got the first interaction wrong, and there would be zero affordances for this. So really, making an explicit wake word is the right way to go.

It might be that touching the number panel is the attention signal. Touch it, and the elevator listens for a few seconds. That fits in with the events in the scene, anyway. The problem with that is the redundancy. (See below.) So if the solution was pressing a button, it should just be a “talk” button rather than a numeric keypad.

It may be that the elevator is always listening, which is a little dark and would stifle any conversation in the elevator less everyone end up stuck in the basement, but this seems very error prone and unlikely.

Deckard: *Yawns* Elevator: Confirmed. Silent alarm triggered.

This issue is similar to the one discussed in Make It So Chapter 5, “Gestural Interfaces” where I discussed how a user tells a computer they are communicating to it with gestures, and when they aren’t.

Where are the paralinguistics?

Humans provide lots of signals to one another, outside of the meaning of what is actually being said. These communication signals are called paralinguistics, and one of those that commonly appears in modern voice assistants is feedback that the system is listening. In the Google Assistant, for example, the dots let you know when it’s listening to silence and when it’s hearing your voice, providing implicit confirmation to the user that the system can hear them. (Parsing the words, understanding the meaning, and understanding the intent are separate, subsequent issues.)

Fixing this in Blade Runner could be as simple as turning on a red LED when the elevator is listening, and varying the brightness with Deckard’s volume. Maybe add chimes to indicate the starting-to-listen and no-longer-listening moments. This elevator doesn’t have anything like that, and it ought to.

Why the redundancy?

Next, why would Deckard need to push buttons to indicate “97” even while he’s saying the same number as part of the voice print? Sure, it could be that the voice print system was added later and Deckard pushes the numbers out of habit. But that bit of backworlding doesn’t buy us much.

It might be a need for redundant, confirming input. This is useful when the feedback is obscure or the stakes are high, but this is a low-stakes situation. If he enters the wrong floor, he just has to enter the correct floor. It would also be easy to imagine the elevator would understand a correction mid-ride like “Oh wait. Elevator, I need some ice. Let’s go to 93 instead.” So this is not an interaction that needs redundancy.

It’s very nice to have the discrete input as accessibility for people who cannot speak, or who have an accent that is unrecognizable to the system, or as a graceful degradation in case the speech recognition fails, but Deckard doesn’t fit any of this. He would just enter and speak his floor.

Why the personally identifiable information?

If we were designing a system and we needed, for security, a voice print, we should protect the privacy of the rider by not requiring personally identifiable information. It’s easy to imagine the spoken name being abused by stalkers and identity thieves riding the elevator with him. (And let’s not forget there is a stalker on the elevator with him in this very scene.)

This young woman, for example, would abuse the shit out of such information.

Better would be some generic phrase that stresses the parts of speech that a voiceprint system would find most effective in distinguishing people.

Tucker Saxon has written an article for VoiceIt called “Voiceprint Phrases.” In it he notes that a good voiceprint phrase needs some minimum number of non-repeating phonemes. In their case, it’s ten. A surname and a number is rarely going to provide that. “Deckard. 97,” happens to have exactly 10, but if he lived on the 2nd floor, it wouldn’t. Plus, it has that personally identifiable information, so is a non-starter.

What would be a better voiceprint phrase for this scene? Some of Saxon’s examples in the article include, “Never forget tomorrow is a new day” and “Today is a nice day to go for a walk.” While the system doesn’t care about the meaning of the phrase, the humans using it would be primed by the content, and so it would just add to the dystopia of the scene if Deckard had to utter one of these sunshine-and-rainbows phrases in an elevator that was probably an uncleaned murder scene. but I think we can do it one better.

(Hey Tucker, I would love use VoiceIt’s tools to craft a confirmed voiceprint phrase, but the signup requires that I permit your company to market to me via phone and email even though I’m just a hobbyist user, so…hard no.)

Deckard: Hi, I’m Deckard. My bank card PIN code is 3297. The combination lock to my car spells “myothercarisaspinner” and my computer password is “unicorn.” 97 please.

Here is an alternate interaction that would have solved a lot of these problems.

ELEVATOR
Voice print identification, please.
DECKARD
SIGHS
DECKARD
Have you considered life in the offworld colonies?
ELEVATOR
Confirmed. Floor?
DECKARD
97

Which is just a punch to the gut considering Deckard is stuck here and he knows he’s stuck, and it’s salt on the wound to have to repeat fucking advertising just to get home for a drink.

So…not great

In total, this scene zooms by and the audience knows how to read it, and for that, it’s fine. (And really, it’s just a setup for the moment that happens right after the elevator door opens. No spoilers.) But on close inspection, from the perspective of modern interaction design, it needs a lot of work.

Cyberspace: Navigation

7 Mar 2017 by scifihughf

Cyberspace is usually considered to be a 3D spatial representation of the Internet, an expansion of the successful 2D desktop metaphor. The representation of cyberspace used in books such as Neuromancer and Snow Crash, and by the film Hackers released in the same year, is an abstract cityscape where buildings represent organisations or individual computers, and this what we see in Johnny Mnemonic. How does Johnny navigate through this virtual city?

Gestures and words for flying

Once everything is connected up, Johnny starts his journey with an unfolding gesture. He then points both fingers forward. From his point of view, he is flying through cyberspace. He then holds up both hands to stop.

Both these gestures were commonly used in the prototype VR systems of 1995. They do however conflict with the more common gestures for manipulating objects in volumetric projections that are described in Make It So chapter 5. It will be interesting to see which set of gestures is eventually adopted, or whether they can co-exist.

Later we will see Johnny turn and bank by moving his hands independently.

We also see him using voice commands, saying “hold it” to stop forward motion immediately. Later we see him stretch one arm out and bring it back, apparently reversing a recent move.

In cyberpunk and related fiction users fly everywhere in cyberspace, a literal interpretation of the spatial metaphor. This is also how users in our real world MUD and MOO cyberspaces start. After a while, travelling through all the intermediate locations between your start and destination gets tedious. MUDs and MOOs allow teleporting, a direct jump to the desired location, and the cyberspace in Johnny Mnemonic has a similar capability.

Gestures for teleporting

Mid sequence, Johnny wants to jump to the Beijing hotel where the upload took place. To do this, he uses a blue geometric shape at the lower left of his view, looking like a high tech, floating tetrahedron. Johnny slowly spins this virtual object using repeated flicking gestures with his left hand, with his ring and middle fingers held together.

It looks very similar to the gesture used on a current-day smartphone to flick through a photo album or set of application icon screens. And in this case, it causes a blue globe to float into view (see below).

Johnny grabs this globe and unfolds it into a fullscreen window, using the standard Hollywood two handed “spread” gesture described in Chapter 5 of Make It So.

The final world map fills the entire screen. Johnny uses his left hand to enter a number on a HUD style overlay keypad, then taps on the map to indicate China.

I interpret this as Johnny using the hotel phone number to specify his destination. It would not be unusual for there to be multiple hotels with the same name within a city such as Beijing, but the phone number should be unique. But since Johnny is currently in North America, he must also specify the international dialing code or 2021 equivalent, which he can do just by pointing. And this is a well-designed user interface which accepts not only multimodal input, but in any order, rather than forcing the user to enter the country code first.

Keyboards and similar physical devices often don’t translate well into virtual reality, because tactile feedback is non-existent. Even touch typists need the feeling of the physical keyboard, in particular the slight concavity of the key tops and the orientation bumps on the F and J keys, to keep their fingers aligned. Here though there is just a small grid of virtual numbers which doesn’t require extended typing. Otherwise this is a good design, allowing Johnny to type a precise number and just point to a larger target.

After he taps a location, the zoomrects indicate a transition into a new cyberspace, in this case, Beijing.

Garden Center

21 Oct 2015 by Christopher Noessel

In the center of the kitchen, mounted to the ceiling, is a Garden Center. Out of use, it retracts out of reach, but anyone in the family can say Fruit, please and the Garden Center drops down to allow fresh grapes to be plucked right off the vine. When done, Marty Jr. tells it to retract with a thump on it, and it retracts back up to its resting place near the ceiling.

This is wonderful. Responds to many types of inputs and keeps healthy, fresh fruit available to the family at any time.

6-Screen TV

21 Oct 2015 by Christopher Noessel

When Marty Jr. gets home, he approaches the large video display in the living room, which is displaying a cropped image of “The Gold of Their Bodies (Et l’or de Leur Corps)” by Paul Gauguin. He speaks to the screen, saying Art off. After a bit of static, the screen goes black. He then says, OK, I want channels 18, 24, 63, 109, 87, and the Weather Channel. As he says each, a sixth of the screen displays the live feed. The number for the channel appears in the upper left corner for a short while before fading. Marty Jr. then sits down to watch the six channels simultaneously.

Voice control. Perfect recognition. No modality. Spot on. It might dynamically update the screen in case he only wanted to watch 2 or 3 channels, but perhaps it is a cheaper system apropos to the McFly household.

Iron Man HUD: A Breakdown

1 Jul 2015 by Christopher Noessel

So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.

So on the surface of this scene, it’s a communications interface.

But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)

So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components.

Exosuit

The Iron Man is the name of the series of superpowered exosuits designed by Tony Stark. They range from the Mark I, a comparatively crude suit of armor to escape imprisonment by terrorists, through the Mark XLVI, the armor seen in The Avengers: Age of Ultron. The suit acts as defense against nearly every type of weapon known. It has repulsor beams built into the palms and in later models the arc reactor mounted in the chest that can be used to deliver concussive force. It allows the wearer to fly. Offensive weaponry varies between models, but has included a high powered laser system, and auto-targeting minigun pod and missiles. The suit can act semi-autonomously or via remote control. One of the models in The Avengers has parts that are seen to self-propel to Tony, targeting a beacon bracelet he wears, and self-assemble around him very quickly.

Immersive display

Though Tony’s head is completely covered, he has a virtual reality display within his helmet. It is a full-field-of-vision, very high-resolution, full-color display that provides stereoscopic imaging. It allows Tony to see the world around him as if he were not wearing the helmet, augment the view with goal-, person-, location-, and object-sensitive awareness.

The display varies a great deal, changing to the needs of the situation. But five icons persistently in the lower part of the display seem to be: suit status, targeting and optics, radar, artificial horizon, and map.

An interpretive view of Tony’s experience, from Iron Man (2008). — An interpretive view of Tony’s experience, from *Iron Man* (2008).

An first-person view from within the HUD, Iron Man (2008). — An first-person view from within the HUD, *Iron Man* (2008).

There is much to critique about the readability of the complex layering and translucency, the limits of human perception, and the necessarily- (and strictly-) interpretive nature of what we as audience see, but let me save those three points for a later post. For now it’s enough to log the features as aspects of the system.

Head NUI

Though Tony could use his hands to interact with an interface projected into the augmented reality view around him, his hands are often occupied in controlling flight or in combat. For this reason the means of input are head gesture, eye gesture, and voice input. A bit more on each follows.

Elements within the HUD such as reticles around his eyes follow and track his head gestures. Other elements stay locked in place. The HUD can track his gaze perfectly, allowing him to designate targets for his weapons with a fixation. Using this perfect eye tracking, Tony can also speak about something he is looking at, either in the real world or in the interface, and the system understands exactly what he’s talking about.

In fact, Tony is able to speak fully natural language commands, and indeed, carry out full-Turing conversations with the suit because of the presence of…

Strong artificial intelligence: JARVIS

An on-board artificial intelligence known as JARVIS handles any information task Tony asks of it, and monitors the surroundings and anticipates informational needs. There is strong evidence that most of the functions of the suit are handled by JARVIS behind the scenes. The crucialness of the artificial intelligence to the function of the suit cannot be overstated. It’s difficult to imagine how most of the suit could function as it does without an artificial intelligence behind the scenes facilitating results and even guiding Tony. With this in mind it is instructive to reframe the AI as the thing being named the Iron Man, with Tony Stark being an onboard manager, or, more charitably, a command-and-control center. Who quips.

Next up in the Iron HUD series: Lets review the functions of the suit.

Alphy

22 Feb 2013 by Christopher Noessel

Barbarella-014

Barbarella’s onboard conversational computer is named Alphy. He speaks with a polite male voice with a British accent and a slight lisp. The voice seems to be omnidirectional, but confined to the cockpit of the space rocket.

Goals

Alphy’s primary duties are threefold. First, to obey Barbarella’s commands, such as waking her up before their approach to Tau Ceti. Second, autopilot navigation. Third, to report statuses, such as describing the chances of safe landing or the atmospheric analysis that assures Barbarella she will be able to breathe.

Display

Alphy

Whenever Alphy is speaking, a display panel at the back of the cockpit moves. The panel stretches from the floor to the ceiling and is about a meter wide. The front of the panel consists of a large array of small rectangular sheets of metal, each of which is attached on one side to one of the horizontal bars that stretch across the panel. As Alphy talks, individual rectangles lift and fall in a stochastic pattern, adding a small metallic clacking to the voice output. A flat yellow light fills the space behind the panel, and the randomly rising and falling rectangles reveal it in mesmerizing patterns.

The light behind Alphy’s panel can change. As Barbarella is voicing her grave concerns to Dianthus, Alphy turns red. He also flashes red and green during the magnetic disturbances that crash her ship on Tau Ceti. We also see him turn a number of colors after the crash on Tau Ceti, indicating the damage that has been done to him.

In the case of the conversation with Dianthus, there is no real alert state to speak of, so it is conceivable that these colors act something like a mood ring, reflecting Barbarella’s affective state.

Language

Like many language-capable sci-fi computer systems of the era, Alphy speaks in a stilted fashion. He is given to “computery” turns of phrases, brusque imperatives, and odd, unsocialized responses. For example, when Barbarella wishes Alphy a good night before she goes to sleep, he replies, “Confirmed.”

Barbarella even speaks this way when addressing Alphy sometimes, such as when they risk crashing into Tau Ceti and she must activate the terrascrew and travel underground. As she is piloting manually, she says things like, “Full operational power on all subterranean systems,” “45 degree ascent,” and “Quarter to half for surfacing.”

Barbarella-Alphy-color-04

Nonetheless, Alphy understands Barbarella completely whenever she speaks to him, so the stilted language seems very much like a convention than a limitation.

Anthropomorphism

Despite his lack of linguistic sophistication, he shows a surprising bit of audio anthropomorphism. When suffering through the magnetic disturbances, his voice gets distressed. Alphy’s tone also gets audibly stressed when he reveals that the Catchman has performed repairs “in reverse,” in each case underscoring the seriousness of the situation. When the space rocket crashes on Tau Ceti, Alphy asks groggily, “Where are we?” We know this is only affectation because within a few seconds, he is back up to full functioning, reporting happily that they have landed, “Planet 16 in the system Tau Ceti. Air density oh-point-oh-51. Cool weather with the possibility of stormy precipitations.” Alphy does not otherwise exhibit emotion. He doesn’t speak of his emotions or use emotional language. This convention, too, is to match Barbarella’s mood and make make her more comfortable.

Agency

Alphy’s sensors seem to be for time, communication technology, self-diagnostics, and for analyzing the immediate environment around the ship. He has actuators to speak, change his display, supply nutrition to Barbarella, and focus power to different systems around the ship, including the emergency systems. He can detect problems, such as the “magnetic disturbance”, and can respond, but has no authority to initiate action. He can only obey Barbarella, as we hear in the following exchange.

Barbarella: What’s happening?
Alphy: Magnetic disturbances.
Barbarella: Magnetic disturbances?…Emergency systems!
Alphy: All emergency systems will now operate.

His real function?

All told, Alphy is very limited in what he can do. His primary functions are reading aloud data that could be dials on a dashboard and flipping switches so Barbarella won’t have to take her hands off of…well, switches…in emergency situations. The bits of anthropomorphic cues he provides to her through the display and language confirm that his primary goal is social, to make Barbarella’s adventurous trips through space not feel so lonely.

Gene Sequence Comparison

17 Jan 2013 by Christopher Noessel

Genetic tester

Shaw sits at this device speaking instructions aloud as she peers through a microscope. We do not see if the instructions are being manually handled by Ford, or whether the system is responding to her voice input. Ford issues the command “compare it to the gene sample,” the nearby screen displays DNA gel electrophoresis results for the exploding alien sample and a human sample. When Ford says, “overlay,” the results slide on top of each other. A few beats after some screen text and a computerized voice informs them that the system is PROCESSING, (repeated twice) it confirms a DNA MATCH with other screen text read by the same computerized voice.

Playback box

When Halloway visits Shaw in her quarters, she uses a small, translucent glass cuboid to show him the comparison. To activate it, she drags a finger quickly across the long, wide surface. That surface illuminates with the data from the genetic tester, including the animation. The emerald green colors of the original have been replaced by cyan, the red has been replaced by magenta, and some of the contextualizing GUI has been omitted, but it is otherwise the same graphic. Other than this activation gesture, no other interactivity is seen with this device.

There’s a bit of a mismatch between the gesture she uses for input and the output on the screen. She swipes, but the information fades up. It would be a tighter mapping for Shaw if a swipe on its surface resulted in the information’s sliding in at the same speed, or at least faded up as if she was operating a brightness control. If the fade up was the best transition narratively, another gesture such as a tap might be a better fit for input. Still, the iOS standard for unlocking is to swipe right, so this decision might have been made on the basis of the audience’s familiarity with that interaction.

MedPod

28 Dec 2012 by Christopher Noessel

Early in the film, when Shaw sees the MedPod for the first time, she comments to Vickers that, “They only made a dozen of these.” As she caresses its interface in awe, a panel extends as the pod instructs her to “Please verbally state the nature of your injury.”

The MedPod is a device for automated, generalized surgical procedures, operable by the patient him- (or her-, kinda, see below) self.

When in the film Shaw realizes that she’s carrying an alien organism in her womb, she breaks free from crewmembers who want to contain her, and makes a staggering beeline for the MedPod.

Once there, she reaches for the extended touchscreen and presses the red EMERGENCY button. Audio output from the pod confirms her selection, “Emergency procedure initiated. Please verbally state the nature of your injury.” Shaw shouts, “I need cesarean!” The machine informs her verbally that, “Error. This MedPod is calibrated for male patients only. It does not offer the procedure you have requested. Please seek medical assistance else–”

I’ll pause the action here to address this. What sensors and actuators are this gender-specific? Why can’t it offer gender-neutral alternatives? Sure, some procedures might need anatomical knowledge of particularly gendered organs (say…emergency circumcision?), but given…

the massive amounts of biological similarity between the sexes
the needs for any medical device to deal with a high degree of biological variability in its subjects anyway
most procedures are gender neutral

…this is a ridiculous interface plot device. If Dr. Shaw can issue a few simple system commands that work around this limitation (as she does in this very scene), then the machine could have just done without the stupid error message. (Yes, we get that it’s a mystery why Vickers would have her MedPod calibrated to a man, but really, that’s a throwaway clue.) Gender-specific procedures can’t take up so much room in memory that it was simpler to cut the potential lives it could save in half. You know, rather than outfit it with another hard drive.

Aside from the pointless “tension-building” wrong-gender plot point, there are still interface issues with this step. Why does she need to press the emergency button in the first place? The pod has a voice interface. Why can’t she just shout “Emergency!” or even better, “Help me!” Isn’t that more suited to an emergency situation? Why is a menu of procedures the default main screen? Shouldn’t it be a prompt to speak, and have the menu there for mute people or if silence is called for? And shouldn’t it provide a type-ahead control rather than a multi-facet selection list? OK, back to the action.

Desperate, Shaw presses a button that grants her manual control. She states “Surgery abdominal, penetrating injuries. Foreign body. Initiate.” The screen confirms these selections amongst options on screen. (They read “DIAGNOS, THERAP, SURGICAL, MED REC, SYS/MECH, and EMERGENCY”)

The pod then swings open saying, “Surgical procedure begins,” and tilting itself for easy access. Shaw injects herself with anesthetic and steps into the pod, which seals around her and returns to a horizontal position.

Why does Shaw need to speak in this stilted speech? In a panicked or medical emergency situation, proper computer syntax should be the last thing on a user’s mind. Let the patient shout the information however they need to, like “I’ve got an alien in my abdomen! I need it to be surgically removed now!” We know from the Sonic chapter that the use of natural language triggers an anthropomorphic sense in the user, which imposes some other design constraints to convey the system’s limitations, but in this case, the emergency trumps the needs of affordance subtleties.

Once inside the pod, a transparent display on the inside states that, “EMERGENCY PROC INITIATED.” Shaw makes some touch selections, which runs a diagnostic scan along the length of her body. The terrifying results display for her to see, with the alien body differentiated in magenta to contrast her own tissue, displayed in cyan.

Shaw shouts, “Get it out!!” It says, “Initiating anesthetics” before spraying her abdomen with a bile-yellow local anesthetic. It then says, “Commence surgical procedure.” (A note for the grammar nerds here: Wouldn’t you expect a machine to maintain a single part of speech for consistency? The first, “Initiating…” is a gerund, while the second, “Commence,” is an imperative.) Then, using lasers, the MedPod cuts through tissue until it reaches the foreign body. Given that the lasers can cut organic matter, and that the xenomorph has acid for blood, you have to hand it to the precision of this device. One slip could have burned a hole right through her spine. Fortunately it has a feather-light touch. Reaching in with a speculum-like device, it removes the squid-like alien in its amniotic sac.

OK. Here I have to return to the whole “ManPod” thing. Wouldn’t a scan have shown that this was, in fact, a woman? Why wouldn’t it stop the procedure if it really couldn’t handle working on the fairer sex? Should it have paused to have her sign away insurance rights? Could it really mistake her womb for a stomach? Wouldn’t it, believing her to be a man, presume the whole womb to be a foreign body and try to perform a hysterectomy rather than a delicate caesarian? ManPod, indeed.

After removing the alien, it waits around 10 seconds, showing it to her and letting her yank its umbilical cord, before she presses a few controls. The MedPod seals her up again with staples and opens the cover to let her sit up.

She gets off the table, rushes to the side of the MedPod, and places all five fingertips of her right hand on it, quickly twisting her hand clockwise. The interface changes to a red warning screen labeled “DECONTAMINATE.” She taps this to confirm and shouts, “Come on!” (Her vocal instruction does not feel like a formal part of the procedure and the machine does not respond differently.) To decontaminate, the pod seals up and a white mist fills the space.

OK. Since this is a MedPod, and it has something called a decontamination procedure, shouldn’t it actually test to see whether the decontamination worked? The user here has enacted emergency decontamination procedures, so it’s safe to say that this is a plague-level contagion. That’s doesn’t say to me: Spray it with a can of Raid and hope for the best. It says, “Kill it with fire.” We just saw, 10 seconds ago, that the MedPod can do a detailed, alien-detecting scan of its contents, so why on LV-223 would it not check to see if the kill-it-now-for-God’s-sake procedure had actually worked, and warn everyone within earshot that it hadn’t? Because someone needs to take additional measures to protect the ship, and take them, stat. But no, MedPod tucks the contamination under a white misty blanket, smiles, waves, and says, “OK, that’s taken care of! Thank you! Good day! Move along!”

For all of the goofiness that is this device, I’ll commend it for two things. The first is for pushing the notion forward of automated medicine. Yes, in this day and age, it’s kind of terrifying to imagine devices handling something as vital as life-saving surgery, but people in the future will likely find it terrifying that today we’d rather trust an error prone, bull-in-a-china-shop human to the task. And, after all, the characters have entrusted their lives to an android while they were in hypersleep for two years, so clearly that’s a thing they do.

Second, the gestural control to access the decontamination is well considered. It is a large gesture, requiring no great finesse on the part of the operator to find and press a sequence of keys, and one that is easy to execute quickly and in a panic. I’m absolutely not sure what percentage of procedures need the back-up safety of a kill-everything-inside mode, but presuming one is ever needed, this is a fine gesture to initiate that procedure. In fact, it could have been used in other interfaces around the ship, as we’ll see later with the escape pod interface.

I have the sense that in the original script, Shaw had to do what only a few very bad-ass people have been willing to do: perform life-saving surgery on themselves in the direst circumstances. Yes, it’s a bit of a stretch since she’s primarily an anthropologist and astronomer in the story, but give a girl a scalpel, hardcore anesthetics, and an alien embryo, and I’m sure she’ll figure out what to do. But pushing this bad-assery off to an automated device, loaded with constraints, ruins the moment and changes the scene from potentially awesome to just awful.

Given the inexplicable man-only settings, requiring a desperate patient to recall FORTRAN-esque syntax for spoken instructions, and the failure to provide any feedback about the destruction of an extinction-level pathogen, we must admit that the MedPod belongs squarely in the realm of goofy narrative technology and nowhere near the real world as a model of good interaction design.