Comparing Sci-Fi HUDs in 2024 Movies

6 Feb 2025 by Christopher Noessel

As in previous years, in preparation for awarding the Fritzes, I watched as many sci-fi movies as I could find across 2024. One thing that stuck out to me was the number of heads-up displays (HUDs) across these movies. There were a lot to them. So in advance of the awards, lets look and compare these. (Note the movies included here are not necessarily nominees for a Fritz award.)

I usually introduce the plot of every movie before I talk about it. This provides some context to understanding the interface. However, that will happen in the final Fritzes post. I’m going to skip that here. Still, it’s only fair to say there will be some spoilers as I describe these.

If you read Chapter 8 of Make It So: Interaction Lessons from Science Fiction, you’ll recall that I’d identified four categories of augmentation.

Sensor displays
Location awareness
Context awareness (objects, people)
Goal awareness

These four categories are presented in increasing level of sophistication. Let’s use these to investigate and compare five primary examples from 2024, in order of their functional sophistication.

Dune 2

True to the minimalism that permeates much of the interfaces film, the AR of this device has a rounded-rectangle frame from which hangs a measure of angular degrees to the right. There are a few ticks across the center of this screen (not visible in this particular screen shot). There is a row of blue characters across the bottom center. I can’t read Harkonnen, and though the characters change, I can’t quite decipher what most of them mean. But it does seem the leftmost character indicates azimuth and the rightmost character angular altitude of the glasses. Given the authoritarian nature of this House, it would make sense to have some augmentation naming the royal figures in view, but I think it’s a sensor display, which leaves the user with a lot of work to figure out how to use that information.

You might think this indicates some failing of the writer’s or FUI designers’ imagination. However, an important part of the history of Dune is a catastrophic conflict known as the Butlerian Jihad. This conflict involved devastating, large-scale wars against intelligent machines. As a result, machines with any degree of intelligence are considered sacrilege. So it’s not an oversight, but as a result, we can’t look to this as a model for how we might handle more sophisticated augmentations.

Alien: Romulus

A little past halfway through the movie, the protagonists finally get their hands on some weapons. In a fan-service scene similar to one between Ripley and Hicks from Aliens (1986), Tyler shows Rain how to hold an FAA44 pulse rifle. He also teaches her how to operate it. The “AA” stands for “aiming assist”, a kind of object awareness. (Tyler asserts this is what the colonial marines used, which kind of retroactively saps their badassery, but let’s move on.) Tyler taps a small display on the user-facing rear sight, and a white-on-red display illuminates. It shows a low-res video of motion happening before it. A square reticle with crosshairs shows where the weapon will hit. A label at the top indicates distance. A radar sweep at the bottom indicates movement in 360° plan view, a sensor display.

When Rain pulls the trigger halfway, the weapon quickly swings to aim at the target. There is no indication of how it would differentiate between multiple targets. It’s also unclear how Rain told it that the object in the crosshairs earlier is what she wants it to track now. Or how she might identify a friendly to avoid. Red is a smart choice for low-light situations as red is known to not interfere with night vision. Also it’s elegantly free of flourishes and fuigetry.

I’m not sure the halfway-trigger is the right activation mechanism. Yes, it allows the shooter to maintain a proper hold and remain ready with the weapon, and allows them not have to look at the display to gain its assistance, but also requires them to be in a calm, stable circumstance that allows for fine motor control. Does this mean that in very urgent, chaotic situations, users are just left to their own devices? Seems questionable.

Alien: Romulus is beholden to the handful of movies in the franchise that preceded it. Part of the challenge for its designers is to stay recognizably a part of the body of work that was established in 1979 while offering us something new. This weapon HUD stays visually simple, like the interfaces from the original two movies. It narratively explains how a civilian colonist with no weapons training can successfully defend herself against a full-frontal assault by a dozen of this universe’s most aggressive and effective killers. However, it leaves enough unexplained that it doesn’t really serve as a useful model.

The Wild Robot

HUD displays of artificially intelligent robots are always difficult to analyze. It’s hard to determine what’s an augmentation, here loosely defined as an overlay on some datastream created for a user’s benefit but explicitly not by that user. It opposes a visualization of the AI’s own thoughts as they are happening. I’d much rather analyze these as augmentation provided for Roz, but it just doesn’t hold up to scrutiny that way. What we see in this film are visualizations of Roz’ thoughts.

Fresh after booting up, Roz searches for a “customer,” and kind of finds one in a crab. *The Wild Robot* (2024).

In the HUD, there is an unchanging frame around the outside. Static cyan circuit lines extend to the edge. (In the main image above, the screen-green is an anomaly.) A sphere rotates in the upper left unconnected to anything. A hexagonal grid on the left has some hexes which illuminate and blink unconnected to anything. The grid moves unrelated to anything. These are fuigetry and neither conveys information nor provides utility.

Inside that frame, we see Roz’ visualized thinking across many scenes.

Locus of attention—Many times we see a reticle indicating where she’s focused, oftentimes with additional callout details written in robot-script.
“Customer” recognition—(pictured) Since it happens early in the film, you might think this is a goofy error. The potential customer she has recognized is a crab. But later in the film, Roz learns the language common to the animals of the island. All the animals display a human-like intelligence, so it’s completely within the realm of possibility that this blue little crustacean could be her customer. Though why that customer needed a volumetric wireframe augmentation is very unclear.
X-ray vision—While looking around for a customer, she happens upon an egg. The edge detection indicates her attention. Then she performs scans that reveal the growing chick inside and a vital signs display.
Damage report—After being attacked by a bear, Roz does an internal damage check and she notes the damage on screen.
Escape alert—(pictured) When a big wave approaches the shore on which she is standing, Roz estimates the height of the wave to be five time her height. Her panic expresses itself in a red tint around the outside edge.
Project management—Roz adopts Brightbill and undertakes the mission to mother him—specifically to teach him to eat, swim, and fly. As she successfully teaches him each of these things, she checks it off by updating one of three graphics that represent the topics.
Language acquisition—(pictured) Of all the AR in this movie, this scene frustrates me the most. There is a sequence in which Roz goes torpid to focus on learning the animal language. Her eyes are open the entire time she captures samples and analyzes them. The AR shows word bubbles associated with individual animal utterances. At first those bubbles are filled with cyan-colored robo-ese script. Over the course of processing a year’s worth of samples, individual characters are slowly replaced in the utterances with bold, green, Latin characters. This display kind of conveys the story beat of “she’s figuring out the language), but befits cryptography much more than acquisition of a new language.

If these were augmented reality, I’d have a lot of questions about why it wasn’t helping her more than it does. It might seem odd to think an AI might have another AI helping it, but humans have loads of systems that operate without explicit conscious thought, like preattentive processing, all the functions of our autonomic nervous system, sensory filtering, and recall, just to name a few. So I can imagine it would be a fine model for AI-supporting-AI.

Since it’s not augmented reality, it doesn’t really act as a model for real world designs except perhaps for its visual styling.

Borderlands

Claptrap is a little one-wheel robot that accompanies Lilith though her adventures on and around Pandora. We see things through his POV several times.

When Claptrap first sees Lilith, it’s from his HUD. Like Roz’ POV display in The Wild Robot, the outside edge of this view has a fixed set of lines and greebles that don’t change, not even for a sensor display. I wish those lines had some relationship to his viewport, but that’s just a round lens and the lines are vaguely like the edges of a gear.

Scrolling up from the bottom left is an impressive set of textual data. It shows that a DNA match has been made (remotely‽ What kind of resolution is Claptrap’s CCD?) and some data about Lilith from what I presume is a criminal justice data feed: Name and brief physical description. It’s person awareness.

Below that are readouts for programmed directive and possible directive tasks. They’re funny if you know the character. Tasks include “Supply a never-ending stream of hilarious jokes and one-liners to lighten the mood in tense situations” and “Distract enemies during combat. Prepare the Claptrap dance of confusion!” I also really like the last one “Take the bullets while others focus on being heroic.” It both foreshadows a later scene and touches on the problem raised with Dr. Strange’s Cloak of Levitation: How do our assistants let us be heroes?

At the bottom is the label “HYPERION 09 U1.2” which I think might be location awareness? The suffix changes once they get near the vault. Hyperion a faction in the game. Not certain what it means in this context.

When driving in a chase sequence, his HUD gives him a warning about a column he should avoid. It’s not a great signal. It draws his attention but then essentially says “Good luck with that.” He has to figure out what object it refers to. (The motion tracking, admittedly, is a big clue.) But the label is not under the icon. It’s at the bottom left. If this were for a human, it would add a saccade to what needs to be a near-instantaneous feedback loop. Shouldn’t it be an outline or color overlay to make it wildly clear what and where the obstacle is? And maybe some augmentation on how to avoid it, like an arrow pointing right? As we see in a later scene (below) the HUD does have object detection and object highlighting. There it’s used to find a plot-critical clue. It’s just oddly not used here, you know, when the passengers’ lives are at risk.

When the group goes underground in search of the key to the Vault, Claptrap finds himself face to face with a gang of Psychos. The augmentation includes little animated red icons above the Psychos. Big Red Text summarizes “DANGER LEVEL: HIGH” across the middle, so you might think it’s demonstrating goal and context awareness. But Claptrap happens to be nigh-invulnerable, as we see moments later when he takes a thousand Psycho bullets without a scratch. In context, there’s no real danger. So,…holup. Who’s this interface for, then? Is it really aware of context?

When they visit Lilith’s childhood home, Claptrap finds a scrap of paper with a plot-critical drawing on it. The HUD shows a green outline around the paper. Text in the lower right tracks a “GARBAGE CATALOG” of objects in view with comments, “A PSYCHO WOULDN’T TOUCH THAT”, “LIFE-CHOICE QUESTIONING TRASH”, “VAULT HUNTER THROWBACK TRASH”. This interface gives a bit of comedy and leads to the Big Clue, but raises questions about consistency. It seems the HUDs in this film are narrativist.

In the movie, there are other HUDs like this one, for the Crimson Lance villains. They fly their hover-vehicles using them, but we don’t nearly get enough time to tease the parts apart.

Atlas

The HUD in Atlas happens when the titular character Atlas is strapped into an ARC9 mech suit, which has its own AGI named Smith. Some of the augmentations are communications between Smith and Atlas, but most are augmentations of the view before her. The viewport from the pilot’s seat is wide and the augmentations appear there.

On the way to evil android Harlan’s base, we see the frame of the HUD has azimuth and altitude indicators near the edge. There are a few functionless flourishes, like arcs at the left and right edges. Later we see object and person recognition (in this case, an android terrorist, Casca Decius). When Smith confirms they are hostile, the square reticles go from cyan to red, demonstrating context awareness.

Over the course of the movie Atlas has resisted Smith’s call to “sync” with him. At Harlan’s base, she is separated from the ARC9 unit for a while. But once she admits her past connection to Harlan, she and Smith become fully synched. She is reunited with the ARC9 unit and its features fully unlock.

As they tear through the base to stop the launch of some humanity-destroying warheads, they meet resistance from Harlan’s android army. This time the HUD wholly color codes the scene, making it extremely clear where the combatants are amongst the architecture.

Overlays indicate the highest priority combatants that, I suppose, might impede progress. A dashed arrow stretches through the scene indicating the route they must take to get to their goal. It focuses Atlas on their goal and obstacles, helping her decision-making around prioritization. It’s got rich goal awareness and works hard to proactively assist its user.

Despite being contrasting colors, they are well-controlled to not vibrate. You might think that the luminance of the combatants and architecture might be flipped, but the ARC9 is bulletproof, so there’s no real danger from the gunfire. (Contrast Claptrap’s fake danger warning, above.) Saving humanity is the higher priority. So the brightest (yellow) means “do this”, the second brightest (cyan) means “through this” and darkest (red) means “there will be some nuisances en route.” The luminescence is where it should be.

In the climactic fight with Harlan, the HUD even displays a predictive augmentation, illustrating where the fast-moving villain is likely to be when Atlas’ attacks land. This crucial augmentation helps her defeat the villain and save the day. I don’t think I’ve seen predictive augmentation outside of video games before.

If I was giving out an award for best HUD of 2024, Atlas would get it. It is the most fully-imagined HUD assistance across the year, and consistently, engagingly styled. If you are involved with modern design or the design of sci-fi interfaces, I highly recommend you check it out.

Stay tuned for the full Fritz awards, coming later this year.

Realtime story visualization

29 Jul 2020 by Christopher Noessel

Caveat: This is definitely me reading into things. Or even, inferring something that I’d like to see in the world. But why not?

Black Panther begins with a conversation between a son and father.

SON
Baba?
FATHER
Yes, my son?
SON
Tell me a story
FATHER
Which one?
SON
The story of home.

The conversation continues with the father describing the history of Wakanda. On screen, we see a lovely sequence of shapes that illustrate the story. A meteor strikes Africa and the nearby flora and fauna change. Five hands form a pentagram version of the four-handed carry grip to represent the five tribes. The hands shift to become warring tribespeople. Their armor. Their weapons. Their animals.

All these shapes are made from vibranium sand—gunmetal gray colored, sparkling particles, see the screen caps—that move and reform fluidly, with a unifying highlight of glowing blue.

Now, this opening sequence isn’t presented as an interface, or really, as anything in the diegesis at all. We understand it is exposition, for us in the audience. But what if it wasn’t? What if this is showing us a close up of a display that illustrates in real-time what the storyteller is saying? Something just over the shoulder of Baba that the child can watch?

The display would not be prerecorded, which requires the storyteller to match its fixed pace. (Presenters who have tried pecha-kucha style presentations of 20 slides, 20 seconds each will know how awkward this can be.) Instead, this display responds instantly to the storyteller’s tone and pace, allowing them to tailor the story to the responses of the audience: emphasizing the things that seem exciting, or heartwarming, or whatever the storyteller wants.

It’s a given in the MCU that Wakanda has developed the technology to control vibranium down to a very small scale, including levitating it, shaping it, and having it form materials of widely varying properties. Nearly all of the technology we see in the film is made from it. So, the diegetic technology for such a display is there.

It’s not that far a stretch from 2D technology we have now. The game Scribblenauts lets players type in phrases and *poof* that thing appears in the scene with your characters. I doubt it’s, like, dictionary-exhaustive, but the vast majority of things I and my son have typed in have been there.

Black panther? Check. (Well, it’s the large cat version, anyway.)
Huge pink Cthulu? Check.
Teeny tiny singularity? Check!
Enraged plaid Beowulf? OK. Not that. But if enough people typed it in, I have a feeling it would eventually show up.

Pipe a speech-to-text engine into something like that, skin it with vibranium sand, and you’re most of the way there.

This unfortunate screen cap makes it look like Cthulu’s about to take a dump in a birdbath.

The interface issues for such a thing probably center around 1. interpretation and 2 control.

1. Natural language understanding of the story

I work on a natural language AI system in my day job at IBM, and disambiguation is one of the major challenges we face: Teaching the systems enough about the world and language to understand what might a user have been meant when they typed something like “deliveries tuesday.” But I work with real-world narrow artificial intelligence, and getting it to understand like a human might understand is a massive undertaking.

The MCU generally, and Wakanda in particular has speculative, human-like Artificial General Intelligences (AGI) like J.A.R.V.I.S., F.R.I.D.A.Y., and Ultron, so all the disambiguation problems we face in the real world are a trivial issue. (Noting that Shuri’s AGI isn’t named in the film.)

AGI can interpret and design and render the story like some magical realtime scene painter in the same way a person would—only much, much faster—and would interpret the language in the same reasonable way. (Plus, I’m pretty sure the display has heard Baba tell this exact same myth before, so its confidence that it is displaying the right thing is even greater.)

2. Controlling the display

The other issue is controlling the display. How does Baba start and stop the rendering? How does it correct something it misunderstood, or change the styling? In the real world we have to work out escape sequences for opt-out systems (like “//” for comments in code) and wake words for opt-in systems (like “Hey, Google” or “Alexa”), but in the MCU we get to rely on the speculative AGI again. Just like a person would know to listen for cues when to start and stop, it can reasonably interpret commands like “pause display,” or “hold here” as we would expect of a person in a tech booth overseeing a theatrical performance.

***

Given the AGI in Wakanda, vibranium sand, and the render-almost-anything engines in the real world, we don’t even have to add anything to the diegesis to make it work, just make a new combination of existing parts.

So while there is zero evidence that this is a diegetic interface, I’m choosing to believe it is one, and hope somebody makes something like it one day.

Black Lives Matter: A first reading list

The Black Lives Matter movement needs to be much more than education—we need action to dismantle the unjust and racist systems it brings to light—but education can be a first place to start. So for this first post, let’s talk how to educate yourself on the issues at hand. This is especially for white people, since this can be so far out of our lived experience that the claims seem at first implausible.

Here biracial/black filmmaker Maria Breaux has given me persmission to share the books she has shared with me, which are a kind of 101 syllabus. Pick one, any one, and read.

The New Jim Crow by Michelle Alexander
Stamped from the Beginning by Ibram X Kendi
How to Be Antiracist (also) by Ibram X Kendi
So You Want to Talk about Race by Ijeoma Oluo
Just Mercy by Bryan Stevenson

In full disclosure I have not read any of these yet. (I’m a notoriously slow reader.) I’m on this journey, too. I’m starting with The New Jim Crow, because it seems the most painful to read.

New Jim Crow: Michelle Alexander at Dillard Nov. 28 – Antenna.Works

Deckard’s Photo Inspector

29 Apr 2020 by Christopher Noessel

Back to Blade Runner. I mean, the pandemic is still pandemicking, but maybe this will be a nice distraction while you shelter in place. Because you’re smart, sheltering in place as much as you can, and not injecting disinfectants. And, like so many other technologies in this film, this will take a while to deconstruct, critique, and reimagine.

Description

Doing his detective work, Deckard retrieves a set of snapshots from Leon’s hotel room, and he brings them home. Something in the one pictured above catches his eye, and he wants to investigate it in greater detail. He takes the photograph and inserts it in a black device he keeps in his living room.

Note: I’ll try and describe this interaction in text, but it is much easier to conceptualize after viewing it. Owing to copyright restrictions, I cannot upload this length of video with the original audio, so I have added pre-rendered closed captions to it, below. All dialogue in the clip is Deckard.

Deckard does digital forensics, looking for a lead.

He inserts the snapshot into a horizontal slit and turns the machine on. A thin, horizontal orange line glows on the left side of the front panel. A series of seemingly random-length orange lines begin to chase one another in a single-row space that stretches across the remainder of the panel and continue to do so throughout Deckard’s use of it. (Imagine a news ticker, running backwards, where the “headlines” are glowing amber lines.) This seems useless and an absolutely pointless distraction for Deckard, putting high-contrast motion in his peripheral vision, which fights for attention with the actual, interesting content down below.

If this is distracting you from reading, YOU SEE MY POINT.

After a second, the screen reveals a blue grid, behind which the scan of the snapshot appears. He stares at the image in the grid for a moment, and speaks a set of instructions, “Enhance 224 to 176.”

In response, three data points appear overlaying the image at the bottom of the screen. Each has a two-letter label and a four-digit number, e.g. “ZM 0000 NS 0000 EW 0000.” The NS and EW—presumably North-South and East-West coordinates, respectively—immediately update to read, “ZM 0000 NS 0197 EW 0334.” After updating the numbers, the screen displays a crosshairs, which target a single rectangle in the grid.

A new rectangle then zooms in from the edges to match the targeted rectangle, as the ZM number—presumably zoom, or magnification—increases. When the animated rectangle reaches the targeted rectangle, its outline blinks yellow a few times. Then the contents of the rectangle are enlarged to fill the screen, in a series of steps which are punctuated with sounds similar to a mechanical camera aperture. The enlargement is perfectly resolved. The overlay disappears until the next set of spoken commands. The system response between Deckard’s issuing the command and the device’s showing the final enlarged image is about 11 seconds.

Deckard studies the new image for awhile before issuing another command. This time he says, “Enhance.” The image enlarges in similar clacking steps until he tells it, “Stop.”

Other instructions he is heard to give include “move in, pull out, track right, center in, pull back, center, and pan right.” Some include discrete instructions, such as, “Track 45 right” while others are relative commands that the system obeys until told to stop, such as “Go right.”

Using such commands he isolates part of the image that reveals an important clue, and he speaks the instruction, “Give me a hard copy right there.” The machine prints the image, which Deckard uses to help find the replicant pictured.

I’d like to point out one bit of sophistication before the critique. Deckard can issue a command with or without a parameter, and the inspector knows what to do. For example, “Track 45 right” and “Track right.” Without the parameter, it will just do the thing repeatedly until told to stop. That helps Deckard issue the same basic command when he knows exactly where he wants to look and when doesn’t know what exactly what he’s looking for. That’s a nice feature of the language design.

But still, asking him to provide step-by-step instructions in this clunky way feels like some high-tech Big Trak. (I tried to find a reference that was as old as the film.) And that’s not all…

Some critiques, as it is

Can I go back and mention that amber distracto-light? Because it’s distracting. And pointless. I’m not mad. I’m just disappointed.
It sure would be nice if any of the numbers on screen made sense, and had any bearing with the numbers Deckard speaks, at any time during the interaction. For instance, the initial zoom (I checked in Photoshop) is around 304%, which is neither the 224 or 176 that Deckard speaks.
It might be that each square has a number, and he simply has to name the two squares at the extents of the zoom he wants, letting the machine find the extents, but where is the labeling? Did he have to memorize an address for each pixel? How does that work at arbitrary levels of zoom?
And if he’s memorized it, why show the overlay at all?
Why the seizure-inducing flashing in the transition sequences? Sure, I get that lots of technologies have unfortunate effects when constrained by mechanics, but this is digital.
Why is the printed picture so unlike the still image where he asks for a hard copy?
Gaze at the reflection in Ford’s hazel, hazel eyes, and it’s clear he’s playing Missile Command, rather than paying attention to this interface at all. (OK, that’s the filmmaker’s issue, not a part of the interface, but still, come on.)

The photo inspector: My interface is up HERE, Rick.

How might it be improved for 1982?

So if 1982 Ridley Scott was telling me in post that we couldn’t reshoot Harrison Ford, and we had to make it just work with what we had, here’s what I’d do…

Squash the grid so the cells match the 4:3 ratio of the NTSC screen. Overlay the address of each cell, while highlighting column and row identifiers at the edges. Have the first cell’s outline illuminate as he speaks it, and have the outline expand to encompass the second named cell. Then zoom, removing the cell labels during the transition. When at anything other than full view, display a map across four cells that shows the zoom visually in the context of the whole.

Rendered in glorious 4:3 NTSC dimensions.

With this interface, the structure of the existing conversation makes more sense. When Deckard said, “Enhance 203 to 608” the thing would zoom in on the mirror, and the small map would confirm.

The numbers wouldn’t match up, but it’s pretty obvious from the final cut that Scott didn’t care about that (or, more charitably, ran out of time). Anyway I would be doing this under protest, because I would argue this interaction needs to be fixed in the script.

How might it be improved for 2020?

What’s really nifty about this technology is that it’s not just a photograph. Look close in the scene, and Deckard isn’t just doing CSI Enhance! commands (or, to be less mocking, AI upscaling). He’s using the photo inspector to look around corners and at objects that are reconstructed from the smallest reflections. So we can think of the interaction like he’s controlling a drone through a 3D still life, looking for a lead to help him further the case.

With that in mind, let’s talk about the display.

Display

To redesign it, we have to decide at a foundational level how we think this works, because it will color what the display looks like. Is this all data that’s captured from some crazy 3D camera and available in the image? Or is it being inferred from details in the 2 dimensional image? Let’s call the first the 3D capture, and the second the 3D inference.

If we decide this is a 3-D capture, then all the data that he observes through the machine has the same degree of confidence. If, however, we decide this is a 3D inferrer, Deckard needs to treat the inferred data with more skepticism than the data the camera directly captured. The 3-D inferrer is the harder problem, and raises some issues that we must deal with in modern AI, so let’s just say that’s the way this speculative technology works.

The first thing the display should do it make it clear what is observed and what is inferred. How you do this is partly a matter of visual design and style, but partly a matter of diegetic logic. The first pass would be to render everything in the camera frustum photo-realistically, and then render everything outside of that in a way that signals its confidence level. The comp below illustrates one way this might be done.

Modification of a pair of images found on Evermotion

In the comp, Deckard has turned the “drone” from the “actual photo,” seen off to the right, toward the inferred space on the left. The monochrome color treatment provides that first high-confidence signal.
In the scene, the primary inference would come from reading the reflections in the disco ball overhead lamp, maybe augmented with plans for the apartment that could be found online, or maybe purchase receipts for appliances, etc. Everything it can reconstruct from the reflection and high-confidence sources has solid black lines, a second-level signal.
The smaller knickknacks that are out of the reflection of the disco ball, and implied from other, less reflective surfaces, are rendered without the black lines and blurred. This provides a signal that the algorithm has a very low confidence in its inference.

This is just one (not very visually interesting) way to handle it, but should illustrate that, to be believable, the photo inspector shouldn’t have a single rendering style outside the frustum. It would need something akin to these levels to help Deckard instantly recognize how much he should trust what he’s seeing.

Flat screen or volumetric projection?

Modern CGI loves big volumetric projections. (e.g. it was the central novum of last year’s Fritz winner, Spider-Man: Far From Home.) And it would be a wonderful juxtaposition to see Deckard in a holodeck-like recreation of Leon’s apartment, with all the visual treatments described above.

But…

Also seriously who wants a lamp embedded in a headrest?

…that would kind of spoil the mood of the scene. This isn’t just about Deckard’s finding a clue, we also see a little about who he is and what his life is like. We see the smoky apartment. We see the drab couch. We see the stack of old detective machines. We see the neon lights and annoying advertising lights swinging back and forth across his windows. Immersing him in a big volumetric projection would lose all this atmospheric stuff, and so I’d recommend keeping it either a small contained VP, like we saw in Minority Report, or just keep it a small flat screen.

OK, so we have an idea about how the display would (and shouldn’t) look, let’s move on to talk about the inputs.

Inputs

To talk about inputs, then, we have to return to a favorite topic of mine, and that is the level of agency we want for the interaction. In short, we need to decide how much work the machine is doing. Is the machine just a manual tool that Deckard has to manipulate to get it to do anything? Or does it actively assist him? Or, lastly, can it even do the job while his attention is on something else—that is, can it act as an agent on his behalf? Sophisticated tools can be a blend of these modes, but for now, let’s look at them individually.

Manual Tool

This is how the photo inspector works in Blade Runner. It can do things, but Deckard has to tell it exactly what to do. But we can still improve it in this mode.

We could give him well-mapped physical controls, like a remote control for this conceptual drone. Flight controls wind up being a recurring topic on this blog (and even came up already in the Blade Runner reviews with the Spinners) so I could go on about how best to do that, but I think that a handheld controller would ruin the feel of this scene, like Deckard was sitting down to play a video game rather than do off-hours detective work.

Special edition made possible by our sponsor, Tom Nook.
(I hope we can pay this loan back.)

Similarly, we could talk about a gestural interface, using some of the synecdochic techniques we’ve seen before in Ghost in the Shell. But again, this would spoil the feel of the scene, having him look more like John Anderton in front of a tiny-TV version of Minority Report’s famous crime scrubber.

One of the things that gives this scene its emotional texture is that Deckard is drinking a glass of whiskey while doing his detective homework. It shows how low he feels. Throwing one back is clearly part of his evening routine, so much a habit that he does it despite being preoccupied about Leon’s case. How can we keep him on the couch, with his hand on the lead crystal whiskey glass, and still investigating the photo? Can he use it to investigate the photo?

Here I recommend a bit of ad-hoc tangible user interface. I first backworlded this for The Star Wars Holiday Special, but I think it could work here, too. Imagine that the photo inspector has a high-resolution camera on it, and the interface allows Deckard to declare any object that he wants as a control object. After the declaration, the camera tracks the object against a surface, using the changes to that object to control the virtual camera.

In the scene, Deckard can declare the whiskey glass as his control object, and the arm of his couch as the control surface. Of course the virtual space he’s in is bigger than the couch arm, but it could work like a mouse and a mousepad. He can just pick it up and set it back down again to extend motion.

This scheme takes into account all movement except vertical lift and drop. This could be a gesture or a spoken command (see below).

Going with this interaction model means Deckard can use the whiskey glass, allowing the scene to keep its texture and feel. He can still drink and get his detective on.

Assistant Tool

Indirect manipulation is helpful for when Deckard doesn’t know what he’s looking for. He can look around, and get close to things to inspect them. But when he knows what he’s looking for, he shouldn’t have to go find it. He should be able to just ask for it, and have the photo inspector show it to him. This requires that we presume some AI. And even though Blade Runner clearly includes General AI, let’s presume that that kind of AI has to be housed in a human-like replicant, and can’t be squeezed into this device. Instead, let’s just extend the capabilities of Narrow AI.

Some of this will be navigational and specific, “Zoom to that mirror in the background,” for instance, or, “Reset the orientation.” Some will more abstract and content-specific, e.g. “Head to the kitchen” or “Get close to that red thing.” If it had gaze detection, he could even indicate a location by looking at it. “Get close to that red thing there,” for example, while looking at the red thing. Given the 3D inferrer nature of this speculative device, he might also want to trace the provenance of an inference, as in, “How do we know this chair is here?” This implies natural language generation as well as understanding.

There’s nothing from stopping him using the same general commands heard in the movie, but I doubt anyone would want to use those when they have commands like this and the object-on-hand controller available.

Ideally Deckard would have some general search capabilities as well, to ask questions and test ideas. “Where were these things purchased?” or subsequently, “Is there video footage from the stores where he purchased them?” or even, “What does that look like to you?” (The correct answer would be, “Well that looks like the mirror from the Arnolfini portrait, Ridley…I mean…Rick*”) It can do pattern recognition and provide as much extra information as it has access to, just like Google Lens or IBM Watson image recognition does.

*Left: The convex mirror in Leon’s 21st century apartment.
Right: The convex mirror in Arnolfini’s 15th century apartment

Finally, he should be able to ask after simple facts to see if the inspector knows or can find it. For example, “How many people are in the scene?”

All of this still requires that Deckard initiate the action, and we can augment it further with a little agentive thinking.

Agentive Tool

To think in terms of agents is to ask, “What can the system do for the user, but not requiring the user’s attention?” (I wrote a book about it if you want to know more.) Here, the AI should be working alongside Deckard. Not just building the inferences and cataloguing observations, but doing anomaly detection on the whole scene as it goes. Some of it is going to be pointless, like “Be aware the butter knife is from IKEA, while the rest of the flatware is Christofle Lagerfeld. Something’s not right, here.” But some of it Deckard will find useful. It would probably be up to Deckard to review summaries and decide which were worth further investigation.

It should also be able to help him with his goals. For example, the police had Zhora’s picture on file. (And her portrait even rotates in the dossier we see at the beginning, so it knows what she looks like in 3D for very sophisticated pattern matching.) The moment the agent—while it was reverse ray tracing the scene and reconstructing the inferred space—detects any faces, it should run the face through a most wanted list, and specifically Deckard’s case files. It shouldn’t wait for him to find it. That again poses some challenges to the script. How do we keep Deckard the hero when the tech can and should have found Zhora seconds after being shown the image? It’s a new challenge for writers, but it’s becoming increasingly important for believability.

Though I’ve never figured out why she has a snake tattoo here (and it seems really important to the plot) but then when Deckard finally meets her, it has disappeared.

Scene

Interior. Deckard’s apartment. Night.
Deckard grabs a bottle of whiskey, a glass, and the photo from Leon’s apartment. He sits on his couch and places the photo on the coffee table.
Deckard
Photo inspector.
The machine on top of a cluttered end table comes to life.
Deckard
Let’s look at this.
He points to the photo. A thin line of light sweeps across the image. The scanned image appears on the screen, pulled in a bit from the edges. A label reads, “Extending scene,” and we see wireframe representations of the apartment outside the frame begin to take shape. A small list of anomalies begins to appear to the left. Deckard pours a few fingers of whiskey into the glass. He takes a drink before putting the glass on the arm of his couch. Small projected graphics appear on the arm facing the inspector.
Deckard
OK. Anyone hiding? Moving?
Photo inspector
No and no.
Deckard
Zoom to that arm and pin to the face.
He turns the glass on the couch arm counterclockwise, and the “drone” revolves around to show Leon’s face, with the shadowy parts rendered in blue.
Deckard
What’s the confidence?
Photo inspector
95.
On the side of the screen the inspector overlays Leon’s police profile.
Deckard
Unpin.
Deckard lifts his glass to take a drink. He moves from the couch to the floor to stare more intently and places his drink on the coffee table.
Deckard
New surface.
He turns the glass clockwise. The camera turns and he sees into a bedroom.
Deckard
How do we have this much inference?
Photo inspector
The convex mirror in the hall…
Deckard
Wait. Is that a foot? You said no one was hiding.
Photo inspector
The individual is not hiding. They appear to be sleeping.
Deckard rolls his eyes.
Deckard
Zoom to the face and pin.
The view zooms to the face, but the camera is level with her chin, making it hard to make out the face. Deckard tips the glass forward and the camera rises up to focus on a blue, wireframed face.
Deckard
That look like Zhora to you?
The inspector overlays her police file.
Photo inspector
63% of it does.
Deckard
Why didn’t you say so?
Photo inspector
My threshold is set to 66%.
Deckard
Give me a hard copy right there.
He raises his glass and finishes his drink.

This scene keeps the texture and tone of the original, and camps on the limitations of Narrow AI to let Deckard be the hero. And doesn’t have him programming a virtual Big Trak.

The Court of Idiots

31 Oct 2018 by Christopher Noessel

It’s Halloween, as if the news of the past week were not scary enough. Pipe bombs to Democratic leaders. The largest massacre of Jewish people in on American soil in history. The murder of two black senior citizens by a white supremacist in Kentucky. Now let’s add to it with this nightmare scene from Idiocracy. Full disclosure: We’re covering technology as old as civilization here, so there won’t be any screen interfaces.

Joe is wheeled into the courtroom in a cage. There is a large gallery there, all of whom are booing him. One throws his milkshake at the accused. Others throw trash. The narrator says, “Joe was arrested for not paying his hospital bill and not having his IPP tattoo. He would soon discover that in the future, Justice was not only blind, but had become rather retarded as well.”

Joe is let out of his cage. The judge, identified by his name plate as The Honorable Hector “The Hangman,” stands at his bench in a spotlight in front of a wall of logos, grinning in anticipation at a new victim. He slams a massive gavel and shouts at the booing crowd, “Listen up! Now. I’m fixin’ to commensurate this trial here. [All of this is sic.] We gon’ see if we can’t come up with a verdict up in here. Now. Since y’all say y’ain’t got no money, we have proprietarily obtained you one of them court-appointed lawyers. So, put your hands together to give it up for Frito Pendejo!”

Frito, wearing a long-sleeve t-shirt with “ATTORNEY AT LAW” running down the sleeve, sits and looks at a paper on the counsel table, saying “Says here you, uh, robbed a hospital?! Why’d you do that?” Joe says, “But I’m not guilty!” Frito replies, “That’s not what the other lawyer said.”

When the trial officially starts, the judge says, “Shut up. Shut up! Now, prosecutor. Why you think he done it?” The prosecutor stands up and says, “K. Number one, your honor. Just look at him.” Most everyone in the courtroom, including, Frito, laughs at this. Frito stands up and says, “He talks like a fag, too!” When more laughter dies down, the prospector continues, saying, “We’ve got all this, like, evidence, of how, like, this guy didn’t even pay at the hospital and I heard that he doesn’t even have his tattoo.” There are collective gasps. He continues, “I know! And I’m all, ‘You gotta be shitting me.’ But check this out, man, judge should be like, ‘Guilty!’ Peace.” The gallery erupts with applause and cheers. There is one guy wearing a helmet with a camera mounted to the top and he takes it all in dumbly.

When Frito stands to raise an objection, he says, “Your honor, I object…that this guy, also broke my apartment and shit.” There are gasps from the gallery, and Frito feels emboldened. “Yeah! And you know what else, I object that he’s not going to have any money to pay me after he pays back all the money he stole from the hospital!” This last bit is addressed to the gallery. They shout in anger. Frito finishes, saying, “And I object. I OBJECT that he interrupted me while I was watching, OW, MY BALLS! THAT IS NOT OK! I REST MY CASE.”

Joe stands and says, “Your Honor, I’m pretty sure we have a mistrial, here, sir.” Hector looks confused at this statement. Frito gets hostile and says, “I’m going to mistrial my foot up your ass if you don’t shut up!”

Joe ignores him and says, “Please listen!” The prosecuting attorney simply mocks him, “Please listen!” Hector laughs. The lawyers high five each other.

The narrator says, “Joe stated his case, logically and passionately. But his perceived effeminate voice only drew big gales of stupid laughter. Without adequate legal representation, Joe was given a stiff sentence.”

So…this nightmare

For most Americans, the drafting and adjudication of laws feels like something that happens “out there.” But upholding the law is something that, through jury duty or being part of a court case, is something most citizens directly experience some time in their lives. So it may seem like familiar, everyday stuff.

WitnessForTheProsecution575-600x365 — Wait. I know this.

But take that less trodden path to ask why it is the way it is, and you’ll eventually find yourself at the foundations of civilization and in the throes of philosophy. A key premise and promise of civilized society is having a fair and rational place for citizens to air grievances, and institutions that make and enforce just laws. In practice it’s far from perfect, but I’d bet most folks agree it’s better than the alternatives of lawlessness, personal violence, revenge, tribal feuds, and vigilantism.

Laws and courts are institutions that are so old and foundational, it’s hard to remember that:

They were, in fact, invented. There was a time before them.
There were reasons for the way they were designed.
Their design, like all designs, aren’t perfect, and involve some trade offs.
They evolve over time.
There are alternate designs to consider.

Idiocracy illustrates how important its norms are, and how tragic it can be when those norms are lost, and everyone involved is a fucking idiot. It’s not exactly monster rampage, body horror, or torture porn, but this scene is as scary a thing as you could ask for Halloween.

Violence as a means

Frito threatens violence a lot. So do a lot of the people in Idiocracy. I’m not even sure if they mean it, but they think that threatening or beating or shooting someone into silence is a fine way to disagree. One of the key promises of civilization is to undo that might-makes-right bullshit that kept generations of peasants suffering while an aristocracy lived the high life peeing in hallways, flubbing courtly love, and stuffng birds into each other like meaty nesting-dolls. We’ve moved on.

Skepticism

It’s funny-terrifying that the case made against Joe is so stupid. There are crass appeals to emotion. Prosecution says that there’s “all this, like, evidence” but does not actually share any, like, evidence. Frito riles the Gallery up with insults “he talks like a fag” and personal grievances that aren’t germane to the case. (Ow My Balls! should not be evidence in, well, anything.) It’s emotional theater and the audience eats it up, because what they’re there for is hits to the amygdala bong: The emotional highs and lows that you might get from a sports show. People screaming in cheers when their confirmation bias is rewarded, or fury when there’s a bit of cognitive dissonance. There’s probably nothing like a Fair Witness left on the planet, and if it weren’t for Joe’s intelligence that gets him out of jail, he would wind up suffering in prison for no reason at all. The lack of skepticism, of applying doubt to all things—especially the things that thrill you emotionally—causes unnecessary suffering. Sure, it’s an unjust universe, but civilization was invented to try and hold a little light against that darkness. The citizenry of Idiocracy are gullible and blithely self-serving to a cruel fault.

Impartiality

The system that pits prosecution against defense and presumes is called “adversarial,” since each counsel will present opposing cases to an impartial judge or jury, who is presumed competent to sort through the competing claims to come to the truth. It’s contrasted with an inquisitorial (or nonadversarial) system where the judges run the line of inquiry. Congressional hearings, for example, are much more of an inquisitorial system. Congresspeople ask the questions, rather than the lawyers.

The adversarial system depends on the impartiality of the judge and jury. If they presume guilt or innocence from the start, why have the trial at all? (This is sometimes called a kangaroo court, where procedure and ethics are ignored, and the outcome is a foregone conclusion.) Only dispassionate, neutral participants can look past their feelers to find a fair and impartial truth. You need people capable of impartial, critical thinking to look past the tricks and their own cognitive biases. That’s out the window in Idiocracy.

Joe is presumed guilty by literally everyone involved. It’s not wholly their fault. Joe is brought in wearing prison fatigues. He is wheeled in a cage. On the surface, he looks guilty. Humans have an availability bias, wherein easily recalled information is believed to be true and representative, and for very stupid people, that can be strictly what they see. And Joe looks guilty. Everyone around me is shouting that he’s guilty. So, the idiot thinks, he must be guilty.

Posers, surface evidence, and meaning

One of the worst things about the scene—and this plays out in the movie generally—is how people’s status is based on stupid, surface markers. For instance: Why do they accept that the judge is smart? Well, he uses fancy, “faggy” words like “commensurate” (It’s a malaprop. He means commence. But the Idiocrats are too stupid to know this.) and “proprietarily.” (I think he means “properly,” but it’s marvelously hard to tell.) Investigating this moves us very quickly into the semiosis treadmill, whereby a sign comes slowly to become the signified, even when it stupidly contradicts the original thing that was signified.

How do courts protect against idiocy?

It doesn’t always. But generally, there are lots of checks against incompetence. Anyone can request a mistrial if it’s shown that counsel, a judge, or a jury is incompetent, prejudiced, or doesn’t follow the rules. Lawyers can be reviewed and disbarred. Remedy for judges varies by state, but judges can be impeached, or undergo judicial review and be removed from their office. Their cases can be overturned by higher courts. (Yes, even that one.)

But notably all these presume that the incompetence is abnormal, and that others are competent to review and judge their competence. When everyone either gets so stupid or so corrupt that they have no interest in checking each other for fairness, the system just collapses.

Can design do anything?

This blog is primarily targeted at people who work in technology and love sci-fi. (Like I do.) A natural question is: Can design do anything to help inoculate courts against Idiocracy? And normally, I try to offer some hopeful answer to these questions, but this one has me stumped. What design interventions could we impose to increase skepticism? Impartiality? Critical thinking? Understanding? Seriously, if you have something, I’d love to hear it, because as of right now, I’ve got nothing.

How’s that for a Halloween scare?

Trick or Treat

Any Trump acolyte who has gleefully shouted “Lock her up! Lock her up!” at one of their farrowing-crate rallies, despite Clinton’s exoneration by the FBI, despite Trump’s own ongoing litany of scandals (including that ongoing issue of security), is guilty of this same kind of mob mentality as these morons in Idiocracy. It is all raw hate without reason. Hits from the tribalism bong. It’s not just that the mob hasn’t heard reason, it’s not interested in reason, only in justification. Only in thinking of political conversations as hobo fights. Did our guy land a hit? Insult? Hur dur, yarp.

It’s pretty soul crushing. So let’s have some spooky, escapist fun tonight for self-care. But as of tomorrow we’ll have 6 days and we should hit it hard.

Sci-fi interfaces

Stop watching sci-fi. Start using it.

Tag Archives: natural language understanding