Back to Blade Runner. I mean, the pandemic is still pandemicking, but maybe this will be a nice distraction while you shelter in place. Because you’re smart, sheltering in place as much as you can, and not injecting disinfectants. And, like so many other technologies in this film, this will take a while to deconstruct, critique, and reimagine.
Doing his detective work, Deckard retrieves a set of snapshots from Leon’s hotel room, and he brings them home. Something in the one pictured above catches his eye, and he wants to investigate it in greater detail. He takes the photograph and inserts it in a black device he keeps in his living room.
Note: I’ll try and describe this interaction in text, but it is much easier to conceptualize after viewing it. Owing to copyright restrictions, I cannot upload this length of video with the original audio, so I have added pre-rendered closed captions to it, below. All dialogue in the clip is Deckard.
He inserts the snapshot into a horizontal slit and turns the machine on. A thin, horizontal orange line glows on the left side of the front panel. A series of seemingly random-length orange lines begin to chase one another in a single-row space that stretches across the remainder of the panel and continue to do so throughout Deckard’s use of it. (Imagine a news ticker, running backwards, where the “headlines” are glowing amber lines.) This seems useless and an absolutely pointless distraction for Deckard, putting high-contrast motion in his peripheral vision, which fights for attention with the actual, interesting content down below.
After a second, the screen reveals a blue grid, behind which the scan of the snapshot appears. He stares at the image in the grid for a moment, and speaks a set of instructions, “Enhance 224 to 176.”
In response, three data points appear overlaying the image at the bottom of the screen. Each has a two-letter label and a four-digit number, e.g. “ZM 0000 NS 0000 EW 0000.” The NS and EW—presumably North-South and East-West coordinates, respectively—immediately update to read, “ZM 0000 NS 0197 EW 0334.” After updating the numbers, the screen displays a crosshairs, which target a single rectangle in the grid.
A new rectangle then zooms in from the edges to match the targeted rectangle, as the ZM number—presumably zoom, or magnification—increases. When the animated rectangle reaches the targeted rectangle, its outline blinks yellow a few times. Then the contents of the rectangle are enlarged to fill the screen, in a series of steps which are punctuated with sounds similar to a mechanical camera aperture. The enlargement is perfectly resolved. The overlay disappears until the next set of spoken commands. The system response between Deckard’s issuing the command and the device’s showing the final enlarged image is about 11 seconds.
Deckard studies the new image for awhile before issuing another command. This time he says, “Enhance.” The image enlarges in similar clacking steps until he tells it, “Stop.”
Other instructions he is heard to give include “move in, pull out, track right, center in, pull back, center, and pan right.” Some include discrete instructions, such as, “Track 45 right” while others are relative commands that the system obeys until told to stop, such as “Go right.”
Using such commands he isolates part of the image that reveals an important clue, and he speaks the instruction, “Give me a hard copy right there.” The machine prints the image, which Deckard uses to help find the replicant pictured.
I’d like to point out one bit of sophistication before the critique. Deckard can issue a command with or without a parameter, and the inspector knows what to do. For example, “Track 45 right” and “Track right.” Without the parameter, it will just do the thing repeatedly until told to stop. That helps Deckard issue the same basic command when he knows exactly where he wants to look and when doesn’t know what exactly what he’s looking for. That’s a nice feature of the language design.
But still, asking him to provide step-by-step instructions in this clunky way feels like some high-tech Big Trak. (I tried to find a reference that was as old as the film.) And that’s not all…
Some critiques, as it is
- Can I go back and mention that amber distracto-light? Because it’s distracting. And pointless. I’m not mad. I’m just disappointed.
- It sure would be nice if any of the numbers on screen made sense, and had any bearing with the numbers Deckard speaks, at any time during the interaction. For instance, the initial zoom (I checked in Photoshop) is around 304%, which is neither the 224 or 176 that Deckard speaks.
- It might be that each square has a number, and he simply has to name the two squares at the extents of the zoom he wants, letting the machine find the extents, but where is the labeling? Did he have to memorize an address for each pixel? How does that work at arbitrary levels of zoom?
- And if he’s memorized it, why show the overlay at all?
- Why the seizure-inducing flashing in the transition sequences? Sure, I get that lots of technologies have unfortunate effects when constrained by mechanics, but this is digital.
- Why is the printed picture so unlike the still image where he asks for a hard copy?
- Gaze at the reflection in Ford’s hazel, hazel eyes, and it’s clear he’s playing Missile Command, rather than paying attention to this interface at all. (OK, that’s the filmmaker’s issue, not a part of the interface, but still, come on.)
How might it be improved for 1982?
So if 1982 Ridley Scott was telling me in post that we couldn’t reshoot Harrison Ford, and we had to make it just work with what we had, here’s what I’d do…
Squash the grid so the cells match the 4:3 ratio of the NTSC screen. Overlay the address of each cell, while highlighting column and row identifiers at the edges. Have the first cell’s outline illuminate as he speaks it, and have the outline expand to encompass the second named cell. Then zoom, removing the cell labels during the transition. When at anything other than full view, display a map across four cells that shows the zoom visually in the context of the whole.
With this interface, the structure of the existing conversation makes more sense. When Deckard said, “Enhance 203 to 608” the thing would zoom in on the mirror, and the small map would confirm.
The numbers wouldn’t match up, but it’s pretty obvious from the final cut that Scott didn’t care about that (or, more charitably, ran out of time). Anyway I would be doing this under protest, because I would argue this interaction needs to be fixed in the script.
How might it be improved for 2020?
What’s really nifty about this technology is that it’s not just a photograph. Look close in the scene, and Deckard isn’t just doing CSI Enhance! commands (or, to be less mocking, AI upscaling). He’s using the photo inspector to look around corners and at objects that are reconstructed from the smallest reflections. So we can think of the interaction like he’s controlling a drone through a 3D still life, looking for a lead to help him further the case.
With that in mind, let’s talk about the display.
To redesign it, we have to decide at a foundational level how we think this works, because it will color what the display looks like. Is this all data that’s captured from some crazy 3D camera and available in the image? Or is it being inferred from details in the 2 dimensional image? Let’s call the first the 3D capture, and the second the 3D inference.
If we decide this is a 3-D capture, then all the data that he observes through the machine has the same degree of confidence. If, however, we decide this is a 3D inferrer, Deckard needs to treat the inferred data with more skepticism than the data the camera directly captured. The 3-D inferrer is the harder problem, and raises some issues that we must deal with in modern AI, so let’s just say that’s the way this speculative technology works.
The first thing the display should do it make it clear what is observed and what is inferred. How you do this is partly a matter of visual design and style, but partly a matter of diegetic logic. The first pass would be to render everything in the camera frustum photo-realistically, and then render everything outside of that in a way that signals its confidence level. The comp below illustrates one way this might be done.
- In the comp, Deckard has turned the “drone” from the “actual photo,” seen off to the right, toward the inferred space on the left. The monochrome color treatment provides that first high-confidence signal.
- In the scene, the primary inference would come from reading the reflections in the disco ball overhead lamp, maybe augmented with plans for the apartment that could be found online, or maybe purchase receipts for appliances, etc. Everything it can reconstruct from the reflection and high-confidence sources has solid black lines, a second-level signal.
- The smaller knickknacks that are out of the reflection of the disco ball, and implied from other, less reflective surfaces, are rendered without the black lines and blurred. This provides a signal that the algorithm has a very low confidence in its inference.
This is just one (not very visually interesting) way to handle it, but should illustrate that, to be believable, the photo inspector shouldn’t have a single rendering style outside the frustum. It would need something akin to these levels to help Deckard instantly recognize how much he should trust what he’s seeing.
Flat screen or volumetric projection?
Modern CGI loves big volumetric projections. (e.g. it was the central novum of last year’s Fritz winner, Spider-Man: Far From Home.) And it would be a wonderful juxtaposition to see Deckard in a holodeck-like recreation of Leon’s apartment, with all the visual treatments described above.
…that would kind of spoil the mood of the scene. This isn’t just about Deckard’s finding a clue, we also see a little about who he is and what his life is like. We see the smoky apartment. We see the drab couch. We see the stack of old detective machines. We see the neon lights and annoying advertising lights swinging back and forth across his windows. Immersing him in a big volumetric projection would lose all this atmospheric stuff, and so I’d recommend keeping it either a small contained VP, like we saw in Minority Report, or just keep it a small flat screen.
OK, so we have an idea about how the display would (and shouldn’t) look, let’s move on to talk about the inputs.
To talk about inputs, then, we have to return to a favorite topic of mine, and that is the level of agency we want for the interaction. In short, we need to decide how much work the machine is doing. Is the machine just a manual tool that Deckard has to manipulate to get it to do anything? Or does it actively assist him? Or, lastly, can it even do the job while his attention is on something else—that is, can it act as an agent on his behalf? Sophisticated tools can be a blend of these modes, but for now, let’s look at them individually.
This is how the photo inspector works in Blade Runner. It can do things, but Deckard has to tell it exactly what to do. But we can still improve it in this mode.
We could give him well-mapped physical controls, like a remote control for this conceptual drone. Flight controls wind up being a recurring topic on this blog (and even came up already in the Blade Runner reviews with the Spinners) so I could go on about how best to do that, but I think that a handheld controller would ruin the feel of this scene, like Deckard was sitting down to play a video game rather than do off-hours detective work.
Similarly, we could talk about a gestural interface, using some of the synecdochic techniques we’ve seen before in Ghost in the Shell. But again, this would spoil the feel of the scene, having him look more like John Anderton in front of a tiny-TV version of Minority Report’s famous crime scrubber.
One of the things that gives this scene its emotional texture is that Deckard is drinking a glass of whiskey while doing his detective homework. It shows how low he feels. Throwing one back is clearly part of his evening routine, so much a habit that he does it despite being preoccupied about Leon’s case. How can we keep him on the couch, with his hand on the lead crystal whiskey glass, and still investigating the photo? Can he use it to investigate the photo?
Here I recommend a bit of ad-hoc tangible user interface. I first backworlded this for The Star Wars Holiday Special, but I think it could work here, too. Imagine that the photo inspector has a high-resolution camera on it, and the interface allows Deckard to declare any object that he wants as a control object. After the declaration, the camera tracks the object against a surface, using the changes to that object to control the virtual camera.
In the scene, Deckard can declare the whiskey glass as his control object, and the arm of his couch as the control surface. Of course the virtual space he’s in is bigger than the couch arm, but it could work like a mouse and a mousepad. He can just pick it up and set it back down again to extend motion.
This scheme takes into account all movement except vertical lift and drop. This could be a gesture or a spoken command (see below).
Going with this interaction model means Deckard can use the whiskey glass, allowing the scene to keep its texture and feel. He can still drink and get his detective on.
Indirect manipulation is helpful for when Deckard doesn’t know what he’s looking for. He can look around, and get close to things to inspect them. But when he knows what he’s looking for, he shouldn’t have to go find it. He should be able to just ask for it, and have the photo inspector show it to him. This requires that we presume some AI. And even though Blade Runner clearly includes General AI, let’s presume that that kind of AI has to be housed in a human-like replicant, and can’t be squeezed into this device. Instead, let’s just extend the capabilities of Narrow AI.
Some of this will be navigational and specific, “Zoom to that mirror in the background,” for instance, or, “Reset the orientation.” Some will more abstract and content-specific, e.g. “Head to the kitchen” or “Get close to that red thing.” If it had gaze detection, he could even indicate a location by looking at it. “Get close to that red thing there,” for example, while looking at the red thing. Given the 3D inferrer nature of this speculative device, he might also want to trace the provenance of an inference, as in, “How do we know this chair is here?” This implies natural language generation as well as understanding.
There’s nothing from stopping him using the same general commands heard in the movie, but I doubt anyone would want to use those when they have commands like this and the object-on-hand controller available.
Ideally Deckard would have some general search capabilities as well, to ask questions and test ideas. “Where were these things purchased?” or subsequently, “Is there video footage from the stores where he purchased them?” or even, “What does that look like to you?” (The correct answer would be, “Well that looks like the mirror from the Arnolfini portrait, Ridley…I mean…Rick*”) It can do pattern recognition and provide as much extra information as it has access to, just like Google Lens or IBM Watson image recognition does.
Finally, he should be able to ask after simple facts to see if the inspector knows or can find it. For example, “How many people are in the scene?”
All of this still requires that Deckard initiate the action, and we can augment it further with a little agentive thinking.
To think in terms of agents is to ask, “What can the system do for the user, but not requiring the user’s attention?” (I wrote a book about it if you want to know more.) Here, the AI should be working alongside Deckard. Not just building the inferences and cataloguing observations, but doing anomaly detection on the whole scene as it goes. Some of it is going to be pointless, like “Be aware the butter knife is from IKEA, while the rest of the flatware is Christofle Lagerfeld. Something’s not right, here.” But some of it Deckard will find useful. It would probably be up to Deckard to review summaries and decide which were worth further investigation.
It should also be able to help him with his goals. For example, the police had Zhora’s picture on file. (And her portrait even rotates in the dossier we see at the beginning, so it knows what she looks like in 3D for very sophisticated pattern matching.) The moment the agent—while it was reverse ray tracing the scene and reconstructing the inferred space—detects any faces, it should run the face through a most wanted list, and specifically Deckard’s case files. It shouldn’t wait for him to find it. That again poses some challenges to the script. How do we keep Deckard the hero when the tech can and should have found Zhora seconds after being shown the image? It’s a new challenge for writers, but it’s becoming increasingly important for believability.
- Interior. Deckard’s apartment. Night.
- Deckard grabs a bottle of whiskey, a glass, and the photo from Leon’s apartment. He sits on his couch and places the photo on the coffee table.
- Photo inspector.
- The machine on top of a cluttered end table comes to life.
- Let’s look at this.
- He points to the photo. A thin line of light sweeps across the image. The scanned image appears on the screen, pulled in a bit from the edges. A label reads, “Extending scene,” and we see wireframe representations of the apartment outside the frame begin to take shape. A small list of anomalies begins to appear to the left. Deckard pours a few fingers of whiskey into the glass. He takes a drink before putting the glass on the arm of his couch. Small projected graphics appear on the arm facing the inspector.
- OK. Anyone hiding? Moving?
- Photo inspector
- No and no.
- Zoom to that arm and pin to the face.
- He turns the glass on the couch arm counterclockwise, and the “drone” revolves around to show Leon’s face, with the shadowy parts rendered in blue.
- What’s the confidence?
- Photo inspector
- On the side of the screen the inspector overlays Leon’s police profile.
- Deckard lifts his glass to take a drink. He moves from the couch to the floor to stare more intently and places his drink on the coffee table.
- New surface.
- He turns the glass clockwise. The camera turns and he sees into a bedroom.
- How do we have this much inference?
- Photo inspector
- The convex mirror in the hall…
- Wait. Is that a foot? You said no one was hiding.
- Photo inspector
- The individual is not hiding. They appear to be sleeping.
- Deckard rolls his eyes.
- Zoom to the face and pin.
- The view zooms to the face, but the camera is level with her chin, making it hard to make out the face. Deckard tips the glass forward and the camera rises up to focus on a blue, wireframed face.
- That look like Zhora to you?
- The inspector overlays her police file.
- Photo inspector
- 63% of it does.
- Why didn’t you say so?
- Photo inspector
- My threshold is set to 66%.
- Give me a hard copy right there.
- He raises his glass and finishes his drink.
This scene keeps the texture and tone of the original, and camps on the limitations of Narrow AI to let Deckard be the hero. And doesn’t have him programming a virtual Big Trak.
Perhaps the scanning orange lines are an indication that the device is listening for commands. Maybe it’s reacting to ambient sound in the room?
Back in the day I had the head canon that the ‘photo’ Deckard slid into the machine was something other than an emulsion-on-paper image. I think the black margins with the orange detail and the shiny back gave me that impression. It also looks a bit thicker than photo paper. Anyway, I thought that it might be a holographic data storage unit with a ‘still frame’ on the front to help identify the captured scene for the user. Maybe nowadays it would be done with a light-field camera, or an array of them, rather than holography. The upshot being that there is far more data present in this object than there would be in a traditional photoemulsion print.
As to why that would be a thing one would want? Well, with light-field cameras one can alter the focal plane and do other tricks after the fact. The ‘photo’ may also allow for lossless reproduction in the future, like a combo of print and negative. Or it may involve encryption/watermarking/whatever to prove that an image is original as taken by a particular camera and has not been tweaked in whatever Deckard’s version of Photoshop is. There isn’t a whiff of any of this in the movie, of course, other than that the photo looks different than the other, more traditional snapshots on Deckard’s piano.
I am actually a big fan of the concept of digital paralinguistics. Any system that is listening to you _should_ indicate that it’s being attentive. But the best way to indicate that is to display some signal that signals its understanding. The orange line is too stochastic, unrelated to the Deckard’s speech. 🙁 Even if it was an “I’m listening” signal, it would be a pretty poor one.
I do think there’s hints that the 3D data is in the photo, and all of what you list is solid evidence that points that way. The only thing it would change would be the need to distinguish inferred from captured data.
“Back in the day I had the head canon that the ‘photo’ Deckard slid into the machine was something other than an emulsion-on-paper image.”
Aye, this is what I told myself was going on. It’s a physical storage device, no moving parts, with the information permanently (or at least, as permanently as reasonably possible) encoded within it. A printed picture on the front as a reminder of what data was stored in it; making a modern 3D storage medium look like something people were already used to, much as we make modern devices and force them to look and act like old technology we are already used to.
What is a replicant, but a new technology, made by humans, thence constrained to an existing shape and format we are already used to.
My reading has always been that the photo is _meant_ to be a kind of polaroid holograph, and that the Esper unit is scanning the holo at the edges to “look around the corners”. While the information is in the holograph, it’s not easily accessible to the human eye, and requires advanced optics and digital processing to get that information out. The graininess of Zhora’s picture is meant to indicate the low quality of the information in the holo at those angles.
I think the flatness of the photo on screen is just down to there being a lack of any way to portray the 3D-ness onscreen in a cheap and quickly realised way for such a short shot.
Incidental to this, I distinctly remember seeing a shot of a “photo” of Tyrell’s niece and (presumably) her mother in the initial release (European theatrical release) of the movie, showing a very short sequence of them moving, like a printed GIF, if that makes sense. At some point in the many versions subsequent, it got edited out, but I take it as evidence that consumer-level photos got very sophisticated in the 2019 of Blade Runner.
A problem I see with the Agentive Tool approach is that anomaly detection is something that, in 2020 at least, humans are good at and AIs are not.
We’re descended from natural intelligences who have had millions of years being selected for noticing the unusual rustle in the grass, the different ripple pattern in the water. While I’m not in any way an AI expert, I believe that current AI systems need masses of training data before they can, say, pick out possible cancers in a CAT scan.
Unless escaped replicants are really predictable in what they wear and how they act, and have been photographed dozens or hundreds of times before, the photo inspector won’t know what it’s looking for and will overload Deckard with false positives.
I have no idea why WordPress waited for me to approve this post. You’re a regular commenter! Anyway, you’re 100% right. Anomaly detection is much better done by humans rather than ANI at the moment. But Blade Runner does have flying cards and AGI in the diegesis, so we can presume ANI algorithms that are much smarter than what’s capable today.
Zhora either covered up her snake tattoo with make-up or had it lasered out before taking the job at the strip-joint. It did, after all, make her easily identifiable.
Pingback: Report Card: Blade Runner (1982) | Sci-fi interfaces
I am pretty sure the Esper machine is meant to be completely analog, not digital. In 1982, digital cameras paled in comparison to analog. It was not at all clear that digital would ever reach the quality and speed of optical systems.
Being analog has a lot of advantages, but comes with some artifacts. First is that you can hear it as moves physically during the work — try listening to the scene without looking at it you’ll see that that is the case.
Second, an analog machine can go out of alignment and require precision adjustment. Being analog, it could continue to perform its function, but in a degraded state. The mysterious “useless and distracting” flickering line is explained by out of whack analog equipment — it was a signal to the audience (at least in 1982) that Deckard is using old, probably obsolete, equipment that he cannot afford to replace.
I would love to see this topic revisited with your vision for how the interface can be understood and improved on given *analog* technology.
Hey Abernathy. Thanks for the comment! It’s definitely an important point to remind the audience that Deckard is a retired Blade Runner working with older equipment. All that is important to his character, the story, and the world.
Note though that in the original, he speaks to the machine, so there’s some level of AI speech recognition. And then there’s Replicants. It’s definitely a world with AI. So there are strong examples from the film showing that the Blade Runner world is not entirely analog.
That said, didn’t the section titled How might it be improved for 1982? already solve it that way?
By the way, I agree with the commenter who said they presumed the 3-D data was stored on the image. While a hologram is probably what Ridley Scott was envisioning, the simple parallax vision shown in the movie could have even been something as simple as a Victorian stereoscope.