Unity Vision

One of my favorite challenges in sci-fi is showing how alien an AI mind is. (It’s part of what makes Ex Machina so compelling, and the end of Her, and why Data from Star Trek: The Next Generation always read like a dopey, Pinnochio-esque narrative tool. But a full comparison is for another post.) Given that screen sci-fi is a medium of light, sound, and language, I really enjoy when filmmakers try to show how they see, hear, and process this information differently.

In Colossus: The Forbin Project, when Unity begins issuing demands, one of its first instructions is to outfit the Computer Programming Office (CPO) with wall-mounted video cameras that it can access and control. Once this network of cameras is installed, Forbin gives Unity a tour of the space, introducing it visually and spatially to a place it has only known as an abstract node network. During this tour, the audience is also introduced to Unity’s point-of-view, which includes an overlay consisting of several parts.

The first part is a white overlay of rule lines and MICR characters that cluster around the edge of the frame. These graphics do not change throughout the film, whether Unity is looking at Forbin in the CPO, carefully watching for signs of betrayal in a missile silo, or creepily keeping an “eye” on Forbin and Markham’s date for signs of deception.

In these last two screen grabs, you see the second part of the Unity POV, which is a focus indicator. This overlay appears behind the white bits; it’s a blue translucent overlay with a circular hole revealing true color. The hole shows where Unity is focusing. This indicator appears, occasionally, and can change size and position. It operates independently of the optical zoom of the camera, as we see in the below shots of Forbin’s tour.

A first augmented computer PoV? 🥇

When writing about computer PoVs before, I have cited Westworld as the first augmented one, since we see things from The Gunslinger’s infrared-vision eyes in the persistence-hunting sequences. (2001: A Space Odyssey came out the year prior to Colossus, but its computer PoV shots are not augmented.) And Westworld came out three years after Colossus, so until it is unseated, I’m going to regard this as the first augmented computer PoV in cinema. (Even the usually-encyclopedic TVtropes doesn’t list this one at the time of publishing.) It probably blew audiences’ minds as it was.

“Colossus, I am Forbin.”

And as such, we should cut it a little slack for not meeting our more literate modern standards. It was forging new territory. Even for that, it’s still pretty bad.

Real world computer vision

Though computer vision is always advancing, it’s safe to say that AI would be looking at the flat images and seeking to understand the salient bits per its goals. In the case of self-driving cars, that means finding the road, reading signs and road makers, identifying objects and plotting their trajectories in relation to the vehicle’s own trajectory in order to avoid collisions, and wayfinding to the destination, all compared against known models of signs, conveyances, laws, maps, and databases. Any of these are good fodder for sci-fi visualization.

Source: Medium article about the state of computer vision in Russia, 2017.

Unity’s concerns would be its goal of ending war, derived subgoals and plans to achieve those goals, constant scenario testing, how it is regarded by humans, identification of individuals, and the trustworthiness of those humans. There are plenty of things that could be augmented, but that would require more than we see here.

Unity Vision looks nothing like this

I don’t consider it worth detailing the specific characters in the white overlay, or backworlding some meaning in the rule lines, because the rule overlay does not change over the course of the movie. In the book Make It So: Interaction Design Lessons from Sci-fi, Chapter 8, Augmented Reality, I identified the types of awareness such overlays could show: sensor output, location awareness, context awareness, and goal awareness, but each of these requires change over time to be useful, so this static overlay seems not just pointless, but it risks covering up important details that the AI might need.

Compare the computer vision of The Terminator.

Many times you can excuse computer-PoV shots as technical legacy, that is, a debugging tool that developers built for themselves while developing the AI, and which the AI now uses for itself. In this case, it’s heavily implied that Unity provided the specifications for this system itself, so that doesn’t make sense.

The focus indicator does change over time, but it indicates focus in a way that, again, obscures other information in the visual feed and so is not in Unity’s interest. Color spaces are part of the way computers understand what they’re seeing, and there is no reason it should make it harder on itself, even if it is a super AI.

Largely extradiegetic

So, since a diegetic reading comes up empty, we have to look at this extradiegetically. That means as a tool for the audience to understand when they’re seeing through Unity’s eyes—rather than the movie’s—and via the focus indicator, what the AI is inspecting.

As such, it was probably pretty successful in the 1970s to instantly indicate computer-ness.

One reason is the typeface. The characters are derived from MICR, which stands for magnetic ink character recognition. It was established in the 1950s as a way to computerize check processing. Notably, the original font had only numerals and four control characters, no alphabetic ones.

Note also that these characters bear a style resemblance to the ones seen in the film but are not the same. Compare the 0 character here with the one in the screenshots, where that character gets a blob in the lower right stroke.

I want to give a shout-out to the film makers for not having this creeper scene focus on lascivious details, like butts or breasts. It’s a machine looking for signs of deception, and things like hands, microexpressions, and, so the song goes, kisses are more telling.

Still, MICR was a genuinely high-tech typeface of the time. The adult members of the audience would certainly have encountered the “weird” font in their personal lives while looking at checks, and likely understood its purpose, so was a good choice for 1970, even if the details were off.

Another is the inscrutability of the lines. Why are they there, in just that way? Their inscrutability is the point. Most people in audiences regard technology and computers as having arcane reasons for the way they are, and these rectilinear lines with odd greebles and nurnies invoke that same sensibility. All the whirring gizmos and bouncing bar charts of modern sci-fi interfaces exhibit the same kind of FUIgetry.

So for these reasons, while it had little to do with the substance of computer vision, its heart was in the right place to invoke computer-y-ness.

Dat Ending

At the very end of the film, though, after Unity asserts that in time humans will come to love it, Forbin staunchly says, “Never.” Then the film passes into a sequence that is hard to tell whether it’s meant to be diegetic or not.

In the first beat, the screen breaks into four different camera angles of Forbin at once. (The overlay is still there, as if this was from a single camera.)

This says more about computer vision than even the FUIgetry.

This sense of multiples continues in the second beat, as multiple shots repeat in a grid. The grid is clipped to a big circle that shrinks to a point and ends the film in a moment of blackness before credits roll.

Since it happens right before the credits, and it has no precedent in the film, I read it as not part of the movie, but a title sequence. And that sucks. I wish wish wish this had been the standard Unity-view from the start. It illustrates that Unity is not gathering its information from a single stereoscopic image, like humans and most vertebrates do, but from multiple feeds simultaneously. That’s alien. Not even insectoid, but part of how this AI senses the world.

Advertisements

Dr. Strange’s augmented reality surgical assistant

We’re actually done with all of the artifacts from Doctor Strange. But there’s one last kind-of interface that’s worth talking about, and that’s when Strange assists with surgery on his own body.

After being shot with a soul-arrow by the zealot, Strange is in bad shape. He needs medical attention. He recovers his sling ring and creates a portal to the emergency room where he once worked. Stumbling with the pain, he manages to find Dr. Palmer and tell her he has a cardiac tamponade. They head to the operating theater and get Strange on the table.

DoctorStrange_AR_ER_assistant-02.png

When Strange passes out, his “spirit” is ejected from his body as an astral projection. Once he realizes what’s happened, he gathers his wits and turns to observe the procedure.

DoctorStrange_AR_ER_assistant-05.png

When Dr. Palmer approaches his body with a pericardiocentesis needle, Strange manifests so she can sense him and recommends that she aim “just a little higher.” At first she is understandably scared, but once he explains what’s happening, she gets back to business, and he acts as a virtual coach.

Continue reading

High Tech Binoculars

In Johnny Mnemonic we see two different types of binoculars with augmented reality overlays and other enhancements: Yakuz-oculars, and LoTek-oculars.

Yakuz-oculars

The Yakuza are the last to be seen but also the simpler of the two. They look just like a pair of current day binoculars, but this is the view when the leader surveys the LoTek bridge.

jm-25-yakuza-binocs-adjusted

I assume that the characters here are Japanese? Anyone?

In the centre is a fixed-size green reticule. At the bottom right is what looks like the magnification factor. At the top left and bottom left are numbers, using Western digits, that change as the binoculars move. Without knowing what the labels are I can only guess that they could be azimuth and elevation angles, or distance and height to the centre of the reticule. (The latter implies some sort of rangefinder.) Continue reading

Airport Security

After fleeing the Yakuza in the hotel, Johnny arrives in the Free City of Newark, and has to go through immigration control. This process appears to be entirely automated, starting with an electronic passport reader.

jm-9-customs-b

After that there is a security scanner, which is reminiscent of HAL from the film 2001: A Space Odyssey.

jm-9-customs-f

The green light runs over Johnny from top to bottom. Continue reading

Grabby hologram

After Pepper tosses off the sexy bon mot “Work hard!” and leaves Tony to his Avengers initiative homework, Tony stands before the wall-high translucent displays projected around his room.

Amongst the videos, diagrams, metadata, and charts of the Tesseract panel, one item catches his attention. It’s the 3D depiction of the object, the tesseract itself, one of the Infinity Stones from the MCU. It is a cube rendered in a white wireframe, glowing cyan amidst the flat objects otherwise filling the display. It has an intense, cold-blue glow at its center.  Small facing circles surround the eight corners, from which thin cyan rule lines extend a couple of decimeters and connect to small, facing, inscrutable floating-point numbers and glyphs.

Avengers_PullVP-02.png

Wanting to look closer at it, he reaches up and places fingers along the edge as if it were a material object, and swipes it away from the display. It rests in his hand as if it was a real thing. He studies it for a minute and flicks his thumb forward to quickly switch the orientation 90° around the Y axis.

Then he has an Important Thought and the camera cuts to Agent Coulson and Steve Rogers flying to the helicarrier.

So regular readers of this blog (or you know, fans of blockbuster sci-fi movies in general) may have a Spidey-sense that this feels somehow familiar as an interface. Where else do we see a character grabbing an object from a volumetric projection to study it? That’s right, that seminal insult-to-scientists-and-audiences alike, Prometheus. When David encounters the Alien Astrometrics VP, he grabs the wee earth from that display to nuzzle it for a little bit. Follow the link if you want that full backstory. Or you can just look and imagine it, because the interaction is largely the same: See display, grab glowing component of the VP and manipulate it.

Prometheus-229 Two anecdotes are not yet a pattern, but I’m glad to see this particular interaction again. I’m going to call it grabby holograms (capitulating a bit on adherence to the more academic term volumetric projection.) We grow up having bodies and moving about in a 3D world, so the desire to grab and turn objects to understand them is quite natural. It does require that we stop thinking of displays as untouchable, uninterruptable movies and more like toy boxes, and it seems like more and more writers are catching on to this idea.

More graphics or more information?

Additionally,  the fact that this object is the one 3D object in its display is a nice affordance that it can be grabbed. I’m not sure whether he can pull the frame containing the JOINT DARK ENERGY MISSION video to study it on the couch, but I’m fairly certain I knew that the tesseract was grabbable before Tony reached out.

On the other hand, I do wonder what Tony could have learned by looking at the VP cube so intently. There’s no information there. It’s just a pattern on the sides. The glow doesn’t change. The little glyph sticks attached to the edges are fuigets. He might be remembering something he once saw or read, but he didn’t need to flick it like he did for any new information. Maybe he has flicked a VP tesseract in the past?

Augmented “reality”

Rather, I would have liked to have seen those glyph sticks display some useful information, perhaps acting as leaders that connected the VP to related data in the main display. One corner’s line could lead to the Zero Point Extraction chart. Another to the lovely orange waveform display. This way Tony could hold the cube and glance at its related information. These are all augmented reality additions.

Augmented VP

Or, even better, could he do some things that are possible with VPs that aren’t possible with AR. He should be able to scale it to be quite large or small. Create arbitrary sections, or plan views. Maybe fan out depictions of all objects in the SHIELD database that are similarly glowy, stone-like, or that remind him of infinity. Maybe…there’s…a…connection…there! Or better yet, have a copy of JARVIS study the data to find correlations and likely connections to consider. We’ve seen these genuine VP interactions plenty of places (including Tony’s own workshop), so they’re part of the diegesis.

Avengers_PullVP-05.pngIn any case, this simple setup works nicely, in which interaction with a cool media helps underscore the gravity of the situation, the height of the stakes. Note to selves: The imperturbable Tony Stark is perturbed. Shit is going to get real.

 

Videoconferencing

BttF_122

Marty Sr. answers a call from a shady business colleague shortly after coming home. He takes the call in the den on the large video screen there. As he approaches the screen, he sees a crop of a Renoir painting, “Dance at La Moulin de la Galette,” with a blinking legend “INCOMING CALL” along the bottom. When he answers it, the Renoir shrinks to a corner of the screen, revealing the live video feed with his correspondent. During the conversation, the Renoir disappears, and text appears near the bottom of the screen providing reminders about the speaker. This appears automatically, with no prompting from Marty Sr.

Needles, Douglas J.
Occupation: Sys Operations
Age: 47
Birthday: August 6, 1968
Address: 88 Oriole Rd, A6t
Wife: Lauren Anne
Children: Roberta, 23 Amy, 20
Food Prefence: Steak, Mex
Food Dislike: Fish, Tuna
Drinks: Scotch, Beer
Hobbies: Avid Basketball Fan
Sports: Jogging, Slamball, Tennis
Politics: None

This is an augmented reality teleconference, as mentioned in Chapter 8 of Make It So: Interaction Design Lessons from Science Fiction. See more information in that chapter. In short, it’s a particularly good example of one type of augmentation that is very useful for people having to interact with networks of people much larger than Dunbar’s number equips us for. Unfortunately, the information appears in a distracting scroll across the bottom, and is not particularly pertinent to the conversation, so could benefit from a bit of context awareness or static high-resolution display to be really useful. Continue reading

Binoculars

BttF_023

Doc Brown uses some specialized binoculars to verify that Marty’ Jr. is at the scene according to plan. He flips them open and puts his eyes up to them. When we see his view, a reticle of green corners is placed around the closest individual in view. In the lower right hand corner are three measurements, “DISTgamma, and “XYZ.” These numbers change continuously. A small pair of graphics at the bottom illustrate whether the reticle is to left or right of center.

BttF_025

As discussed in Chapter 8 of Make It So, augmented reality systems like this can have several awarenesses, and this has some sensor display and people awareness. I’m not sure what use the sensor data is to Doc, and the people detector seems unable to track a single individual consistently.

BttF_024

So, a throwaway interface that doesn’t help much beyond looking gee-whiz(1989).

Iron Man HUD: 1st person view

In the prior post we catalogued the functions in the Iron HUD. Today we examine the 1st-person display.

When we first see the HUD, Tony is donning the Iron Man mask. Tony asks, “JARVIS, “You there?”” To which JARVIS replies, ““At your service sir.”” Tony tells him to “Engage the heads-up display,” and we see the HUD initialize. It is a dizzying mixture of blue wireframe motion graphics. Some imply system functions, such as the reticle that pinpoints Tony’s eye. Most are small dashboard-like gauges that remain small and in Tony’s peripheral vision while the information is not needed, and become larger and more central when needed. These features are catalogued in another post, but we learn about them through two points-of-view: a first-person view, which shows us what Tony’s sees as if we were there, donning the mask in his stead, and second-person view, which shows us Tony’s face overlaid against a dark background with floating graphics.

This post is about that first-person view. Specifically it’s about the visual design and the four awarenesses it displays.

Avengers-missile-fetching04

In the Augmented Reality chapter of Make It So, I identified four types of awareness seen in the survey for Augmented Reality displays:

  1. Sensor display
  2. Location awareness
  3. Context awareness
  4. Goal awareness

The Iron Man HUD illustrates all four and is a useful framework for describing and critiquing the 1st-person view. Continue reading

Iron Man HUD: Just the functions

In the last post we went over the Iron HUD components. There is a great deal to say about the interactions and interface, but let’s just take a moment to recount everything that the HUD does over the Iron Man movies and The Avengers. Keep in mind that just as there are many iterations of the suit, there can be many iterations of the HUD, but since it’s largely display software controlled by JARVIS, the functions can very easily move between exosuits.

Gauges

Along the bottom of the HUD are some small gauges, which, though they change iconography across the properties, are consistently present.

IronMan1_HUD07

For the most part they persist as tiny icons and thereby hard to read, but when the suit reboots in a high-altitude freefall, we get to see giant versions of them, and can read that they are:

Continue reading

Iron Man HUD: A Breakdown

So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.

Avengers-Iron-Man-Videoconferencing01

So on the surface of this scene, it’s a communications interface.

But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)

So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components. Continue reading