Iron Man HUD: 1st person view

In the prior post we catalogued the functions in the Iron HUD. Today we examine the 1st-person display.

When we first see the HUD, Tony is donning the Iron Man mask. Tony asks, JARVIS, “You there?” To which JARVIS replies, “At your service sir.” Tony tells him to “Engage the heads-up display”, and we see the HUD initialize. It is a dizzying mixture of blue wireframe motion graphics. Some imply system functions, such as the reticle that pinpoints Tony’s eye. Most are small dashboard-like gauges that remain small and in Tony’s peripheral vision while the information is not needed, and become larger and more central when needed. These features are catalogued in another post, but we learn about them through two points-of-view:a first-person view, which shows us what Tony’s sees as if we were there, donning the mask in his stead, and second-person view, which shows us Tony’s face overlaid against a dark background with floating graphics.

This post is about that first-person view. Specifically it’s about the visual design and the four awarenesses it displays.

Avengers-missile-fetching04

In the Augmented Reality chapter of Make It So, I identified four types of awareness seen in the survey for Augmented Reality displays:

  1. Sensor display
  2. Location awareness
  3. Context awareness
  4. Goal awareness

The Iron Man HUD illustrates all four and is a useful framework for describing and critiquing the 1st-person view.

Sensor display

When looking through the HUD “ourselves,” we can see that the HUD provides some airplane-like heads up instruments: Across the top is a horizontal compass with a thin white line for a needle. Below and to its left is a speed indicator, presented in terms of MACH. On the left side of the screen is a two-part altimeter with overlays indicating public, commercial, military, and aerospace layers of atmosphere, with a small blue tick mark indicating Tony’s current altitude.

There are just-in-time status indicators like that cyan text box on the right with its randomized rule line. The content within is all N -8 W -97 RNG EL, so, hard to tell what it means, but Tony’s a maker working with a prototype. It’s no surprise he takes some shortcuts in the interface since it’s not a commercial device. But we should note that it would reduce his cognitive load to not have to remember what those cryptic letters meant.

IronMan1_HUD08
You can just see the tops of these gauges at the bottom of this screen.

The exact sensor shown depends on the context and goal at hand.

Periphery and attention

A quick sidenote about peripheral vision and the detail of these gauges. Looking at them, it’s notable that they are small and quite detailed. That makes sense when he’s looking right at them, but when he’s not, given the amount of big, swirling graphics he“s got vying for his attention in the main display, the more those little gauges have to compete. And when it comes to your peripheral vision, localized detail and motion is not enough, owing to the limits of our foveal extent. (Props to @pixelio for the heads-up on this one.)

You see, your brain tricks you into thinking that you can see really well across your entire field of vision. In fact, you can only see really well across a few dozen degrees of that perceptual sphere, corresponding to the tiny area at the back of your eye called the fovea where all the really good photoreceptors concentrate. As your eyes dart around the scene before you, your brain puts all the snippets of detailed information together so it feels like a cohesive, well-detailed whole, but it’s ultimately just a hack. Take a look at this demonstration of the effect.

Screen Shot 2015-07-20 at 23.49.56
This only works if you view it live.

So, having those teeny little guages dancing around as a signal of troubles ahead won’t really get Tony’s attention. He could develop habits of glancing at these things, but that’s a weak strategy, since this data is so mission-critical. If he misses it and forgets to check the gauges, he’s Iron Toast. Fortunately, JARVIS is once again our deus ex machina (in so many senses) because he is able to track where Tony is looking, and if he’s not looking at the wiggling gauge, JARVIS can choose to escalate the signal: Hide the air traffic data temporarily and show the problem in the main screen. Here, as in other mission critical systems, attention management is crisis management. Now, for those of us working with pre-JARVIS tech, it’s rare today for a system to be able to

  • Track perceptual details of its users
  • Monitor a model of the user’s attention
  • Make the right call amongst competing priorities to escalate the right one

But if you could, it would be the smart and humane way to handle it.

Location Awareness

As Tony prepares for his first flight, JARVIS gives him a bit of x-ray vision, displaying a wireframe view of the Santa Monica coastline with live air traffic control icons of aircraft in the vicinity. The overhead map updates of course in real time.

IronMan1_HUD17
If my Google Earth sleuthing is right, his view means he lives in the Malibu RV Park and this view is due East.

Context Awareness

Very quickly after we meet the HUD it shows its object recognition capabilities. As Tony sweeps his glance across his garage, complex reticles jump to each car. Split-seconds afterwards, the car’’s outline is overlaid and some adjunct information about it is presented.

IronMan1_HUD10

This holds true as he’s in flight as well. When Tony passes by the Santa Monica pier, not only is the Pacific Wheel identified (as the Santa Monica Ferriswheel), but the interface shows him a Wikipedia-esque article for the thing as well.

IronMan1_HUD19

IronMan1_HUD21

While JARVIS might be tapping into location databases for both the car and the ferris wheel recognition, it’s more than that. In one scene we see him getting information on the Iron Patriot as it rockets away, and its location wouldn’t be on any real-time record for him to access.

Optical zoom

Too much detail

While this level of object detail is deeply impressive, it’s about as useful as reading Wikipedia pages hard-printed to transparencies while driving. The text is too small, too multilayered, and just pointless considering that JARVIS can tell him whatever he needs to know without even asking. Maybe he could indulge in pop-up pamphlets if he was on a long-haul flight from, say, Europe back home to the Malibu RV Park (see above), but wouldn’t Tony rather watch a movie while on Autopilot instead?

Goal awareness

Of course JARVIS is aware of Tony’s goals, and provides graphics customized to the task, whether that task is navigating flight through complex obstacle courses…

3D wayfinding

…taking down a bad guy with the next hit…

Suggested target points

…saving innocent bystanders who are freefalling from a plane…

Biometric analysis, target acquisition

…or instantly analyzing problems in an observed (and complicated) piece of machinery…

3D schematics of observed machinery with damage highlights

…JARVIS is there with the graphics to help illustrate, if not solve, the problem at hand. Most impressively, perhaps, is JARVIS’ ability to juggle all of these graphics and modes seamlessly to present just the right thing at the right time in real time. Tony never asks for a particular display, it just happens. If you needed no other proof of its strong artificial intelligence, this would be it.

Next up in the Iron HUD series: Compare and contrast the 2nd-person view.

Iron Man HUD: Just the functions

In the last post we went over the Iron HUD components. There is a great deal to say about the interactions and interface, but let’s just take a moment to recount everything that the HUD does over the Iron Man movies and The Avengers. Keep in mind that just as there are many iterations of the suit, there can be many iterations of the HUD, but since it’s largely display software controlled by JARVIS, the functions can very easily move between exosuits.

Gauges

Along the bottom of the HUD are some small gauges, which, though they change iconography across the properties, are consistently present.

IronMan1_HUD07

For the most part they persist as tiny icons and thereby hard to read, but when the suit reboots in a high-altitude freefall, we get to see giant versions of them, and can read that they are:

IronMan1_HUD13
Tony can, at a glance or request, summon more detail for any of the gauges.
IronMan1_HUD12
Even different visualizations of similar information.

Object Recognition

In the 1st-person view we see that the HUD has a separate map in the lower-left, and object recognition/awareness,

IronMan1_HUD10
IronMan1_HUD11
In the 2nd-person view, we see even more layers of information about the identified objects, floating closer to tony’s point of view.

Situational

Most of the HUD functions we see, though, are situational, brought up for Tony’s attention when JARVIS believes they are needed, or when Tony requests them. Following are screenshots that illustrate a moment when the situational function appeared. 

Iron Man

Iron Man 2

Iron Man 3

The Avengers

Some of these illustrate why I argue that JARVIS is the superhero, and Tony just the onboard manager, but rather than reverse engineering any particular function, for this post it is enough to document them and note that only the optical zoom seems to be an interactive function. This raises questions of how he initiated the mode and how he escapes the mode, but since we don’t see the mechanisms of control, it’s entirely arguable that JARVIS is just  being his usual helpful self again.

Next up in the Iron HUD series: Let’s dive deeper into the first-person view.

Iron Man HUD: A Breakdown

So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.

Avengers-Iron-Man-Videoconferencing01

So on the surface of this scene, it’s a communications interface.

But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)

So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components.

Exosuit

The Iron Man is the name of the series of superpowered exosuits designed by Tony Stark. They range from the Mark I, a comparatively crude suit of armor to escape imprisonment by terrorists, through the Mark XLVI, the armor seen in The Avengers: Age of Ultron. The suit acts as defense against nearly every type of weapon known. It has repulsor beams built into the palms and in later models the arc reactor mounted in the chest that can be used to deliver concussive force. It allows the wearer to fly. Offensive weaponry varies between models, but has included a high powered laser system, and auto-targeting minigun pod and missiles. The suit can act semi-autonomously or via remote control. One of the models in The Avengers has parts that are seen to self-propel to Tony, targeting a beacon bracelet he wears, and self-assemble around him very quickly.

Marks1and43

Immersive display

Though Tony’s head is completely covered, he has a virtual reality display within his helmet. It is a full-field-of-vision, very high-resolution, full-color display that provides stereoscopic imaging. It allows Tony to see the world around him as if he were not wearing the helmet, augment the view with goal-, person-, location-, and object-sensitive awareness.

The display varies a great deal, changing to the needs of the situation. But five icons persistently in the lower part of the display seem to be: suit status, targeting and optics, radar, artificial horizon, and map.

An interpretive view of Tony’s experience, from Iron Man (2008).
An interpretive view of Tony’s experience, from Iron Man (2008).
An first-person view from within the HUD, Iron Man (2008).
An first-person view from within the HUD, Iron Man (2008).

There is much to critique about the readability of the complex layering and translucency, the limits of human perception, and the necessarily- (and strictly-) interpretive nature of what we as audience see, but let me save those three points for a later post. For now it’s enough to log the features as aspects of the system.

Head NUI

Though Tony could use his hands to interact with an interface projected into the augmented reality view around him, his hands are often occupied in controlling flight or in combat. For this reason the means of input are head gesture, eye gesture, and voice input. A bit more on each follows.

Elements within the HUD such as reticles around his eyes follow and track his head gestures. Other elements stay locked in place. The HUD can track his gaze perfectly, allowing him to designate targets for his weapons with a fixation. Using this perfect eye tracking, Tony can also speak about something he is looking at, either in the real world or in the interface, and the system understands exactly what he’s talking about.

In fact, Tony is able to speak fully natural language commands, and indeed, carry out full-Turing conversations with the suit because of the presence of…

Strong artificial intelligence: JARVIS

An on-board artificial intelligence known as JARVIS handles any information task Tony asks of it, and monitors the surroundings and anticipates informational needs. There is strong evidence that most of the functions of the suit are handled by JARVIS behind the scenes. The crucialness of the artificial intelligence to the function of the suit cannot be overstated. It’s difficult to imagine how most of the suit could function as it does without an artificial intelligence behind the scenes facilitating results and even guiding Tony. With this in mind it is instructive to reframe the AI as the thing being named the Iron Man, with Tony Stark being an onboard manager, or, more charitably, a command-and-control center. Who quips.

Next up in the Iron HUD series: Lets review the functions of the suit.

Avengers-Iron-Man-Videoconferencing02



Ectogoggles

When the Ghostbusters are called to the Sedgewick Hotel, they track a ghost called Slimer from his usual haunt on the 12th floor to a ballroom. There Ray dons a pair of asymmetrical goggles that show him information about the “psycho-kinetic energy (PKE) valences” in the area. (The Ghostbusters wiki—and of course there is such a thing—identifies these alternately as paragoggles or ectogoggles.) He uses the goggles to peek from behind a curtain to look for Slimer.

Ghostbusters_binoculars_02

Far be it for this humble blog to try and reverse-engineer what PKE valences actually are, but let’s presume it generally means ghosts and ghost related activity. Here’s an animated gif of the display for your ghostspotting pleasure.

Ghostoculars_gif

As he scans the room, we see a shot from his perspective. Five outputs augment the ordinary view the googles offer.

1. A plan position indicator (like what you see on a radar) sweeps around and around in the upper left hand corner, but never displays anything (even when Slimer appears.)

2. A bar graph on the left side that wavers up and down until Slimer is spotted, when it jumps to maximum. The bar graph adheres to the basic visual principle of “up means more.” The bar graph is colored with a stoplight gradient, with red at the bottom, yellow in the middle, and a bright screen-green at the top. Note that the graph builds from the bottom until it hits maximum, when its glow slides to the top to fully illuminate only the uppermost block. This is a special “max” mode that strongly draws the user’s attention.

3. There is a 7-segment red LED number display just below the graph, which you might think is a numerical version of the same data, but we only see it increment steadily from 03094 to 03051 during the first scan, then after a cutaway to Ray’s face, we see it drop to 01325 and continue to increment steadily until it hits 1333, where it remains steady and begins to blink. It hits this maximum about a half a second before the graph jumps to its max.

graph

4. In the very lower left is a red mode label reading “KER,” which blinks until the numbers hit 01333 in the second sequence, when KER disappears and is replaced with a steadily-glowing green “MAX.”

What the heck is KER? I don’t think there’s any diegetic answer. Ker might be an extradiegetic shout-out to Rick Kerrigan, who was production supervisor for Entertainment Effects Group / Boss Film Studios for the film, but that’s just a guess. Otherwise I got nothin’. Anyone else?

5. In the lower right is a blurry light that blinks red until Slimer is spotted, when it blinks the same screen-green as the bar graph, sweep, and MAX label.

Narratively, this is a tone interface, that doesn’t add anything to the plot, and only helps us experience and understand how it is the busters do their busting. As a tone interface, making these changes would help improve believability without affecting the plot.

Ghostbusters_binoculars_08

How to better support busting

The immediate improvements you could make to this as a “real” ghostbusting tool are fairly obvious:

  • Make the plan position indicator, you know, work.
  • Have the numbers match the graph, or, if they’re actually measuring different things, put the LED display on the other side of the view.
  • I’d change the graph color indicating no-PKE to black or dark gray. Red often connotes danger, and really, if there’s no PKE, you’re safe from the supernatural. Plus the blackbody radiation spectrum has a more physical reference and is therefore more immediate.
  • You could even lose the bar diagram—which requires looking away from the view—and replace it with a line around the view that changes color similarly. This puts the augmentation in the periphery.
  • Lose the distracting blinking red light entirely. It draws attention at a time when the Buster’s eyes need to be on the view, and it’s just duplicating information already provided in a better way by the graph.

But we can do those improvements better. In the augmented reality chapter of the book, I identified levels of awareness for these devices. The ectogoggles are an example of the simplest type, of sensor display, with the sweep giving an unfulfilled promise of the second type, location awareness. We can make even bigger improvements by considering the other levels, i.e. context and goal awareness.

Context Awareness

Context awareness implies a more sophisticated system with image recognition and display capabilities. Could the paragoggles help draw attention to where on the view the PKE is most concentrated, and how those readings are trending? Of course this wouldn’t be so important when the ghost is actually visible, but if it could lead his eyes to where the ghost is most likely going to be, it would be more useful and save him even the microseconds of an eye saccade.

A second aspect of context awareness is object or people recognition. If the goggles could recognize individual ghosts, the display be improved with some information about this particular ghost—or its category—from a database. What’s its name? What methods have failed or worked in the past to control it? Even if it doesn’t know these things, it can provide an alert that it is an UNKNOWN ENTITY, which is spooky sounding and tells the Ghostbusters to be on high alert since anything could happen.

Goal awareness

Lastly, they could be improved with goal awareness. The Ghostbusters aren’t birdwatchers. They’re there to capture that ugly spud. Can it help guide each person as to the best time to gear up the proton packs (or do it for them), where to position themselves as well as the trap, and finally when and where to fire? Certainly someone as scatterbrained as Ray could use that kind of assistance.

Ghostbusters_binoculars_00

Section No6’s crappy sniper tech

GitS-Drone_gunner-01

GitS-Drone_gunner-12

Section 6 sends helicopters to assassinate Kunasagi and her team before they can learn the truth about Project 2501. We get a brief glimpse of the snipers, who wear full-immersion helmets with a large lens to the front of one side, connected by thick cables to ports in the roof of the helicopter. The snipers have their hands on long barrel rifles mounted to posts. In these helmets they have full audio access to a command and control center that gives orders and recieves confirmations.

GitS-profile-06

The helmets feature fully immersive displays that can show abstract data, such as the profiles and portraits of their targets.

GitS-Drone_gunner-06

GitS-Drone_gunner-07

These helmets also provide the snipers an augmented reality display that grants high powered magnification views overlaid with complex reticles for targeting. The reticles feature a spiraling indicator of "gyroscopic stabilization" and a red dot that appears in the crosshairs when the target has been held for a full second. The reticles do not provide any "layman" information in text, but rely solely on simple shapes that a well-trained sniper can see rather than read. The whole system has the ability to suppress the cardiovascular interference of the snipers, though no details are given as to how.

These features seem provocative, and a pretty sweet setup for a sniper: heightened vision, supression of interference, aiming guides, and signals indicating a key status. But then, we see a camera on the bottom of the helicopter, mounted with actuators that allow it to move with a high (though not full) freedom of movement and precision. What’s this there for? It wouldn’t make sense for the snipers to be using it to aim. Their eyes are in the direction of their weapons.

GitS-Drone_gunner-02

This could be used for general surveillance of course, but the collection of technologies that we see here raise the question: If Section 9 has the technology to precisely-control a camera, why doesn’t it apply that to the barrel of the weapon? And if it has the technology to know when the weapon is aimed at its target (showing a red dot) why does it let humans do the targeting?

Of course you want a human to make the choice to pull a trigger/activate a weapon, because we should not leave such a terrible, ethical, and deadly decision to an algorithm, but the other activities of targeting could clearly be handled, and handled better, by technology.

This again illustrates a problem that sci-fi has had with tech, one we saw in Section 6’s security details: How are heroes heroic if the machines can do the hard work? This interface retreats to simple augmentation rather than an agentive solution to bypass the conflict. Real-world designers will have to answer it more directly.

R-3000 “Spider tank” vision

GitS-spidertank-22

Section 6 stations a spider tank, hidden under thermoptic camouflage, to guard Project 2501. When Kunasagi confronts the tank, we see a glimpse of the video feed from its creepy, metal, recessed eye. This view is a screen green image, overlaid with two reticles. The larger one with radial ticks shows where the weapon is pointing while the smaller one tracks the target.

I have often used the discrepancy between a weapon- and target-reticle to point out how far behind Hollywood is on the notion of agentive systems in the real world, but for the spider tank it’s very appropriate.The image processing is likely to be much faster than the actuators controlling the tank’s position and orientation. The two reticles illustrate what the tank’s AI is working on. This said, I cannot work out why there is only one weapon reticle when the tank has two barrels that move independently.

GitS-spidertank-13

GitS-spidertank-09

When the spider tank expends all of its ammunition, Kunasagi activates her thermoptic camouflage, and the tank begins to search for her. It switches from its protected white camera to a big-lens blue camera. On its processing screen, the targeting reticle disappears, and a smaller reticle appears with concentric, blinking white arcs. As Kunasagi strains to wrench open plating on the tank, her camouflage is compromised, allowing the tank to focus on her (though curiously, not to do anything like try and shake her off or slam her into the wall or something). As its confidence grows, more arcs appear, become thicker, and circle the center, indicating its confidence.

The amount of information on the augmentation layer is arbitrary, since it’s a machine using it and there are certainly other processes going on than what is visualized. If this was for a human user, there might be more or less augmentation necessary, depending on the amount of training they have and the goal awareness of the system. Certainly an actual crosshairs in the weapon reticle would help aim it very precisely.

GitS-spidertank-06