The characters in Johnny Mnemonic make quite a few video phone calls throughout the film, enough to be grouped in their own section on interfaces.
The first thing a modern viewer will note is that only one of the phones resembles a current day handheld mobile. This looks very strange today and it’s hard to imagine why we would ever give up our beloved iPhones and Androids. I’ll just observe that accurately predicting the future is difficult (and not really the point) and move on.
More interesting is the variety of phones used. In films from the 1950s to the 1990s, everyone uses a desk phone with a handset. (For younger readers: that is the piece you picked up and held next to your ear and mouth. There’s probably one in your parents’ house.) The only changes were the gradual replacement of rotary dials by keypads, and some cordless handsets. In 21st century films everyone uses a small sleek handheld box. But in Johnny Mnemonic every phone call uses a different interface.
First is the phone call Johnny makes from the New Darwin hotel.
As previously discussed, Johnny is lying in bed using a remote control to select numbers on the onscreen keypad. He is facing a large wall mounted TV/display screen, with what looks like a camera at the top. The camera is realistic but unusual: as Chapter 10 of Make It So notes, films very rarely show the cameras used in visual communication.Continue reading →
Jasper is a longtime friend of Theo’s who offers his home as a safe house for a time. Jasper’s civilian vehicle features a device on its dashboard that merits some attention. It is something like a small laptop computer, with a flat screen in a roughly pill-shaped black plastic frame mounted in the center of the dashboard. The top half of this screen shows a view from a backwards-facing camera mounted on the vehicle.
The opening shot of Johnny Mnemonic is a brightly coloured 3D graphical environment. It looks like an abstract cityscape, with buildings arranged in rectangular grid and various 3D icons or avatars flying around. Text identifies this as the Internet of 2021, now cyberspace.
Strictly speaking this shot is not an interface. It is a visualization from the point of view of a calendar wake up reminder, which flies through cyberspace, then down a cable, to appear on a wall mounted screen in Johnny’s hotel suite. However, we will see later on that this is exactly the same graphical representation used by humans. As the very first scene of the film, it is important in establishing what the Internet looks like in this future world. It’s therefore worth discussing the “look” employed here, even though there isn’t any interaction.
Cyberspace is usually equated with 3D graphics and virtual reality in particular. Yet when you look into what is necessary to implement cyberspace, the graphics really aren’t that important.
MUDs and MOOs: ASCII Cyberspace
People have been building cyberspaces since the 1980s in the form of MUDs and MOOs. At first sight these look like old style games such as Adventure or Zork. To explore a MUD/MOO, you log on remotely using a terminal program. Every command and response is pure text, so typing “go north” might result in “You are in a church.” The difference between MUD/MOOs and Zork is that these are dynamic multiuser virtual worlds, not solitary-player games. Other people share the world with you and move through it, adventuring, building, or just chatting. Everyone has an avatar and every place has an appearance, but expressed in text as if you were reading a book.
guest>>@go #1914 Castle entrance A cold and dark gatehouse, with moss-covered crumbling walls. A passage gives entry to the forbidding depths of Castle Aargh. You hear a strange bubbling sound and an occasional chuckle. Obvious exits: path to Castle Aargh (#1871) enter to Bridge (#1916)
Most impressive of all, these are virtual worlds with built-in editing capabilities. All the “graphics” are plain text, and all the interactions, rules, and behaviours are programmed in a scripting language. The command line interface allows the equivalent of Emacs or VI to run, so the world and everything in itcan be modified in real time by the participants. You don’t even have to restart the program. Here a character creates a new location within a MOO, to the “south” of the existing Town Square:
laranzu>>@dig MyNewHome laranzu>> @describe here as “A large and spacious cave full of computers” laranzu>> @dig north to Town Square
The simplicity of the text interfaces leads people to think these are simple systems. They’re not. These cyberspaces have many of the legal complexities found in the real world. Can individuals be excluded from particular places? What can be done about abusive speech? How offensive can your public appearance be? Who is allowed to create new buildings, or modify existing ones? Is attacking an avatar a crime? Many 3D virtual reality system builders never progress that far, stopping when the graphics look good and the program rarely crashes. If you’re interested in cyberspace interface design, a long running textual cyberspace such as LambdaMOO or DragonMUD holds a wealth of experience about how to deal with all these messy human issues.
So why all the graphics?
So it turns out MUDs and MOOs are a rich, sprawling, complex cyberspace in text. Why then, in 1995, did we expect cyberspace to require 3D graphics anyway?
The 1980s saw two dimensional graphical user interfaces become well known with the Macintosh, and by the 1990s they were everywhere. The 1990s also saw high end 3D graphics systems becoming more common, the most prominent being from Silicon Graphics. It was clear that as prices came down personal computers would soon have similar capabilities.
At the time of Johnny Mnemonic, the world wide web had brought the Internet into everyday life. If web browsers with 2D GUIs were superior to the command line interfaces of telnet, FTP, and Gopher, surely a 3D cyberspace would be even better? Predictions of a 3D Internet were common in books such as Virtual Reality by Howard Rheingold and magazines such as Wired at the time. VRML, the Virtual Reality Markup/Modeling Language, was created in 1995 with the expectation that it would become the foundation for cyberspace, just as HTML had been the foundation of the world wide web.
Twenty years later, we know this didn’t happen. The solution to the unthinkable complexity of cyberspace was a return to the command line interface in the form of a Google search box.
Abstract or symbolic interfaces such as text command lines may look more intimidating or complicated than graphical systems. But if the graphical interface isn’t powerful enough to meet their needs, users will take the time to learn how the more complicated system works. And we’ll see later on that the cyberspace of Johnny Mnemonic is not purely graphical and does allow symbolic interaction.
Colonial One is a luxury passenger liner in commercial service until the war with the Cylons breaks out. The captain and co-pilot are not military pilots, and most passengers are dignitaries or VIPs visiting the Galactica for the unveiling of it as a museum.
Compared to military cockpits and the CIC aboard the Galactica, Colonial One’s cockpit has simple controls and an unsophisticated space-borne sensor system. Also unlike the Galactica or the Raptors, no one on Colonial One calls their space-borne sensor system the “Dradis”. At the center of each control console is a large gimbal-based horizon indicator.
The sensors show a simple 2-d representation of local space, with nearby contacts indicated as white dots. There is no differentiation between ‘enemy’ and ‘friendly’ contacts. Likewise, the image of a Cylon missile (shown above) is the same indicator as other ships. There is no clear explanation of what the small white dots on the background of the image are, or what the lines indicate.
When the Cylon fighters show up, the crew has some unknown way besides this screen of knowing the Cylons have just jumped into contact range, and that they have launched missiles at Colonial One. How the crew determines this isn’t shown, but both the crew and Apollo are confident that the assessment is correct.
When Laura Rosilyn tells the crew to send a message on a specific frequency before the missile attack, the crew uses the same keypad to send alpha-numeric signals over a radio/faster-than-light (FTL) link as to enter information into their flight computers. The FTL link appears to connect every planet in the Colonies together in real time: we don’t get any sense of delay between the attacks happening and the entire civilization reacting to it in real time.
The largest usability concern here is Mode Switching, and making it clear whether the crew is entering information into the ship or into the radio. Given that we see the crew interact most with the ship itself, the following procedure would make the most sense:
Entering information into the ship is the primary ‘mode’
An explicit command to switch over to the radio link.
Crew enters the given information into the link
On ‘enter’, the interface flips back over to entering information into the ship.
With a larger budget, the Dradis is a better system (at least with the improvements installed)
A large amount of space inside the cockpit is given over to communication controls and a receiver station. At the receiver station, Colonial One has a small printer attached to an automatic collector that prints off broadcast messages. The function and placement of the printer appears similar to weather printers on modern passenger jets.
The cockpit is very utilitarian, and the controls look well used. These are robust systems and look like they have been in place for a while. Despite the luxury associated with the passenger compartment, the crew have been granted no special luxuries or obvious assisting equipment to make their job more comfortable.
If we look at a current (or, up until very recently current) pattern: the Space Shuttle has a very similar layout. It is intended to also enter the atmosphere, which Colonial One is shown with the equipment to do, and maintains a 2.5D movement concept. Given that it’s a commercial ship with direct paths to follow, Colonial One does not need the complicated controls – that are shown to be very difficult to master – that are present on ships like the Viper.
Overall, a solid pattern
In-universe, this ship was not designed for combat, and is woefully unprepared for it when it arrives. The sensor system and the controls appear specialized for the job of ferrying high-paying customers from one planet to another through friendly space. Other ships also have the same level of manual controls and physical switches in the cockpit, though it is impossible to tell whether this is because Colonial One was built in the same era as the Galactica, or whether the builders wanted extra reliability in the controls than ‘modern’ electronics provided.
As long as the pilots are as well trained as current-day commercial pilots, the banks of controls would provide solid spatial grouping and muscle memory. There might be some room to shrink the number of controls or group them better, but we lack the context to dig into that particular issue.
One minor fix would be the possibility of mode errors for the keypad. It is not obvious when the crew changes from “I want to enter information into Colonial One to change operating parameters” and “I want to send a message to someone else”. A clear way to indicate that the keyboard is sending information to the ship, compared to sending information to the radio system, would clear up the possibility of a mode-switch error. Common options could be:
A large switch close by that changed the color of the lights
A bi-directional light with labels on which mode it’s in
or distinct separation between the Pilot’s keyboard and the Co-pilot’s keyboard
Of the three, a clear distinction between pilot’s keyboard and co-pilot’s keyboard would be the most secure; provided that there was a switch in case of emergency.
The Colonial One copies many interface patterns from modern airliners. Since the airline industry has one of the best and most sophisticated UI design in practice right now, there are very few obvious recommendations to make, and credit should be given for how realistic it looks.
When she wonders about Chewbacca’s whereabouts, Malla first turns to the Imperial-issue Media Console. The device sits in the living space, and consists of a personal console and a large wall display. The wall display mirrors the CRT on the console. The console has a QWERTY keyboard, four dials, two gauges, a sliding card reader, a few red and green lights on the side, and a row of randomly-blinking white lights along the front.
Public Service Requests
As Malla approaches it, it is displaying an 8-bit kaleidoscope pattern and playing a standard-issue “electronics” sound. Malla presses a handful of buttons—here it’s important to note the difficulty of knowing what is being pressed when the hand we’re watching is covered in a mop—and then moves through a confusing workflow, where…
She presses five buttons
She waits a few seconds
As she is pressing four more buttons…
…the screen displays a 22-character string (a password? A channel designation?) ↑***3- ↓3&39÷ ↑%63&-:::↓
A screen flashes YOU HAVE REACHED TRAFFIC CONTROL in black letters on a yellow background
She presses a few more buttons, and another 23-character string appears on screen ↑***3- XOXOO OXOOX XOOXO-↑ (Note that the first six characters are identical to the first six characters of the prior code. What’s that mean? And what’s with all the Xs and Os? Kisses and hugs? A binary? I checked. It seems meaningless.)
An op-art psychedelic screen of orange waves on black for a few seconds
The Control Room of Jurassic Park has a basic video/audio feed to the Tour Explorers that a controller (or, in this case, John Hammond) can use to talk to the tour participants. He is able to switch to different cameras using the number keys on the keyboard attached to the monitor. The cameras themselves appear to be fixed in place.
We never see the cameras themselves in the Explorers, but we do see Malcolm tap on one of the cameras during the tour while Hammond is watching it’s feed, so they are visible to the riders.
Hammond occasionally speaks through his audio link, and can hear a constant audio feed from the Explorers. He has some kind of mute button (he says a couple disparaging comments that the other characters don’t appear to hear), but the feed from the Explorers is real-time. It isn’t obvious how he switches between the different Explorers’ audio feeds, or whether he hears both Explorers simultaneously. Continue reading →
Forgive me, as I am but a humble interaction designer (i.e., neither a professional visual designer nor video editor) but here’s my shot at a redesigned DuoMento, taking into account everything I’d noted in the review.
There’s only one click for Carl to initiate this test.
To decrease the risk of a false positive, this interface draws from a large category of concrete, visual and visceral concepts to be sent telepathically, and displays them visually.
It contrasts Carl’s brainwave frequencies (smooth and controlled) with Johnny’s (spiky and chaotic).
It reads both the brain of the sender and the receiver for some crude images from their visual cortex. (It would be better at this stage to have the actors wear some glowing attachment near a crown to show how this information was being read.)
These changes are the sort that even in passing would help tell a more convincing narrative by being more believable, and even illustrating how not-psychic Johnny really is.
Carl, a young psychic, has an application at home to practice and hone his mental powers. It’s not named in the film, so I’m going to call it DuoMento. We see DuoMento in use when Carl uses it to try and help Johnny find if he has any latent psyhic talent. (Spoiler alert: It doesn’t work.)
DuoMento challenges its users with blind matching tests. For it, the “thought projector” (Carl) sits in a chair at a desk with a keyboard and a desktop monitor before him. The “thought receiver” (Johnny) sits in a chair facing the thought projector, unable to see either the desktop monitor or the large, wall-mounted screen behind him, which duplicates the image from the desktop monitor. To the receiver’s right hand is a small elevated panel of around 20 white push buttons.
For the test, two Hoyle playing cards appear on the screen side-by-side, face down. Carl presses a key on his keyboard, and one card flips over to reveal its face. Carl concentrates on the face-up card, attempting to project the identity of the card to Johnny. Johnny tries his best to receive the thought. It’s intense.
When Johnny feels he has an answer, he says, “I see…Ace of Spades,” and reaches forward and presses a button on the elevated panel. In response, the hidden card flips over as the ace of spades. An overlay appears on top of the two cards indicating if it was a match. Lacking any psychic abilities, Johnny gets a big label reading “NO MATCH,” accompanied by a buzzer sound. Carl resets it to a new card with three clicks on his keyboard.
Not very efficient
Why does it take Carl three clicks to reset the cards? You’d think on such a routine task it would be as simple as pressing [space bar]. Maybe you want to prevent accidental activation, but still that’s a key with a modifer, like shift+[space bar]. Best would be if Carl was also a telekinetic. Then he could just mentally push a switch and get some of that practice in. If that switch offered variable resistance it could increase with each…but I digress since he’s just a telepath.
A semi-questionable display
I get why there’s a side-by-side pair of cards. People are much better at these sorts of comparison tasks when objects are side-by-side. But ultimately, it conveys the wrong thing. Having a face down card that flips over implies that that face-down card is the one that Johnny’s trying to guess. But it’s not. The one that’s already turned over is the one he’s trying to guess. Better would be a graphic that implies he’s filling in the blank.
Better still are two separate screens: One for the projector with a single card displayed, and a second for the receiver with this same graphic prompting him to guess. This would require a little different setup when shooting the scene, with over-the-shoulder shots for each showing the different screen. But audiences are sophisticated enough to get that now. Different screens can show different things.
At first it seems like Johnny’s input panel is insufficient for the task. After all, there are 52 cards in a standard deck of cards and only 20 buttons. But having a set of 13 keys for the card ranks and 4 for the suit is easy enough, reduces the number of keys, and might even let him answer only the part he’s confident in if the image hasn’t quite come through.
Does it help test for “sensitivity”?
Psychic powers are real in the world of Starship Troopers, so we’re going not going to question that. Instead the question at hand will be: Is this the best test for psychic sensitivity?
I do wonder that having a lit screen gives the receiver a reflection in the projector’s eyes to detect, even if unconsciously. An eagle-eyed receiver might be able to spot a color, or the difference between a face card and a number card. Better would be some way for the projector to cover his eyes while reading the subject, and dim that screen afterward.
The risk of false positives
More importantly, such a test would want to eliminate the chance that the receiver guessed correctly by chance. The more constrained and familiar the range of options, the more likely they are to get a false positive, which wouldn’t help anything except confidence, and even that would be false. I get that when designing skills-building interfaces, you want to start easy and get progressively more challenging. But it makes more sense to constrain the concepts being projected to things that are more concrete and progress to greater abstraction or more nuance. Start with “fire,” perhaps, and advance to “flicker” or “warmth.” For such thoughts, a video cue of a word randomly selected from that pool of concepts would make the most sense. And for cinematic directness (Starship Troopers was nothing if not direct) you should overlay the word onto the video cue as well.
The next design challenge then becomes how does the receiver provide to the system what, if anything, they’re receiving. Since the concepts would be open-ended, you need a language-input mechanism: ANSI keyboard for typing, or voice recognition.
Additionally, I’d add a brain-reading interface that was able to read his brain as he was attempting to receive. Then it could detect for the right state of mind, e.g. an alpha state, as well as areas of the brain that are being activated. Cinematically you could show a brain map, indicating the brain state in a range, the areas of the brain being activated. Having the map on hand for Johnny would let him know to relax and get into a receptive state. If Carl had the same map he could help prompt him.
In a movie you’d probably also want a crude image feed being “read” from Johnny’s thoughts. It might charmingly be some dumb, non-fire things, like scenes from his last jump ball game, Carmen’s face and cleavage, and to Carl’s shame, a recollection of the public humilation suffered recently at his hand.
But if this interface (and telepathy) was real, you wouldn’t want to show that to Johnny, as it might cause distracting feedback loops, and you wouldn’t want to show it to Carl less he betray when Johnny is getting close, and encourage Johnny’s zeroing in on the concept through subtle social cues instead of the desired psychic ones. Since it’s not real, let’s comp it up next more cinematically.
When students want to know the results of their tests, they do so by a public interface. A large, tiled screen is mounted to a recessed section of wall in a courtyard. The display is divided into a grid of five columns and three rows. Each cell contains one student’s results for one test, as a percentage. One cell displays an ad for military service. Another provides a reminder for the upcoming sports game. Four keyboards are situated below the screens at waist level.
To find her score, Carmen approaches one of the keyboards and enters some identifying data. In response, the column above the screen displays her score and moves the data in the other cells up. There is no way to learn of one’s test scores privately. This hits Johnny particularly hard when he checks his scores to find he has earned 35% on his Math Final, a failing grade.
Worse, his friend Carl is able to walk up to the keyboard and with a few key presses, interrupt every other student looking at the grades, and fill the entire screen with Johnny’s score for all to see, with the failing number blinking red and white, ridiculing him before his peers. After a reprimand from Johnny, Carl returns the display to normal with the press of a button.
Is ANSI the right input?
The keyboard would be a pain to keep clean, and you’d figure that a student ID would be a unique-and-memorable enough token. Does an entire ANSI keyboard need to be there? Wouldn’t a number pad be enough? But why a manual input at all? Nowadays you’d expect some near-field communication, or biometric token, which would obviate the keyboard entirely.
Are publicizing grades OK?
So there are input and interaction improvements to be made, for sure. But there’s more important issues to talk about here. Yes, students can accomplish one task with the interface well enough: Checking grades. But what about the giant, public output?
It’s fullfilling one of the dystopian goals of the fascist society in which the story takes place, which is that might makes right. Carl is a bully (even if Jonny’s friend) and in the culture of Starship Troopers, if he wants to increase Johnny’s public humiliation, why not? Johnny needs to study harder, take it on the chin, or make Carl stop. In this regard, the interface satisfies both the students’ task and the culture’s…um…values.
I originally wanted to counter that with a strong statement that, “But that’s not us.” After all, modern federal privacy laws in the United States forbid this public display as a violation of students’ privacy. (See FERPA laws.) But apparently not everyone believes this. A look on debate.org (at the time of writing) shows that opinion is perfectly split on the topic. I could lay out my thoughts on which side is better for learning, but it’s really beyond the scope of this blog to build a case for either side of Lakoff’s Moral Politics.
You’re Doing More Than You Think You’re Doing
But it’s worth noting the scope of these issues at hand. This seems at first to be an interface just about checking grades, but when you look at the ecosystem in which it operates, it actually illustrates and reinforce a culture’s core virtues. The interface is sometimes not just the interface. Its designers are more than flowchart monkeys.
Many characters in Ghost in the Shell have a particular cybernetic augmentation that lets them use specially-designed keyboards for input.
To control this input device, the user’s hands are replaced with cybernetic ones. Normally they look and behave like normal human hands. But when needed, the fingers of these each split into three separate mini-fingers, which can move independently. These 30 spidery fingerlets triple the number of digits at play, dancing across the keyboard at a blinding 24 positions per second.
The keyboards for which these hands were built have eight rows. The five rows nearest the user have single symbols. (QWERTY English?) Three rows farthest from the user have keys labeled with individual words. Six other keys at the top right are unlabeled. Each key glows cyan when pressed and is flush with the board itself. In this sense it works more like a touch panel than a keyboard. The board has around 100 keys in total.
What’s nifty about the keyboard itself is not the number of keys. Modern keyboards have about that many. What’s nifty is that you can see these keyboards are massively chorded, with screen captures from the film showing nine keys being pressed at once.
Let’s compare. (And here I owe a great mathematical debt of thanks to Nate Clinton for his mastery of combinatorics.) The keyboard I’m typing this blog post on has 104 keys, and can handle five keys being pressed at once, i.e, a base key like “S” and up to four modifier keys: shift, control, option, and command. If you do the math, this allows for 1600 different keypresses. That’s quite a large range of momentary inputs.
But on the tera-keyboard you’re able to press nine keys at once, and more importantly, it looks like any key can be chorded with any other key. If we’re conservative in the interpretation and presume that 9 keys must be pressed at once—leaving 6 fingerlets free to move into position for the next bit of input—that still adds up to a possible 2,747,472,247,520 possible keypresses (≈2.7 trillion). That’s about nine orders of magnitude more than our measley 1600. At 24 keypresses per second, that’s a data rate of 6.5939334e+13 per second.
So, ok, yes, fast, but it only raises the question:
What exactly is being input?
It’s certainly more than just characters. Unicode‘s 110,000 characters is a fraction of a fraction of this amount of data, and it covers most of the world’s scripts.
Is it words? Steven Pinker in his book The Language Instinct cites sources estimating the number of words in an educated person’s vocabulary is around 60,000. This excludes proper names, numbers, foreign words, any scientific terms, and acronyms, so it’s pretty conservative. Even if we double it, we’re still around the number of characters in Unicode. So even if the keyboard had one keypress for every word the user could possibly know and be thinking at any particular moment, the typist would only be using a fragment of its capacity.
The only thing that nears this level of data on a human scale is the human brain. With a common estimate of 100 billion neurons, the keyboard could be expressing the state of it’s users brain, 24 times a second, distinguishing between 10 different states of each neuron.
This also bypasses one of the concerns of introducing an input mechanism like this that requires active manipulation: The human brain doesn’t have the mechanisms to manage 30 digits and 9-key-chording at this rate. To get it to where it could manage this kind of task would need fairly massive rewiring of the brain of the user. (And if you could do that, why bother with the computer?)
But if it’s a passive device, simply taking “pictures” of the brain and sharing those pictures with the computer, it doesn’t require that the human be reengineered, just re-equipped. It requires a very smart computer system able to cope with and respond to that kind of input, but we see that exact kind of artificial intelligence elsewhere in the film.
Because of the form factor of hands and keyboard, it looks like a manual input device. But looking at the data throughput, the evidence suggests that it’s actually a brain interface, meant to keep the computer up to date with whatever the user is thinking at that exact moment and responding appropriately. For all the futurism seen in this film, this is perhaps the most futuristic, and perhaps the most surprising.