UX of Speculative Brain-Computer Inputs

So much of the technology in Black Panther appears to work by mental command (so far: Panther Suit 2.0, the Royal Talon, and the vibranium sand tables) that…

  • before we get into the Kimoyo beads, or the Cape Shields, or the remote driving systems…
  • before I have to dismiss these interactions as “a wizard did it” style non-designs
  • before I review other brain-computer interfaces in other shows…

…I wanted check on the state of the art of brain-computer interfaces (or BCIs) and see how our understanding had advanced since I wrote the Brain interface chapter in the book, back in the halcyon days of 2012.

Note that I am deliberately avoiding the tech side of this question. I’m not going to talk about EEG, PET, MRI, and fMRI. (Though they’re linked in case you want to learn more.) Modern brain-computer interface (or BCI) technologies are evolving too rapidly to bother with an overview of them. They’ll change in the real world by the time I press “publish,” much less by the time you read this. And sci-fi tech is most often a black box anyway. But the human part of the human-computer interaction model changes much more slowly. We can look to the brain as a relatively-unalterable component of the BCI question, leading us to two believability questions of sci-fi BCI.

  1. How can people express intent using their brains?
  2. How do we prevent accidental activation using BCI?

Let’s discuss each.

1. How can people express intent using their brains?

In the see-think-do loop of human-computer interaction…

  • See (perceive) has been a subject of visual, industrial, and auditory design.
  • Think has been a matter of human cognition as informed by system interaction and content design.
  • Do has long been a matter of some muscular movement that the system can detect, to start its matching input-process-output loop. Tap a button. Move a mouse. Touch a screen. Focus on something with your eyes. Hold your breath. These are all ways of “doing” with muscles.
The “bowtie” diagram I developed for my book on agentive tech.

But the first promise of BCI is to let that doing part happen with your brain. The brain isn’t a muscle, so what actions are BCI users able to take in their heads to signal to a BCI system what they want it to do? The answer to this question is partly physiological, about the way the brain changes as it goes about its thinking business.

Ah, the 1800s. Such good art. Such bad science.

Our brains are a dense network of bioelectric signals, chemicals, and blood flow. But it’s not chaos. It’s organized. It’s locally functionalized, meaning that certain parts of the brain are predictably activated when we think about certain things. But it’s not like the Christmas lights in Stranger Things, with one part lighting up discretely at a time. It’s more like an animated proportional symbol map, with lots of places lighting up at the same time to different degrees.

Illustrative composite of a gif and an online map demo.

The sizes and shapes of what’s lighting up may change slightly between people, but a basic map of healthy, undamaged brains will be similar to each other. Lots of work has gone on to map these functional areas, with researchers showing subjects lots of stimuli and noting what areas of the brain light up. Test enough of these subjects and you can build a pretty good functional map of concepts. Thereafter, you can take a “picture” of the brain, and you can cross-reference your maps to reverse-engineer what is being thought.

From Jack Gallant’s semantic maps viewer.

Right now those pictures are pretty crude and slow, but so were the first actual photographs in the world. In 20–50 years, we may be able to wear baseball caps that provide a much more high-resolution, real time inputs of concepts being thought. In the far future (or, say, the alternate history of the MCU) it is conceivable to read these things from a distance. (Though there are significant ethical questions involved in such a technology, this post is focused on questions of viability and interaction.)

From Jack Gallant’s semantic map viewer

Similarly the brain maps we have are only for a small percentage of an average adult vocabulary. Jack Gallant’s semantic map viewer (pictured and linked above) shows the maps for about 140 concepts, and estimates of average active vocabulary is around 20,000 words, so we’re looking at a tenth of a tenth of what we can imagine (not even counting the infinite composability of language). But in the future we will not only have more concepts mapped, more confidently, but we will also have idiographs for each individual, like the personal dictionary in your smart phone.

All this is to say that our extant real world technology confirms that thoughts are a believable input for a system. This includes linguistic inputs like “Turn on the light” and “activate the vibranium sand table” and “Sincerely, Chris” and even imagining the desired change, like a light changing from dark to light. It might even include subconscious thoughts that yet to be formed into words.

2. How do we prevent accidental activation?

But we know from personal experience, we don’t want all our thoughts to be acted on. Take, for example, those thoughts you’re feeling hangry, or snarky, or dealing with a jerk-in-authority. Or those texts and emails that you’ve composed in the heat of the moment but wisely deleted before they get you in trouble.

If a speculative BCI is being read by a general artificial intelligence, it can manage that just like a smart human partner would.

He is composing a blog post, reasons the AGI, so I will just disregard his thought that he needs to pee.

And if there’s any doubt, an AGI can ask. “Did you intend me to include the bit about pee in the post?” Me: “Certainly not. Also BRB.” (Readers following the Black Panther reviews will note that AGI is available to Wakandans in the form of Griot.)

If AGI is unavailable to the diegesis (and it would significantly change any diegesis of which it is a part) then we need some way to indicate when a thought is intended as input and when it isn’t. Having that be some mode of thought feels complicated and error-prone, like when programmers have to write regex expressions that escape escape characters. Better I think is to use some secondary channel, like a bodily interaction. Touch forefinger and pinky together, for instance, and the computer understands you intend your thoughts as input.

So, for any BCI that appears in sci-fi, we would want to look for the presence or absence of AGI as a reasonableness interpreter, and, barring that, for some alternate-channel mechanism for indicating deliberateness. We would also hope to see some feedback and correction loops to understand the nuances of the edge-case interactions, but these are rare in sci-fi.

Even more future-full

This all points to the question of what seeing/perceiving via a BCI might be. A simple example might be a disembodied voice that only the user can hear.

A woman walks alone at night. Lost in thoughts, she hears her AI whisper to her thoughts, “Ada, be aware that a man has just left a shadowy doorstep and is following, half a block behind you. Shall I initialize your shock shoes?”

What other than language can be written to the brain in the far future? Images? Movies? Ideas? A suspicion? A compulsion? A hunch? How will people know what are their own thoughts and what has been placed there from the outside? I look forward to the stories and shows that illustrate new ideas, and warn us of the dark pitfalls.

Internet 2021

The opening shot of Johnny Mnemonic is a brightly coloured 3D graphical environment. It looks like an abstract cityscape, with buildings arranged in rectangular grid and various 3D icons or avatars flying around. Text identifies this as the Internet of 2021, now cyberspace.

Internet 2021 display

Strictly speaking this shot is not an interface. It is a visualization from the point of view of a calendar wake up reminder, which flies through cyberspace, then down a cable, to appear on a wall mounted screen in Johnny’s hotel suite. However, we will see later on that this is exactly the same graphical representation used by humans. As the very first scene of the film, it is important in establishing what the Internet looks like in this future world. It’s therefore worth discussing the “look” employed here, even though there isn’t any interaction.

Cyberspace is usually equated with 3D graphics and virtual reality in particular. Yet when you look into what is necessary to implement cyberspace, the graphics really aren’t that important.

MUDs and MOOs: ASCII Cyberspace

People have been building cyberspaces since the 1980s in the form of MUDs and MOOs. At first sight these look like old style games such as Adventure or Zork. To explore a MUD/MOO, you log on remotely using a terminal program. Every command and response is pure text, so typing “go north” might result in “You are in a church.” The difference between MUD/MOOs and Zork is that these are dynamic multiuser virtual worlds, not solitary-player games. Other people share the world with you and move through it, adventuring, building, or just chatting. Everyone has an avatar and every place has an appearance, but expressed in text as if you were reading a book.

guest>>@go #1914
Castle entrance
A cold and dark gatehouse, with moss-covered crumbling walls. A passage gives entry to the forbidding depths of Castle Aargh. You hear a strange bubbling sound and an occasional chuckle.

Obvious exits:
path to Castle Aargh (#1871)
enter to Bridge (#1916)

Most impressive of all, these are virtual worlds with built-in editing capabilities. All the “graphics” are plain text, and all the interactions, rules, and behaviours are programmed in a scripting language. The command line interface allows the equivalent of Emacs or VI to run, so the world and everything in it can be modified in real time by the participants. You don’t even have to restart the program. Here a character creates a new location within a MOO, to the “south” of the existing Town Square:

laranzu>>@dig MyNewHome
laranzu>> @describe here as “A large and spacious cave full of computers”
laranzu>> @dig north to Town Square

The simplicity of the text interfaces leads people to think these are simple systems. They’re not. These cyberspaces have many of the legal complexities found in the real world. Can individuals be excluded from particular places? What can be done about abusive speech? How offensive can your public appearance be? Who is allowed to create new buildings, or modify existing ones? Is attacking an avatar a crime? Many 3D virtual reality system builders never progress that far, stopping when the graphics look good and the program rarely crashes. If you’re interested in cyberspace interface design, a long running textual cyberspace such as LambdaMOO or DragonMUD holds a wealth of experience about how to deal with all these messy human issues.

So why all the graphics?

So it turns out MUDs and MOOs are a rich, sprawling, complex cyberspace in text. Why then, in 1995, did we expect cyberspace to require 3D graphics anyway?

The 1980s saw two dimensional graphical user interfaces become well known with the Macintosh, and by the 1990s they were everywhere. The 1990s also saw high end 3D graphics systems becoming more common, the most prominent being from Silicon Graphics. It was clear that as prices came down personal computers would soon have similar capabilities.

At the time of Johnny Mnemonic, the world wide web had brought the Internet into everyday life. If web browsers with 2D GUIs were superior to the command line interfaces of telnet, FTP, and Gopher, surely a 3D cyberspace would be even better? Predictions of a 3D Internet were common in books such as Virtual Reality by Howard Rheingold and magazines such as Wired at the time. VRML, the Virtual Reality Markup/Modeling Language, was created in 1995 with the expectation that it would become the foundation for cyberspace, just as HTML had been the foundation of the world wide web.

Twenty years later, we know this didn’t happen. The solution to the unthinkable complexity of cyberspace was a return to the command line interface in the form of a Google search box.

Abstract or symbolic interfaces such as text command lines may look more intimidating or complicated than graphical systems. But if the graphical interface isn’t powerful enough to meet their needs, users will take the time to learn how the more complicated system works. And we’ll see later on that the cyberspace of Johnny Mnemonic is not purely graphical and does allow symbolic interaction.

The secret of the tera-keyboard


Many characters in Ghost in the Shell have a particular cybernetic augmentation that lets them use specially-designed keyboards for input.


To control this input device, the user’s hands are replaced with cybernetic ones. Normally they look and behave like normal human hands. But when needed, the fingers of these each split into three separate mini-fingers, which can move independently. These 30 spidery fingerlets triple the number of digits at play, dancing across the keyboard at a blinding 24 positions per second.


The tera-keyboard

The keyboards for which these hands were built have eight rows. The five rows nearest the user have single symbols. (QWERTY English?) Three rows farthest from the user have keys labeled with individual words. Six other keys at the top right are unlabeled. Each key glows cyan when pressed and is flush with the board itself. In this sense it works more like a touch panel than a keyboard. The board has around 100 keys in total.


What’s nifty about the keyboard itself is not the number of keys. Modern keyboards have about that many. What’s nifty is that you can see these keyboards are massively chorded, with screen captures from the film showing nine keys being pressed at once.


Let’s compare. (And here I owe a great mathematical debt of thanks to Nate Clinton for his mastery of combinatorics.) The keyboard I’m typing this blog post on has 104 keys, and can handle five keys being pressed at once, i.e, a base key like “S” and up to four modifier keys: shift, control, option, and command. If you do the math, this allows for 1600 different keypresses. That’s quite a large range of momentary inputs.

But on the tera-keyboard you’re able to press nine keys at once, and more importantly, it looks like any key can be chorded with any other key. If we’re conservative in the interpretation and presume that 9 keys must be pressed at once—leaving 6 fingerlets free to move into position for the next bit of input—that still adds up to a possible 2,747,472,247,520 possible keypresses (≈2.7 trillion). That’s about nine orders of magnitude more than our measley 1600. At 24 keypresses per second, that’s a data rate of 6.5939334e+13 per second.


So, ok, yes, fast, but it only raises the question:

What exactly is being input?

It’s certainly more than just characters. Unicode‘s 110,000 characters is a fraction of a fraction of this amount of data, and it covers most of the world’s scripts.

Is it words? Steven Pinker in his book The Language Instinct cites sources estimating the number of words in an educated person’s vocabulary is around 60,000. This excludes proper names, numbers, foreign words, any scientific terms, and acronyms, so it’s pretty conservative. Even if we double it, we’re still around the number of characters in Unicode. So even if the keyboard had one keypress for every word the user could possibly know and be thinking at any particular moment, the typist would only be using a fragment of its capacity.


The only thing that nears this level of data on a human scale is the human brain. With a common estimate of 100 billion neurons, the keyboard could be expressing the state of it’s users brain, 24 times a second, distinguishing between 10 different states of each neuron.

This also bypasses one of the concerns of introducing an input mechanism like this that requires active manipulation: The human brain doesn’t have the mechanisms to manage 30 digits and 9-key-chording at this rate. To get it to where it could manage this kind of task would need fairly massive rewiring of the brain of the user. (And if you could do that, why bother with the computer?)

But if it’s a passive device, simply taking “pictures” of the brain and sharing those pictures with the computer, it doesn’t require that the human be reengineered, just re-equipped. It requires a very smart computer system able to cope with and respond to that kind of input, but we see that exact kind of artificial intelligence elsewhere in the film.

The “secret”

Because of the form factor of hands and keyboard, it looks like a manual input device. But looking at the data throughput, the evidence suggests that it’s actually a brain interface, meant to keep the computer up to date with whatever the user is thinking at that exact moment and responding appropriately. For all the futurism seen in this film, this is perhaps the most futuristic, and perhaps the most surprising.