On Phone Menus and the Blowing of Gaskets

January 2, 2013

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

This blog post is only tangentially related to avatars and body language. But it does relate to the larger subject of communication technology that fails to accommodate normal human behavior and the rules of natural language.

But first, an appetizer. Check out this video for a phone menu for callers to the Tennessee State Mental Hospital:




A Typical Scenario

You’ve probably had this experience. You call a company or service to ask about your bill, or to make a general inquiry. You are dumped into a sea of countless menu options given by a recorded message (I say countless, because you usually don’t know how many options you have to listen to – will it stop at 5? Or will I have to listen to 10?). None of the options apply to you. Or maybe some do. You’re not really sure. You hope – you pray, that you will be given the option to speak to a representative, a living, breathing, thinking, soft and cuddly human. After several agonizing minutes (by now you’ve forgotten most of the long-winded options) you realize that there is no option to speak to a human. Or at least youthink there is no option. You’re not really sure.

Your blood pressure has now reached levels that warrant medical attention. If you still have rational neurons firing, you get the notion to press “0″. And the voice says, “Please wait to speak to a phone representative”. You collapse in relief. The voice continues: “this call may be recorded for quality assurance” Yea, right. (I think I remember once actually hearing the message say, “this call may be recorded……because…we care”. Okay now that is gasket-blowing material).

Why Conversation Matters

I don’t think I need to go into this any further. Just do a search on “phone menu” (or “phone tree”) and “frustration”, or something like that, and follow the scent and you’ll find plenty of blog posts on the subject.

How would I best characterize this problem? I could talk about it from an economic point of view. For instance it costs a company a lot more to hire real people than to hook up an automated answering service or an interactive voice response (IVR) system. But companies have to also weigh the negative impact of a large percentage of irate customers. But too few companies look at this as a Design problem. Ah, there it is again: that ever-present normalizer and humanizer of technology: DesignIt’s invisible when it works well, and that’s why it is such an unsung hero.

The Hyper-Linearity of Non-Interactive Verbal Messages

The nature of this design problem, I believe, is that these phone menus give a large amount of verbal information (words, sentences, options, numbers, etc.) which take time to explain. They are laid out in a sequential order.

There is no way to jump ahead, to interrupt the monolog, or to ask it for clarification, as you would in a normal conversation. You are stuck in time – rigid, linear time, with no escape. (At least that’s what it feels like: there are usually options to hit special keys to go to the previous menu or pop out entirely, etc. But who knows what those keys are? And the dreaded fear of getting disconnected is enough to keep people like me staying within the lines, gritting  teeth, and being obedient (although that means I have the potential to become the McDonald’s gunman who makes the headlines the next morning.)

Compare this with a conversation with a phone representative: normal human dialog involves interruptions, clarifications, repetitions, mirroring (the “mm’s”, “hmm’s”, “ah’s”, “ok’s”, “uh-huh’s”, and such – the audible equivalent of eye-contact and head-nods), and all the affordances that you get from the prosody of speech. Natural conversations continually adapt to the situation. These adaptive, conversational dynamics are absent from the braindead phone robots. And their soft, soothing voices don’t help – in fact they only make me want to kill them that much harder.

There are two solutions:

1. Full-blown Artificial Intelligence, allowing the robot voice to “hear” your concerns, questions, and drill down, with your help, to the crux of the problem. But I’m afraid that AI  has a way to go before this is possible. And even if it is almost possible, the good psychologists, interaction designers, and human-user interface experts don’t seem to be running the show. They are outnumbered by the techno-geeks with low EQ, and little understanding of human psychology. Left-brainers gather the power and influence, and run the machines – computer-wise and business-wise – because they are good with the numbers, and rarely blow a gasket. The right-brained skill set ends up stuck on the periphery, almost by its very nature. I’m waiting for this revolution I keep hearing about – the Revenge of the Right Brain. So far, I still keep hitting brick walls built with left-brained mortar. But I digress.

2. Visual interfaces. By having all the options laid out in a visual space, the user’s eyes can jump around (much more quickly than a robot can utter the options). Thus, if the layout is designed well (a rarity in the internet junkyard) the user can quickly see, “ah, I have five options. Maybe I want to choose option 4 – I will select, “more information about option 4 to make sure”. All of this can happen within a matter of seconds. You could almost say that the interface affords a kind of body language that the user reads and acts upon immediately.

Consider the illustration below for a company’s phone tree which I found on the internet (I blacked-out the company name and phone number). Wouldn’t it be nice if you could just take a glance at this visual diagram and jump to the choice you want? If you’re like me, your eyes will jump straight to the bottom where the choice to speak to a representative is. (Of course it’s at the bottom).

This picture says it all. But of course. We each have two eyes, each with millions of photoreceptors: simultaneity, parallelism, instant grok. But since I’m talking about telephones, the solution has to be found within the modality of audio alone, trapped in time. And in that case, there is no other solution than an advanced AI program that can understand your question, read your prosodic body language, and respond to the flow of the conversation, thus collapsing time.

…and since that’s not coming for a while, there’s another choice: a meat puppet – one of those very expensive communication units that burn calories, and require a salary. What a nuisance.


Uncanny Charlie

August 18, 2012

The subway system in Boston has a mascot named “Charlie”, a cartoon character who rides the train and reminds people to use the “Charlie Card”. With the exception of his face, he looks like a normal airbrushed graphic of a guy with a hat. But his face? Uh, it’s f’d up.

In case you don’t know yet about the Uncanny Valley, it refers to a graph devised by a Japanese robot maker. The graph shows typical reactions to human likeness in robots and other simulations. The more realistic the robot (or computer generated character) the more CREEPY it becomes….

..until it is so utterly realistic that you are fooled, and you respond to it as if it were a living human. But watch out. If the eyes do something wacky or scary, or if something else reveals the fact that it is just an animated corpse…DOWN you fall…. into the valley.

Anyway, I have a theory about the uncanny valley: it is just a specific example of a more general phenomenon that occurs when incompatible levels of realism are juxtaposed in a single viewing experience. So for instance, an animated film in which the character motions are realistic – but their faces are abstract – can be creepy. How about a computer animation in which the rendering is super-realistic, but the motions are stiff and artificial? Creepola. A cartoon character where one aspect is stylized and other aspects are realistic looks…not right. That’s Charlie’s issue.

Stylized faces are everywhere:

But when an artist takes a stylized line-drawn graphic of a face and renders it with shading, I consider this to be a visual language blunder. The exception to this rule of thumb is demonstrated by artists who purposefully juxtapose styles and levels of realism, for artistic impact, such as the post-modern painter David Salle.

The subject of levels of realism and accessibility in graphic design is covered in McCloud’s Understanding Comics. The image-reading eyebrain can adjust its zone of suspension of disbelief to accommodate a particular level of stylism/realism. But in general, it cannot easily handle having that zone bifurcated.

Charlie either needs a face transplant to match his jacket and hat, or else he needs to start wearing f’d-up clothes to match his f’d-up face.


Nano Avatars

June 8, 2012

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

———————–

The other day, Jeremy Owen Turner told me about NanoArt. Here’s a cool nano art piece by Yong Qing Fu, described in Chemistry World.

nano

We started imagining a nano virtual world. Jeremy pontificates on avatars as works of art, avatars that can take on alternate forms, including nano art. I started thinking about what an avatar that consisted of a molecule might be like.

Some illustrations of the hemoglobin molecule look a bit like the flying spaghetti monster. Which reminds me, Cory Linden’s avatar in Second Life is based on the flying spaghetti monster.

Spaghetti

We’ve seen avatars hanging out among virtual molecules

avatar_in_molecule

but what about avatars that ARE molecules? Stephanie H. Chanteau and James M. Tour of Rice University created anthropomorphic molecules.

NanoKid2

But I’m not so interested in how people make anthropomorphic molecules. I’m interested in avatars that live a molecule’s life. Check this out…

scanning tunneling microscope (STM) is set up in a magnificent auditorium.

STM

The microscope’s subject matter is projected onto a giant video screen. An audience of thousands watch as a team of five molecule-avatar controllers sit with computer mice and keyboards and mingle in a virtual world that is actually not virtual. In the middle of all the flamboyant machinery is a tiny nano-stage, a performance dance floor where five molecules show something rather strange and new

Since the STM can be used for atom manipulation as well as visioning (a consequence of the Observer Effect), the very technology for seeing the avatars is used to control them.

The audience collectively winces as the avatars try to, um, walk. Okay, maybe walking isn’t the right word. What exactly do these avatars do? They combine to form supermolecules. They jump and twitch. They split and reform. They blink and chirp. They fall off the edge of the stage and accidentally get stuck on carbon atoms. It may not be elegant. But hey it would be so cool to watch.

When the performance is done, the avatars take a bow…or something. The audience applauds with a standing ovation. A new genre is born. Constraints define creative boundaries and therefore creativity. And the limited repertoire of molecular interactions define the social vocabulary of these agents. Kind of reminds me of Flatland.

Avatars are embodiments of humans (or human intention) in virtual worlds.

“Seeing” a molecule is a problematic term, in the same sense that “seeing” a planet in a distant star system is a problematic term. It’s not “seeing” on a human scale. It’sprosthetic seeing. And so, just like a software-based virtual world, there must be arenderer.

molecule

Our most distant ancestor is a molecule that accidentally replicated and thus started the upward avalanche that is called Evolution. Dennett’s intentional stance can be applied on all levels of the biosphere. Molecular avatars represent the most basic and primitive expression of agentry. And unlike the constraints of C++, Havok, and OpenGL, in virtual world software programs, the constraints in this molecular world are real.

It may yield some insights about the fundamentals of interaction.


Just Because It’s Visual Doesn’t Mean It’s Better

May 24, 2012

I’ve been renting a lot of cars lately because my own car died. And so I get to see a lot of the interiors of American cars. Car design is generally more user-friendly than computer interfaces – for the simple reason that when you make a mistake on a computer interface and the computer crashes, you will not die.

As cars become increasingly computerized, the “body language” starts to get wonky, even in aspects that are purely mechanical.

In a car I recently rented, I was looking for the emergency brake. The body language of most of the cars I’ve used offers an emergency brake just to the right of my seat in the form of a lever that I pull up. Body language between human bodies is mostly unconscious. If a human-manufactured tool is designed well, its body langage is also mostly-unconscious: it is natural. Anyway…I could not find an emergency brake in the usual place in this particular car. So I looked in the next logical place: near the floor to the left of the foot pedals. There I saw the following THING:

I wanted to check to make sure it was the brake, so that I wouldn’t inadvertently pop open the hood or the cap of the gas tank. So I peered more closely at the symbol on this particular THING, and I asked myself the following question:

What the F?

Once I realized that this was indeed the emergency brake, I decided that a simple word would have sufficed.

In some cars, the “required action” is written on the brake:


Illiterate Icon Artists

I was reminded of an episode in one of the companies I was working for, where an “icon artist” was hired to build the visual symbols for several buttons on a computer interface. He had devised a series of icons that were meant to provide visual language counterparts to basic actions that we typically do on computer interfaces. He came up with novel and aesthetic symbols. But….UN-READABLE.

I suggested he just put the words on the icons, because the majority of computer users know English, and if they don’t know English, they could always open up a dictionary. Basically, this guy’s clever icons had no counterpart to the rest of the world. They were his own invention – they were UNDISCOVERABLE.

Moral of the story:

Designed body language should corresponds to “natural affordances”;  the expectations and readability of the natural world. If that is not possible, use historical conventions (by now there is plenty of reference material on visual symbols, and I would suspect that by now there are ways to check for the relative “universality” of certain symbols).

In both cases, whether using words or visuals, literacy is needed.

Put in another way:

It is impossible to invent a visual langage from scratch. Because the only one who can visually “read” it is the creator. If it does not commute, it is not language. This applies to visual icons as much as it does to words.

As technology becomes more and more computerized (like cars) we have less and less opportunity to take advantage of natural affordances. Eventually, it will be possible to set the emergency brake by touching a tiny red button, or by uttering a message into a microphone. Thankfully, emergency brakes are still very physical, and I get to FEEL the pressure of that brake as I push it in, or pop it off….

that is…if I can ever find the damn thing.


Screensharing: Don’t Look at Me

January 11, 2012

Imagine discussing a project you are doing with a small group: a web site, a drawing, a contraption you are building; whatever. You would not expect the people to be looking at your face the whole time. Much of the time you will all be gazing around at different parts of the project. You may be pointing your fingers around, using terms like “this”, “that”, “here” and “there”.

When people have their focus on something separate from their own bodies, that thing becomes an extension of their bodies. Bodymind is not bound by skin. And collaborating, communicating bodyminds meld on an object of common interest.

TeleKinesics

The internet is dispersing our workspaces globally, and the same is happening to our bodies.

The anthropologist, Ray Birdwhistell coined the term “kinesics“, referring to the interpretation, science, or study of body language.

I invented a word: “telekinesics”. I define it as, “the science of body language as conducted over remote distances via some medium, including the internet” (ref)

My primary interest is the creation of body langage using remote manifestations of ourselves, such as with avatars and other visual-interactive forms. I don’t consider video conferencing as a form of virtual body language, because it is essentially a re-creation of one’s literal appearances and sounds. It is an extension of telephony.

But it is virtual in one sense: it is remote from your real body.

Video conferencing, and applications like Skype are extremely useful. I use Skype all the time to chat with friends or colleagues. Seeing my collaborator’s face helps tremendously to fill-in the missing nonverbal signals in telephony. But if the subject of conversation is a project we are working on, then “face-time”, is not helpful. We need to enter into, and embody, the space of our collaboration.

Screen Sharing

This is why screen sharing is so useful. Screen sharing happens when you flip a switch on your Skype (or whatever) application that changes the output signal from your camera to your computer screen. Your mouse cursor becomes a tiny Vanna White – annotating, referencing, directing people’s gazes.

Michael Braun, in the blog post: Screen Sharing for Face Time, says that seeing your chat partner is not always helpful, while screen sharing “has been shown to increase productivity. When remote participants had access to a shared workspace (for example, seeing the same spreadsheet or computer program), then their productivity improved. This is not especially surprising to anyone who has tried to give someone computer help over the phone. Not being able to see that person’s screen can be maddening, because the person needing help has to describe everything and the person giving help has to reconstruct the problem in her mind.”

Many software applications include cute features like collaborative drawing spaces, intended for co-collaborators to co-create, co-communicate, and to to co-mess up each other’s co-work. The interaction design (from what I’ve seen) is generally awkward. But more to the point: we don’t yet have a good sense of how people can and should interact in such collaborative virtual spaces. The technology is still frothing like tadpole eggs.

Some proponents of gestural theory believe that one reason speech emerged out of gestural communication was because it freed up the “talking hands” so that they could do physical work – so our mouths started to do the talking. Result: we can put our hands to work, look at our work, and talk about it, all at the same time.

Screen sharing may be a natural evolutionary trend – a continuing thread to this ancient  activity – as manifested in the virtual world of internet communications.

 

 


Follow

Get every new post delivered to your Inbox.