High Fidelity: Body Language through “Telekinesics”

June 2, 2013

Human communication demonstrates the usual punctuated equilibria of any natural evolutionary system. From hand gestures to grunts to telephones to email and beyond, human communication has not only evolved, but splintered off into many modalities and degrees of asynchrony.

hifi-logoI recently had the great fortune to join a company that is working on the next great surge in human communication: High Fidelity, Inc. This company is bringing together several new technologies to make this happen.

So, what is the newest evolutionary surge in human communication? I would describe it using a term from Virtual Body Language (page 22):

Telekinesics is a word invented to denote…”the study of all emerging nonverbal practices across the internet, by adding the prefix, tele to Birdwhistell’s, term kinesics. It could easily be confused with “telekinesis”: the ability to cause movement at a distance through the mind alone (the words differ by only one letter). But hey, these two phenomena are not so different anyway, so a slip of the tongue wouldn’t be such a bad thing. Telekinesics may be defined as “the science of body language as conducted over remote distances via some medium, including the internet”. 

And now it’s not just science, but practice: body language is ready to go online…in realtime.

And when I say “realtime” – I mean, pretty damn fast, compared to most things that zip (or try to zip) across the internet. And when we’re talking about subtle head nods, changes in eye contact, fluctuations in your voice, and shoulder shrugs, fast is not just a nicety, it is a necessity – for clear communication using a body.

Here’s Ryan Downe showing an early stage of avatar head movement using Google Glass.

Philip Rosedale, the founder of High Fidelity, often talks about how cool it would be for my avatar to walk up to your avatar and give it a little shoulder-shove, or a fist-bump, or an elbow-nudge, or a hug…and for your avatar to respond with a slight – but noticeable – movement.

It would appear that human touch (or at least the visual/audible representation of human touch) is on the verge of becoming a reality – through telekinesics. Of all the modalities and senses that we use to communicate, touch is the most primal: we share it with the oldest microorganisms.

touch_avatarWhen touch is manifest on the internet, along with highly-crafted virtual environments, maybe, just maybe, we will have reached that stage in human evolution when we can have a meaningful, intimate exchange – even if one person is in Shanghai and the other is in Chicago.

small_earthAnd that means people can stop having to fly around the world and burning fossil fuels in order to have 2-hour-long business meetings. And that means reducing our carbon footprint. And that means we might have a better chance of not pissing-off Mother Earth to the degree that she has a spontaneous fever and shrugs us off like pesky fleas.

Which would really suck.

So…keep an eye on what we’re doing at High Fidelity, and get ready for the next evolutionary step in human communication. It just might be necessary for our survival.

On Phone Menus and the Blowing of Gaskets

January 2, 2013

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

This blog post is only tangentially related to avatars and body language. But it does relate to the larger subject of communication technology that fails to accommodate normal human behavior and the rules of natural language.

But first, an appetizer. Check out this video for a phone menu for callers to the Tennessee State Mental Hospital:


A Typical Scenario

You’ve probably had this experience. You call a company or service to ask about your bill, or to make a general inquiry. You are dumped into a sea of countless menu options given by a recorded message (I say countless, because you usually don’t know how many options you have to listen to – will it stop at 5? Or will I have to listen to 10?). None of the options apply to you. Or maybe some do. You’re not really sure. You hope – you pray, that you will be given the option to speak to a representative, a living, breathing, thinking, soft and cuddly human. After several agonizing minutes (by now you’ve forgotten most of the long-winded options) you realize that there is no option to speak to a human. Or at least youthink there is no option. You’re not really sure.

Your blood pressure has now reached levels that warrant medical attention. If you still have rational neurons firing, you get the notion to press “0″. And the voice says, “Please wait to speak to a phone representative”. You collapse in relief. The voice continues: “this call may be recorded for quality assurance” Yea, right. (I think I remember once actually hearing the message say, “this call may be recorded……because…we care”. Okay now that is gasket-blowing material).

Why Conversation Matters

I don’t think I need to go into this any further. Just do a search on “phone menu” (or “phone tree”) and “frustration”, or something like that, and follow the scent and you’ll find plenty of blog posts on the subject.

How would I best characterize this problem? I could talk about it from an economic point of view. For instance it costs a company a lot more to hire real people than to hook up an automated answering service or an interactive voice response (IVR) system. But companies have to also weigh the negative impact of a large percentage of irate customers. But too few companies look at this as a Design problem. Ah, there it is again: that ever-present normalizer and humanizer of technology: DesignIt’s invisible when it works well, and that’s why it is such an unsung hero.

The Hyper-Linearity of Non-Interactive Verbal Messages

The nature of this design problem, I believe, is that these phone menus give a large amount of verbal information (words, sentences, options, numbers, etc.) which take time to explain. They are laid out in a sequential order.

There is no way to jump ahead, to interrupt the monolog, or to ask it for clarification, as you would in a normal conversation. You are stuck in time – rigid, linear time, with no escape. (At least that’s what it feels like: there are usually options to hit special keys to go to the previous menu or pop out entirely, etc. But who knows what those keys are? And the dreaded fear of getting disconnected is enough to keep people like me staying within the lines, gritting  teeth, and being obedient (although that means I have the potential to become the McDonald’s gunman who makes the headlines the next morning.)

Compare this with a conversation with a phone representative: normal human dialog involves interruptions, clarifications, repetitions, mirroring (the “mm’s”, “hmm’s”, “ah’s”, “ok’s”, “uh-huh’s”, and such – the audible equivalent of eye-contact and head-nods), and all the affordances that you get from the prosody of speech. Natural conversations continually adapt to the situation. These adaptive, conversational dynamics are absent from the braindead phone robots. And their soft, soothing voices don’t help – in fact they only make me want to kill them that much harder.

There are two solutions:

1. Full-blown Artificial Intelligence, allowing the robot voice to “hear” your concerns, questions, and drill down, with your help, to the crux of the problem. But I’m afraid that AI  has a way to go before this is possible. And even if it is almost possible, the good psychologists, interaction designers, and human-user interface experts don’t seem to be running the show. They are outnumbered by the techno-geeks with low EQ, and little understanding of human psychology. Left-brainers gather the power and influence, and run the machines – computer-wise and business-wise – because they are good with the numbers, and rarely blow a gasket. The right-brained skill set ends up stuck on the periphery, almost by its very nature. I’m waiting for this revolution I keep hearing about – the Revenge of the Right Brain. So far, I still keep hitting brick walls built with left-brained mortar. But I digress.

2. Visual interfaces. By having all the options laid out in a visual space, the user’s eyes can jump around (much more quickly than a robot can utter the options). Thus, if the layout is designed well (a rarity in the internet junkyard) the user can quickly see, “ah, I have five options. Maybe I want to choose option 4 – I will select, “more information about option 4 to make sure”. All of this can happen within a matter of seconds. You could almost say that the interface affords a kind of body language that the user reads and acts upon immediately.

Consider the illustration below for a company’s phone tree which I found on the internet (I blacked-out the company name and phone number). Wouldn’t it be nice if you could just take a glance at this visual diagram and jump to the choice you want? If you’re like me, your eyes will jump straight to the bottom where the choice to speak to a representative is. (Of course it’s at the bottom).

This picture says it all. But of course. We each have two eyes, each with millions of photoreceptors: simultaneity, parallelism, instant grok. But since I’m talking about telephones, the solution has to be found within the modality of audio alone, trapped in time. And in that case, there is no other solution than an advanced AI program that can understand your question, read your prosodic body language, and respond to the flow of the conversation, thus collapsing time.

…and since that’s not coming for a while, there’s another choice: a meat puppet – one of those very expensive communication units that burn calories, and require a salary. What a nuisance.

“Consider Including” Google Stupidity and Arrogance

November 12, 2011

A little off-topic here, but I just can’t resist taking another jab at The Google.

I am a gmail user, but more recently I have considered switching.

Every so often, I notice a new gmail feature. Google is usually kind enough to let me know that a new feature has been introduced, such as offering me the option to try the “new look”, although after I say “no thank you” which I always do, I keep getting notifications to try the “new look”, even though I had already said “no thank you” to the “new look”. Thanks Google, but please STOP TELLING ME ABOUT YOUR “NEW LOOK”.

And then there is the little yellow “Important” symbol that one day magically appeared next to some of my messages. When I roll over the symbol I see the text, “Important mainly because of the people in the conversation”.

Yo Google: how ’bout if I decide what’s important.

One person in the Google forums complained about gmail tagging her message as: “Important mainly because of the words in the message”. She says, “Can we stop with the idiotic messages from Google, as if our paternalistic uncle was looking out for us?”


But that’s not what I want to talk about: I want to talk about a feature which is the ultimate example of Google developers trying to be oh so clever but just coming across as stupid. I’m talking about the text that appears when I’m composing an email to someone, which says, “Consider including: John, Rebecca…” And so on.

Peter Thomas, one of the many bloggers who has complained about this ridiculous feature, summarizes it:

“When you type an e-mail, Gmail comes up with a list of people that you may like to also copy it to. Let’s pause and just think about this. You are writing an e-mail, generally the first thing that you do is to type in the address of the person (or people) you are writing to. Gmail has a useful feature that scans your previous mails, so typing “Pe” will bring up “Peter Thomas” as an option. So far so good….

…but then, gmail offers a list of people that you may consider including as recipients of your email, based on simple association. Hello? What if I am emailing a colleague to complain about the boss? I certainly don’t want to include the boss, and it scares me that his name is sitting up there, a mouse-click away from disaster. Or what if I am plotting a surprise birthday party for Beth? Including Beth is specifically NOT what I want to do.

And…what if the person is DEAD?

I found this on the Google forums:

I deleted my dead friend as a contact which was traumatic enough, but having google STILL suggesting I include her when there’s honestly nothing I’d like better than to be able to include her BECAUSE SHE’S DEAD.  How do I make this stop?!?!?!

Note to Google:

Please get out of the business of reading our minds. You suck at it.

Peter Thomas concludes: “This “feature” is bad enough to have merited me writing to Google asking them to remove it, or at least make it optional. Their support forums are full of people saying the same. It will be interesting to see whether or not they listen.”

Do a search for “consider including”, and you’ll come across several people railing against this act of stupidity from Google. My blog post is not original. Yet I feel compelled to add another voice to the chorus.

Do I have any conclusions or insights? Not really, other than my opinion that any good thing can turn bad when it gets too big and too powerful. Google is generally a good thing. But I think Google is getting too big and too powerful. And I am getting smaller and less powerful, in relative terms. I want to be completely in charge of how I communicate with my friends and colleagues.

The fact that Google is brimming with young, clever, cocky geeks does not make for an agreeable form of world domination.


Without a Body, Our Conversations Bifurcate

August 23, 2011

While talking on the phone or texting with a friend, it is impossible to give your friend visual signals that indicate understanding, affirmation, confusion, or levels of attention. These indicators are typically provided by head motions, facial expressions, hand movements, and posturing. In natural face-to-face interaction, these signals happen in real time, and they are coverbal; they are often tightly-synchronized with the words being exchanged.

You may have had the following experience: you are exchanging texts in an online chat with a friend. There is a long period of no response after you send a text. Did you annoy your friend? Maybe your friend has gone to the bathroom? Is your friend still thinking about what you said? One problem that ensues is cross-dialog: during the silent period, you may change the subject by issuing a new text, but unknowingly, your friend had been writing some text as a response to your last text on the previous topic. You get that text, and – relieved that you didn’t annoy your friend – you quickly switch to the previous topic. Meanwhile, your friend has just begun to respond to your text on the new topic. The conversation bifurcates – simply due to a lack of nonverbal signaling.

Like frogs in boiling water, most of us are not aware that our bodies are slowly dissolving as we engage increasingly in text-based communication, which is often asynchronous (or at least running at lower than conversation-rates). My theory: new forms of body language are emerging in the absence of our real bodies. Smart design of visual/interactive interfaces can adapt to this natural evolution. I don’t see it as a choice. It’s simply a part of our evolution – our adaptability.

Jill Chivers, in the blog, “I’m Listening – the Power and Magic of Listening in Everyday Lives“, makes a great case for reaching for the phone when repeated email pings are not getting through to someone, or for going face-to-face, when phone calls are left unanswered.

Call her old-fashioned, call her a Luddite. But she is simply suggesting that we all need to stay connected in ways that maximize our body language. It’s not an anti-technology stance. In fact, I would argue that we need more technology and smarter technology – just that it has to be the kind of technology that manifests embodiment over the internet – in whatever forms it takes. Without bodies, virtual or otherwise, and without the synchrony of realtime bodies, voices, and some stream of co-presence, we tend to fragment into text-like pieces.

Some people like deconstructing themselves into textual fragments. Sometimes I like it – I can hide behind my well-crafted words. But I don’t like the fact that I like it. I don’t want to like it anymore than I do. I would prefer to like connecting with people more in realtime, like I used to – before the world was wired.

