High Fidelity: Body Language through “Telekinesics”

June 2, 2013

Human communication demonstrates the usual punctuated equilibria of any natural evolutionary system. From hand gestures to grunts to telephones to email and beyond, human communication has not only evolved, but splintered off into many modalities and degrees of asynchrony.

hifi-logoI recently had the great fortune to join a company that is working on the next great surge in human communication: High Fidelity, Inc. This company is bringing together several new technologies to make this happen.

So, what is the newest evolutionary surge in human communication? I would describe it using a term from Virtual Body Language (page 22):

Telekinesics is a word invented to denote…”the study of all emerging nonverbal practices across the internet, by adding the prefix, tele to Birdwhistell’s, term kinesics. It could easily be confused with “telekinesis”: the ability to cause movement at a distance through the mind alone (the words differ by only one letter). But hey, these two phenomena are not so different anyway, so a slip of the tongue wouldn’t be such a bad thing. Telekinesics may be defined as “the science of body language as conducted over remote distances via some medium, including the internet”. 

And now it’s not just science, but practice: body language is ready to go online…in realtime.

And when I say “realtime” – I mean, pretty damn fast, compared to most things that zip (or try to zip) across the internet. And when we’re talking about subtle head nods, changes in eye contact, fluctuations in your voice, and shoulder shrugs, fast is not just a nicety, it is a necessity – for clear communication using a body.

Here’s Ryan Downe showing an early stage of avatar head movement using Google Glass.

Philip Rosedale, the founder of High Fidelity, often talks about how cool it would be for my avatar to walk up to your avatar and give it a little shoulder-shove, or a fist-bump, or an elbow-nudge, or a hug…and for your avatar to respond with a slight – but noticeable – movement.

It would appear that human touch (or at least the visual/audible representation of human touch) is on the verge of becoming a reality – through telekinesics. Of all the modalities and senses that we use to communicate, touch is the most primal: we share it with the oldest microorganisms.

touch_avatarWhen touch is manifest on the internet, along with highly-crafted virtual environments, maybe, just maybe, we will have reached that stage in human evolution when we can have a meaningful, intimate exchange – even if one person is in Shanghai and the other is in Chicago.

small_earthAnd that means people can stop having to fly around the world and burning fossil fuels in order to have 2-hour-long business meetings. And that means reducing our carbon footprint. And that means we might have a better chance of not pissing-off Mother Earth to the degree that she has a spontaneous fever and shrugs us off like pesky fleas.

Which would really suck.

So…keep an eye on what we’re doing at High Fidelity, and get ready for the next evolutionary step in human communication. It just might be necessary for our survival.

Nano Avatars

June 8, 2012

The other day, Jeremy Owen Turner told me about NanoArt. Here’s a cool nano art piece by Yong Qing Fu, described in Chemistry World.


We started imagining a nano virtual world. Jeremy pontificates on avatars as works of art, avatars that can take on alternate forms, including nano art. I started thinking about what an avatar that consisted of a molecule might be like.

Some illustrations of the hemoglobin molecule look a bit like the flying spaghetti monster. Which reminds me, Cory Linden’s avatar in Second Life is based on the flying spaghetti monster.


We’ve seen avatars hanging out among virtual molecules


but what about avatars that ARE molecules? Stephanie H. Chanteau and James M. Tour of Rice University created anthropomorphic molecules.


But I’m not so interested in how people make anthropomorphic molecules. I’m interested in avatars that live a molecule’s life. Check this out…

scanning tunneling microscope (STM) is set up in a magnificent auditorium.


The microscope’s subject matter is projected onto a giant video screen. An audience of thousands watch as a team of five molecule-avatar controllers sit with computer mice and keyboards and mingle in a virtual world that is actually not virtual. In the middle of all the flamboyant machinery is a tiny nano-stage, a performance dance floor where five molecules show something rather strange and new

Since the STM can be used for atom manipulation as well as visioning (a consequence of the Observer Effect), the very technology for seeing the avatars is used to control them.

The audience collectively winces as the avatars try to, um, walk. Okay, maybe walking isn’t the right word. What exactly do these avatars do? They combine to form supermolecules. They jump and twitch. They split and reform. They blink and chirp. They fall off the edge of the stage and accidentally get stuck on carbon atoms. It may not be elegant. But hey it would be so cool to watch.

When the performance is done, the avatars take a bow…or something. The audience applauds with a standing ovation. A new genre is born. Constraints define creative boundaries and therefore creativity. And the limited repertoire of molecular interactions define the social vocabulary of these agents. Kind of reminds me of Flatland.

Avatars are embodiments of humans (or human intention) in virtual worlds.

“Seeing” a molecule is a problematic term, in the same sense that “seeing” a planet in a distant star system is a problematic term. It’s not “seeing” on a human scale. It’sprosthetic seeing. And so, just like a software-based virtual world, there must be arenderer.


Our most distant ancestor is a molecule that accidentally replicated and thus started the upward avalanche that is called Evolution. Dennett’s intentional stance can be applied on all levels of the biosphere. Molecular avatars represent the most basic and primitive expression of agentry. And unlike the constraints of C++, Havok, and OpenGL, in virtual world software programs, the constraints in this molecular world are real.

It may yield some insights about the fundamentals of interaction.

Voice as Puppeteer

May 5, 2012

According to Gestural Theory, verbal language emerged from the primal energy of the body, from physical and vocal gestures.


The human mind is at home in a world of abstract symbols – a virtual world separated from the gestural origins of those symbols. An evolution from the analog to the digital continues today with the flood of the internet over earth’s geocortex. Our thoughts are awash in the alphabet: a digital artifact that arose from a gestural past. It’s hard to imagine that the mind could have created the concepts of Self, God, Logic, and Math: belief structures so deep in our wiring – generated over millions of years of genetic, cultural, and neural evolution. I’m not even sure if I fully believe that these structures are non-eternal and human-fabricated. Since the Copernican Revolution yanked humans out from the center of the universe, it continues to progressively kick down the pedestals of hubris. But, being humans, we cannot stop this trajectory of virtuality, even as we become more aware of it as such.

I’ve observed something about the birth of online virtual worlds, and the foundational technologies involved. One of the earliest online virtual worlds was Onlive Traveler, which used realtime voice.


My colleague, Steve DiPaola invented some techniques for Traveler which cause the voice to animate the floating faces that served as avatars.

But as online virtual worlds started to proliferate, they incorporated the technology of chat rooms – textual conversations. One quirky side-effect of this was the collision of computergraphical humanoid 3D models with text-chat. These are strange bedfellows indeed – occupying vastly different cognitive dimensions.


Many of us worked our craft to make these bedfellows not so strange, such as the techniques that I invented with Chuck Clanton at There.com, called Avatar Centric Communication.

Later, voice was introduced to There.com. I invented a technique for There.com voice chat, and later re-implemented a variation for Second Life, for voice-triggered gesticulation.

Imagine the uncanny valley of hearing real voices coming from avatars with no associated animation. When I first witnessed this in a demo, the avatars came across as propped-up corpses with telephone speakers attached to their heads. Being so tuned-in to body language as I am, I got up on the gesticulation soap box and started a campaign to add voice-triggered animation. As an added visual aid, I created the sound wave animation that appears above avatar heads for both There and SL…


Gesticulation is the physical-visual counterpart to vocal energy – we gesticulate when we speak – moving our eyebrows, head, hands, etc. – and it’s almost entirely unconscious. Since humans are so verbally-oriented, and since we expect our bodies to produce natural body language to correspond to our spoken communications, we should expect the same of our avatars. This is the rationale for avatar gesticulation.

I think that a new form of puppeteering is on the horizon. It will use the voice. And it won’t just take sound signal amplitudes as input, as I did with voice-triggered gesticulation. It will parse the actual words and generate gestural emblems as well as gesticulations. And just as we will be able to layer filters onto our voices to mask our identities or role-play as certain characters, we will also be able to filter our body language to mimic the physical idiolects of Egyptians, Native Americans, Sicilians, four-year-old Chinese girls, and 90-year old Ethiopian men.

Digital-alphabetic-technological humanity reaches down to the gestural underbelly and invokes the primal energy of communication. It’s a reversal of the gesture-to-words vector of Gestural Theory.

And it’s the only choice we have for transmitting natural language over the geocortex, because we are sitting on top of a thousands-year-old heap of alphabetic evolution.

Seven Hundred Puppet Strings

March 31, 2012

The human body has about seven hundred muscles. Some of them are in the digestive tract, and make their living by pushing food along from sphincter to sphincter. Yum! These muscles are part of the autonomic nervous system.

Other muscles are in charge of holding the head upright while walking. Others are in charge of furrowing the brow when a situation calls for worry. The majority of these muscles are controlled without conscious effort. Even when we do make a conscious movement (like waving a hand at Bonnie), the many arm muscles involved just do the right thing without our having to think about what each muscle is doing. The command region of the brain says, “wave at Bonnie”, and everything just happens like magic. Unless Bonnie scowls and looks the other way, in which case, the brow furrows, and is sometimes accompanied by grumbling vocalizations.

The avatar equivalent of unconscious muscle control is a pile of procedural software and animation scripts that are designed to “do the right thing” when the human avatar controller makes a high-level command, like <walk>, or <do_the_coy_shoulder_move>, or <wave_at, “Bonnie”>. Sometimes, an avatar controller might want to get a little more nuanced: <walk_like, “Alfred Hitchcock”>; <wave_wildly_at, “Bonnie”>. I have pontificated about the art of puppeteering avatars in the following two web sites:


Also this interview with me by Andrea Romeo discusses some of the ideas about avatar puppetry that he and I have been bantering around for about a year now.

The question of how much control to apply on your virtual self has been rolling around in my head ever since I started writing avatar code for There.com and Second Life. Avatar control code is like a complex marionette system, where every “muscle” of the avatar has a string attached to it. But instead of all strings having equal importance, these strings are arranged in a hierarchical structure.

The avatar controller may not necessarily want or need to have access to every muscle’s puppet string. The question is: which puppet strings do the avatar controller want to control at any given time, and…how?

I’ve been thinking about how to make a system that allows a user to shift up and down the hierarchy, in the same way that our brains shift focus among different motion regimes


The movements – communicative and otherwise – that our future avatars make in virtual spaces may be partially generated through live motion-capture, but in most cases, there will be substitutions, modifications, and deconstructions of direct motion capture. Brian Rotman sez:

“Motion capture technology, then, allows the communicational, instrumental, and affective traffic of the body in all its movements, openings, tensings, foldings, and rhythms into the orbit of “writing”.

Becoming Beside Ourselves, page 47

Thus, body language will be alphabetized and textified for efficient traversal across the geocortex. This will give us the semantic knobs needed to puppeteer our virtual selves – at a distance. And to engage the semiotic process.

If I need my avatar to run up a hill to watch out for a hovercraft, or to walk into the next room to attend another business meeting, I don’t want to have to literally ambulate here in my tiny apartment to generate this movement in my avatar. I would be slamming myself against the walls and waking up the neighbors. The answer to generating the full repertoire of avatar behavior is hierarchical puppeteering. And on many levels. I may want my facial expressions, head movements, and hand movements to be captured while explaining something to my colleagues in remote places, but when I have to take a bio-break, or cough, or sneeze, I’ll not want that to be broadcast over the geocortex

And I expect the avatar code to do my virtual breathing for me.

And when my avatar eats ravioli, I will want its virtual digestive tract to just do its thing, and make a little avatar poop when it’s done digesting. These autonomic inner workings are best left to code. Everything else should have a string, and these strings should be clustered in many combinations for me to tug at many different semantic levels. I call this Hierarchical Puppetry.

Here’s a journal article I wrote called Hierarchical Puppetry.

Screensharing: Don’t Look at Me

January 11, 2012

Imagine discussing a project you are doing with a small group: a web site, a drawing, a contraption you are building; whatever. You would not expect the people to be looking at your face the whole time. Much of the time you will all be gazing around at different parts of the project. You may be pointing your fingers around, using terms like “this”, “that”, “here” and “there”.

When people have their focus on something separate from their own bodies, that thing becomes an extension of their bodies. Bodymind is not bound by skin. And collaborating, communicating bodyminds meld on an object of common interest.


The internet is dispersing our workspaces globally, and the same is happening to our bodies.

The anthropologist, Ray Birdwhistell coined the term “kinesics“, referring to the interpretation, science, or study of body language.

I invented a word: “telekinesics”. I define it as, “the science of body language as conducted over remote distances via some medium, including the internet” (ref)

My primary interest is the creation of body langage using remote manifestations of ourselves, such as with avatars and other visual-interactive forms. I don’t consider video conferencing as a form of virtual body language, because it is essentially a re-creation of one’s literal appearances and sounds. It is an extension of telephony.

But it is virtual in one sense: it is remote from your real body.

Video conferencing, and applications like Skype are extremely useful. I use Skype all the time to chat with friends or colleagues. Seeing my collaborator’s face helps tremendously to fill-in the missing nonverbal signals in telephony. But if the subject of conversation is a project we are working on, then “face-time”, is not helpful. We need to enter into, and embody, the space of our collaboration.

Screen Sharing

This is why screen sharing is so useful. Screen sharing happens when you flip a switch on your Skype (or whatever) application that changes the output signal from your camera to your computer screen. Your mouse cursor becomes a tiny Vanna White – annotating, referencing, directing people’s gazes.

Michael Braun, in the blog post: Screen Sharing for Face Time, says that seeing your chat partner is not always helpful, while screen sharing “has been shown to increase productivity. When remote participants had access to a shared workspace (for example, seeing the same spreadsheet or computer program), then their productivity improved. This is not especially surprising to anyone who has tried to give someone computer help over the phone. Not being able to see that person’s screen can be maddening, because the person needing help has to describe everything and the person giving help has to reconstruct the problem in her mind.”

Many software applications include cute features like collaborative drawing spaces, intended for co-collaborators to co-create, co-communicate, and to to co-mess up each other’s co-work. The interaction design (from what I’ve seen) is generally awkward. But more to the point: we don’t yet have a good sense of how people can and should interact in such collaborative virtual spaces. The technology is still frothing like tadpole eggs.

Some proponents of gestural theory believe that one reason speech emerged out of gestural communication was because it freed up the “talking hands” so that they could do physical work – so our mouths started to do the talking. Result: we can put our hands to work, look at our work, and talk about it, all at the same time.

Screen sharing may be a natural evolutionary trend – a continuing thread to this ancient  activity – as manifested in the virtual world of internet communications.



Virtual Sentience Requires a Gaze

November 28, 2011

I was speaking with my colleague Michael Nixon at the School of Interactive Art and Technology. We were talking about body language in non-human animated characters. He commented that before you can imbue a virtual character with apparent sentience, it has to have the ability to GAZE – in other words, look at something. In other words, it has a head with eyes. Or maybe just a head. Or… a “head”.

Here’s the thing about gaze: it pokes out of the local (“lonely”) coordinate system of the character and into the global (“social”) coordinate system of the world and other sentient beings. Gaze is the psychic vector that connects a character with the world. The character “places it’s gaze upon the world”. Luxo Jr is a great example of imbuing an otherwise inanimate object with sentience (and lots of personality besides) by using body language such as gaze.

I have observed something missing in video conferencing. Gaze. Notice in this set of four images how the video chat participants cannot make eye-contact with each other. This is because they are not sharing the same physical 3D space. Nor are they sharing the same virtual 3D space!

Gaze is one of the most powerful communicative elements of natural language, along with the musicality of speech, and of course facial and bodily gesture. This is especially true among groups of young single people in which hormones are flying, and flirtation, coyness, and jealousy create a symphony of psychic vectors…

At There.com, I designed the initial avatar gaze system. With the help of Chuck Clanton, I created an “intimacam”, which aimed perpendicular to the consensual gaze of the avatars, and zoomed-in closer when the avatar heads came closer to each other.

The greatest animators have known about the power of gaze for as long as the craft has existed. This highly-social component of body language has a mathematical manifestation in the virtual spaces of cartoons, computer games, and virtual worlds. And it is one of the many elements that will become refined and codified and included into the virtual body language of the internet.

Human communication is migrating over to the internet – the geo-cortex of posthumanity. Text is leading the way. Body language has some catching up to do. Brian Rotman has some interesting things to say along these lines in his book, Becoming Beside Ourselves.

We can learn a lot from Pixar animators, as well as psychologists and actors, as we develop virtual worlds and collaborative workspaces.


In response to my earlier post, Laban-for-animators expert Leslie Bishko made this comment:

“My .2c – breath promotes the illusion of sentience, gaze promotes the illusion of interaction and relationship!”

New Discovery at Max Planck

October 29, 2011

I came across this article in Science Daily:

Talk to the Virtual Hands: Body Language of Both Speaker and Listener Affects Success in Virtual Reality Communication Game

Researchers at the Max Planck Institute found that “…virtual communication usually lacks the body gestures so common in face-to-face interactions”.


The researchers found that …”the lack of gestural information from both speaker and listener limits successful communication in virtual environments.”

That’s quite an insight.

They also found that “participants move much less in a virtual environment than they do in the “real world.”