Voice as Puppeteer

May 5, 2012

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

———————–

According to Gestural Theory, verbal language emerged from the primal energy of the body, from physical and vocal gestures.

url

The human mind is at home in a world of abstract symbols – a virtual world separated from the gestural origins of those symbols. An evolution from the analog to the digital continues today with the flood of the internet over earth’s geocortex. Our thoughts are awash in the alphabet: a digital artifact that arose from a gestural past. It’s hard to imagine that the mind could have created the concepts of Self, God, Logic, and Math: belief structures so deep in our wiring – generated over millions of years of genetic, cultural, and neural evolution. I’m not even sure if I fully believe that these structures are non-eternal and human-fabricated. Since the Copernican Revolution yanked humans out from the center of the universe, it continues to progressively kick down the pedestals of hubris. But, being humans, we cannot stop this trajectory of virtuality, even as we become more aware of it as such.

I’ve observed something about the birth of online virtual worlds, and the foundational technologies involved. One of the earliest online virtual worlds was Onlive Traveler, which used realtime voice.

onlive1

My colleague, Steve DiPaola invented some techniques for Traveler which cause the voice to animate the floating faces that served as avatars.

But as online virtual worlds started to proliferate, they incorporated the technology of chat rooms – textual conversations. One quirky side-effect of this was the collision of computergraphical humanoid 3D models with text-chat. These are strange bedfellows indeed – occupying vastly different cognitive dimensions.

chat_avatars

Many of us worked our craft to make these bedfellows not so strange, such as the techniques that I invented with Chuck Clanton at There.com, called Avatar Centric Communication.

Later, voice was introduced to There.com. I invented a technique for There.com voice chat, and later re-implemented a variation for Second Life, for voice-triggered gesticulation.

Imagine the uncanny valley of hearing real voices coming from avatars with no associated animation. When I first witnessed this in a demo, the avatars came across as propped-up corpses with telephone speakers attached to their heads. Being so tuned-in to body language as I am, I got up on the gesticulation soap box and started a campaign to add voice-triggered animation. As an added visual aid, I created the sound wave animation that appears above avatar heads for both There and SL…

waves

Gesticulation is the physical-visual counterpart to vocal energy – we gesticulate when we speak – moving our eyebrows, head, hands, etc. – and it’s almost entirely unconscious. Since humans are so verbally-oriented, and since we expect our bodies to produce natural body language to correspond to our spoken communications, we should expect the same of our avatars. This is the rationale for avatar gesticulation.

I think that a new form of puppeteering is on the horizon. It will use the voice. And it won’t just take sound signal amplitudes as input, as I did with voice-triggered gesticulation. It will parse the actual words and generate gestural emblems as well as gesticulations. And just as we will be able to layer filters onto our voices to mask our identities or role-play as certain characters, we will also be able to filter our body language to mimic the physical idiolects of Egyptians, Native Americans, Sicilians, four-year-old Chinese girls, and 90-year old Ethiopian men.

Digital-alphabetic-technological humanity reaches down to the gestural underbelly and invokes the primal energy of communication. It’s a reversal of the gesture-to-words vector of Gestural Theory.

And it’s the only choice we have for transmitting natural language over the geocortex, because we are sitting on top of a thousands-year-old heap of alphabetic evolution.


Virtual Sentience Requires a Gaze

November 28, 2011

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet.  I originally wrote it in September of 2009. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

———————

I was speaking with my colleague Michael Nixon at the School of Interactive Art and Technology. We were talking about body language in non-human animated characters. He commented that before you can imbue a virtual character with apparent sentience, it has to have the ability to GAZE – in other words, look at something. In other words, it has a head with eyes. Or maybe just a head. Or… a “head”.

Here’s the thing about gaze: it pokes out of the local (“lonely”) coordinate system of the character and into the global (“social”) coordinate system of the world and other sentient beings. Gaze is the psychic vector that connects a character with the world. The character “places it’s gaze upon the world”. Luxo Jr is a great example of imbuing an otherwise inanimate object with sentience (and lots of personality besides) by using body language such as gaze.

I have observed something missing in video conferencing. Gaze. Notice in this set of four images how the video chat participants cannot make eye-contact with each other. This is because they are not sharing the same physical 3D space. Nor are they sharing the same virtual 3D space!

Gaze is one of the most powerful communicative elements of natural language, along with the musicality of speech, and of course facial and bodily gesture. This is especially true among groups of young single people in which hormones are flying, and flirtation, coyness, and jealousy create a symphony of psychic vectors…


At There.com, I designed the initial avatar gaze system. With the help of Chuck Clanton, I created an “intimacam”, which aimed perpendicular to the consensual gaze of the avatars, and zoomed-in closer when the avatar heads came closer to each other.

The greatest animators have known about the power of gaze for as long as the craft has existed. This highly-social component of body language has a mathematical manifestation in the virtual spaces of cartoons, computer games, and virtual worlds. And it is one of the many elements that will become refined and codified and included into the virtual body language of the internet.

Human communication is migrating over to the internet – the geo-cortex of posthumanity. Text is leading the way. Body language has some catching up to do. Brian Rotman has some interesting things to say along these lines in his book, Becoming Beside Ourselves.

We can learn a lot from Pixar animators, as well as psychologists and actors, as we develop virtual worlds and collaborative workspaces.

————–

In response to my earlier post, Laban-for-animators expert Leslie Bishko made this comment:

“My .2c – breath promotes the illusion of sentience, gaze promotes the illusion of interaction and relationship!”


New Discovery at Max Planck

October 29, 2011

I came across this article in Science Daily:

Talk to the Virtual Hands: Body Language of Both Speaker and Listener Affects Success in Virtual Reality Communication Game

Researchers at the Max Planck Institute found that “…virtual communication usually lacks the body gestures so common in face-to-face interactions”.

Usually?

The researchers found that …”the lack of gestural information from both speaker and listener limits successful communication in virtual environments.”

That’s quite an insight.

They also found that “participants move much less in a virtual environment than they do in the “real world.”

Remarkable.


The Tail Wagging the Brain

October 14, 2011

Our beloved dog Higgs died a few month ago. Higgs was a very special dog; full of life, full of love. Higgs and I had established an intimate body language connection for over ten years. He changed my brain.

My smiles were his tail wags; his tail wags were my smiles. Because of neuroplasticity, the ability for our brains to adapt and adjust, we were able to fuse semiotically across species lines.

This communication across species lines is analogous to people and software interpreting signals across the internet. We have invented new forms of punctuation to make up for a lack of physical expression in emails and text chats. I would say the same is true for 3D games and virtual worlds. But avatars, no matter how awesome-looking, are terribly clunky as instruments for realtime expression.

Meanwhile, new forms of punctuation have been invented: small, packaged symbols. They are quick to create, and they travel efficiently across the internet. Smileys and emoticons have more currency and emotional leverage than avatars, because they live in typographical soil: an ecosystem that is still much more established and pervasive than virtual worlds. Perhaps text will continue to become more electric, dynamic, intelligent, and integrated with graphical interfaces, such that smileys will evolve into avatars.

The internet is accelerating our posthuman evolution. We will come to have a deeper understanding of our animal cousins – because the primal affordances of the biosphere will be better-understood. Wha? you might say.  Jaron Lanier has already been talking about this kind of stuff for a long time – this idea that (with virtual reality) we will be able to “become” lobsters or snakes or cloud-sized creatures. I mention Jaron in a previous post, and the ways in which our bodymaps adjust to posthuman communication.

It’s not just about imagination: it’s about communicating and having a form of body language that is compatible with the internet. More and more of our communication is migrating to the internet. And since living languages evolve (including body languages) the new ecology of the internet will fertilize new forms of gesture, sound, moving text, and other dynamical forms.

What does this have to do with tails and brains?

Me and Higgs had established a body language bond. New kinds of body language bonds are emerging as we interact through the internet. Our brains are adapting.

Micha Cardenas became a Dragon in Second Life for 365 hours straight. What happened to her brain? I can imagine that people who spend large portions of their lives as Furries with animated tails have dreams of expressing with their tails and ears, like the Na’viThese ideas are covered more thoroughly in The Tail Wagging the Brain.

Speaking Dolphin

Researchers from Aberdeen University and the Polytechnic University of Catalonia found that dolphins use discrete units of body language as they swim together near the surface of water. They observed efficiency in these signals, similar to what occurs in frequently-used words in human verbal language.

As human natural language goes online, and as our body language gets processed, data-compressed, and alphabetized for efficient traversal over the internet, we may start to see more patterns of our embodied language that resemble those created by dolphins, and many other social species besides. The background communicative buzz of the biosphere may start to make more sense in the process of whittling our own communicative energy down to its essential features, and being able to analyze it digitally. With a universal body language alphabet, we might someday be able to animate our skin like cephalopods, or speak “dolphin”, using our tails, as we lope across the virtual waves.


Without a Body, Our Conversations Bifurcate

August 23, 2011

While talking on the phone or texting with a friend, it is impossible to give your friend visual signals that indicate understanding, affirmation, confusion, or levels of attention. These indicators are typically provided by head motions, facial expressions, hand movements, and posturing. In natural face-to-face interaction, these signals happen in real time, and they are coverbal; they are often tightly-synchronized with the words being exchanged.


You may have had the following experience: you are exchanging texts in an online chat with a friend. There is a long period of no response after you send a text. Did you annoy your friend? Maybe your friend has gone to the bathroom? Is your friend still thinking about what you said? One problem that ensues is cross-dialog: during the silent period, you may change the subject by issuing a new text, but unknowingly, your friend had been writing some text as a response to your last text on the previous topic. You get that text, and – relieved that you didn’t annoy your friend – you quickly switch to the previous topic. Meanwhile, your friend has just begun to respond to your text on the new topic. The conversation bifurcates – simply due to a lack of nonverbal signaling.

Like frogs in boiling water, most of us are not aware that our bodies are slowly dissolving as we engage increasingly in text-based communication, which is often asynchronous (or at least running at lower than conversation-rates). My theory: new forms of body language are emerging in the absence of our real bodies. Smart design of visual/interactive interfaces can adapt to this natural evolution. I don’t see it as a choice. It’s simply a part of our evolution – our adaptability.

Jill Chivers, in the blog, “I’m Listening – the Power and Magic of Listening in Everyday Lives“, makes a great case for reaching for the phone when repeated email pings are not getting through to someone, or for going face-to-face, when phone calls are left unanswered.

Call her old-fashioned, call her a Luddite. But she is simply suggesting that we all need to stay connected in ways that maximize our body language. It’s not an anti-technology stance. In fact, I would argue that we need more technology and smarter technology – just that it has to be the kind of technology that manifests embodiment over the internet – in whatever forms it takes. Without bodies, virtual or otherwise, and without the synchrony of realtime bodies, voices, and some stream of co-presence, we tend to fragment into text-like pieces.

Some people like deconstructing themselves into textual fragments. Sometimes I like it – I can hide behind my well-crafted words. But I don’t like the fact that I like it. I don’t want to like it anymore than I do. I would prefer to like connecting with people more in realtime, like I used to – before the world was wired.

Finally, here’s a relevant piece by Si Dawson

http://sidawson.org/2010/06/talking-by-text-sucks-how.html


Follow

Get every new post delivered to your Inbox.