How Does Artificial Life Avoid the Uncanny Valley?

July 6, 2015

The following creepy humanoids provide ample reason to fear artificial intelligence:

Screen Shot 2015-07-05 at 12.59.52 PM

This is just one example of virtual humans that would be appropriate in a horror movie. There are many others. Here’s my question: why are there so many creepy humans in computer animation?

Screen Shot 2015-07-05 at 7.27.08 PMThe uncanny problem is not necessarily due to the AI itself: it’s usually the result of failed attempts at generating appropriate body language for the AI. As I point out in the Gestural Turing Test: “intelligence has a body”. And nothing ruins a good AI more than terrible body language. And yes, when I say “body language”, I include the sound, rhythm, timbre, and prosody of the voice (which is produced in the body).

Simulated body language can steer clear of the uncanny valley with some simple rules of thumb:

1. Don’t simulate humans unless you absolutely have to.

2. Use eye contact between characters. This is not rocket science, folks.

3. Cartoonify. Less visual detail leaves more to the imagination and less that can go wrong.

4. Do the work to make your AI express itself using emotional cues. Don’t be lazy about it.

Shameless plug: Wiglets are super-cartoony non-humanoid critters that avoid the uncanny valley, and use emotional cues, like eye contact, proxemic movements, etc.

ww

These videos show how wiglets move and act.

0313lifeArtificial Life was invented partly as a way to get around a core problem of AI: humans are the most sophisticated and complex animals on Earth. Simulating them in a realistic way is nearly impossible, because we can always detect a fake. Getting it wrong (which is almost always the case) results in something creepy, scary, clumsy, or just plain useless.

In contrast, simulating non-human animals (starting with simple organisms and working up the chain of emergent complexity) is a pragmatic program for scientific research – not to mention developing consumer products, toys, games, and virtual companions.

We’ll get to believable artificial humans some day.

Meanwhile…

I am having a grand old time making virtual animals using simulated physics, genetics, and a touch of AI. No lofty goals here. With a good dose of imagination (people have plenty of it), it only takes a teaspoon of AI (crafted just right) to make a compelling experience – to make something feel and act sentient. And with the right blend of body language, responsiveness, and interactivity, imagination can fill-in all the missing details.

Alan Turing understood the role of the observer, and this is why he chose a behaviorist approach to asking the question: “what is intelligence?”

intelligent-animals-01Artificial Intelligence is founded on the anthropomorphic notion that human minds are the pinnacle of intelligence on Earth. But hubris can sometimes get in the way of progress. Artificial Life – on the other hand, recognizes that intelligence originates from deep within ancient Earth. We are well-advised to understand it (and simulate it) as a way to better understand ourselves, and how we came to be who we are.

It’s also not a bad way to avoid the uncanny valley.

Advertisements

On Phone Menus and the Blowing of Gaskets

January 2, 2013

(This blog post is re-published from an earlier blog of mine called “avatar puppetry” – the nonverbal internet. I’ll be phasing out that earlier blog, so I’m migrating a few of those earlier posts here before I trash it).

This blog post is only tangentially related to avatars and body language. But it does relate to the larger subject of communication technology that fails to accommodate normal human behavior and the rules of natural language.

But first, an appetizer. Check out this video for a phone menu for callers to the Tennessee State Mental Hospital:

http://www.youtube.com/watch?v=zjABiLYrKKE


A Typical Scenario

You’ve probably had this experience. You call a company or service to ask about your bill, or to make a general inquiry. You are dumped into a sea of countless menu options given by a recorded message (I say countless, because you usually don’t know how many options you have to listen to – will it stop at 5? Or will I have to listen to 10?). None of the options apply to you. Or maybe some do. You’re not really sure. You hope – you pray, that you will be given the option to speak to a representative, a living, breathing, thinking, soft and cuddly human. After several agonizing minutes (by now you’ve forgotten most of the long-winded options) you realize that there is no option to speak to a human. Or at least youthink there is no option. You’re not really sure.

Your blood pressure has now reached levels that warrant medical attention. If you still have rational neurons firing, you get the notion to press “0″. And the voice says, “Please wait to speak to a phone representative”. You collapse in relief. The voice continues: “this call may be recorded for quality assurance” Yea, right. (I think I remember once actually hearing the message say, “this call may be recorded……because…we care”. Okay now that is gasket-blowing material).

Why Conversation Matters

I don’t think I need to go into this any further. Just do a search on “phone menu” (or “phone tree”) and “frustration”, or something like that, and follow the scent and you’ll find plenty of blog posts on the subject.

How would I best characterize this problem? I could talk about it from an economic point of view. For instance it costs a company a lot more to hire real people than to hook up an automated answering service or an interactive voice response (IVR) system. But companies have to also weigh the negative impact of a large percentage of irate customers. But too few companies look at this as a Design problem. Ah, there it is again: that ever-present normalizer and humanizer of technology: DesignIt’s invisible when it works well, and that’s why it is such an unsung hero.

The Hyper-Linearity of Non-Interactive Verbal Messages

The nature of this design problem, I believe, is that these phone menus give a large amount of verbal information (words, sentences, options, numbers, etc.) which take time to explain. They are laid out in a sequential order.

There is no way to jump ahead, to interrupt the monolog, or to ask it for clarification, as you would in a normal conversation. You are stuck in time – rigid, linear time, with no escape. (At least that’s what it feels like: there are usually options to hit special keys to go to the previous menu or pop out entirely, etc. But who knows what those keys are? And the dreaded fear of getting disconnected is enough to keep people like me staying within the lines, gritting  teeth, and being obedient (although that means I have the potential to become the McDonald’s gunman who makes the headlines the next morning.)

Compare this with a conversation with a phone representative: normal human dialog involves interruptions, clarifications, repetitions, mirroring (the “mm’s”, “hmm’s”, “ah’s”, “ok’s”, “uh-huh’s”, and such – the audible equivalent of eye-contact and head-nods), and all the affordances that you get from the prosody of speech. Natural conversations continually adapt to the situation. These adaptive, conversational dynamics are absent from the braindead phone robots. And their soft, soothing voices don’t help – in fact they only make me want to kill them that much harder.

There are two solutions:

1. Full-blown Artificial Intelligence, allowing the robot voice to “hear” your concerns, questions, and drill down, with your help, to the crux of the problem. But I’m afraid that AI  has a way to go before this is possible. And even if it is almost possible, the good psychologists, interaction designers, and human-user interface experts don’t seem to be running the show. They are outnumbered by the techno-geeks with low EQ, and little understanding of human psychology. Left-brainers gather the power and influence, and run the machines – computer-wise and business-wise – because they are good with the numbers, and rarely blow a gasket. The right-brained skill set ends up stuck on the periphery, almost by its very nature. I’m waiting for this revolution I keep hearing about – the Revenge of the Right Brain. So far, I still keep hitting brick walls built with left-brained mortar. But I digress.

2. Visual interfaces. By having all the options laid out in a visual space, the user’s eyes can jump around (much more quickly than a robot can utter the options). Thus, if the layout is designed well (a rarity in the internet junkyard) the user can quickly see, “ah, I have five options. Maybe I want to choose option 4 – I will select, “more information about option 4 to make sure”. All of this can happen within a matter of seconds. You could almost say that the interface affords a kind of body language that the user reads and acts upon immediately.

Consider the illustration below for a company’s phone tree which I found on the internet (I blacked-out the company name and phone number). Wouldn’t it be nice if you could just take a glance at this visual diagram and jump to the choice you want? If you’re like me, your eyes will jump straight to the bottom where the choice to speak to a representative is. (Of course it’s at the bottom).

This picture says it all. But of course. We each have two eyes, each with millions of photoreceptors: simultaneity, parallelism, instant grok. But since I’m talking about telephones, the solution has to be found within the modality of audio alone, trapped in time. And in that case, there is no other solution than an advanced AI program that can understand your question, read your prosodic body language, and respond to the flow of the conversation, thus collapsing time.

…and since that’s not coming for a while, there’s another choice: a meat puppet – one of those very expensive communication units that burn calories, and require a salary. What a nuisance.


A Future Man Experiences Sex as a Female

April 20, 2012

I am a heterosexual male, happily married, and by most accounts, normal and healthy. This blog post is a what-if, extrapolating upon the idea of having a virtual body…..

THE MIND

Frank Zappa said that the dirtiest part of your body is your mind. It is hard to disagree with this. Your mind is capable of generating some serious filth (unless you never bathe, in which case, it is possible that parts of your body may actually be dirtier than your mind).

Obviously, the body has something to do with sex. But there is indeed a psychological, cognitive, emotional, imaginative dimension. It seems that these mental aspects of sex become more important as we get older. One obvious reason: aging. Entropy! Deteriorating, wrinkling, flabbifying, and weakening our bodies. But our aging minds are often as sharp as ever, and capable of higher dimensions of love and romance (and filth). It’s a shame that youth must be wasted on the young. I am referring to us in our earlier years when we had great bodies and great physical strength…but OH how immature we were.

Ray Kurzweil and other futurists suggest that virtual reality will be fully-integrated into our lives in the future. One could also assume that virtual sex will continue from its current occasional manifestations of phone sex, sexting, and avatar play in virtual worlds. There are already non-technological forms of virtual reality such as imaginative play, role-playing, etc. It’s only recently that technology has evolved enough to enhance the experience (or ruin it…depending on your vantage point).

Fantastic Sex at Age 100

The difference between mortality and immortality will become fuzzier in the future. Humans may achieve a certain kind of immortality by having their brains uploaded into a virtual reality when they are physically dead (or transformed into a cyborg, whichever comes first). This of course is based on the assumption that one can still experience a continuous life, having nothing left but a brain, and that this brain can be uploaded to some renewable medium…highly-debatable at this early juncture. But let’s roll with it anyway. I can imagine that a 100-year old future human might engage in sex with all the vigor and muscle tone associated with youth (think Jake Sully in Avatar who got his legs back as a Na’vi). Think of this youthful sex…but with the imagination, wisdom, and capacity for love that only a 100-year-old could possess.

I’m a software guy, not a hardware guy, so I can’t say much about nanobots and teledildonics and other technological enhancements of human physicality. But I can imagine that given the appropriate virtual reality enhancements, I could experience something akin to being a female. If nanobots are indeed a part of our future, they might be able to stimulate the brain chemistry and bodily sensation associated with female thoughts and feelings.

Is this a good thing? It is a bit creepy. But I say it is a good thing. Here’s why: human imagination has no limits. Human creativity knows no bounds. The desire to understand how others experience the world is based on empathy and natural social bonding. Technology can be used for this purpose.

An earlier blog post I wrote explores the question of how we might experience non-human embodiment, and body language, through future virtual reality technology. Within the realm of human society, there are still a lot of experiences and perspectives that can be shared. It might help us understand each other a bit better. Empathy could be technologically-enhanced; generated through simulation and virtuality.

And it might make for some awesome sex.

One can only imagine. (That’ll have to do for now).

Here’s a piece by Robert Weiss about the pros and cons of virtual sex.


“Consider Including” Google Stupidity and Arrogance

November 12, 2011

A little off-topic here, but I just can’t resist taking another jab at The Google.

I am a gmail user, but more recently I have considered switching.

Every so often, I notice a new gmail feature. Google is usually kind enough to let me know that a new feature has been introduced, such as offering me the option to try the “new look”, although after I say “no thank you” which I always do, I keep getting notifications to try the “new look”, even though I had already said “no thank you” to the “new look”. Thanks Google, but please STOP TELLING ME ABOUT YOUR “NEW LOOK”.

And then there is the little yellow “Important” symbol that one day magically appeared next to some of my messages. When I roll over the symbol I see the text, “Important mainly because of the people in the conversation”.

Yo Google: how ’bout if I decide what’s important.

One person in the Google forums complained about gmail tagging her message as: “Important mainly because of the words in the message”. She says, “Can we stop with the idiotic messages from Google, as if our paternalistic uncle was looking out for us?”

Consider

But that’s not what I want to talk about: I want to talk about a feature which is the ultimate example of Google developers trying to be oh so clever but just coming across as stupid. I’m talking about the text that appears when I’m composing an email to someone, which says, “Consider including: John, Rebecca…” And so on.

Peter Thomas, one of the many bloggers who has complained about this ridiculous feature, summarizes it:

“When you type an e-mail, Gmail comes up with a list of people that you may like to also copy it to. Let’s pause and just think about this. You are writing an e-mail, generally the first thing that you do is to type in the address of the person (or people) you are writing to. Gmail has a useful feature that scans your previous mails, so typing “Pe” will bring up “Peter Thomas” as an option. So far so good….

…but then, gmail offers a list of people that you may consider including as recipients of your email, based on simple association. Hello? What if I am emailing a colleague to complain about the boss? I certainly don’t want to include the boss, and it scares me that his name is sitting up there, a mouse-click away from disaster. Or what if I am plotting a surprise birthday party for Beth? Including Beth is specifically NOT what I want to do.

And…what if the person is DEAD?

I found this on the Google forums:

I deleted my dead friend as a contact which was traumatic enough, but having google STILL suggesting I include her when there’s honestly nothing I’d like better than to be able to include her BECAUSE SHE’S DEAD.  How do I make this stop?!?!?!

Note to Google:

Please get out of the business of reading our minds. You suck at it.

Peter Thomas concludes: “This “feature” is bad enough to have merited me writing to Google asking them to remove it, or at least make it optional. Their support forums are full of people saying the same. It will be interesting to see whether or not they listen.”

Do a search for “consider including”, and you’ll come across several people railing against this act of stupidity from Google. My blog post is not original. Yet I feel compelled to add another voice to the chorus.

Do I have any conclusions or insights? Not really, other than my opinion that any good thing can turn bad when it gets too big and too powerful. Google is generally a good thing. But I think Google is getting too big and too powerful. And I am getting smaller and less powerful, in relative terms. I want to be completely in charge of how I communicate with my friends and colleagues.

The fact that Google is brimming with young, clever, cocky geeks does not make for an agreeable form of world domination.

IMHO.


Watson’s Avatar: Just Abstract Art?

February 17, 2011

This video describes the visual design of the “avatar” for Watson – the Jeopardy-playing AI that recently debuted on the Jeopardy show.

This is a lovely example of generative art. Fun to watch as it swirls and swarms and shimmers. But I do not think it is a masterpiece of avatar design – or even information design in general. The most successful information design, in my opinion, employs natural affordances – the property of expressing the function or true state of an animal or thing. Natural affordances are the product of millions of years of evolution. Graphical user interfaces, no matter how clever and pretty, rarely come close to offering the multimodal stimuli that allow a farmer to read the light of the sky to predict rain, or for a spouse to sense the sincerity of her partner’s words by watching his head motions and changes in gaze.

Watson’s avatar, like many other attempts at visualizing emotion, intent, or states of human communication, uses arbitrary visual effects. They may look cool, but they do not express anything very deep.

…although Ebroodle thinks there is something pretty deep going on with Watson, as in… world domination.

Despite my criticism, I do commend Joshua Davis, the artist who developed the avatar. It is difficult to design non-human visual representations of human expression and communication. But it is a worthy effort, considering the rampant uncanny valley effect that has infected so much virtual human craft for so long, caused by artists (or non-artists) trying to put a literal human face on something artificial.

What Was Watson Thinking?

Watson’s avatar takes the form of a sphere with a swarm of particles that swirl around it. The particles migrate up to the top when Watson is confident about its answer, and to the bottom when it is unsure. Four different colors are used to indicate levels of confidence. Green means very confident. Sounds pretty arbitrary. I’ve never been a fan of color for indicating emotion or states of mind – it is overused, and ultimately arbitrary. Too many other visual affordances are underutilized (such as styles of motion).

Contradiction Between Visual and Audible Realism

Here’s something to ponder: Watson’s avatar is very abstract and arty. But Watsons voice is realistic … and kinda droll. I think Watson’s less-than-perfect speech creates a sonic uncanny valley effect. Does the abstraction of Watsons visual face help this problem, or make it more noticeable?

Is the uncanny valley effect aggravated when there is a discrepancy between visual and audible realism? I can say with more confidence that the same is true when visual realism is not met with complimentary behavioral realism (as I discuss in my book).

Am I saying that Watson should have a realistic human face – to match its voice? Not at all! That would be a disaster. But this doesn’t mean that its maker can craft abstract shapes and motions with reckless abandon. Indeed, the perception of shapes and colors changing over time – accompanied by sound – is the basis for all body language interpretation – it penetrates deep into the ancient communicative energy of planet Earth. Body language is the primary communication channel of humans as well as all animals. Understanding these ancient affordances is a good way to become good at information design.

Hmm – I just got an image of Max Headroom in my mind. Max Headroom had an electronic stutter in his voice as well as in his visual manifestation. Audio and video were complimentary. It would be kinda fun to see Max as Watson’s avatar.

What do you think, my dear Watson?