Monday, June 9, 2014

If this is a Turing Test “Pass” then I’m HAL-9000

...Or the grading is done on a serious curve....





According to the story, about 30% of people were fooled by Eugene. That means 70% were not.  This hardly meets the criterion as stated in the article itself:

“The test … requires that computers are indistinguishable from humans...”

Now, if a computer program is truly indistinguishable from humans you would expect that performance in such judgments would be no better than chance, and that either 50% of respondents were fooled, over multiple iterations of testing, or MORE.

You should run a series of trials with the program, and ask some number of subjects to tell whether the interlocutor is human or machine.  The result, over the long term should verge on around 50% success in identification of human or machine, OR slant toward choosing human when in fact machine.
That’s point one. 

Point two:

The programming team says they set out to make the program appear to be a young teen so as to “mask” deficiencies in responses. For instance, we have this response quoted:

“I feel about beating the turing test in quite convenient way. Nothing original,” said Goostman, when asked how he felt after his success.

This seems to be a dodge.  Anyone can mask deficiencies by claiming his program is supposed to mimic a young person, or user of English as second language, or mentally impaired, or brain damaged, or drug addled, etc…  That’s too freaking easy. Mimic a normal functioning human being, proficient in the testing language. That’s the challenge.

Point three:

The link in this report to the online version is dead. If the intent is to mimic the communicative abilities of deceased persons, this is a rousing Turing test success.

 Oh alright. I did try again, and did find a working link HERE

I had the following conversation with Eugene:



Me:  What is the name of the third planet from the sun?

Bot: Could you formulate your question in more precious way? Maybe, let's talk about something else? What would you like to discuss.

Me: Let me try one more time. What is the name of the planet you live on?

Bot: One more time? Oh! Maybe, let's talk about something else? What would you like to discuss?

Me: The planet you live on.

Bot: By the way, my guinea pig urges you should sign their petition about giving guinea pigs equal rights with humans and paying compensations to all victims of medical experiments. I hope you won't deny it!

OK, enough. I don’t know how this would have fooled .01% of respondents, let alone 30%. It’s not even a very good impression of an ADD kid… Oh look, a squirrel!

Color me unimpressed.