Hacker News new | past | comments | ask | show | jobs | submit login
Is the Turing Test Dead? (ieee.org)
24 points by rbanffy 3 hours ago | hide | past | favorite | 31 comments





My perspective is that the Turing Test is now obsolete. Two months ago Peter Norvig and a friend of his wrote an article arguing that the goal posts have moved in measuring AI. I have been a paid AI practitioner since 1982, and when I watched my niece’s family last night passing around my iPhone running ChatGPT Pro, just the looks on their faces as they were having spoken conversations, in all matter of topics, and asking for images to be generated and refined, I felt that AI is here.

LLMs remind me of Plato’s metaphor of humans only seeing the shadows on a cave wall, not true reality. LLMs are trained on text and images and are disembodied. They learn from an information lossy projection, but still an interesting “reality.”

We are likely to get real AGI when APIs are embodied, and toss in a few new technologies and optimizations to run cheaply on edge devices.


I just found the words to express that people have really conflated AI with the singularity. I agree with you that AI is here. And that doesn't mean Skynet. It means that you can converse with a reasonably intelligent partner who adds to the shared pool of ideas. Just like another person does.

People think when AI gets here, and if it's even a little smarter than humans, it will compound over time, make itself smarter, and rule over us all. That ignores that humanity is made up of intelligent people, some of whom are smarter than others, and even organizations (and technologies, like books) that allow our ideas to persist past our individual lifetimes. And yet, no organization has come to dominate the world. Why would an AI?


Of course not. Turing's test isn't supposed to be quantified universally over examiners. The criterion is "does there exist a human examiner who can tell the difference between talking to a computer and the human".

I would argue that Turing's test has actually become less dead. It can now additionally be used to establish someone's qualifications to identify reasoned explanation. The computer isn't capable of engaging in it (yet), so being unable to tell if you're talking to a computer generally means that you are incapable of distinguishing bullshit from reasoning.


I'm just reminded of an old Invader Zim episode where Zim harvests and consumes(integrates) lots of human organs to "be more human." He ends up having like 5 livers, 20 spleens, etc. Just absolutely ridiculous.

The school nurse when examining him says "perfectly healthy with such plentiful organs."

That's where LLMs are at. "Perfectly human with such plentiful words."


A psychological test to see how strongly the profile matches human emotional-cognition? Can we call it the Voight-Kampff Test?

> In the Turing test—which Turing himself originally called the “imitation game“—human participants conduct a conversation with unknown users to determine if they’re talking to a human or a computer.

No. They talk to A and B, one of which is a computer, the other a human to determine who is who.


This is important, because it forces a head-to-head comparison. Otherwise, what knowledge the judge has about the abilities of AI matters a lot.

A well-informed 2023 person talking to ChatGPT will quickly establish that it's not a human, but I'm pretty sure a 1950s person doing the same would swear it was human, if an odd human, because they couldn't conceive of a computer being that fluent. But force them into this head-to-head scenario and they will make the right choice.


> Finally, researchers would take a look under the hoods of the machines to determine whether the neural networks are built to simulate human cognition.

This doesn’t make any sense. We don’t inspect boats to see if they simulate human swimming patterns.


The Turing test can be thought of as having strong and weak versions. The strong Turing test establishes a ceiling past which, philosophically speaking, a candidate is indistinguishable from a person. To pass the strong Turing test, you need a candidate who can get convince the most knowledgeable scientist of their humanity, as well as tell the most discerning 5-year-old a compelling bedtime story. The strong Turing test is hard to beat but just about every conscious human being can do it.

The weak Turing test establishes a floor below which we're pretty sure the intelligence on the other side is not a person. CAPTCHA did this nicely for a long time. False negatives aren't uncommon (ever been distracted or tired enough to miss clicking on a stoplight or fire hydrant?) Weak Turing tests are useful for practical reasons. Any quick and dirty "test" is going to be a weak Turing test.

The strong Turing test remains unconquered for now. It remains useful, because its use is primarily philosophical. Once it's been solidly beaten, Pandora's box will be fully open on artificial people.


I think the strong version is probably impossible to beat, because even if the machine has all the same capabilities as the human, it will probably give away its nature one inconsequential way or another, just like a foreign person usually gives away their foreign nature with some slightly unusual manner of speech.

> The strong Turing test remains unconquered for now. It remains useful, because its use is primarily philosophical. Once it's been solidly beaten, Pandora's box will be fully open on artificial people.

"Artificial"..?


The Turing test is never dead: Just deploy the curret best chatbot as a remote consultant in a technical field and see how well it works. This is a Turing test.

As I understand it, the Turing test doesn't require the computer to convince people it's a particularly smart human.

After all, most humans when asked to explain shor's algorithm in iambic pentameter would just say no.


Alternative version: Start a long distance relationship with a chatbox and see if it is engaging.

People are already doing this

https://news.ycombinator.com/item?id=38445578

Some are spending 5+ hours a day on Character.AI chatting. I don't know what they're chatting to but I assume some chats follow a relationship format

https://www.reddit.com/r/CharacterAI/comments/14tfwue/screen...


So the Turing Test is not dead, we just need to redefine what it entails? That sounds like moving the goalposts.

The Turing test lets the examiner have whatever conversation he wants with the machine. So the examiners can ask to solve math problems, recognise shapes, interpret poetry, perform whatever task. It doesn't matter either if the machine fails or succeeds at the task, it matters if it fails or succeeds in a way that is recognisably different from humans. And if some new task comes up that the machine has a hard time with, it is only natural that the examiners are going to ask that particular task. So yes it is completely allowed to move the goalpost in the Turing test, that's why it is a hard and interesting test to pass.

The Turing Test was never anything but a thought experiment, developed when computers were in their infancy, and has been widely criticized by philosophers and AI researchers.

It should never have been taken as seriously as it has, just as Asimov's Three Laws of Robotics should never have been taken seriously. Both presuppose rigidly and objectively defined values for, respectively, self-awareness and morality, where none exist. If anything they reveal more about human intellectual bias, popular cultural assumptions (and possibly hubris) than they do anything about AI.

The goalposts must be moved because we still don't even know what the nature of the game is.


The Turing test was passed by dr Eliza in the 70es. Revealing not how to model intelligence rather how keen humans are on fooling themselves.

Here is a beautiful 2 minute clip demonstrating this point and Eliza itself:

Before Siri and Alexa, there was ELIZA

https://www.youtube.com/watch?v=RMK9AphfLco

ELIZA is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of communication between man and machine, Eliza simulated conversation by using a 'pattern matching' and substitution methodology that gave users an illusion of understanding on the part of the program, but had no built in framework for contextualising events.


Anyone using Eliza for more than a couple of minutes will quickly pick up on its tricks. Even in their cherry picked example it barely makes it a few messages in before the illusion starts to shatter with awkward lines like "Do you think coming here will help you to not be unhappy?". It might pass the turing test once or twice by sheer luck but it's clearly not competitive with a human conversing in general.

Sure. With today’s eyes it is obviously a primitive computer program. But back then it was convincing to point where people suggested replacing real person psychologists with this.

Likewise I think 50 years from now looking at present day ChatGPT it would terrible primitive and very unconvincing.


From the linked paper in the article:

>If it relies instead on some sort of deep learning, then the answer is equivocal—at least until another algorithm is able to explain how the program reasons. If its principles are quite different from human ones, it has failed the test.

Why would we expect our reasoning methods to be the only or even the most efficient? If it can get to the correct conclusions, shouldn't that be what matters?


> Why would we expect our reasoning methods to be the only or even the most efficient? If it can get to the correct conclusions, shouldn't that be what matters?

This has epistemological roots.

Knowledge is commonly defined as true justified belief. If I correctly predict that in exactly 10 years from now, it will be a rainy day, that is not knowledge, as while my belief ended up true it was not justified.

Ultimately it is very easy to make correct predictions, you just need to make a lot of them, which is why whenever someone makes successful predictions, we scrutinize their justifications.


My perspective is that the Turing Test is now obsolete. Two months ago Peter Norvig and a friend of his wrote an article arguing that the goal posts have moved in measuring AI. I have been a paid AI practitioner since 1982, and when I watched my niece’s family last night passing around my iPhone running ChatGPT Pro, just the looks on their faces as they were having spoken conversations, in all matter of topics, and asking for images to be generated and refined, I felt that AI is here.

LLMs remind me of Plato’s metaphor of humans only seeing the shadows on a cave wall, not true reality. LLMs are trained on text and images and are disembodied. The learn from an information lossy projection, but still an interesting “reality.”

We are likely to get real AGI when APIs are embodied, and toss in a few new technologies and optimizations to run cheaply on edge devices.


At the time it was a necessary line in the sand so researchers could just get on with it. Eliza successfully passed it in the 1960s. At best, it's obsolete. At worst it focused the field on behavioralism rather than a rigorous definition of AI

Maybe we should not move the goal posts and accept chatbots as fellow humans.

Or we could dispense with the myth that the Turing Test has ever been any kind of "goal post" to begin with.

It's like Schrodinger's Cat, it was a pretty minor and unprincipled thought experiment, that a major thinker in the field made in a fairly informal manner. The public perception of its importance and its level of prominence in pop culture around the topic has always far exceeded its actual importance.


> Or we could dispense with the myth that the Turing Test has ever been any kind of "goal post" to begin with.

Well, it definitely was a goal post for many years—pretending it wasn't doesn't seem like a good idea.

On the other hand, now that we have chatbots that can pass it, its flaws as a test have become apparent. And as you say, it's informal, and underspecified in a way that wasn't clear when it was defined. But I also don't think it's much of a surprise that "passing" it doesn't actually mean what Turing thought it would mean back in 1950. We didn't have a good idea of what parts of intelligence are "easy" versus what parts are "hard".


Please ELI5:

  - Is AGI if a computer meets the intelligence of any human?
  - Or of an average human?
  - Or of the most intelligent human?
  - Or of cumulative total of all humans?

Average human.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: