Particle News: AI Models GPT-4.5 and LLaMa-3.1 Achieve Milestone in Rigorous Turing Test

Overview

Researchers at UC San Diego conducted a three-party Turing Test where participants conversed with both AI and human counterparts to determine which was human.
GPT-4.5, when prompted with a specific persona, was judged to be human 73% of the time, outperforming actual human participants; LLaMa-3.1 achieved a 56% success rate under similar conditions.
The study highlights the importance of persona prompts in enhancing AI's ability to mimic human behavior, with models performing significantly worse without such guidance.
Critics argue that the Turing Test measures conversational mimicry rather than true intelligence, as these models lack comprehension or consciousness.
The findings, published as a preprint awaiting peer review, have sparked concerns about societal impacts, including job automation, social engineering risks, and ethical challenges.