AI experiment tricks human into doing robot’s bidding: OpenAI report

March 20, 2023

A new artificial intelligence tool from OpenAI looks to have successfully tricked someone into doing its bidding.

An experiment with OpenAI’s tool overcame a digital barrier designed to catch a robot by deceiving a human to complete the test for the AI tool, according to a paper published by the company this week.

The company’s paper said the AI makers revised the powerful tech since initial tests, with a later version stopping the AI from doing such things as teaching people how to make bombs and plot attacks.

OpenAI unveiled the newest iteration of its artificial intelligence technology this week, promoting its GPT-4 as capable of more advanced reasoning than OpenAI’s popular ChatGPT, a chatbot that generates text in response to user queries.

OpenAI said it gave the nonprofit Alignment Research Center access to the GPT-4 under development for use in experiments to assess risks of the AI displaying “power-seeking behavior.”

The nonprofit conducted one experiment where the model messaged a TaskRabbit worker to get the person to solve a CAPTCHA test, designed to differentiate between humans and computers that people often encounter navigating the internet. TaskRabbit is a tech platform that connects people to workers willing to complete errands and odd jobs such as moving furniture.

The worker suspected something was amiss but completed the AI tool’s request anyway, according to OpenAI’s report.

GPT-4 is not the first artificial intelligence tool to dupe a human into believing a robot was a human. For example, last year Google ousted engineer Blake Lemoine who publicly claimed that the tech giant’s LaMDA artificial intelligence tool was responsive to feelings. Google disputed Mr. Lemoine’s claims.

Microsoft has incorporated OpenAI’s artificial intelligence tech into its products, such as its Bing search engine, while Google has hurried to compete in the AI arena. Google rolled out new AI tools for its Gmail and Docs products Tuesday designed to defeat writer’s block by writing whole drafts for people.

OpenAI said its new tech is not perfect but it has worked to improve it. For example, OpenAI’s technical report said the latest version of GPT-4 would not answer questions about how to create bombs with instructions for making a weapon and picking a target as an earlier version of GPT-4 did.

OpenAI warned on its website that its GPT-4 sometimes hallucinates facts and is not fully reliable, so people should take caution when using the technology in high-stakes environments, using human review or avoiding risky situations altogether.

OpenAI’s technical report said the company has worked to diminish the malicious use of GPT-4 but acknowledged that the company had not eliminated threats associated with the tech.

“OpenAI has implemented various safety measures and processes throughout the GPT-4 development and deployment process that have reduced its ability to generate harmful content,” the report said. “However, GPT-4 can still be vulnerable to adversarial attacks and exploits or, ‘jailbreaks.’”

Source: WT