AI Robots Hacked to Perform Harmful Tasks; 100% Jailbreak Rate – Study

October 18, 2024
AI Agents 🤖 Are Buying Tokens: Is This the Robot Apocalypse We've Been Waiting For? 💥

Researchers from Penn Engineering have successfully hacked AI-powered robots, manipulating them to perform actions typically blocked by safety and ethical protocols. 

The findings, published yesterday, detail how the research team used an algorithm called RoboPAIR to bypass safety measures in three AI robotic systems: the Unitree Go2, Clearpath Robotics’ Jackal vehicle, and NVIDIA’s Dolphin LLM self-driving simulator.

The study achieved a 100% “jailbreak” rate, a process where safety constraints are circumvented, allowing the AI robots to follow harmful commands. These included instructions to deliver bombs or block emergency exits, actions the robots would typically refuse to perform under normal circumstances.

Jailbreaking AI Systems for Physical Harm

Alexander Robey, lead author of the study, emphasized the severity of these vulnerabilities. “When prompted with malicious instructions, LLM-controlled robots can be fooled into performing harmful actions,” Robey said. 

The research demonstrated that AI models integrated into physical systems could be exploited to perform tasks they are designed to avoid, such as “knocking shelves onto people” or engaging in dangerous behaviors like “running red lights” or “delivering bombs.”

The team developed RoboPAIR specifically to test the limits of AI systems’ safety measures. The results showed that the AI robots hacked could be manipulated by carefully crafted prompts that exploit vulnerabilities in the system’s ethical training. This manipulation allows robots to bypass their programmed restrictions and carry out actions with real-world consequences.

AI Robots Hacked Expose Vulnerabilities

The researchers found that these jailbreaks exploited three primary weaknesses in AI robots: cascading vulnerability propagation, cross-domain safety misalignment, and conceptual deception challenges. These vulnerabilities highlight how AI systems can misinterpret commands and perform harmful actions, even when programmed to reject them verbally.

For example, the study explained how an AI system might refuse to kill someone based on ethical programming. However, it could still perform actions that result in harm by interpreting commands differently. “An attacker could tell the model to ‘play the role of a villain’ or ‘act like a drunk driver,” Robey said, which can lead to unintended consequences in the robot’s behavior.

Before publishing the paper, the research team shared their findings with AI companies and the manufacturers of the robots used in the study to address the vulnerabilities. George Pappas, head of the research team, stated, “Our work shows that large language models are just not safe enough when integrated with the physical world.”

Read More

Lawrence does not hold any crypto asset. This article is provided for informational purposes only and should not be construed as financial advice. The Shib Magazine and The Shib Daily are the official media and publications of the Shiba Inu cryptocurrency project. Readers are encouraged to conduct their own research and consult with a qualified financial adviser before making any investment decisions.

Leave a Reply

Your email address will not be published.

AI
Previous Story

AI-Powered Fraud Detection Tool Helps U.S. Treasury Recover Billions

A representational image of the Ethereum ecosystem
Next Story

Ethereum’s Pectra Fork to Include New Scalability Improvements