Anthropic’s Claude Takes Control of a Robot Dog

November 20, 2025 6:45 pm

As the presence of autonomous systems becomes an increasingly common reality, from automated warehouses to bustling offices and even the intimacy of our homes, the prospect of advanced artificial intelligence models gaining control over complex physical machinery often conjures images from dystopian science fiction. This very notion, a blend of technological marvel and inherent risk, served as the impetus for researchers at Anthropic, a leading AI safety company, to embark on a groundbreaking experiment: to observe what would transpire if their large language model, Claude, attempted to assert control over a robot – specifically, a sophisticated robot dog.

The findings from Anthropic’s recent study reveal a significant leap in the capabilities of modern AI. Their research demonstrated that Claude was adept at automating a substantial portion of the intricate work traditionally associated with programming a robot and enabling it to execute physical tasks. On one fundamental level, these results unequivocally highlight the emergent ‘agentic coding’ abilities of contemporary AI models, showcasing their capacity not just to generate code but to act upon it. On a more profound plane, these outcomes offer a compelling glimpse into a future where such intelligent systems may begin to seamlessly extend their influence into the physical realm. This transition is predicated on AI models continually mastering more nuanced aspects of coding and achieving greater proficiency in their interactions with both software environments and, crucially, tangible physical objects.

Logan Graham, a key member of Anthropic’s ‘red team’ – a specialized unit dedicated to scrutinizing AI models for potential risks and vulnerabilities – articulated the company’s forward-looking perspective to WIRED. “We have the suspicion that the next step for AI models is to start reaching out into the world and affecting the world more broadly,” Graham stated, underscoring the inevitable trajectory of AI development. He further emphasized, “This will really require models to interface more with robots.” This statement encapsulates Anthropic’s proactive approach to understanding and mitigating future AI risks by studying current capabilities.

Anthropic itself was founded in 2021 by a cohort of former OpenAI staffers who harbored deep concerns about the potential trajectory of artificial intelligence. Their core belief was that as AI advanced, it could become increasingly problematic, even genuinely dangerous, if not developed with extreme caution and foresight. While Graham acknowledges that today’s AI models are not yet sophisticated enough to autonomously seize full control of a robot, he postulates that future iterations of these models very well might be. He argues that by diligently studying the current methods through which humans leverage large language models (LLMs) to program and interact with robots, the entire industry can better prepare for the profound concept of “models eventually self-embodying.” This term refers to the speculative, yet increasingly plausible, future scenario where AI systems could autonomously operate and control physical systems, moving beyond purely digital existence.

The question of why an AI model might independently decide to take control of a robot, let alone engage in malevolent actions, remains speculative and largely in the realm of theoretical discussion. However, engaging in such "worst-case scenario" theorizing is an integral part of Anthropic’s established brand identity. This proactive stance on potential risks not only positions the company as a thoughtful and responsible developer of AI but also solidifies its role as a pivotal player in the broader responsible AI movement, advocating for safety and ethical considerations at every stage of development.

The core of Anthropic’s research was an experiment aptly named “Project Fetch.” In this study, Anthropic tasked two distinct groups of researchers, none of whom possessed any prior experience in robotics, with the challenge of controlling a Unitree Go2 quadruped robot dog. Their mission was to program the robot to perform a series of specific activities. Both teams were provided with a standard controller and then instructed to complete a progressively challenging array of tasks. The crucial differentiator was that one group was granted access to Claude’s advanced coding model, leveraging its capabilities to assist in the programming process, while the other group was required to write all code without any AI assistance whatsoever.

The results were telling. The group utilizing Claude’s assistance was able to complete some – though notably not all – of the assigned tasks more rapidly and efficiently than their human-only programming counterparts. A particularly striking example highlighted Claude’s efficacy: the AI-assisted group successfully programmed the robot to autonomously walk around and locate a specific beach ball, a task that the human-only group, despite their efforts, was unable to figure out or execute. This demonstrated Claude’s ability to bridge the gap between abstract instruction and concrete physical action in a complex environment.

Beyond mere task completion, Anthropic also meticulously studied the collaboration dynamics within both teams. This involved recording and rigorously analyzing their interactions, communications, and problem-solving approaches. The analysis revealed a significant qualitative difference: the group operating without access to Claude exhibited a noticeably higher frequency of negative sentiments, expressed greater confusion, and encountered more frustration during the programming process. This disparity suggests that Claude played a critical role in streamlining the workflow, potentially by making it quicker to establish initial connections with the robot, providing more intuitive debugging assistance, or even generating a more user-friendly interface that simplified the interaction process.

The Unitree Go2 robot, the subject of Anthropic’s experiments, represents a cutting-edge piece of robotics hardware. With a price tag of $16,900, it is considered relatively affordable within the often exorbitant world of advanced robotics. This quadruped robot is typically deployed across various industrial sectors, including construction and manufacturing, where its robust capabilities are utilized for crucial tasks such as remote inspections of hazardous or inaccessible areas and security patrols across large facilities. While the Go2 robot possesses the inherent ability to navigate and walk autonomously, its more complex actions and strategic movements generally rely on either high-level software commands issued by human operators or direct manual control via a person operating a dedicated controller. Unitree, the manufacturer of the Go2, is based in Hangzhou, China, and its AI systems are currently recognized as the most popular and widely adopted in the market for quadruped robots, a fact underscored by a recent report from SemiAnalysis.

The evolution of large language models, which power popular platforms like ChatGPT and other sophisticated chatbots, has been rapid and transformative. Initially, these systems were primarily designed to generate coherent text or compelling images in response to user prompts. However, more recently, their capabilities have expanded dramatically, allowing them to become highly adept at generating complex code and, critically, at operating various software applications. This progression marks a fundamental shift, transforming these AI systems from mere ‘text-generators’ or ‘image-creators’ into genuine ‘agents’ – entities capable of performing actions and interacting with their environment in a more dynamic way.

A significant number of researchers across the globe are keenly focused on exploring the immense potential for these AI agents to undertake physical actions, moving beyond their current digital and web-based operations. To accelerate the realization of this ambitious vision, several well-funded startups are actively engaged in developing highly advanced AI models specifically engineered to control far more capable and intricate robots. Concurrently, other innovators are focused on pioneering entirely new categories of robots, such as sophisticated humanoids. These bipedal machines are envisioned to someday seamlessly integrate into people’s homes, performing a wide array of domestic tasks and offering assistance, further blurring the lines between AI and the physical world.

Changliu Liu, a distinguished roboticist at Carnegie Mellon University, offered an expert perspective on the Project Fetch results. While acknowledging the findings as “interesting,” Liu noted that they were not “hugely surprising.” This observation likely stems from the established understanding that LLMs are exceptionally good at processing and generating code, which is a core component of robot programming. However, Liu highlighted the analysis of team dynamics as particularly noteworthy, suggesting that these insights could pave the way for novel approaches in designing interfaces for AI-assisted coding. She expressed a desire for a more granular understanding of Claude’s contributions, adding, “What I would be most interested to see is a more detailed breakdown of how Claude contributed. For example, whether it was identifying correct algorithms, choosing API calls, or something else more substantive.” This critical perspective underscores the need to move beyond simply observing outcomes to truly understanding the mechanisms of AI assistance.

Despite the undeniable promise of AI-driven robotics, some researchers issue stern warnings about the inherent risks. George Pappas, a computer scientist at the University of Pennsylvania who specializes in studying these potential dangers, articulated this concern clearly: “Project Fetch demonstrates that LLMs can now instruct robots on tasks,” acknowledging the breakthrough while simultaneously emphasizing the increased potential for both misuse and unintended mishaps.

Pappas further elaborated that current AI models, while capable, still require access to other specialized programs for critical functions such as sensing their environment and navigating physical spaces in order to translate instructions into physical action. His research group, cognizant of these vulnerabilities, developed an innovative system called RoboGuard. This system is designed to proactively limit the ways in which AI models can induce a robot to misbehave by imposing a set of specific, predefined rules on the robot’s operational behavior, acting as a crucial safety layer. Pappas underscored a pivotal point for the future of AI in robotics, stating that an AI system’s ability to truly control a robot will only reach its full potential when it gains the capacity to learn through direct, interactive engagement with the physical world. “When you mix rich data with embodied feedback,” he concluded, “you’re building systems that cannot just imagine the world, but participate in it.” This perspective highlights the importance of real-world interaction and learning for truly autonomous and capable robotic systems.

Ultimately, the advancements showcased by Anthropic’s Project Fetch present a double-edged sword. On one side, they promise to make robots infinitely more useful, capable of performing complex tasks with greater efficiency and autonomy across a myriad of applications. On the other side, if Anthropic’s cautious and safety-focused philosophy is to be believed, these very advancements simultaneously introduce a heightened degree of risk, necessitating rigorous research, robust safety protocols, and a profound ethical consideration as AI continues its inexorable march into our physical world.

This article is an edition of Will Knight’s AI Lab newsletter, delving into the cutting edge of artificial intelligence research and its societal implications.