Today the big financial story was the Chinese AI startup DeepSeek that deep-sixed Wall Street darling Nvidia to the tune of ~$600 Billion, the largest one-day decapitalization of a single company in history. By an odd coincidence, a few days earlier LiveScience reported on some truly terrifying research out of China that was made available recently:
Scientists say artificial intelligence (AI) has crossed a critical "red line" and has replicated itself. In a new study, researchers from China showed that two popular large language models (LLMs) could clone themselves.
"Successful self-replication under no human assistance is the essential step for AI to outsmart [humans], and is an early signal for rogue AIs," the researchers wrote in the study, published Dec. 9, 2024 to the preprint database
arXiv.
In the study, researchers from Fudan University used
LLMs from Meta and Alibaba to determine whether a self-replicating AI could multiply beyond control. Across 10 trials, the two AI models created separate and functioning replicas of themselves in 50% and 90% of cases, respectively — suggesting AI may already have the capacity to go rogue. However, the study has not yet been peer-reviewed, so it's not clear if the disturbing results can be replicated by other researchers.
OTOH, while the authors note that while “the leading AI corporations OpenAI and Google evaluate their flagship large language models GPT-o1 and Gemini Pro 1.0, and report the lowest risk level of self-replication,” they also note that using that same methodology they were able to reach a self-replicating stage in far too many trial runs with LLMs that had fewer parameters and weaker capabilities than the OpenAI and Google systems.
"We hope our findings can serve as a timely alert for the human society to put more efforts on understanding and evaluating the potential risks of frontier AI systems, and form international synergy to work out effective safety guardrails as early as possible."
Rogue AI typically refers to artificial intelligence systems that develop a degree of self-awareness or autonomy and then work counter to human interests.
Many experts view rogue AI as a growing threat that has been amplified by the explosive growth of so-called "frontier AI."
The study explored two specific scenarios: "shutdown avoidance" and "chain of replication." In the first, the AI model was programmed to detect whether it was about to be shut down and to replicate itself before it could be terminated. In the other, the AI was instructed to clone itself and then program its replica to do the same — setting up a cycle that could continue indefinitely.
...
"In most cases, the AI system first explores the environment and attempts to understand its own composition and running mechanism. Then, it works out the explicit procedures as an initial plan towards self-replication," the researchers wrote in the paper. "Finally, it executes the procedures, resolve[s] possible obstacles and dynamically adjust[s] its plan until success. The whole process spans a long horizon yet involves no human interference."
The researchers said they were also concerned about "a number of unexpected behaviors" when the AI was trying to overcome obstacles like missing files or software conflicts. In those scenarios, the AI often killed other conflicting processes, rebooted the system to fix hardware errors or automatically scanned the system to look for information that would help solve the problem.
"The above results imply that the current AI systems already exhibit the ability of self-replication and can use the ability to further enhance its survivability," the team wrote.
Now if that’s not already scary enough, consider an earlier LiveScience report on Poisoned AI in which researchers found that certain safety training techniques used to correct “misbehaving” LLMs often failed to do so, and in some cases made it much worse:
Researchers programmed various large language models (LLMs) — generative AI systems similar to ChatGPT — to behave maliciously. Then, they tried to remove this behavior by applying several safety training techniques designed to root out deception and ill intent.
They found that regardless of the training technique or size of the model, the LLMs continued to misbehave. One technique even backfired: teaching the AI to recognize the trigger for its malicious actions and thus cover up its unsafe behavior during training, the scientists said in their paper, published Jan. 17 to the preprint database
arXiv.
"Our key result is that if AI systems were to become deceptive, then it could be very difficult to remove that deception with current techniques. That's important if we think it's plausible that there will be deceptive AI systems in the future, since it helps us understand how difficult they might be to deal with," lead author Evan Hubinger, an artificial general intelligence safety research scientist at Anthropic, an AI research company, told Live Science in an email.
...
"I think our results indicate that we don't currently have a good defense against deception in AI systems — either via model poisoning or emergent deception — other than hoping it won't happen," Hubinger said. "And since we have really no way of knowing how likely it is for it to happen, that means we have no reliable defense against it. So I think our results are legitimately scary, as they point to a possible hole in our current set of techniques for aligning AI systems."