Cyber Security

Anthropic Says Chinese AI Firms Used 16 Million Claude Questions to Copy Model

IRavie LakshmananFebruary 24, 2026Artificial Intelligence / Anthropic

Anthropic on Monday said it identified “industrial campaigns” organized by three artificial intelligence (AI) companies, DeepSeek, Moonshot AI, and MiniMax, to illegally extract Claude’s power to improve their models.

The distillation attack generated more than 16 million exchanges with its large language model (LLM) by using 24,000 fake accounts that violated terms of service and regional access restrictions. All three companies are based in China, where use of its services is prohibited due to “legal, regulatory, and security risks.”

Distillation refers to a method where a poorly performing model is trained on the results produced by a robust AI system. While distillation is a legal way for companies to produce smaller, cheaper versions of their frontier models, it’s illegal for competitors to use it to get such capabilities from other AI companies at a fraction of the time and cost they would take to develop them themselves.

“Illicitly produced models do not have the necessary safeguards, creating a serious national security risk,” Anthropic said. “Models built by illegal immersion are less likely to retain those protections, meaning the potential for harm increases when most protections are completely removed.”

Foreign AI companies that distribute American models can use these unsecured capabilities to carry out malicious activities, cyber-related or otherwise, thereby serving as the basis for military, intelligence, and surveillance systems that authorized governments can use to conduct offensive cyber operations, disinfection campaigns, and mass surveillance.

Campaigns detailed by early AI include the use of fake accounts and commercial proxy services to reach Claude at scale while avoiding detection. Anthropic said it was able to identify each campaign to a specific AI lab based on request metadata, IP address correlation, request metadata, and infrastructure indicators.

The details of the three distillation attacks are below –

  • DeepSeek, which targeted Claude’s thinking skills, rubric-based grading tasks, and sought his help in producing safer alternatives to sensitive political questions such as questions about opponents, party leaders, or endorsements in 150,000 exchanges.
  • Moonshot AI, which guides Claude’s agent thinking and tools, coding skills, agent optimization, and computer vision across 3.4 million exchanges.
  • MiniMax, which directs Claude’s coding skills and tools across more than 13 million exchanges.

“The volume, composition, and focus of the notifications were different from normal usage patterns, indicating deliberate power outages rather than legitimate use,” Anthropic added. “Each campaign targeted Claude’s very different skills: agentic thinking, tool use, and coding.”

The company also revealed that the attack relies on commercial proxy services that resell access to Claude and other AI models at scale. These services are powered by “hydra cluster” architectures that contain large networks of rogue accounts to distribute traffic across their API.

Access is then used to generate large amounts of carefully designed alerts designed to extract specific skills from the model for the purpose of training their models in favor of high-quality responses.

“The breadth of these communication systems means that there are no points of failure,” said Anthropic. “When one account is banned, another takes its place. In some cases, a single proxy network managed more than 20,000 fake accounts at once, mixing distillation traffic with unrelated customer requests to make detection difficult.”

To combat the threat, Anthropic said it has developed several categories and fingerprinting systems to identify suspicious patterns of distillation attacks in API traffic, strengthened the authentication of academic accounts, security research programs, and startups, and implemented advanced defenses to reduce the effectiveness of the results model of illegal filtering.

This disclosure comes weeks after the Google Threat Intelligence Group (GTIG) disclosed that it identified and disrupted distillation attacks and extracting target models from Gemini’s reasoning power using more than 100,000 commands.

“Attacks on extraction and distillation models generally do not pose a risk to ordinary users, because they do not threaten the confidentiality, availability, or integrity of AI services,” Google said earlier this month. “Instead, the risk is concentrated on model developers and service providers.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button