Prevent Distillation Attacks: Detection Strategies Revealed

Bassyonni

Published: February 23, 2026 4:02 PM ET

Prevent Distillation Attacks: Detection Strategies Revealed

Recent investigations have unveiled large-scale operations by three AI research laboratories—DeepSeek, Moonshot, and MiniMax. These labs have illicitly captured Claude’s capabilities to enhance their own models. They generated over 16 million interactions through around 24,000 fraudulent accounts, violating terms of service and access restrictions.

Understanding Distillation Attacks

The technique employed by these labs is known as “distillation.” This method typically trains a less capable model by utilizing outputs from a more advanced one. While distillation can be legitimate, its malicious application poses significant risks. Competitors can obtain powerful AI capabilities from other labs without developing them from scratch, saving both time and resources.

The Evolving Threat

The intensity and sophistication of these operations are increasing. The potential dangers extend beyond individual labs, affecting the entire industry and geopolitical landscape. Addressing these distillation attacks necessitates prompt, coordinated action among industry stakeholders, policymakers, and the global AI community.

Implications of Illicit Distillation

Models acquired through distillation often lack essential safeguards, creating national security risks.
Leading companies like Anthropic develop systems to prevent AI misuse, such as the creation of bioweapons.
Illicitly distilled models are unlikely to maintain these protections, facilitating the proliferation of dangerous technologies.

These threats can empower authoritarian regimes, allowing them to use AI for offensive cyber operations and mass surveillance. If distilled models become open-source, the risks multiply, expanding capabilities beyond governmental control.

Distillation Attacks and Export Controls

Anthropic has long advocated for export controls to preserve the U.S. position in AI. Distillation attacks undermine these controls. Foreign labs can gain competitive advantages against U.S. innovations while avoiding the constraints imposed by export regulations. Additionally, these attacks give a misleading impression that export controls are ineffective.

Identifying the Campaigns

Each of the three identified campaigns used fraudulent accounts and proxy services to access Claude extensively. They employed unusual prompt structures and usage patterns, indicative of deliberate capability extraction.

DeepSeek’s Campaign

Scale: Over 150,000 exchanges
Focus: Reasoning capabilities across various tasks
Actions: Employed load balancing to enhance reliability and evade detection

Moonshot’s Campaign

Scale: Over 3.4 million exchanges
Focus: Agentic reasoning and tool usage
Actions: Utilized multiple types of accounts to obscure activities

MiniMax’s Campaign

Scale: Over 13 million exchanges
Focus: Agentic coding and orchestration
Actions: Showed rapid adaptability in response to model updates

Accessing Advanced Models

To bypass restrictions, the laboratories utilized commercial proxy services. These services facilitate access through fraudulent accounts, creating expansive networks that distribute API traffic. This structure minimizes detection risks since banned accounts can be quickly replaced.

Recognizing Distillation Patterns

A typical sign of a distillation attack is a significant volume of repetitive prompts targeting a specific capability. When thousands of similar requests occur simultaneously, it becomes clear that they may not be legitimate usage.

Response Strategies

Detection: Developing classifiers and systems to recognize patterns indicative of distillation attacks.
Intelligence Sharing: Collaborating with other AI labs and relevant authorities to enhance awareness of threats.
Access Controls: Strengthening verification processes for accounts commonly exploited by fraudsters.
Countermeasures: Introducing safeguards to diminish the value of model outputs for distillation purposes.

Addressing these severe challenges requires comprehensive cooperation across the AI ecosystem. Initiatives from all sectors are essential for effectively combating the risks posed by distillation attacks.