DDoS

DDoS

A coordinated saturation attack targeting LLM inference endpoints or RAG retrieval layers, intended to exhaust API token quotas, GPU memory, or vector database compute cycles; architectural trade-offs involve balancing aggressive rate-limiting (which protects costs) against the risk of false positives that block legitimate power users.

Definition

Disambiguation

In AI, this manifests as 'Token Exhaustion' or 'Semantic Flooding' rather than traditional TCP/IP packet storms.

Visual Metaphor

"A crowd of automated bots filling up every seat in a library and checking out every book simultaneously, preventing actual researchers from accessing information."

Key Tools

Cloudflare AI GatewayRedis (Rate Limiting)NeMo GuardrailsKong API GatewayUpstash

Related Connections

Rate Limiting(Primary Mitigation Strategy)
Token Quota(Resource Constraint)
Semantic Cache(Resiliency Component)

Conceptual Overview

Disambiguation

In AI, this manifests as 'Token Exhaustion' or 'Semantic Flooding' rather than traditional TCP/IP packet storms.

Visual Analog

A crowd of automated bots filling up every seat in a library and checking out every book simultaneously, preventing actual researchers from accessing information.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles