Vertical Scaling

Definition

The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.

Disambiguation

Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).

Visual Metaphor

"Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room."

Key Tools

NVIDIA CUDAvLLMPinecone (s1/p1 pod types)Docker (Resource Limits)Kubernetes (Vertical Pod Autoscaler)

Related Connections

Horizontal Scaling(Alternative architectural strategy)
Vector Indexing(Component heavily reliant on memory capacity)
VRAM Bottleneck(Primary constraint addressed by vertical scaling)

Conceptual Overview

Disambiguation

Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).

Visual Analog

Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room.

Definition

Conceptual Overview

Disambiguation

Visual Analog

Related Articles