Back to Learn
Intermediate

Vertical Scaling

The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.

Definition

The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.

Disambiguation

Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).

Visual Metaphor

"Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room."

Conceptual Overview

The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.

Disambiguation

Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).

Visual Analog

Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room.

Related Articles