Definition
The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.
Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).
"Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room."
Conceptual Overview
The process of increasing the computational resources—specifically GPU VRAM, RAM, and CPU cores—of a single node to accommodate high-dimensional vector embeddings, larger local LLMs, or memory-intensive indexing operations in a RAG pipeline.
Disambiguation
Scaling 'Up' (beefier hardware) vs Scaling 'Out' (more machines).
Visual Analog
Upgrading a single library bookshelf to be taller and wider to hold more massive, complex encyclopedias, rather than adding more shelves to the room.