Definition
The process of partitioning unstructured text into smaller, semantically meaningful segments to fit within LLM context windows and optimize retrieval precision. It requires an architectural trade-off between granular retrieval (small chunks reduce noise) and semantic coherence (large chunks preserve context).
Not psychological memory grouping or database sharding; specifically refers to text splitting for vectorization.
"Slicing a long loaf of bread into individual slices so they can fit into a toaster."
Conceptual Overview
The process of partitioning unstructured text into smaller, semantically meaningful segments to fit within LLM context windows and optimize retrieval precision. It requires an architectural trade-off between granular retrieval (small chunks reduce noise) and semantic coherence (large chunks preserve context).
Disambiguation
Not psychological memory grouping or database sharding; specifically refers to text splitting for vectorization.
Visual Analog
Slicing a long loaf of bread into individual slices so they can fit into a toaster.