Definition
Vector representations of text extracted from the hidden states of LLaMA-based architectures, transforming input strings into high-dimensional numerical points to facilitate semantic similarity search within RAG pipelines. While they provide high contextual alignment if the same model is used for generation, the trade-off involves higher computational overhead and larger vector dimensions compared to specialized encoder-only models like BERT.
Using the model's internal latent space for data representation rather than using the model to generate text strings.
"A high-fidelity GPS coordinate in a multi-dimensional semantic map where 'King' and 'Queen' are physically located near each other."
- Cosine Similarity(Mathematical metric used to compare LLaMA embedding proximity)
- Vector Database(Storage and indexing infrastructure for the resulting embeddings)
- Pooling Strategy(Prerequisite method for converting token-level hidden states into a single sentence-level vector)
- Decoder-only Architecture(The underlying model structure from which the embeddings are extracted)
Conceptual Overview
Vector representations of text extracted from the hidden states of LLaMA-based architectures, transforming input strings into high-dimensional numerical points to facilitate semantic similarity search within RAG pipelines. While they provide high contextual alignment if the same model is used for generation, the trade-off involves higher computational overhead and larger vector dimensions compared to specialized encoder-only models like BERT.
Disambiguation
Using the model's internal latent space for data representation rather than using the model to generate text strings.
Visual Analog
A high-fidelity GPS coordinate in a multi-dimensional semantic map where 'King' and 'Queen' are physically located near each other.