Definition
A neural network layer that enables models to dynamically weight the relevance of different input tokens, allowing an LLM or Agent to focus on contextually critical information while incurring a quadratic (O(n²)) computational trade-off relative to sequence length.
Differential weighting of vector dot products, not cognitive focus or user engagement.
"A theatrical spotlight that automatically adjusts its brightness and focus on specific actors (tokens) based on their importance to the current scene's dialogue."
- Self-Attention(Component)
- Transformer Architecture(Prerequisite)
- Context Window(Prerequisite)
- Cross-Attention(Component)
Conceptual Overview
A neural network layer that enables models to dynamically weight the relevance of different input tokens, allowing an LLM or Agent to focus on contextually critical information while incurring a quadratic (O(n²)) computational trade-off relative to sequence length.
Disambiguation
Differential weighting of vector dot products, not cognitive focus or user engagement.
Visual Analog
A theatrical spotlight that automatically adjusts its brightness and focus on specific actors (tokens) based on their importance to the current scene's dialogue.