Definition
Knowledge distillation is a model compression technique where a smaller 'Student' model is trained to replicate the behavior, output distributions, or reasoning paths of a larger, more capable 'Teacher' model. In AI agents and RAG, this reduces inference latency and costs while attempting to retain the sophisticated performance of frontier models.
Refers to transferring model intelligence between neural networks, not purifying datasets or chemical processes.
"A Master Chef creating a simplified, high-speed 'cheat sheet' for a line cook to replicate a signature dish perfectly without needing the Master's decades of experience."
- Teacher-Student Architecture(Prerequisite)
- Synthetic Data Generation(Component)
- Quantization(Alternative Optimization Technique)
- Logit Matching(Component)
Conceptual Overview
Knowledge distillation is a model compression technique where a smaller 'Student' model is trained to replicate the behavior, output distributions, or reasoning paths of a larger, more capable 'Teacher' model. In AI agents and RAG, this reduces inference latency and costs while attempting to retain the sophisticated performance of frontier models.
Disambiguation
Refers to transferring model intelligence between neural networks, not purifying datasets or chemical processes.
Visual Analog
A Master Chef creating a simplified, high-speed 'cheat sheet' for a line cook to replicate a signature dish perfectly without needing the Master's decades of experience.