Definition
A high-precision second-stage retrieval technique that calculates relevance by processing a query and a document candidate simultaneously; while it significantly improves ranking accuracy by capturing nuanced semantic interactions, it introduces a trade-off of much higher computational latency compared to vector-based Bi-Encoders.
Distinguishes 'late interaction' (where query and doc meet only at the end) from 'joint encoding' (where they interact in every transformer layer).
"A jeweler examining individual gemstones with a loupe after a mechanical sorter has already filtered out the obvious gravel."
Conceptual Overview
A high-precision second-stage retrieval technique that calculates relevance by processing a query and a document candidate simultaneously; while it significantly improves ranking accuracy by capturing nuanced semantic interactions, it introduces a trade-off of much higher computational latency compared to vector-based Bi-Encoders.
Disambiguation
Distinguishes 'late interaction' (where query and doc meet only at the end) from 'joint encoding' (where they interact in every transformer layer).
Visual Analog
A jeweler examining individual gemstones with a loupe after a mechanical sorter has already filtered out the obvious gravel.