Definition
Natural Questions (NQ) is a large-scale benchmark dataset comprising real Google search engine queries used to train and evaluate Retrieval-Augmented Generation (RAG) and Open-Domain Question Answering (ODQA) systems. It tests the end-to-end pipeline's ability to find relevant Wikipedia passages and extract precise answers, though it presents a trade-off between real-world query realism and the static nature of its underlying knowledge source.
Distinguish from SQuAD; NQ uses 'information-seeking' queries where the user didn't see the answer first, whereas SQuAD questions are manually written based on a provided text.
"The 'Mystery Shopper' test for AI: A collection of real, unscripted questions from the public used to verify if the library's retrieval system and the librarian's comprehension actually work under pressure."
- Open-Domain Question Answering (ODQA)(Primary Task Context)
- Exact Match (EM)(Evaluation Metric)
- Dense Passage Retrieval (DPR)(Prerequisite Model Architecture)
- Recall@k(Retrieval Performance Metric)
Conceptual Overview
Natural Questions (NQ) is a large-scale benchmark dataset comprising real Google search engine queries used to train and evaluate Retrieval-Augmented Generation (RAG) and Open-Domain Question Answering (ODQA) systems. It tests the end-to-end pipeline's ability to find relevant Wikipedia passages and extract precise answers, though it presents a trade-off between real-world query realism and the static nature of its underlying knowledge source.
Disambiguation
Distinguish from SQuAD; NQ uses 'information-seeking' queries where the user didn't see the answer first, whereas SQuAD questions are manually written based on a provided text.
Visual Analog
The 'Mystery Shopper' test for AI: A collection of real, unscripted questions from the public used to verify if the library's retrieval system and the librarian's comprehension actually work under pressure.