SmartFAQs.ai

Glossary

Definitions and synonyms for key RAG concepts. Use search or filter by category.

Advanced Concepts

Understanding cause-effect

Finding relevant code snippets

"What-if" analysis

Transferring knowledge to smaller models

Combining multiple models

Merging embeddings/outputs

Learning from rankings

Virtual environment testing

Non-text data retrieval

Creating training examples

Using tests as metrics

Advanced Embedding Techniques

Vectors capturing surrounding context

Unified space for multiple languages

Full-dimensional continuous vectors

Reducing vector dimensions while preserving relationships

Models trained on specialized corpora (medical, legal)

Number of dimensions in vector (e.g., 768, 1536)

Models fine-tuned for specific retrieval tasks

Multiple representations per document (title, content, etc.)

Cross-language vector representations

Compressing vectors by 97% through subvector coding

Reducing precision of vectors (4-bit, 8-bit)

/ **Sparse Vectors** - High-dimensional vectors with mostly zeros

SPLADE

SPLADE, SPLADE

Sparse Lexical and DensE embedding combining sparse/dense approaches

Techniques to reduce storage requirements

Advanced Retrieval & Learning

Lifelong learning systems

Multi-language adaptation

Adapting to specific domains

Learning from minimal examples

Learning from new data

Tracking information relevance over time

Updating stale information

Learning to learn quickly

Real-time model updates

Knowledge transfer across domains

No task-specific examples

Advanced Retrieval Methods

Contextualized Late Interaction over BERT

ColBERT applied to multimodal (vision) content

Broadening query scope

Neural dense retrieval approach

LLM-generated synthetic documents

Refining retrieval in steps

Retrieving across multiple documents

Multiple reformulations of single query

Enriching query with synonyms and related terms

Rewriting queries for better matching

Transforming queries to improve retrieval

Breaking complex queries into parts

Adding alternative terms

Architectures & Models

BERT

BERT, BERT

Bidirectional Encoder Representations from Transformers

BM25

BM25, BM25

Best Matching 25

DPR

DPR, DPR

Dense Passage Retrieval

HNSW

HNSW, HNSW

Hierarchical Navigable Small World

Hypothetical Document Embeddings

IVF

IVF, IVF

Inverted File

PQ

PQ, PQ

Product Quantization

RAFT

RAFT, RAFT

Retrieval-Augmented Fine-Tuning

SPLADE

SPLADE, SPLADE

Sparse Lexical and DensE

TF-IDF

TF-IDF, TF-IDF

Term Frequency-Inverse Document Frequency

Benchmarks & Datasets

BEIR

BEIR, BEIR

Diverse information retrieval benchmark

Late interaction evaluation

Multi-hop question answering

MS-MARCO

MS-MARCO, MS-MARCO

Large-scale IR benchmark

Long-form QA benchmark

Biomedical question answering

Large-scale QA dataset

Cross-lingual QA benchmark

Compliance & Ethics

Systematic discrimination

Identifying unfair patterns

CCPA

CCPA, CCPA

California Consumer Privacy Act

Understanding model decisions

Measuring equality

FINRA

FINRA, FINRA

Financial regulatory compliance

GDPR

GDPR, GDPR

General Data Protection Regulation

HIPAA

HIPAA, HIPAA

Health Insurance Portability and Accountability Act

Ethical AI principles

SOX

SOX, SOX

Sarbanes-Oxley compliance

Clear system explanation

Context & Token Management

Dividing content for token limits

Maintaining important information

Cutting text to fit limits

Context supporting response claims

Tokens in query and context

Tokens in generated response

Reusing previously computed prompts

Fixed-size moving context window

Condensing text to save tokens

Allocated tokens for retrieval and generation

Maximizing value per token

Maximum tokens LLM can process

Core RAG

ANN

ANN, ANN

Approximate Nearest Neighbor

IR

IR, IR

Information Retrieval

LLM

LLM, LLM

Large Language Model

NLP

NLP, NLP

Natural Language Processing

QA

QA, QA

Question Answering

RAG

RAG, RAG

Retrieval-Augmented Generation

Data & Context Management

Storing context externally

Previous interactions for context

Retrieval from multiple systems

Querying across distributed knowledge sources

Persistent user preferences and patterns

Session and long-term memory integration

Dynamic knowledge base updates

Curating important memories

Short-term conversation state

Real-time document and response streaming

Data & Privacy

RBAC and ABAC

Tracking system actions

Recording user permissions

Removing identifying information

Tracking data origin and transformations

Source and history tracking

Data storage encryption

Network data encryption

Identifying personally identifiable information

Removing sensitive information

Replacing sensitive data

Data Structures

Sorted tree structure

Connected nodes and edges

Key-value storage

Priority queue structure

Semantic network

Graph with attributes

First-in-first-out

Probabilistic balanced structure

Prefix tree for strings

Database Features

Transaction guarantees

Conditional document selection

Redundancy and failover

Storing non-vector data

Isolated data per tenant

Data partition within database

Copying data across nodes

Handling growing data

Distributing data across partitions

Access control per tenant

Document & Data Management

Redundant content between consecutive chunks

Number of tokens or characters per segment

Breaking documents into manageable pieces for embedding

Complete collection of documents in knowledge base

Breaking text into logical units

Repository for original/source documents

Tracking changes and updates to source material

Character or token-based uniform segmentation

Adding contextual information (tags, dates, source)

Content-aware splitting based on meaning

/ **Agentic Chunking** - LLM-assisted intelligent document splitting

Pulling readable content from various formats

Dividing content while preserving context

Document Processing

Web scraping

Document parsing and chunking

Vision-language model(VLM) OCR

Open-source vision-language model(VLM) OCR

Vision-language model(VLM) OCR

Open-source vision-language model(VLM) OCR

PDF content extraction

PDF text extraction

Browser automation

OCR text recognition

Document parsing and chunking

Domain Applications

API and documentation retrieval

FAQ and ticket automation

Infrastructure documentation

Organizational information system

Market data and compliance

Fact-checking and verification

Contract and precedent analysis

/ **Clinical RAG** - Patient records and guidelines

Property and regulation information

Academic literature integration

Embedding Fundamentals

Angle-based similarity metric between vectors

Similarity calculation for normalized vectors

Neural network converting text to vectors

Multi-dimensional space where vectors are positioned

Numerical vector representations of text capturing semantic meaning

Straight-line distance between vectors

Contextual understanding of text beyond keywords

Finding documents by meaning rather than keywords

Scaling vectors to unit length

Numeric array encoding semantic content

Embedding Models

Open-source high-performance embeddings

CLIP

CLIP, CLIP

Vision-language embedding model

Commercial multilingual embedding model

Large-scale training datasets embeddings

Embeddings from Meta's LLaMA models

Open-source efficient embeddings

3,072-dimensional embedding model

1,536-dimensional embedding model

Framework for semantic textual similarity

Embeddings from Voyage AI models

Error Analysis

Incorrect source citation

New documents not retrievable

Confident false information

Retrieved docs not supporting query

Fabricated information not in context

Contradictory information

Adversarial attack via input

Relevant docs ranked too low

Missing relevant documents

Meaning divergence between query and context

Evaluation Tools & Frameworks

Agent behavior tracking

AI observability platform

LLM evaluation with code

Production monitoring and evaluation

LangChain evaluation platform

Prompt testing and comparison

Data validation for outputs

RAGAS

RAGAS, RAGAS

RAG evaluation framework

Foundational Terms

Query combined with retrieved context before generation

Maximum amount of text an LLM can process (tokens)

Information sources outside the LLM's training data

LLM component that synthesizes responses from retrieved context

Anchoring generated responses in retrieved facts to reduce hallucinations

Process of preparing and storing documents for retrieval

Structured or unstructured collection of documents and data

System component responsible for fetching relevant documents

Technique combining information retrieval with generative AI for grounded responses

Generation & Response Metrics

How well response addresses question

Semantic similarity to expected answer

N-gram overlap with reference

Correctness of source attribution

Logical flow and readability

Addressing all query aspects

Predictions matching reference exactly

Accuracy of claims in response

Response grounded only in retrieved context

Recall-oriented understudy evaluation

Meaning-based comparison

Infrastructure & Deployment

Request routing and management

Content delivery network

AWS, Azure, GCP

Reusing database connections

Docker containers

Multi-machine setup

Placing systems closer to users

Adding more servers

Container orchestration

Distributing requests

Independent service components

On-premise deployment

Using more powerful servers

Intelligent RAG Patterns

Iterative query refinement

Dynamic strategy selection based on query type

Autonomous agent-driven retrieval decisions

Multiple retrieval paths in single query

Post-generation error checking

Retaining interaction history

Rapid adaptation with few examples

Coordination between specialized agents

Self-reflective improvement mechanisms

Internal critique and iteration

Self-assessment of relevance

Model self-evaluates and critiques own outputs

RAG using external tools/APIs

Long-Context Handling

Extended text handling

100K+ token support

Token allocation strategy

Updating high-value information

Preferring recent documents

Importance weighting

Choosing what to retain

Time-aware retrieval

Machine Learning Concepts

Focusing on relevant parts

BERT

BERT, BERT

Bidirectional Encoder Representations

Two-part model architecture

Adapting pre-trained models

GPT

GPT, GPT

Generative Pre-trained Transformer

Deep learning models

Learning useful features

Elements attending to each other

Using knowledge from one task for another

Attention-based models

Metrics

AUC

AUC, AUC

Area Under Curve

BLEU

BLEU, BLEU

BiLingual Evaluation Understudy

EM

EM, EM

Exact Match

F1

F1, F1

F1 Score

MAP

MAP, MAP

Mean Average Precision

MRR

MRR, MRR

Mean Reciprocal Rank

NDCG

NDCG, NDCG

Normalized Discounted Cumulative Gain

ROC

ROC, ROC

Receiver Operating Characteristic

ROUGE

ROUGE, ROUGE

Recall-Oriented Understudy for GIST Evaluation

Monitoring & Observability

Infrastructure monitoring

LLM error analysis

Metrics visualization

Token usage and cost tracking

Application performance

Observability and evaluation

Metrics collection

Multimodal RAG

Vector representation of audio

Speech-to-text

Text-to-image, image-to-text

Scanned PDFs and photos

Vector representations of images

Finding similar images

Handling text, images, video, audio

Single space for multiple modalities

Extracting frames and audio

Model for image embeddings

Optimization Techniques

Grouping queries for efficiency

Storing frequent results

Minimizing expenses

Terminating search early

Improving retrieval speed

Speeding up responses

Distributed caching

Using cheaper models first

Reusing computed contexts

Optimizing query execution

In-memory cache layer

More queries per second

Reducing token usage

Orchestration & Framework Libraries

Multi-agent conversation framework

Agent team orchestration

Declarative LLM programming

End-to-end RAG framework

Comprehensive LLM orchestration framework

Stateful workflow graphs

Document indexing and retrieval framework

Lightweight agent coordination

.NET LLM integration

Performance Concepts

Data transfer rate

Complexity classification

Performance limiting factor

Model prediction time

Response time

Storage requirement

Performance measurement

Algorithm memory usage

Operations per unit time

Algorithm speed analysis

Personalization & Memory

Monitoring user interactions

Segmenting by user attributes

Recent interaction memory

Extreme user customization

Organizing remembered information

System for custom ranking

Long-term knowledge storage

Current conversation context

Learning user preferences

Custom vector representations

Platforms & Tools

AWS

AWS, AWS

Amazon Web Services

Microsoft Azure

CLI

CLI, CLI

Command Line Interface

FOSS

FOSS, FOSS

Free and Open-Source Software

GCP

GCP, GCP

Google Cloud Platform

GUI

GUI, GUI

Graphical User Interface

JSON

JSON, JSON

JavaScript Object Notation

REST

REST, REST

Representational State Transfer

SDK

SDK, SDK

Software Development Kit

Prompting Techniques

A

B Testing

Comparing prompt variants

Step-by-step reasoning prompts

Including retrieved documents in prompt

Runtime prompt modification

Including examples in prompt

Crafting clear task instructions

Enhancing prompt with context

Testing prompt effectiveness

Improving prompt quality

Reusable prompt structure

Managing prompt variations

Base instructions for LLM behavior

Query or request from user

RAG Variants & Techniques

RAG with enhanced retrieval techniques

Fundamental RAG pattern

Separate, independently upgradeable components

Integrated single system

Basic retrieve-then-generate pipeline

Classical single-stage retrieval approach

Ranking & Re-ranking

Model that scores query-document pairs jointly

Using BERT-like models to score pairs

Separate encoders for query and document

ML models for optimal ranking

NDCG@K

NDCG@K, NDCG@K

Normalized Discounted Cumulative Gain ranking metric

Fraction of top-k results that are relevant

Algorithms determining result order

Reordering retrieved results by relevance

Fraction of all relevant docs in top-k

Assigning confidence to document relevance

Minimum score for document inclusion

Returning k highest-ranked documents

Ranking Algorithms

BM25

BM25, BM25

Probabilistic ranking function (builds on TF-IDF)

k1 (term saturation), b (length norm)

Adjusting for document length

Standard BM25 implementation

Probability-based scoring

Ordering by relevance

How often term appears

Preventing TF dominance

TF-IDF

TF-IDF, TF-IDF

Term Frequency-Inverse Document Frequency

Retrieval Metrics

Are top results ranked in order of relevance?

Does context contain info needed for answer?

Harmonic mean of precision and recall

F1@K

F1@K, F1@K

F1 score at top-k results

Average precision across queries

Average rank of first relevant result

MRR@K

MRR@K, MRR@K

MRR considering only top-k results

NDCG@K

NDCG@K, NDCG@K

NDCG@top-k results

Fraction of retrieved results that are relevant

Fraction of all relevant documents retrieved

Numerical measure of document relevance

Embedding-based relevance measure

Search Techniques

Searching across all text fields

Retrieving at different granularity levels

Combining keyword and semantic search

Traditional text matching using terms

ColBERT-style token-level interactions

Sequential retrieval steps with refinement

Search using deep learning models

Scoring document relevance to query

Combining ranking lists from multiple retrievers

Meaning-based search (vs. keyword)

Combining sparse (BM25) and dense (embeddings) methods

Security Threats

Crafted attack inputs

Corrupting training/knowledge bases

Denial-of-service attacks

Circumventing safety guardrails

Stealing model knowledge

Malicious input attacks

Controlling request volume

Selection & Filtering

Filtering by field values

Initial broad retrieval stage

Removing duplicate or near-duplicate results

Connecting mentions to knowledge base entities

Multi-dimensional filtering

Selecting documents by attributes

Identifying entities in text

Modifying queries for better results

Combining results from multiple sources

Specialized Retrieval Approaches

Understanding cause-effect relationships

Identifying clusters in knowledge graphs

Synthesizing info across multiple sources

Graph of connected entities

/ **Knowledge-Graph-Aware Retrieval** - Using entity relationships

Incorporating structured knowledge

Multi-step logical inference

Following relationships for context

Retrieving from tables, databases, knowledge graphs

Storage Technologies

Index for dense vectors

Vectors across servers

Network structure indexing

Combined sparse and dense

Vectors stored in RAM

Mapping terms to documents

Vectors on disk

Index for sparse vectors

Hierarchical indexing

Optimized structure for vector storage

System-Level Metrics

Uptime and reliability

Token usage and infrastructure costs

Time to generate response

Storage and RAM requirements

Time from query to results

Performance as system grows

Queries processed per unit time

Number of tokens consumed

Techniques & Patterns

API

API, API

Application Programming Interface

CRAG

CRAG, CRAG

Corrective RAG

ETL

ETL, ETL

Extract Transform Load

MCP

MCP, MCP

Model Context Protocol

NER

NER, NER

Named Entity Recognition

OCR

OCR, OCR

Optical Character Recognition

RRF

RRF, RRF

Reciprocal Rank Fusion

Self-Reflective RAG

VQA

VQA, VQA

Visual Question Answering

Text Processing

Standardizing letter case

Text standardization

Identifying text language

Converting to base form

Reducing words to root form

Removing common words

Breaking text into tokens

Cleaning spacing

Use Case Specific

Conversational interface

Creating source attributions

Finding relevant documents

Verifying claims

Question answering

Suggesting content

Meaning-based search

Condensing document content

Vector Database Platforms

Lightweight open-source embedded database

Full-text search with vector support

FAISS

FAISS, FAISS

Facebook's high-performance similarity search library

Modern vector database with multi-modal support

Enterprise open-source vector database for massive scale

Vector capabilities in MongoDB

Managed vector database with hybrid search

Vector extension for PostgreSQL

Rust-based high-performance vector database

PostgreSQL with vector support

Specialized database optimized for storing and querying embeddings

Alternative term for vector database

Open-source vector database with GraphQL API

Vector Indexing & Search Algorithms

Fast inexact similarity search

Search over dense embeddings

HNSW

HNSW, HNSW

Hierarchical Navigable Small World graph indexing

Data structure for efficient keyword retrieval

IVF

IVF, IVF

Inverted File indexing for clustering

Finding k most similar items

Locating closest points in vector space

Finding items similar to query

Similarity-based retrieval using embeddings

Weak-AND for efficient pruning

Vector Similarity Metrics

Angle-based metric

Angle between vectors

Inner product of vectors

Straight-line distance

Bit-level differences

Set overlap measure

Euclidean distance norm

Grid-based distance

Workflow & Automation Platforms

Data pipeline orchestration

Rapid RAG development

SaaS integration platform

Visual workflow automation

Workflow orchestration

Low-code automation