What is a vector database in one sentence?

A vector database stores numeric embeddings and, given a query vector, quickly returns the stored items whose vectors are most similar — the backbone of semantic search and retrieval-augmented generation.

How is it different from a normal database?

A relational database answers exact-match and range queries on structured columns. A vector database answers 'what is most similar to this' on high-dimensional vectors using distance metrics like cosine or dot product.

What are some vector databases?

Dedicated stores include Pinecone, Weaviate, Qdrant, Milvus, and Chroma. Postgres with pgvector and Elasticsearch/OpenSearch also offer vector search, often good enough to avoid a separate system.

Approximate nearest-neighbor search trades a small amount of accuracy for huge speed gains. Algorithms like HNSW build a navigable graph so you avoid comparing the query against every stored vector.

Do I need a vector database for RAG?

You need vector search, not necessarily a dedicated product. For small corpora, pgvector or an in-memory index is fine. Dedicated databases earn their keep at scale, with filtering, and with high query volume.

Glossary · Data & AI

What is a Vector Database?

A vector database is a system built to store high-dimensional vectors — the numeric fingerprints that machine-learning models produce for text, images, and audio — and to answer one specific question fast: given this query vector, which stored vectors are most similar? It is the engine behind semantic search, recommendations, and retrieval-augmented generation.

The core idea

A machine-learning model can turn a sentence, an image, or a product description into a list of a few hundred to a few thousand numbers — an embedding that captures meaning. Things that mean similar things end up close together in that high-dimensional space. A vector database exists to make use of that geometry: store millions of these vectors, then answer "find the closest ones to this query vector" in milliseconds. Distance is measured with cosine similarity, dot product, or Euclidean distance.

Why a regular database is not enough

Relational databases are excellent at exact matches, ranges, and joins: "all orders over $500 placed last week." They are terrible at "which of these ten million paragraphs is most semantically similar to this question," because that requires comparing the query against every stored vector and ranking by distance. Doing that naively is a brute-force scan. Vector databases solve it with purpose-built index structures and approximate nearest-neighbor algorithms that skip the vast majority of comparisons.

How ANN search works

Approximate nearest-neighbor (ANN) search accepts a tiny loss of accuracy in exchange for enormous speed. The most common index, HNSW (Hierarchical Navigable Small World), builds a layered graph you can hop across to reach the right neighborhood without touching most vectors. Other approaches include IVF (inverted file indexes that cluster vectors first) and product quantization (compressing vectors to fit more in memory). The trade-off you tune is recall versus latency versus memory.

Where it fits in an AI stack

The dominant use case in 2026 is retrieval-augmented generation: documents are split into chunks, each chunk is embedded and stored, and at query time the most relevant chunks are retrieved and handed to a large language model as context. Beyond RAG, vector databases power semantic search, deduplication, recommendation engines, anomaly detection, and image similarity. Many also support metadata filtering, so you can combine "most similar" with "and tagged finance, written after January."

Dedicated vs. bolt-on

You do not always need a new piece of infrastructure. Postgres with the pgvector extension adds vector columns and ANN indexes to a database you may already run, which keeps your data in one place. Dedicated systems — Pinecone, Weaviate, Qdrant, Milvus, Chroma — earn their cost at large scale, high query volume, or when you need advanced filtering and horizontal sharding. The right answer depends on corpus size and traffic, not hype.

At QUANT LAB

When we build AI integration features, the vector store is a deliberate decision, not a default. We start by asking how big the corpus is, how often it changes, and how many queries per second the feature must serve. For most teams a Postgres-based index is the pragmatic starting point; we reach for a dedicated database only when the numbers demand it. Either way, the embedding model, chunking strategy, and data pipeline feeding it matter far more to quality than the brand on the box.

Long-form deep-dives that use this term

All posts

Related terms

Building semantic search or RAG?

We design retrieval pipelines that pick the right vector store for your scale — not the trendiest one. Book a 30-minute call.

AI integration services