What is a Vector Database?
A vector database is a system built to store high-dimensional vectors — the numeric fingerprints that machine-learning models produce for text, images, and audio — and to answer one specific question fast: given this query vector, which stored vectors are most similar? It is the engine behind semantic search, recommendations, and retrieval-augmented generation.
The core idea
A machine-learning model can turn a sentence, an image, or a product description into a list of a few hundred to a few thousand numbers — an embedding that captures meaning. Things that mean similar things end up close together in that high-dimensional space. A vector database exists to make use of that geometry: store millions of these vectors, then answer "find the closest ones to this query vector" in milliseconds. Distance is measured with cosine similarity, dot product, or Euclidean distance.
Why a regular database is not enough
Relational databases are excellent at exact matches, ranges, and joins: "all orders over $500 placed last week." They are terrible at "which of these ten million paragraphs is most semantically similar to this question," because that requires comparing the query against every stored vector and ranking by distance. Doing that naively is a brute-force scan. Vector databases solve it with purpose-built index structures and approximate nearest-neighbor algorithms that skip the vast majority of comparisons.
How ANN search works
Approximate nearest-neighbor (ANN) search accepts a tiny loss of accuracy in exchange for enormous speed. The most common index, HNSW (Hierarchical Navigable Small World), builds a layered graph you can hop across to reach the right neighborhood without touching most vectors. Other approaches include IVF (inverted file indexes that cluster vectors first) and product quantization (compressing vectors to fit more in memory). The trade-off you tune is recall versus latency versus memory.
Where it fits in an AI stack
The dominant use case in 2026 is retrieval-augmented generation: documents are split into chunks, each chunk is embedded and stored, and at query time the most relevant chunks are retrieved and handed to a large language model as context. Beyond RAG, vector databases power semantic search, deduplication, recommendation engines, anomaly detection, and image similarity. Many also support metadata filtering, so you can combine "most similar" with "and tagged finance, written after January."
Dedicated vs. bolt-on
You do not always need a new piece of infrastructure. Postgres with the pgvector extension adds vector columns and ANN indexes to a database you may already run, which keeps your data in one place. Dedicated systems — Pinecone, Weaviate, Qdrant, Milvus, Chroma — earn their cost at large scale, high query volume, or when you need advanced filtering and horizontal sharding. The right answer depends on corpus size and traffic, not hype.
At QUANT LAB
When we build AI integration features, the vector store is a deliberate decision, not a default. We start by asking how big the corpus is, how often it changes, and how many queries per second the feature must serve. For most teams a Postgres-based index is the pragmatic starting point; we reach for a dedicated database only when the numbers demand it. Either way, the embedding model, chunking strategy, and data pipeline feeding it matter far more to quality than the brand on the box.
Long-form deep-dives that use this term
All postsAdding AI Features to Your SaaS (2026)
Where AI helps, build-vs-API trade-offs, evals, guardrails, and shipping without torching margins.
Read postAPI Rate Limiting Strategies for 2026
Token bucket vs sliding window, per-key quotas, 429 semantics, and where to enforce limits.
Read postAPI Security Best Practices (2026)
Auth, rate limiting, input validation, secrets, and the OWASP API Top 10.
Read post
Related terms
Building semantic search or RAG?
We design retrieval pipelines that pick the right vector store for your scale — not the trendiest one. Book a 30-minute call.