Vector Databases Explained: What They Are and When to Use Them
Your Regular Database Can't Do What a Vector Database Does
Most developers hit this wall when building their first AI feature. You have embeddings โ high-dimensional float arrays representing semantic meaning โ and you need to find the ones closest to a query embedding. You try storing them in Postgres. The first query works. Then you have 50,000 vectors and similarity search takes four seconds. Then 500,000 and it times out entirely.
That's not a Postgres problem. That's a fundamental mismatch between what relational databases are built for and what semantic search requires. Vector databases exist to solve exactly this problem.
This post explains what vector databases actually are under the hood, how they differ from regular databases, and when you actually need one versus when pgvector is enough.
๐ฏ Quick Answer (30-Second Read)
- What it is: A database optimized for storing and searching high-dimensional vectors using approximate nearest-neighbor (ANN) algorithms
- When to use it: Any AI feature doing semantic search, RAG, recommendation, or similarity matching at scale
- Main benefit: Millisecond similarity search across millions of vectors โ something a relational DB cannot do efficiently
- Main limitation: Not a replacement for your primary database โ vectors need metadata stored alongside them
- Recommendation: Start with pgvector on Supabase, migrate to a dedicated vector DB when query latency becomes a measurable problem
What a Vector Database Actually Is
A vector database is a storage and retrieval system built specifically around one operation: given a query vector, find the N vectors in the database most similar to it. Everything about the architecture โ the indexing algorithms, the storage layout, the query engine โ is optimized for this single operation.
Similarity is measured as distance in high-dimensional space. The most common metrics are cosine similarity (angle between vectors, used for text embeddings) and Euclidean distance (straight-line distance, used for image embeddings). Two pieces of text with similar meaning will have embeddings that are close together by these measures. Two unrelated pieces of text will be far apart.
The hard problem is doing this search fast. A brute-force approach โ compare your query vector against every vector in the database โ is called exact nearest-neighbor search. It is perfectly accurate but scales as O(n). At a million vectors, it is unusably slow.
How Vector Indexes Work
Vector databases solve the speed problem with Approximate Nearest-Neighbor (ANN) indexes. These trade a small amount of accuracy for massive speed gains.
The two most common index types are:
HNSW (Hierarchical Navigable Small World) builds a multi-layer graph where each node connects to its nearest neighbors. At query time, the algorithm navigates the graph starting from a high-level entry point and progressively refines the search. HNSW has excellent query speed and recall but is memory-intensive and slow to build.
IVF (Inverted File Index) clusters vectors into buckets using k-means. At query time, only the nearest clusters are searched rather than the entire dataset. IVF is faster to build and uses less memory than HNSW but has lower recall if the cluster count is not tuned well.
Most production vector databases expose both options. HNSW is the default recommendation for most use cases.
The Right Tool vs The Wrong Tool
The right approach is matching the tool to your scale and query pattern. If you are building a RAG pipeline with under 100,000 documents and your team already runs Postgres, pgvector is the right answer. It gives you vector search inside your existing database with no new infrastructure. You get SQL joins between your vectors and metadata, familiar tooling, and Supabase hosts it with zero ops overhead.
When your query latency starts degrading โ typically around 500,000 to 1 million vectors with complex filters โ that is the signal to evaluate a dedicated vector database. Not before.
The wrong approach is architecting for scale you do not have. Developers who add Pinecone or Weaviate to a project with 10,000 vectors are paying operational and financial overhead for no measurable benefit. A vector database is infrastructure. Infrastructure has a maintenance cost. Do not add it until the problem it solves is real.
The other wrong approach is treating your vector database as your only database. Vector databases store vectors and metadata. They do not replace your relational database for transactional data, user records, billing, or anything requiring ACID guarantees. Your stack needs both โ the question is whether the vector layer needs to be a dedicated service.
My Take
The reason vector databases feel confusing to most developers is that they look like databases but behave like search indexes โ and most developers have never had to think carefully about how search indexes work. The best outcome is a team that treats vector search as a first-class infrastructure concern: they measure recall, they monitor query latency, they re-index when their embedding model changes. The worst outcome is a RAG pipeline where someone picked Pinecone on day one because it was in a tutorial, never set up recall metrics, and is now paying $400/month for a service they could run on pgvector with better observability. Right now, the industry is converging toward hybrid stores โ Postgres extensions like pgvector closing the gap with dedicated services for most workloads, while dedicated vector DBs push into multi-modal and billion-scale territory. Where this is heading: the distinction between vector database and operational database will largely disappear for sub-100M vector workloads within two years. Build for the problem you have today.
Comparison Table
| Tool | Type | Best For | Scaling Ceiling | Hosted Option |
|---|---|---|---|---|
| pgvector | Postgres extension | Early-stage, existing Postgres stack | ~1M vectors | Supabase |
| Pinecone | Managed vector DB | Teams wanting zero ops | Very high | Yes (only) |
| Qdrant | Dedicated vector DB | Performance + filtering + self-host | Very high | Yes + self-host |
| Weaviate | Dedicated vector DB | Multi-modal, GraphQL API | Very high | Yes + self-host |
| Chroma | Embedded vector DB | Local dev, prototyping | Low | No |
| Redis VSS | In-memory vector search | Low-latency, real-time retrieval | Medium | Yes |
Real Developer Use Case
A developer building a documentation search feature for a B2B SaaS started with pgvector on Supabase. The corpus was 80,000 document chunks. Average query latency at launch: 180ms. Acceptable.
Eighteen months later, the corpus grew to 1.2 million chunks across multiple customers. Query latency climbed to 1.4 seconds. The fix was not switching to Pinecone โ it was adding an HNSW index in pgvector with the right ef_construction and m parameters, and filtering by customer_id before the vector search to reduce the search space per query.
Result: latency back to 220ms with no new infrastructure. The lesson is that most vector search performance problems are index tuning problems, not database selection problems. Know your index before you change your stack.
Frequently Asked Questions
What is the difference between a vector database and a traditional database?
A traditional relational database is optimized for exact lookups, joins, and transactional writes. A vector database is optimized for approximate similarity search over high-dimensional float arrays. They solve different problems. Most production AI applications need both โ a relational database for structured data and a vector layer for semantic retrieval.
Do I need a vector database for RAG?
Not necessarily. If your document corpus is under 500,000 chunks, pgvector on Postgres handles RAG retrieval well with proper indexing. Dedicated vector databases like Qdrant or Pinecone make sense when you need sub-10ms latency at millions of vectors, multi-tenant isolation at scale, or advanced filtering that Postgres cannot express efficiently.
What is HNSW and why does it matter?
HNSW (Hierarchical Navigable Small World) is the index algorithm that makes fast approximate nearest-neighbor search possible. It builds a graph structure over your vectors so similarity queries navigate to the right neighborhood without scanning every vector. Most vector databases and pgvector support HNSW. Tuning m (number of connections) and ef_construction (build accuracy) is the most impactful performance lever you have.
Is Pinecone worth the cost?
Pinecone is expensive relative to self-hosted options but eliminates all operational overhead. It makes sense for teams without infrastructure expertise who need a production-ready vector search service immediately. For teams comfortable with Postgres or running their own containers, pgvector or Qdrant will give better cost efficiency with comparable performance at most scales.
Can a vector database store regular data too?
Vector databases support metadata fields alongside vectors โ strings, numbers, booleans โ which you can filter on during search. But they are not designed for complex relational queries, transactions, or schema evolution. Use them for what they are built for: vector storage and similarity retrieval. Keep your source-of-truth data in a relational database.
Conclusion
Vector databases exist because similarity search over high-dimensional embeddings is a fundamentally different problem from anything a relational database was designed to solve. If you are building an AI feature that needs semantic search โ RAG, recommendations, duplicate detection, image similarity โ you need a vector layer in your stack.
Start with pgvector. It handles more than most developers expect. Add a dedicated vector database when latency data tells you to, not when a tutorial tells you to. Understand your index before you change your infrastructure.
Related reads: RAG Explained: How AI Apps Answer Questions on Your Data ยท How Developers Use AI to Build Apps Faster ยท How to Create a SaaS with Next.js and Supabase