Nov 03, 2025

pgvector doesn't scale

It’s funny how engineers (myself included) assume that if you can store vectors in Postgres, you should. The logic feels sound: one database, one backup, one mental model. But the moment you hit scale, that convenience quietly turns into a trap.

In Alex Jacobs’s piece “The Case Against pgvector” he writes that you “pick an index type and then never rebalance, so recall can drift.” That line hit me because it captures the hidden friction: Postgres was built for structured queries, not high-dimensional vector search. Jacobs shows how building an index on millions of vectors can consume “10+ GB of RAM and hours of build time” on a production database.

Then comes the filtering trap. Jacobs points out that if you want “only published documents” combined with similarity search, the order of filtering matters. Filter before and it’s fast. Filter after and your query can take seconds instead of milliseconds. That gap is invisible in prototypes but painful in production.

The takeaway is clear. Convenience is not a strategy. If your vector workloads go beyond trivial, use a dedicated vector database. The single-system story looks tidy on a diagram but often costs you far more in latency and maintenance.

https://alex-jacobs.com/posts/the-case-against-pgvector/

Subscribe to The AI Engineering Brief

No spam, no sharing to third party. Only you and me.

pgvector doesn't scale

by Anup Jadhav

Member discussion

More like this

TIL: How Transformers work

Fine-Tuning makes a comeback

How to read research papers

Choosing an LLM Is Choosing a World-View

Reasoning LLMs are wanderers rather than systematic explorers

Horizon Length - Moore's law for AI Agents