pgvector doesn't scale
It’s funny how engineers (myself included) assume that if you can store vectors in Postgres, you should. The logic feels sound: one database, one backup, one mental model. But the moment you hit scale, that convenience quietly turns into a trap.
In Alex Jacobs’s piece “The Case Against pgvector” he writes that you “pick an index type and then never rebalance, so recall can drift.” That line hit me because it captures the hidden friction: Postgres was built for structured queries, not high-dimensional vector search. Jacobs shows how building an index on millions of vectors can consume “10+ GB of RAM and hours of build time” on a production database.
Then comes the filtering trap. Jacobs points out that if you want “only published documents” combined with similarity search, the order of filtering matters. Filter before and it’s fast. Filter after and your query can take seconds instead of milliseconds. That gap is invisible in prototypes but painful in production.
The takeaway is clear. Convenience is not a strategy. If your vector workloads go beyond trivial, use a dedicated vector database. The single-system story looks tidy on a diagram but often costs you far more in latency and maintenance.
No spam, no sharing to third party. Only you and me.
Member discussion