Every LLM project starts with the same question: which vector database? The honest answer is that the choice usually matters less than whether the team can observe and operate what they pick. But some setups genuinely fit certain shapes of work better than others — and the wrong pick can quietly cap a project at the prototype stage.
After shipping Qdrant, pgvector, and Weaviate in production, we keep returning to the same short list of questions before choosing one for a new project.
Latency vs. freshness
Most teams ask about search latency first and about write throughput last. In practice, the reverse is often more important: if the corpus updates daily and indexes rebuild in batches, the stale-window users experience is the metric that matters — not p50 query time.
pgvector fits when the data already lives in Postgres and transactional consistency is non-negotiable. Qdrant fits when filtering plus vector search at scale is the primary need. Weaviate fits when first-class hybrid retrieval should come out of the box rather than be wired by hand.
Operational weight
Self-hosted clusters look cheaper on a spreadsheet. After three months of paging oncall at 2 AM, managed options start looking reasonable. Pick the one the team is willing to operate — not the one with the best benchmark page.
Continue reading