How vector search works

Vector search relies on dense embeddings — high-dimensional float arrays where semantic similarity maps to geometric proximity.

The math

Given a query vector (q) and document vectors (d_1, d_2, \ldots, d_n), the cosine similarity is

$$\text{sim}(q, d_i) = \frac{q \cdot d_i}{|q| , |d_i|}$$

For unit-normalised vectors this collapses to the dot product, which modern hardware can chew through at hundreds of millions of comparisons per second.

Pipeline

flowchart LR Q[Query text] --> E[Embedder] E --> V[Query vector] V --> S[ANN index search] S --> R[Top-k results] R --> O[Reranker]

In code

import { embed } from './embeddings';
import { db } from './db';

async function search(query: string, k = 10) {
  const [vec] = await embed([query]);
  return db.execute(
    'SELECT id, text FROM docs ORDER BY embedding <=> $1 LIMIT $2',
    [vec, k],
  );
}

The <=> operator is pgvector's cosine-distance shorthand. For a 1M-row corpus, an HNSW index keeps this query under 10ms p95.

Markdown showcase

Math · code · Mermaid in one reply

How vector search works

The math

Pipeline

In code