Skip to main content

Overview

Every Rivestack database comes with pgvector pre-installed. pgvector adds vector similarity search to PostgreSQL, letting you store embeddings from OpenAI, Cohere, Hugging Face, or any other model alongside your relational data.
pgvector is enabled by default on all Rivestack databases — no installation or configuration needed.

Enable the extension

pgvector is pre-installed, but you need to enable it in your database:
CREATE EXTENSION IF NOT EXISTS vector;
Verify it’s active:
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';

Store embeddings

Create a table with a vector column

CREATE TABLE documents (
    id BIGSERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding VECTOR(1536)  -- OpenAI ada-002 dimensions
);
The VECTOR(n) type stores n-dimensional vectors. Common dimension sizes:
ModelDimensions
OpenAI text-embedding-ada-0021536
OpenAI text-embedding-3-small1536
OpenAI text-embedding-3-large3072
Cohere embed-english-v3.01024
BGE-small-en384

Insert vectors

INSERT INTO documents (content, embedding)
VALUES (
    'Rivestack is a managed PostgreSQL service',
    '[0.1, 0.2, 0.3, ...]'  -- your embedding vector
);

From Python with OpenAI

import psycopg2
import openai

# Generate embedding
response = openai.embeddings.create(
    model="text-embedding-ada-002",
    input="Rivestack is a managed PostgreSQL service"
)
embedding = response.data[0].embedding

# Store in Rivestack
conn = psycopg2.connect("postgresql://user:pass@host:5432/mydb?sslmode=require")
cur = conn.cursor()
cur.execute(
    "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
    ("Rivestack is a managed PostgreSQL service", str(embedding))
)
conn.commit()

Query vectors

Find the 5 most similar documents using cosine distance:
SELECT id, content, embedding <=> '[0.1, 0.2, 0.3, ...]' AS distance
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'
LIMIT 5;

Distance operators

OperatorDistance MetricUse Case
<->L2 (Euclidean)General purpose
<=>CosineText embeddings (most common)
<#>Inner productWhen vectors are normalized

Create an index

For tables with more than a few thousand rows, create an index to speed up queries:
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
HNSW provides fast, approximate nearest-neighbor search with good recall. Tuning parameters:
ParameterDefaultDescription
m16Max connections per node. Higher = better recall, more memory.
ef_construction64Build-time search width. Higher = better recall, slower builds.
Set the search parameter at query time:
SET hnsw.ef_search = 100;  -- default is 40

IVFFlat index

CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
IVFFlat requires a training step and works best when you set lists to rows / 1000 for tables up to 1M rows.
SET ivfflat.probes = 10;  -- default is 1, higher = better recall

Which index to use?

HNSWIVFFlat
RecallHigherLower
Build speedSlowerFaster
Query speedFastFast
MemoryMoreLess
RecommendationDefault choiceLarge datasets with limited memory

RAG example

A complete retrieval-augmented generation (RAG) pattern:
import psycopg2
import openai

def search(query: str, top_k: int = 5):
    # 1. Generate embedding for the query
    response = openai.embeddings.create(
        model="text-embedding-ada-002",
        input=query
    )
    query_embedding = response.data[0].embedding

    # 2. Search Rivestack for similar documents
    conn = psycopg2.connect("postgresql://user:pass@host:5432/mydb?sslmode=require")
    cur = conn.cursor()
    cur.execute("""
        SELECT content, embedding <=> %s::vector AS distance
        FROM documents
        ORDER BY embedding <=> %s::vector
        LIMIT %s
    """, (str(query_embedding), str(query_embedding), top_k))

    results = cur.fetchall()
    conn.close()
    return results

# 3. Use results as context for your LLM
results = search("How does Rivestack handle backups?")
context = "\n".join([row[0] for row in results])

Performance tips

  • Always create an index for tables over a few thousand rows. Without an index, pgvector scans every row.
  • Use HNSW as your default index type. It offers the best recall/speed trade-off.
  • Increase ef_search if recall is too low. Start with 100 and increase as needed.
  • Batch inserts for loading large datasets. Use COPY or multi-row INSERT for best throughput.
  • Choose the right distance operator for your use case. Cosine (<=>) is the most common for text embeddings.