RAG over User History

You don’t always need an entity graph and typed extractions. Sometimes you just want “given a query, return the most relevant chunks of this user’s past conversations.” Synap does this too — its ContextResponse is queryable like a vector store, with the bonus that retrieval is automatically scoped to the user.

from maximem_synap import MaximemSynapSDK, ContextType

sdk = MaximemSynapSDK()
await sdk.initialize()

async def search_user_history(user_id: str, customer_id: str, query: str, top_k: int = 8):
    """Pure RAG: return raw memory chunks ranked by relevance to `query`."""
    ctx = await sdk.user.context.fetch(
        user_id=user_id,
        customer_id=customer_id,
        search_query=[query],
        max_results=top_k,
        types=[ContextType.FACTS, ContextType.EPISODES],
        mode="accurate",                # graph-aware ranking
    )

    # Sort facts and episodes by relevance using their respective scores
    hits = (
        [(f.confidence, "fact", f.content) for f in ctx.facts] +
        [(e.significance, "episode", e.summary) for e in ctx.episodes]
    )
    hits.sort(key=lambda h: h[0], reverse=True)
    return hits[:top_k]


async def rag_chat(user_id: str, customer_id: str, user_message: str) -> str:
    hits = await search_user_history(user_id, customer_id, user_message)

    citations = "\n".join(f"[{i+1}] ({kind}) {text}" for i, (score, kind, text) in enumerate(hits))
    system_prompt = (
        "Answer the user's question using ONLY the citations below. "
        "Cite by number. If the answer isn't in the citations, say you don't know.\n\n"
        f"Citations:\n{citations}"
    )

    completion = await openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message},
        ],
    )
    return completion.choices[0].message.content

Why this works without your own vector DB

Synap embeds every ingested document at ingest time and stores the vectors in its vector engine.
mode="accurate" combines vector similarity with graph traversal — so a query about “the project Bob is leading” pulls memories about Bob even if Bob isn’t named in the query verbatim.
All retrievals are scoped — you cannot accidentally fetch another user’s data.

When this isn’t enough If you have non-conversational sources (PDFs, knowledge base articles, web pages) that you want to retrieve over, you have two options:

Ingest them into Synap with document_type="document" at the CUSTOMER or CLIENT scope. They become part of the same retrieval surface as user history.
Keep a separate vector DB for the document corpus and merge results at the application layer. Use Synap for memory only.

Option 1 is simpler and gets you scope-aware retrieval for free. Option 2 makes sense if you already have a doc corpus indexed and don’t want to re-ingest. Comparing to a raw vector DB

	Pure pgvector / Pinecone	Synap
Setup	DB + embed model + retrieval pipeline	`pip install maximem-synap`
Per-user isolation	You build it	Native (`user_id` scope)
Multi-source merge (chat + docs)	Manual	Native
Re-ranking by recency / significance	Manual	Built in
Cost	Pay for compute + storage + embedding API	Per-call pricing

If “I just want a vector DB” is genuinely all you want, pgvector is cheaper. If you’re going to end up building scope isolation, multi-source merge, and reranking on top of pgvector, you’re rebuilding Synap.

Getting Started

Setup & Integration

SDK

Guides

Cookbook

Concepts

Dashboard

API Reference

Migration

Roadmap

Resources