Context Fetch - Maximem Synap

Overview

Context fetch is how your AI agent gets relevant context from stored memories. When your agent needs to know a user’s preferences, recall past conversations, or understand organizational context, it calls the retrieval API. Synap returns a structured ContextResponse containing facts, preferences, episodes, and emotions ranked by relevance to the query. Context fetch is designed to sit in your agent’s hot path: the fast mode is lower-latency than accurate, making it suitable for real-time conversation flows.

A brand-new conversation (or a user/scope with nothing ingested yet) returns an empty ContextResponse, not an error. The typed lists (facts, preferences, episodes, emotions, temporal_events) come back empty. Check for emptiness rather than catching an exception for the cold-start case. ContextNotFoundError is reserved for a context resource that is genuinely missing/removed, and a malformed (non-UUID) conversation_id raises InvalidInputError. See Error Handling for the per-method breakdown.

Conversation Context

The primary retrieval interface is sdk.conversation.context.fetch(). It returns context relevant to a specific conversation, enriched with memories from the user’s broader history.

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["project deadlines", "Q2 planning"],
    max_results=10,
    types=["facts", "preferences"],
    mode="fast"
)

print(f"Found {len(context.facts)} facts, {len(context.preferences)} preferences")

Key parameters

conversation_id is the only required argument; it must be a valid UUID string (generate one with str(uuid.uuid4()) or reuse a UUID you already manage). The remaining arguments shape retrieval: search_query (one or more semantic queries, merged and re-ranked when you pass several), max_results (default 10), mode (fast or accurate; see Retrieval Modes below), and types. There is also an optional precision_level ("high", the default, or "medium"): "medium" gives you faster responses with less precisely filtered results — recall isn’t impacted. See Precision level below. For the types filter on conversation.context.fetch(), the accepted string values are the plural forms plus "all": "facts", "preferences", "episodes", "emotions", "temporal", and "all". Omitting types returns every type.

Full parameter reference → Every argument, including the recommended user_id / customer_id scoping hints, is documented field-by-field in conversation.context.fetch.

Retrieval Modes

Synap offers two retrieval modes that trade off latency against comprehensiveness.

Aspect	`fast`	`accurate`
Search method	Vector + graph (no LLM subquery decomposition)	Vector + graph + LLM subquery decomposition + reranking
Best for	Real-time chat, low-latency requirements	Complex queries, relationship-aware context
Ranking	Similarity + graph signals	Multi-signal ranking (similarity + recency + graph centrality + LLM rerank)

Start with fast mode. Switch to accurate when you need relationship-aware context, such as queries that span multiple entities (“What did Alice say about the project Bob is leading?”).

Retrieval mode values (fast / accurate) are distinct from ingestion mode values (fast / long-range). They control different stages of the pipeline and are not interchangeable: passing "long-range" to context.fetch() or "accurate" to memories.create() will be rejected.

When to Use Each Mode

Use fast when...

You are in a real-time conversation flow
The query is about a single topic or entity
You need the lowest-latency retrieval path
You are retrieving frequently (e.g., every turn)

Use accurate when...

The query involves relationships between entities
You need context spanning multiple conversations
You are building a comprehensive summary
You can afford additional latency for LLM-driven query decomposition and reranking

Precision level

Every context fetch also accepts an optional precision_level parameter that controls how tightly results are filtered before they are returned.

`precision_level`	Behavior
`high`	Results go through an additional relevance-refinement pass before being returned. Default.
`medium`	Skips the refinement pass for faster responses. Recall isn’t impacted — the same candidate memories are searched — but outputs are less precisely filtered.

precision_level is independent of both mode axes — combine it with either fast or accurate retrieval (it does not apply to ingestion). Passing any value other than "high" or "medium" raises InvalidInputError. For real latency on your instance, see Dashboard → Usage.

Keep high for most integrations; drop to medium on latency-critical hot paths where a few extra loosely-related items are acceptable.

Response Structure

The ContextResponse object contains structured memory types and metadata.

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["dietary preferences"],
    mode="fast"
)

A ContextResponse is a bag of typed memory items: facts, preferences, episodes, emotions, and temporal_events, plus a metadata object. Each list is empty when nothing relevant was found, so iterate defensively. The examples below show the fields you reach for most often; the complete Pydantic signatures for every type live in the reference.

Full response reference → Response Shapes is the single source of truth for every field on Fact, Preference, Episode, Emotion, TemporalEvent, ContextResponse, and ResponseMetadata.

Facts

Facts are discrete, verified pieces of information extracted from memories.

for fact in context.facts:
    print(f"[{fact.confidence:.0%}] {fact.content}")
    print(f"  Source: {fact.source}")
    print(f"  Extracted: {fact.extracted_at}")

Preferences

Preferences capture user likes, dislikes, and stated preferences. Note that the certainty signal on a Preference is strength (not confidence as on Fact), and the text is in content grouped by category.

for pref in context.preferences:
    print(f"[{pref.category}] {pref.content} (strength: {pref.strength:.0%})")

Episodes

Episodes represent summarized narrative segments from past interactions. The narrative text is in summary (not content), ranked by significance.

for episode in context.episodes:
    print(f"[{episode.significance:.0%}] {episode.summary}")
    print(f"  Occurred: {episode.occurred_at}")
    print(f"  Participants: {', '.join(episode.participants)}")

Emotions

Emotions capture detected emotional states and sentiment from interactions, scored by intensity.

for emotion in context.emotions:
    print(f"{emotion.emotion_type} (intensity: {emotion.intensity:.0%})")
    print(f"  Triggered by: {emotion.context}")
    print(f"  Detected: {emotion.detected_at}")

Temporal Events

Time-bound events with explicit start and (optionally) end markers, e.g., “user’s subscription renews on 2026-08-12”.

for event in context.temporal_events:
    print(f"{event.content} ({event.temporal_category})")
    print(f"  Valid: {event.event_date} → {event.valid_until or 'open-ended'}")

Response Metadata

Every ContextResponse includes metadata about the retrieval operation.

meta = context.metadata
print(f"Correlation ID: {meta.correlation_id}")
print(f"Source: {meta.source}")              # "cache", "cloud", or "anticipation"
print(f"TTL: {meta.ttl_seconds}s")
if meta.compaction_applied is not None:
    print(f"Compaction applied: {meta.compaction_applied.value}")  # e.g. "adaptive"

Log metadata.correlation_id for debugging and support inquiries. metadata.source tells you where the response came from ("cache", "cloud", or "anticipation"), and metadata.ttl_seconds is how long the local cache treats it as fresh.

metadata.compaction_applied is not a boolean. It is None when no compaction ran, or a CompactionLevel enum value when one did, so always test with if meta.compaction_applied is not None rather than a truthiness check. The strategy-named members (adaptive, aggressive, balanced, conservative) are the ones you’ll typically see on retrieval. See CompactionLevel for the full enum.

Scoped Retrieval

In addition to conversation-level retrieval, Synap provides scope-specific interfaces for retrieving memories at the user, customer, and client levels.

User Context

Retrieve all memories scoped to a specific user, across all their conversations.

# `user_id` is required. `conversation_id` is optional: pass it if you want
# the relevance ranking to bias toward memories from that conversation.
user_context = await sdk.user.context.fetch(
    user_id="user_alice",
    customer_id="acme_corp",                                 # required on B2B instances
    search_query=["travel preferences"],
    max_results=20,
    mode="accurate",
)

# Returns facts, preferences, episodes, emotions, temporal_events
# scoped to user_alice.

Customer Context

Retrieve memories shared across all users within a customer (organization/tenant).

# `customer_id` is required for customer-scoped retrieval.
customer_context = await sdk.customer.context.fetch(
    customer_id="acme_corp",
    search_query=["engineering team OKRs"],
    max_results=15,
    mode="accurate",
)

Client Context

Retrieve memories at the broadest scope, across all customers and users within your Synap client.

client_context = await sdk.client.context.fetch(
    search_query=["product roadmap"],
    max_results=10,
    mode="fast",
)

Scoped retrieval respects Synap’s scope hierarchy. User context includes user-scoped memories. Customer context includes customer-scoped memories visible to all users in that customer. Client context includes client-wide memories. Higher scopes never leak memories from narrower scopes unless those memories were explicitly created at the broader scope. See Memory Scopes for details.

Full parameter reference → Each scoped method has its own reference page with the complete argument list and per-scope types values: user.context.fetch, customer.context.fetch, and client.context.fetch.

The `types` Filter

Use the types parameter to retrieve only specific memory types. This reduces response size and processing time when you only need certain kinds of context.

# Only retrieve facts and preferences
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["facts", "preferences"],
    mode="fast"
)

# Only retrieve temporal/episode context
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["episodes", "temporal"],
    mode="accurate"
)

# Explicitly request all types (same as omitting the parameter)
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["all"],
    mode="fast"
)

Search Queries

The search_query parameter drives semantic search. Synap matches your queries against stored memories using both semantic similarity and graph relationships.

Single Query

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["What are the user's dietary restrictions?"]
)

Multiple Queries

When you provide multiple queries, Synap runs each independently and merges the results with deduplication and re-ranking.

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=[
        "dietary preferences and restrictions",
        "favorite restaurants and cuisines",
        "food allergies"
    ],
    max_results=15
)

Use multiple queries to broaden recall when a single query might miss relevant memories. For example, a user asking “What should I eat?” might benefit from queries about dietary preferences, allergies, and favorite cuisines simultaneously.

No Query (Recency-Based)

When search_query is omitted, retrieval returns the most recent and contextually relevant memories for the conversation without semantic filtering.

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    max_results=5
)

The `.raw` Property

For forward compatibility with future Synap API changes, every response object exposes a .raw property containing the unprocessed API response as a dictionary.

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    mode="fast"
)

# Access raw response for forward compatibility
raw = context.raw
print(raw.keys())

The .raw property is useful when Synap adds new fields to the API response that have not yet been mapped to typed SDK properties. You can access new fields immediately without waiting for an SDK update.

Full Example: System Prompt Injection

The most common use case for context fetch is injecting contextual memories into your LLM’s system prompt.

import json
from openai import AsyncOpenAI

openai = AsyncOpenAI()

async def chat_with_memory(conversation_id: str, user_message: str):
    # Fetch relevant context
    context = await sdk.conversation.context.fetch(
        conversation_id=conversation_id,
        search_query=[user_message],
        max_results=10,
        mode="fast",
        precision_level="medium",  # hot path: skip the refinement pass for a faster response
    )

    # Build memory context string
    memory_lines = []

    if context.facts:
        memory_lines.append("## Known Facts")
        for fact in context.facts:
            if fact.confidence >= 0.7:
                memory_lines.append(f"- {fact.content}")

    if context.preferences:
        memory_lines.append("\n## User Preferences")
        for pref in context.preferences:
            memory_lines.append(f"- {pref.content}")

    if context.episodes:
        memory_lines.append("\n## Relevant Past Interactions")
        for episode in context.episodes:
            memory_lines.append(f"- {episode.summary}")

    memory_context = "\n".join(memory_lines) if memory_lines else "No prior context available."

    # Inject into system prompt
    system_prompt = f"""You are a helpful assistant with access to the user's memory.

Use the following context to personalize your responses. Do not mention
that you are reading from a memory system.

{memory_context}"""

    # Call LLM
    response = await openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message}
        ]
    )

    assistant_message = response.choices[0].message.content

    # Ingest the new conversation turn for future memory
    await sdk.memories.create(
        document=f"User: {user_message}\nAssistant: {assistant_message}",
        document_type="ai-chat-conversation",
        user_id="user_12345",
        customer_id="acme_corp",
        mode="long-range",
    )

    return assistant_message

For long-running conversations, combine context.fetch() with get_context_for_prompt() for a hybrid approach: compacted history provides broad context while retrieval provides query-specific details. See the Context Compaction guide for examples of this pattern.

Next Steps

Entity Resolution

Learn how entities are resolved to improve retrieval quality.

Context Compaction

Compress context to reduce LLM token costs.

Ingestion

Feed more data into Synap to enrich retrieval results.

Memory Scopes

Understand how scope filtering affects retrieval boundaries.

​Overview

​Conversation Context

​Key parameters

​Retrieval Modes

​When to Use Each Mode

Use fast when...

Use accurate when...

​Precision level

​Response Structure

​Facts

​Preferences

​Episodes

​Emotions

​Temporal Events

​Response Metadata

​Scoped Retrieval

​User Context

​Customer Context

​Client Context

​The types Filter

​Search Queries

​Single Query

​Multiple Queries

​No Query (Recency-Based)

​The .raw Property

​Full Example: System Prompt Injection

​Next Steps

Entity Resolution

Context Compaction

Ingestion

Memory Scopes

Overview

Conversation Context

Key parameters

Retrieval Modes

When to Use Each Mode

Precision level

Response Structure

Facts

Preferences

Episodes

Emotions

Temporal Events

Response Metadata

Scoped Retrieval

User Context

Customer Context

Client Context

The `types` Filter

Search Queries

Single Query

Multiple Queries

No Query (Recency-Based)

The `.raw` Property

Full Example: System Prompt Injection

Next Steps