Skip to main content

Overview

Context fetch is how your AI agent gets relevant context from stored memories. When your agent needs to know a user’s preferences, recall past conversations, or understand organizational context, it calls the retrieval API. Synap returns a structured ContextResponse containing facts, preferences, episodes, and emotions ranked by relevance to the query. Context fetch is designed to sit in your agent’s hot path — the fast mode targets 50-100ms latency, making it suitable for real-time conversation flows.

Conversation Context

The primary retrieval interface is sdk.conversation.context.fetch(). It returns context relevant to a specific conversation, enriched with memories from the user’s broader history.
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["project deadlines", "Q2 planning"],
    max_results=10,
    types=["facts", "preferences"],
    mode="fast"
)

print(f"Found {len(context.facts)} facts, {len(context.preferences)} preferences")

Parameter Reference

conversation_id
str
required
The identifier of the conversation to retrieve context for. Must be a valid UUID string — non-UUID strings will cause a ServiceUnavailableError. This scopes the retrieval to memories associated with this conversation plus broader user/customer context.
search_query
List[str]
One or more semantic search queries that drive retrieval. The queries are embedded and matched against stored memories using vector similarity. When multiple queries are provided, results are merged and re-ranked. If omitted, retrieval returns the most recent and relevant context without semantic filtering.
max_results
int
default:"10"
Maximum number of results to return across all types. The actual number returned may be less if fewer relevant results exist.
types
List[str]
Filter which memory types to include in the response. Valid values: "facts", "preferences", "episodes", "emotions", "temporal", "all". If omitted, all types are returned.
mode
str
default:"fast"
The retrieval mode. Controls the depth of search and ranking. See Retrieval Modes below.

Retrieval Modes

Synap offers two retrieval modes that trade off latency against comprehensiveness.
Aspectfastaccurate
Latency~50-100ms~200-500ms
Search methodVector similarity onlyVector + graph traversal + re-ranking
Best forReal-time chat, low-latency requirementsComplex queries, relationship-aware context
Relationship awarenessLimitedFull entity-relationship traversal
RankingCosine similarityMulti-signal ranking (similarity + recency + graph centrality)
Start with fast mode. Switch to accurate when you need relationship-aware context, such as queries that span multiple entities (“What did Alice say about the project Bob is leading?”).

When to Use Each Mode

Use fast when...

  • You are in a real-time conversation flow
  • The query is about a single topic or entity
  • Latency budget is under 100ms
  • You are retrieving frequently (e.g., every turn)

Use accurate when...

  • The query involves relationships between entities
  • You need context spanning multiple conversations
  • You are building a comprehensive summary
  • Latency budget allows 200-500ms

Response Structure

The ContextResponse object contains structured memory types and metadata.
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["dietary preferences"],
    mode="fast"
)

Facts

Facts are discrete, verified pieces of information extracted from memories.
for fact in context.facts:
    print(f"[{fact.confidence:.0%}] {fact.content}")
    print(f"  Source: {fact.source}")
    print(f"  Extracted: {fact.extracted_at}")
id
UUID
required
Unique identifier for this fact.
content
str
required
The fact content in natural language (e.g., “User is a vegetarian”).
confidence
float
required
Confidence score between 0.0 and 1.0. Higher values indicate stronger evidence from source material.
source
str
required
Identifier of the memory this fact was extracted from.
extracted_at
datetime
required
Timestamp when this fact was extracted.
metadata
dict
Additional metadata about the extraction.

Preferences

Preferences capture user likes, dislikes, and stated preferences.
for pref in context.preferences:
    print(f"{pref.content} (confidence: {pref.confidence:.0%})")
Preferences follow the same field structure as facts, with content expressing the preference in natural language (e.g., “Prefers boutique hotels over large chains”).

Episodes

Episodes represent summarized narrative segments from past interactions.
for episode in context.episodes:
    print(f"Episode: {episode.content}")
    print(f"  From: {episode.source}")

Emotions

Emotions capture detected emotional states and sentiment from interactions.
for emotion in context.emotions:
    print(f"Emotion: {emotion.content} ({emotion.confidence:.0%})")

Response Metadata

Every ContextResponse includes metadata about the retrieval operation.
meta = context.metadata
print(f"Correlation ID: {meta.correlation_id}")
print(f"Source: {meta.source}")          # "cache" or "cloud"
print(f"TTL: {meta.ttl_seconds}s")
print(f"Compaction applied: {meta.compaction_applied}")
correlation_id
str
required
Unique ID for this retrieval request. Log this for debugging and support inquiries.
ttl_seconds
int
required
How long this response is valid in the local cache. After this duration, the SDK will fetch fresh results from Synap Cloud.
source
str
required
Whether this response was served from the local "cache" or fetched from "cloud". Cached responses are faster but may be slightly stale.
compaction_applied
bool
required
Whether context compaction was applied to the response. If true, the context has been compressed to fit within token budgets.

Scoped Retrieval

In addition to conversation-level retrieval, Synap provides scope-specific interfaces for retrieving memories at the user, customer, and client levels.

User Context

Retrieve all memories scoped to a specific user, across all their conversations.
user_context = await sdk.user.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["travel preferences"],
    max_results=20,
    mode="accurate"
)

# Returns facts, preferences, episodes, emotions
# scoped to the user associated with this conversation

Customer Context

Retrieve memories shared across all users within a customer (organization/tenant).
customer_context = await sdk.customer.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["engineering team OKRs"],
    max_results=15,
    mode="accurate"
)

Client Context

Retrieve memories at the broadest scope — across all customers and users within your Synap client.
client_context = await sdk.client.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["product roadmap"],
    max_results=10,
    mode="fast"
)
Scoped retrieval respects Synap’s scope hierarchy. User context includes user-scoped memories. Customer context includes customer-scoped memories visible to all users in that customer. Client context includes client-wide memories. Higher scopes never leak memories from narrower scopes unless those memories were explicitly created at the broader scope. See Memory Scopes for details.

The types Filter

Use the types parameter to retrieve only specific memory types. This reduces response size and processing time when you only need certain kinds of context.
# Only retrieve facts and preferences
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["facts", "preferences"],
    mode="fast"
)

# Only retrieve temporal/episode context
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["episodes", "temporal"],
    mode="accurate"
)

# Explicitly request all types (same as omitting the parameter)
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    types=["all"],
    mode="fast"
)

Search Queries

The search_query parameter drives semantic search. Queries are embedded using Synap’s embedding model and matched against stored memory vectors.

Single Query

context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=["What are the user's dietary restrictions?"]
)

Multiple Queries

When you provide multiple queries, Synap runs each independently and merges the results with deduplication and re-ranking.
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    search_query=[
        "dietary preferences and restrictions",
        "favorite restaurants and cuisines",
        "food allergies"
    ],
    max_results=15
)
Use multiple queries to broaden recall when a single query might miss relevant memories. For example, a user asking “What should I eat?” might benefit from queries about dietary preferences, allergies, and favorite cuisines simultaneously.

No Query (Recency-Based)

When search_query is omitted, retrieval returns the most recent and contextually relevant memories for the conversation without semantic filtering.
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    max_results=5
)

The .raw Property

For forward compatibility with future Synap API changes, every response object exposes a .raw property containing the unprocessed API response as a dictionary.
context = await sdk.conversation.context.fetch(
    conversation_id="3f6b1a2c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
    mode="fast"
)

# Access raw response for forward compatibility
raw = context.raw
print(raw.keys())
The .raw property is useful when Synap adds new fields to the API response that have not yet been mapped to typed SDK properties. You can access new fields immediately without waiting for an SDK update.

Full Example: System Prompt Injection

The most common use case for context fetch is injecting contextual memories into your LLM’s system prompt.
import json
from openai import AsyncOpenAI

openai = AsyncOpenAI()

async def chat_with_memory(conversation_id: str, user_message: str):
    # Fetch relevant context
    context = await sdk.conversation.context.fetch(
        conversation_id=conversation_id,
        search_query=[user_message],
        max_results=10,
        mode="fast"
    )

    # Build memory context string
    memory_lines = []

    if context.facts:
        memory_lines.append("## Known Facts")
        for fact in context.facts:
            if fact.confidence >= 0.7:
                memory_lines.append(f"- {fact.content}")

    if context.preferences:
        memory_lines.append("\n## User Preferences")
        for pref in context.preferences:
            memory_lines.append(f"- {pref.content}")

    if context.episodes:
        memory_lines.append("\n## Relevant Past Interactions")
        for episode in context.episodes:
            memory_lines.append(f"- {episode.content}")

    memory_context = "\n".join(memory_lines) if memory_lines else "No prior context available."

    # Inject into system prompt
    system_prompt = f"""You are a helpful assistant with access to the user's memory.

Use the following context to personalize your responses. Do not mention
that you are reading from a memory system.

{memory_context}"""

    # Call LLM
    response = await openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message}
        ]
    )

    assistant_message = response.choices[0].message.content

    # Ingest the new conversation turn for future memory
    await sdk.memories.create(
        document=f"User: {user_message}\nAssistant: {assistant_message}",
        document_type="ai-chat-conversation",
        user_id="user_12345",
        mode="long-range"
    )

    return assistant_message
For long-running conversations, combine context.fetch() with get_context_for_prompt() for a hybrid approach: compacted history provides broad context while retrieval provides query-specific details. See the Context Compaction guide for examples of this pattern.

Next Steps

Entity Resolution

Learn how entities are resolved to improve retrieval quality.

Context Compaction

Compress context to reduce LLM token costs.

Ingestion

Feed more data into Synap to enrich retrieval results.

Memory Scopes

Understand how scope filtering affects retrieval boundaries.