Overview
Context fetch is how your AI agent gets relevant context from stored memories. When your agent needs to know a user’s preferences, recall past conversations, or understand organizational context, it calls the retrieval API. Synap returns a structuredContextResponse containing facts, preferences, episodes, and emotions ranked by relevance to the query.
Context fetch is designed to sit in your agent’s hot path — the fast mode targets 50-100ms latency, making it suitable for real-time conversation flows.
Conversation Context
The primary retrieval interface issdk.conversation.context.fetch(). It returns context relevant to a specific conversation, enriched with memories from the user’s broader history.
Parameter Reference
The identifier of the conversation to retrieve context for. Must be a valid UUID string — non-UUID strings will cause a
ServiceUnavailableError. This scopes the retrieval to memories associated with this conversation plus broader user/customer context.One or more semantic search queries that drive retrieval. The queries are embedded and matched against stored memories using vector similarity. When multiple queries are provided, results are merged and re-ranked. If omitted, retrieval returns the most recent and relevant context without semantic filtering.
Maximum number of results to return across all types. The actual number returned may be less if fewer relevant results exist.
Filter which memory types to include in the response. Valid values:
"facts", "preferences", "episodes", "emotions", "temporal", "all". If omitted, all types are returned.The retrieval mode. Controls the depth of search and ranking. See Retrieval Modes below.
Retrieval Modes
Synap offers two retrieval modes that trade off latency against comprehensiveness.| Aspect | fast | accurate |
|---|---|---|
| Latency | ~50-100ms | ~200-500ms |
| Search method | Vector similarity only | Vector + graph traversal + re-ranking |
| Best for | Real-time chat, low-latency requirements | Complex queries, relationship-aware context |
| Relationship awareness | Limited | Full entity-relationship traversal |
| Ranking | Cosine similarity | Multi-signal ranking (similarity + recency + graph centrality) |
When to Use Each Mode
Use fast when...
- You are in a real-time conversation flow
- The query is about a single topic or entity
- Latency budget is under 100ms
- You are retrieving frequently (e.g., every turn)
Use accurate when...
- The query involves relationships between entities
- You need context spanning multiple conversations
- You are building a comprehensive summary
- Latency budget allows 200-500ms
Response Structure
TheContextResponse object contains structured memory types and metadata.
Facts
Facts are discrete, verified pieces of information extracted from memories.Unique identifier for this fact.
The fact content in natural language (e.g., “User is a vegetarian”).
Confidence score between 0.0 and 1.0. Higher values indicate stronger evidence from source material.
Identifier of the memory this fact was extracted from.
Timestamp when this fact was extracted.
Additional metadata about the extraction.
Preferences
Preferences capture user likes, dislikes, and stated preferences.content expressing the preference in natural language (e.g., “Prefers boutique hotels over large chains”).
Episodes
Episodes represent summarized narrative segments from past interactions.Emotions
Emotions capture detected emotional states and sentiment from interactions.Response Metadata
EveryContextResponse includes metadata about the retrieval operation.
Unique ID for this retrieval request. Log this for debugging and support inquiries.
How long this response is valid in the local cache. After this duration, the SDK will fetch fresh results from Synap Cloud.
Whether this response was served from the local
"cache" or fetched from "cloud". Cached responses are faster but may be slightly stale.Whether context compaction was applied to the response. If
true, the context has been compressed to fit within token budgets.Scoped Retrieval
In addition to conversation-level retrieval, Synap provides scope-specific interfaces for retrieving memories at the user, customer, and client levels.User Context
Retrieve all memories scoped to a specific user, across all their conversations.Customer Context
Retrieve memories shared across all users within a customer (organization/tenant).Client Context
Retrieve memories at the broadest scope — across all customers and users within your Synap client.Scoped retrieval respects Synap’s scope hierarchy. User context includes user-scoped memories. Customer context includes customer-scoped memories visible to all users in that customer. Client context includes client-wide memories. Higher scopes never leak memories from narrower scopes unless those memories were explicitly created at the broader scope. See Memory Scopes for details.
The types Filter
Use the types parameter to retrieve only specific memory types. This reduces response size and processing time when you only need certain kinds of context.
Search Queries
Thesearch_query parameter drives semantic search. Queries are embedded using Synap’s embedding model and matched against stored memory vectors.
Single Query
Multiple Queries
When you provide multiple queries, Synap runs each independently and merges the results with deduplication and re-ranking.No Query (Recency-Based)
Whensearch_query is omitted, retrieval returns the most recent and contextually relevant memories for the conversation without semantic filtering.
The .raw Property
For forward compatibility with future Synap API changes, every response object exposes a .raw property containing the unprocessed API response as a dictionary.
The
.raw property is useful when Synap adds new fields to the API response that have not yet been mapped to typed SDK properties. You can access new fields immediately without waiting for an SDK update.Full Example: System Prompt Injection
The most common use case for context fetch is injecting contextual memories into your LLM’s system prompt.Next Steps
Entity Resolution
Learn how entities are resolved to improve retrieval quality.
Context Compaction
Compress context to reduce LLM token costs.
Ingestion
Feed more data into Synap to enrich retrieval results.
Memory Scopes
Understand how scope filtering affects retrieval boundaries.