Runtime ingestion is the process of feeding data into Synap as it is generated during live agent interactions. This is the primary ingestion path for most applications. After each conversation turn — or at the end of a conversation — your application calls the SDK to send the content through Synap’s ingestion pipeline, where it is extracted, resolved, and stored as structured memory.
Use this file to discover all available pages before exploring further.
Unlike bootstrap ingestion, which handles bulk historical data, runtime ingestion is designed for low-latency, non-blocking operation. Your agent never waits for ingestion to complete before responding to the user.
Runtime ingestion is asynchronous. The SDK call returns immediately with an ingestion_id. The pipeline processes the content in the background, and memories become available for retrieval within seconds (fast mode) to minutes (long-range mode).
The runtime ingestion flow integrates naturally into your agent’s conversation loop:
1
Your agent receives a user message
The user sends a message to your agent through your application’s interface — a chat widget, API, mobile app, or other channel.
2
Your agent retrieves context from Synap
Before generating a response, the agent calls Synap to fetch relevant memories and context. This step is covered in detail in Agent Interactions.
3
Your agent generates and delivers a response
The agent calls an LLM with the retrieved context and conversation history, generates a response, and sends it to the user.
4
Your agent ingests the conversation turn
After the response is delivered, the agent calls sdk.memories.create() to ingest the conversation turn. This call returns immediately — it does not block the user experience.
5
Synap processes in the background
The ingestion pipeline extracts entities, resolves references, detects preferences, and stores structured memories. These memories become available for future retrieval queries.
The core SDK method for runtime ingestion is sdk.memories.create():
from synap import Synapsdk = Synap(api_key="your_api_key")result = await sdk.memories.create( document="User: Can you remind me what we decided about the migration timeline?\n" "Assistant: In our last conversation, you and the team agreed to begin the " "database migration on March 15th, with a two-week buffer for testing.", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp", mode="fast")print(f"Ingestion ID: {result.ingestion_id}")# Returns immediately -- processing happens in the background
For most agent applications, you will use ai-chat-conversation almost exclusively during runtime. Other document types are more common in bootstrap ingestion or specialized pipelines.
Runtime ingestion supports two processing modes that control the depth of extraction:
Fast mode
Long-range mode
Performs basic chunking, lightweight entity extraction, and vector embedding. Processing completes in seconds. Best for real-time chat logging where speed matters more than extraction depth.
await sdk.memories.create( document="User: What time is the standup?\nAssistant: Daily standup is at 9:30 AM Pacific.", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp", mode="fast")
Use fast mode for: routine conversations, high-throughput pipelines, non-critical context.
Runs the full extraction pipeline: deep entity resolution, relationship mapping, preference detection, emotional analysis, and advanced categorization. Processing takes seconds to minutes. Best for important conversations that should be thoroughly analyzed.
await sdk.memories.create( document="User: We need to revisit the Project Atlas timeline. Sarah from engineering " "said the Q3 deadline is not feasible. Can we push to Q4 and bring in two " "more engineers from the infrastructure team?\n" "Assistant: I'll note that. So the proposal is to shift Project Atlas to Q4 " "and augment the team with infrastructure engineers. Should I also note " "Sarah's concern about the original timeline?", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp", mode="long-range")
Use long-range mode for: strategic conversations, onboarding sessions, complex discussions with multiple entities and decisions.
The user_id and customer_id parameters determine where the memory is stored in the scope hierarchy. This directly affects who can retrieve the memory later.
# User-scoped: only visible when retrieving for this specific userawait sdk.memories.create( document="User: I prefer bullet points over paragraphs.\nAssistant: Got it!", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp", mode="fast")# Customer-scoped: visible to all users in the organizationawait sdk.memories.create( document="Company handbook update: All PTO requests must be submitted two weeks in advance.", document_type="document", customer_id="acme_corp", mode="long-range")# Client-scoped: visible to all users across all customersawait sdk.memories.create( document="Product changelog: Version 2.5 adds support for custom webhooks.", document_type="document", mode="fast")
For a full explanation of the scope hierarchy, see Memory Scopes.
Always format conversations with clear speaker labels (User: and Assistant:). The ingestion pipeline uses these labels to identify who said what, which is critical for accurate preference detection and entity attribution.
# Good: clear speaker labelsdocument = "User: I need the report by Friday.\nAssistant: I'll have it ready by Thursday evening."# Bad: no speaker contextdocument = "I need the report by Friday. I'll have it ready by Thursday evening."
Use consistent user and customer IDs
Ensure that user_id and customer_id are consistent across all ingestion calls for the same user and organization. Inconsistent IDs fragment the memory store, creating isolated pockets of context that cannot be retrieved together. Derive these IDs from your application’s auth system.
Use fast mode for routine real-time chat
For standard conversational turns, fast mode provides the best balance of speed and extraction quality. Reserve long-range mode for high-value conversations where deep extraction justifies the additional processing time.
Ingest after the response is delivered
Always ingest the conversation turn after your agent has responded, not before. This ensures the ingested content includes both the user message and the agent’s response, providing complete context for future retrieval.
Handle ingestion errors gracefully
Runtime ingestion should never block or crash your agent. Wrap ingestion calls in error handling and log failures for later investigation. A missed ingestion is recoverable; a crashed agent is not.
try: await sdk.memories.create( document=conversation_turn, document_type="ai-chat-conversation", user_id=user_id, customer_id=customer_id, mode="fast" )except Exception as e: logger.warning(f"Ingestion failed, will retry: {e}") # Queue for retry or log for manual follow-up
Set document_id for idempotent retries
If you implement retry logic for failed ingestion calls, always set a document_id to prevent duplicate memories. Derive the ID from your conversation or message identifier.