Ingestion is how raw content becomes structured memory in Synap. Whether you feed in conversation turns as they happen or load historical data in bulk, every document passes through the same pipeline — categorized, extracted, chunked, resolved against existing entities, and stored. This page explains that shared pipeline once, then covers the two ingestion paths: runtime (live, per-turn) and bootstrap (bulk, backfill).
Your application produces content all the time — live conversations, support tickets, product docs, CRM records. Ingestion is the process that turns that raw content into structured, retrievable memory. There are two paths into Synap, and they share the same underlying pipeline:
Runtime ingestion feeds content in as it is generated during live agent interactions, one turn at a time.
Bootstrap ingestion loads pre-existing data in bulk — historical conversations, documentation, migrations from another system.
Both paths converge on the same processing pipeline. The difference is how you call them and what defaults make sense for each.
Every document you send — via either path — flows through the same stages before it becomes a memory you can retrieve:
1
Categorization
The pipeline reads the document’s document_type and selects the appropriate extraction logic. A chat conversation, an email, and a PDF are each processed differently.
2
Extraction
Synap analyzes the content to pull out the things worth remembering: facts, decisions, preferences, action items, and the entities involved. The depth of this step depends on the ingestion mode (see below).
3
Chunking
Larger documents are segmented into coherent units so that retrieval can return precise, relevant passages rather than whole documents.
4
Entity resolution
Extracted entities (people, projects, organizations) are matched against entities already in the store, so references to “Sarah” or “Project Atlas” link to the same entity across many documents. See Entity Resolution.
5
Storage
The resulting structured memories are written to the correct scope, indexed for both vector and graph retrieval, and made available to future queries.
Ingestion is asynchronous on both paths. The SDK call returns quickly with an identifier, and processing continues in the background. Memories become available for retrieval once the pipeline finishes — fast mode completes sooner than long-range mode.
A single parameter, mode, controls how deeply the extraction stage analyzes each document. The same two modes are available on both ingestion paths.
fast
long-range
Performs lightweight chunking, entity extraction, and vector embedding. Best when speed matters more than extraction depth — routine conversational turns, high-throughput logging, non-critical context.
Runs the full extraction pipeline: deep entity resolution, relationship mapping, preference detection, and advanced categorization. Best for high-value content that deserves thorough analysis. This is the default.
The ingestion mode you pick here is distinct from the retrieval mode you pick when querying. For how depth maps to query behavior (fast = vector + graph; accurate = vector + graph + LLM subquery decomposition + reranking), see Retrieval Modes.
user_id and customer_id determine where a memory is stored in the scope hierarchy, which in turn controls who can retrieve it later. On B2C agents, customer_id is resolved automatically and you only pass user_id; on B2B agents you pass both. The same scoping rules apply on both ingestion paths. See Memory Scopes for the full hierarchy.
Runtime ingestion feeds content into Synap as it is generated during live agent interactions. This is the primary ingestion path for most applications. After each conversation turn — or at the end of a conversation — your application calls sdk.memories.create() to send the turn through the pipeline. The call returns immediately, so your agent never waits for ingestion before responding to the user; fast is the natural default here.
1
Your agent receives a user message
Through a chat widget, API, mobile app, or other channel.
2
Your agent retrieves context from Synap
Before responding, the agent fetches relevant memories. See Context End to End.
3
Your agent generates and delivers a response
It calls an LLM with the retrieved context and conversation history, then replies to the user.
4
Your agent ingests the turn
After the response is delivered, the agent calls sdk.memories.create(). This returns immediately and does not block the user experience.
5
Synap processes in the background
The pipeline extracts, resolves, and stores structured memories, which become available for future retrieval.
import uuidfrom maximem_synap import MaximemSynapSDKsdk = MaximemSynapSDK(api_key="your_api_key")result = await sdk.memories.create( document="User: Can you remind me what we decided about the migration timeline?\n" "Assistant: In our last conversation, you and the team agreed to begin the " "database migration on March 15th, with a two-week buffer for testing.", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp", # B2B only -- auto-resolved on B2C mode="fast",)print(f"Ingestion ID: {result.ingestion_id}")# Returns immediately -- processing happens in the background
Parameter
Required
Description
document
Yes
The text to ingest. For conversations, include both user and assistant messages with speaker labels.
document_type
Yes
The type of content. Determines how the pipeline processes the document.
user_id
No
The user this content belongs to. Determines user-scope storage.
customer_id
No
The customer organization (B2B only; auto-resolved on B2C). Determines customer-scope storage.
mode
No
fast or long-range. Defaults to long-range; runtime callers typically pass fast.
document_id
No
Unique identifier for idempotency. Prevents duplicate ingestion on retry.
document_created_at
No
Timestamp override. Defaults to the current time for runtime ingestion (usually correct).
Always format conversations with clear User: and Assistant: labels. The pipeline uses them to identify who said what, which is critical for accurate preference detection and entity attribution.
# Good: clear speaker labelsdocument = "User: I need the report by Friday.\nAssistant: I'll have it ready by Thursday evening."# Bad: no speaker contextdocument = "I need the report by Friday. I'll have it ready by Thursday evening."
Use consistent user and customer IDs
Keep user_id (and customer_id on B2B) consistent across all calls for the same user and organization. Inconsistent IDs fragment the store into isolated pockets of context that cannot be retrieved together. Derive them from your auth system.
Ingest after the response is delivered
Ingest the turn after your agent has responded, not before, so the ingested content includes both the user message and the agent’s reply — complete context for future retrieval.
Handle ingestion errors gracefully
Runtime ingestion should never block or crash your agent. Wrap calls in error handling and log failures. A missed ingestion is recoverable; a crashed agent is not.
try: await sdk.memories.create( document=conversation_turn, document_type="ai-chat-conversation", user_id=user_id, customer_id=customer_id, mode="fast", )except Exception as e: logger.warning(f"Ingestion failed, will retry: {e}") # Queue for retry or log for manual follow-up
Set document_id for idempotent retries
If you retry failed ingestion calls, set a document_id to prevent duplicate memories. Derive it from your conversation or message identifier.
Bootstrap ingestion loads pre-existing data into Synap in bulk. Before your agent goes live — or alongside live operation — you often need to seed it with historical context: past conversations, product documentation, knowledge base articles, customer records. You call sdk.memories.batch_create() with many documents at once. Because this data is historical and processed in the background, long-range is the natural default.Use bootstrap ingestion whenever you need to load a significant volume of existing data:
Migrating from another system — a custom memory solution, a competing product, or an in-house knowledge base.
Loading historical conversations — past chat logs, support tickets, or email threads.
Seeding product documentation — docs, FAQs, help center articles, internal wikis.
Backfilling customer data — CRM records, customer profiles, organizational context.
Populating shared knowledge — company policies, SOPs, reference material at customer or client scope.
Bootstrap loads run at BOOTSTRAP priority in the ingestion queue, which processes below real-time but above maintenance tasks. Your live agent keeps operating normally while historical data is processed. You do not need to finish bootstrapping before going live — the two paths can run simultaneously without competing for resources.
sdk.memories.batch_create() accepts multiple documents in a single call, reducing per-call overhead and enabling server-side throughput optimizations. Each document is a CreateMemoryRequest supporting the same fields as sdk.memories.create().
from maximem_synap import MaximemSynapSDK, CreateMemoryRequestsdk = MaximemSynapSDK(api_key="your_api_key")result = await sdk.memories.batch_create( documents=[ CreateMemoryRequest( document="User: How do I reset my password?\nAssistant: Go to Settings > Security > Reset Password.", document_type="ai-chat-conversation", document_id="migration_001", document_created_at="2024-03-15T10:30:00Z", user_id="user_456", customer_id="acme_corp", # B2B only -- auto-resolved on B2C mode="long-range", ), CreateMemoryRequest( document="User: What integrations do you support?\nAssistant: We support Slack, Jira, and GitHub.", document_type="ai-chat-conversation", document_id="migration_002", document_created_at="2024-03-16T14:20:00Z", user_id="user_789", customer_id="acme_corp", mode="long-range", ), ], fail_fast=False,)print(f"Succeeded: {result.succeeded}")print(f"Failed: {result.failed}")
Parameter
Required
Description
documents
Yes
List of CreateMemoryRequest objects to ingest (max 100 per call).
fail_fast
No
If True, the whole batch fails on the first error. If False (default), errors are collected and returned alongside successful results.
Each CreateMemoryRequest supports the same fields as a single create call — including document_id for idempotency and document_created_at for temporal accuracy.
Always set document_created_at to the document’s original creation time. Without it, Synap defaults to the ingestion time, which distorts temporal ordering. If a user later asks “What did we discuss last March?”, accurate timestamps are essential for correct retrieval.
2
Use document IDs for idempotency
Assign a unique document_id to every document, ideally derived from your source system’s primary key (e.g. migration_{source_id}). If a batch is interrupted, you can safely retry it — already-ingested documents are skipped, preventing duplicates and making it easy to trace memories back to their source.
3
Use long-range mode for historical data
Bootstrap data benefits from thorough extraction. long-range (the default for batch) performs deep entity resolution, relationship mapping, and preference detection. The extra processing time is fine for background loads.
4
Validate and organize before loading
Clean your historical data first: drop empty conversations, strip PII you should not store, ensure timestamps are ISO 8601, and confirm each document carries the correct scope. Incorrect scoping is difficult to fix later — you would have to re-ingest. Start with a small test batch and verify scoping, timestamps, and entity resolution before the full load.
5
Monitor ingestion progress
For large loads, track progress with sdk.memories.status():
Keep concurrency modest. While the API accepts many parallel batch requests, excessive concurrency causes queue backpressure and slows processing for all ingestion types. Add a short delay between batches during the initial bulk load.