Skip to main content
Unlike bootstrap ingestion, which handles bulk historical data, runtime ingestion is designed for low-latency, non-blocking operation. Your agent never waits for ingestion to complete before responding to the user.
Runtime ingestion is asynchronous. The SDK call returns immediately with an ingestion_id. The content is processed in the background, and memories become available for retrieval shortly after.

How it works

The runtime ingestion flow integrates naturally into your agent’s conversation loop:
1

Your agent receives a user message

The user sends a message to your agent through your application’s interface — a chat widget, API, mobile app, or other channel.
2

Your agent retrieves context from Synap

Before generating a response, the agent calls Synap to fetch relevant memories and context. This step is covered in detail in Agent Interactions.
3

Your agent generates and delivers a response

The agent calls an LLM with the retrieved context and conversation history, generates a response, and sends it to the user.
4

Your agent ingests the conversation turn

After the response is delivered, the agent calls sdk.memories.create() to ingest the conversation turn. This call returns immediately — it does not block the user experience.
5

Synap processes in the background

Synap analyzes the content and stores relevant memories. These memories become available for future retrieval queries.

The SDK call

The core SDK method for runtime ingestion is sdk.memories.create():
from synap import Synap

sdk = Synap(api_key="your_api_key")

result = await sdk.memories.create(
    document="User: Can you remind me what we decided about the migration timeline?\n"
             "Assistant: In our last conversation, you and the team agreed to begin the "
             "database migration on March 15th, with a two-week buffer for testing.",
    document_type="ai-chat-conversation",
    user_id="user_123",
    customer_id="acme_corp",
    mode="fast"
)

print(f"Ingestion ID: {result.ingestion_id}")
# Returns immediately -- processing happens in the background

Parameters

ParameterRequiredDescription
documentYesThe text content to ingest. For conversations, include both user and assistant messages with speaker labels.
document_typeYesThe type of content being ingested. Determines how Synap processes the document.
user_idNoThe user this content belongs to. Determines user-scope storage.
customer_idNoThe customer organization. Determines customer-scope storage.
modeNoIngestion mode: fast or long-range. Defaults to fast for runtime ingestion.
document_idNoUnique identifier for idempotency. Prevents duplicate ingestion on retry.
document_created_atNoTimestamp override. Defaults to current time for runtime ingestion (usually correct).

Document types

Synap supports a range of document types, each with specialized extraction logic:
Document TypeDescription
ai-chat-conversationChat conversations between a user and an AI agent. The most common type for runtime ingestion.
documentGeneral text documents, articles, or notes.
emailEmail content including headers and body.
pdfPDF document content (text extracted).
imageImage descriptions or OCR-extracted text.
audioTranscribed audio content.
meeting-transcriptMeeting transcriptions with multiple speakers.
For most agent applications, you will use ai-chat-conversation almost exclusively during runtime. Other document types are more common in bootstrap ingestion or specialized workflows.

Ingestion modes

Runtime ingestion supports two processing modes that control the depth of extraction:
Optimized for speed. Best for real-time chat logging where low latency matters more than extraction depth.
await sdk.memories.create(
    document="User: What time is the standup?\nAssistant: Daily standup is at 9:30 AM Pacific.",
    document_type="ai-chat-conversation",
    user_id="user_123",
    customer_id="acme_corp",
    mode="fast"
)
Use fast mode for: routine conversations, high-throughput workloads, non-critical context.
For a detailed comparison, see Fast Mode and Accurate Mode.

Scoping

The user_id and customer_id parameters determine where the memory is stored in the scope hierarchy. This directly affects who can retrieve the memory later.
# User-scoped: only visible when retrieving for this specific user
await sdk.memories.create(
    document="User: I prefer bullet points over paragraphs.\nAssistant: Got it!",
    document_type="ai-chat-conversation",
    user_id="user_123",
    customer_id="acme_corp",
    mode="fast"
)

# Customer-scoped: visible to all users in the organization
await sdk.memories.create(
    document="Company handbook update: All PTO requests must be submitted two weeks in advance.",
    document_type="document",
    customer_id="acme_corp",
    mode="long-range"
)

# Client-scoped: visible to all users across all customers
await sdk.memories.create(
    document="Product changelog: Version 2.5 adds support for custom webhooks.",
    document_type="document",
    mode="fast"
)
For a full explanation of the scope hierarchy, see Memory Scopes.

The typical agent loop

Here is how runtime ingestion fits into a standard agent conversation loop:
from synap import Synap
from openai import AsyncOpenAI

sdk = Synap(api_key="synap_api_key")
openai_client = AsyncOpenAI(api_key="openai_api_key")

async def handle_message(user_message: str, user_id: str, customer_id: str):
    """Handle a single user message with memory-enabled context."""

    # Step 1: Retrieve relevant context from Synap
    context = await sdk.conversation.context.fetch(
        user_id=user_id,
        customer_id=customer_id,
        query=user_message,
        mode="fast"
    )

    # Step 2: Build the prompt with retrieved memories
    system_prompt = (
        "You are a helpful assistant. Use the following context from previous "
        "conversations to inform your response:\n\n"
        f"{context.formatted_context}\n\n"
        "If the context is not relevant, respond based on your general knowledge."
    )

    # Step 3: Generate the response
    response = await openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message}
        ]
    )

    assistant_message = response.choices[0].message.content

    # Step 4: Ingest the conversation turn (non-blocking)
    await sdk.memories.create(
        document=f"User: {user_message}\nAssistant: {assistant_message}",
        document_type="ai-chat-conversation",
        user_id=user_id,
        customer_id=customer_id,
        mode="fast"
    )

    return assistant_message
For a more detailed walkthrough of this pattern, see Agent Interactions.

Best practices

Always format conversations with clear speaker labels (User: and Assistant:). Synap uses these labels to correctly attribute statements to each participant, which improves the quality of stored memories.
# Good: clear speaker labels
document = "User: I need the report by Friday.\nAssistant: I'll have it ready by Thursday evening."

# Bad: no speaker context
document = "I need the report by Friday. I'll have it ready by Thursday evening."
Ensure that user_id and customer_id are consistent across all ingestion calls for the same user and organization. Inconsistent IDs fragment the memory store, creating isolated pockets of context that cannot be retrieved together. Derive these IDs from your application’s auth system.
For standard conversational turns, fast mode provides the best balance of speed and extraction quality. Reserve long-range mode for high-value conversations where deep extraction justifies the additional processing time.
Always ingest the conversation turn after your agent has responded, not before. This ensures the ingested content includes both the user message and the agent’s response, providing complete context for future retrieval.
Runtime ingestion should never block or crash your agent. Wrap ingestion calls in error handling and log failures for later investigation. A missed ingestion is recoverable; a crashed agent is not.
try:
    await sdk.memories.create(
        document=conversation_turn,
        document_type="ai-chat-conversation",
        user_id=user_id,
        customer_id=customer_id,
        mode="fast"
    )
except Exception as e:
    logger.warning(f"Ingestion failed, will retry: {e}")
    # Queue for retry or log for manual follow-up
If you implement retry logic for failed ingestion calls, always set a document_id to prevent duplicate memories. Derive the ID from your conversation or message identifier.
await sdk.memories.create(
    document=conversation_turn,
    document_type="ai-chat-conversation",
    document_id=f"turn_{conversation_id}_{turn_number}",
    user_id=user_id,
    customer_id=customer_id,
    mode="fast"
)

Next steps

Bootstrap Ingestion

Load historical data in bulk before or alongside runtime operation.

Agent Interactions

The full retrieve-generate-ingest loop for memory-enabled agents.

SDK Ingestion Guide

Detailed SDK reference for all ingestion methods and parameters.

Fast Mode

Understand the tradeoffs of fast vs long-range ingestion modes.