Skip to main content

Overview

Every conversation in Synap follows a predictable lifecycle: context starts empty, grows as turns accumulate, may be compacted when it becomes too large, and ultimately produces long-term memories when the conversation ends. Understanding this lifecycle is essential for tuning your memory architecture and ensuring your agents have the right context at the right time. This page walks through each stage of a conversation’s context lifecycle, from the first user message to final memory storage.

Lifecycle Stages

1

Conversation Starts

When a new conversation begins, the context window is empty. The SDK creates a new conversation scope identified by a unique conversation_id. No memories have been retrieved yet, and no turns have been recorded.
# A new conversation begins with an empty context
conversation_id = "conv_" + generate_id()
2

First Turn — Retrieval and Initial Context

The first user message triggers a retrieval call to the Synap Cloud. Relevant long-term memories are fetched based on the user’s message, their identity (user and customer scope), and any organizational context available at the client scope.The agent uses these retrieved memories to build the initial context — a rich starting point that includes relevant facts, preferences, past interactions, and organizational knowledge. This is the moment where long-term memory meets the current conversation.
# Retrieve relevant context for the first turn
context = await sdk.conversation.context.fetch(
    conversation_id=conversation_id,
    user_id="user_abc",
    customer_id="cust_xyz",
    messages=[{"role": "user", "content": user_message}]
)
# context.memories contains relevant long-term memories
# context.system_prompt contains the enriched system prompt
3

Conversation Progresses — Context Grows

Each subsequent turn adds to the context window. Both the conversation history (user messages and assistant responses) and any newly retrieved memories contribute to the growing context.At each turn, the SDK may perform additional retrieval calls if the conversation topic shifts or if new entities are mentioned. The context window now contains:
  • The full conversation history (all prior turns)
  • Retrieved long-term memories (from the initial and subsequent retrievals)
  • Any system instructions or organizational context
4

Token Count Rises

As turns accumulate, the total token count of the context window increases. For long conversations — customer support sessions, multi-step workflows, in-depth discussions — this can grow to tens of thousands of tokens.The SDK tracks the token count internally and compares it against configurable thresholds. At this stage, the context is still within limits and no action is needed.
5

Compaction Trigger

When the context exceeds the configured threshold, the compaction process is triggered. Compaction extracts the essential information from the conversation so far — key facts, decisions, unresolved questions, user preferences expressed during the conversation — and compresses the context into a significantly smaller representation.Compaction can be triggered by:
  • Token limit: the context exceeds max_context_tokens (default: configurable per instance)
  • Turn count: the conversation exceeds max_turns_before_compaction
  • Adaptive trigger: the system detects diminishing relevance in older turns
# Compaction happens automatically, but can also be triggered manually
compacted = await sdk.conversation.context.compact(
    conversation_id=conversation_id,
    strategy="extract_essential"
)
6

Post-Compaction — Rebuilt Context

After compaction, the context window is rebuilt from three sources:
  1. Compressed context — the compacted summary of all prior turns
  2. Recent turns — the last N turns are kept in full (not compressed) to maintain conversational flow
  3. Retrieved memories — a fresh retrieval pass ensures relevant long-term memories are still included
The result is a context window that fits within the token budget while preserving the most important information. The conversation continues seamlessly — the user and agent experience no interruption.
7

Conversation Ends — Long-term Memory Storage

When the conversation ends (explicitly via sdk.conversation.end() or implicitly via session timeout), the final state of the conversation is ingested into long-term memory.The ingestion pipeline processes the conversation content through the full multi-stage pipeline: categorization, extraction, chunking, entity resolution, and organization. The extracted memories are stored in the vector and graph stores, scoped to the appropriate user and customer.
# End the conversation and trigger long-term memory ingestion
await sdk.conversation.end(
    conversation_id=conversation_id,
    ingest=True  # default: True
)

Lifecycle Diagram

The following diagram shows the full lifecycle of context within a single conversation:
Start → Retrieve → Generate → Ingest → [Repeat] → Compact? → Continue → End → Store Long-term
  │                                        │            │                        │
  │    ┌───────────────────────────────────┘            │                        │
  │    │  (loop for each turn)                          │                        │
  │    │                                                │                        │
  │    ▼                                                ▼                        ▼
  │  Context grows with each turn              Compressed context         Memories extracted
  │  Retrieval adds long-term memories         + recent turns             and persisted in
  │  Token count increases                     + fresh retrieval          vector + graph stores
  │                                            = new context window

Conversational Context vs. Long-term Memory

A conversation feeds both short-term and long-term memory paths:

Short-term Path (Compaction)

During the conversation, compaction summarizes older turns to keep the context window manageable. This information exists only for the duration of the conversation. It is not persisted beyond the session.

Long-term Path (Ingestion)

When the conversation ends, the full content is processed through the ingestion pipeline. Extracted facts, preferences, episodes, emotions, and temporal events become persistent memories available to all future conversations.
The key insight is that compaction and ingestion serve different purposes:
AspectCompaction (Short-term)Ingestion (Long-term)
WhenDuring conversationAfter conversation ends
PurposeKeep context within token budgetCreate persistent memories
ScopeCurrent conversation onlyAvailable across all future conversations
OutputCompressed summaryStructured memories (facts, entities, episodes)
StorageIn-memory / sessionVector store + graph store

Configuring Thresholds

You can tune the compaction behavior through your MACA (Memory Architecture Configuration) settings:
Set max_context_tokens to control when compaction triggers based on token count. A higher limit allows longer uncompacted conversations but uses more tokens per LLM call.
retrieval:
  context_budget:
    max_context_tokens: 8000
Set max_turns_before_compaction to trigger compaction after a fixed number of turns, regardless of token count. Useful for predictable compaction behavior.
retrieval:
  compaction:
    max_turns_before_compaction: 20
Enable adaptive compaction to let the system decide when to compact based on content relevance analysis. The system monitors the relevance of older turns and compacts when they fall below a threshold.
retrieval:
  compaction:
    strategy: adaptive
    relevance_threshold: 0.3
Start with the default thresholds and adjust based on your use case. Customer support conversations may benefit from higher turn count limits (users reference earlier parts of the conversation), while quick Q&A interactions can use aggressive compaction.

Next Steps

Long-term Context Lifecycle

Learn how memories persist and evolve across conversations.

Context Compaction

Deep dive into compaction strategies and algorithms.

SDK: Context Compaction

Programmatic control over compaction in your application.