Conversational Context Lifecycle

Overview

Every conversation in Synap follows a predictable lifecycle: context starts empty, grows as turns accumulate, may be compacted when it becomes too large, and ultimately produces long-term memories when the conversation ends. Understanding this lifecycle is essential for tuning your memory architecture and ensuring your agents have the right context at the right time. This page walks through each stage of a conversation’s context lifecycle, from the first user message to final memory storage.

Lifecycle Stages

Conversation Starts

When a new conversation begins, the context window is empty. The SDK creates a new conversation scope identified by a unique conversation_id. No memories have been retrieved yet, and no turns have been recorded.

# A new conversation begins with an empty context
conversation_id = "conv_" + generate_id()

First Turn — Retrieval and Initial Context

The first user message triggers a retrieval call to the Synap Cloud. Relevant long-term memories are fetched based on the user’s message, their identity (user and customer scope), and any organizational context available at the client scope.The agent uses these retrieved memories to build the initial context — a rich starting point that includes relevant facts, preferences, past interactions, and organizational knowledge. This is the moment where long-term memory meets the current conversation.

# Retrieve relevant context for the first turn
context = await sdk.conversation.context.fetch(
    conversation_id=conversation_id,
    search_query=[user_message],
)
# context.facts (and context.preferences, context.episodes, ...) hold the
# relevant long-term memories retrieved for this turn

Conversation Progresses — Context Grows

Each subsequent turn adds to the context window. Both the conversation history (user messages and assistant responses) and any newly retrieved memories contribute to the growing context.At each turn, the SDK may perform additional retrieval calls if the conversation topic shifts or if new entities are mentioned. The context window now contains:

The full conversation history (all prior turns)
Retrieved long-term memories (from the initial and subsequent retrievals)
Any system instructions or organizational context

Token Count Rises

As turns accumulate, the total token count of the context window increases. For long conversations — customer support sessions, multi-step workflows, in-depth discussions — this can grow to tens of thousands of tokens.The SDK tracks the token count internally and compares it against configurable thresholds. At this stage, the context is still within limits and no action is needed.

Compaction Trigger

When the context exceeds the configured threshold, the compaction process is triggered. Compaction extracts the essential information from the conversation so far — key facts, decisions, unresolved questions, user preferences expressed during the conversation — and compresses the context into a significantly smaller representation.Compaction can be triggered by:

Token limit: the context exceeds max_context_tokens (default: configurable per instance)
Turn count: the conversation exceeds max_turns_before_compaction
Adaptive trigger: the system detects diminishing relevance in older turns

# Compaction happens automatically, but can also be triggered manually
compacted = await sdk.conversation.context.compact(
    conversation_id=conversation_id,
    strategy="adaptive"
)

Post-Compaction — Rebuilt Context

After compaction, the context window is rebuilt from three sources:

Compressed context — the compacted summary of all prior turns
Recent turns — the last N turns are kept in full (not compressed) to maintain conversational flow
Retrieved memories — a fresh retrieval pass ensures relevant long-term memories are still included

The result is a context window that fits within the token budget while preserving the most important information. The conversation continues seamlessly — the user and agent experience no interruption.

Conversation Content Becomes Long-term Memory

Conversation turns are persisted into long-term memory by submitting their content through sdk.memories.create() (or sdk.memories.batch_create() for multiple turns at once). There is no explicit “end” call — you ingest the content you want to remember whenever it is ready.The ingestion pipeline processes the conversation content through the full multi-stage pipeline: categorization, extraction, chunking, entity resolution, and organization. The extracted memories are stored in the vector and graph stores, scoped to the appropriate user and customer.

# Submit the conversation content for long-term memory ingestion
await sdk.memories.create(
    document=conversation_transcript,
    document_type="ai-chat-conversation",
    user_id="user_123",
    customer_id="acme_corp",
)

Lifecycle Diagram

The following diagram shows the full lifecycle of context within a single conversation:

Start → Retrieve → Generate → Ingest → [Repeat] → Compact? → Continue → End → Store Long-term
  │                                        │            │                        │
  │    ┌───────────────────────────────────┘            │                        │
  │    │  (loop for each turn)                          │                        │
  │    │                                                │                        │
  │    ▼                                                ▼                        ▼
  │  Context grows with each turn              Compressed context         Memories extracted
  │  Retrieval adds long-term memories         + recent turns             and persisted in
  │  Token count increases                     + fresh retrieval          vector + graph stores
  │                                            = new context window

Conversational Context vs. Long-term Memory

A conversation feeds both short-term and long-term memory paths:

Short-term Path (Compaction)

During the conversation, compaction summarizes older turns to keep the context window manageable. This information exists only for the duration of the conversation. It is not persisted beyond the session.

Long-term Path (Ingestion)

When you submit conversation content via sdk.memories.create(), it is processed through the ingestion pipeline. Extracted facts, preferences, episodes, emotions, and temporal events become persistent memories available to all future conversations.

The key insight is that compaction and ingestion serve different purposes:

Aspect	Compaction (Short-term)	Ingestion (Long-term)
When	During conversation	After conversation ends
Purpose	Keep context within token budget	Create persistent memories
Scope	Current conversation only	Available across all future conversations
Output	Compressed summary	Structured memories (facts, entities, episodes)
Storage	In-memory / session	Vector store + graph store

Configuring Thresholds

You can tune the compaction behavior through your MACA (Memory Architecture Configuration) settings:

Token Limit

The token-count threshold that triggers compaction is configurable through the Memory Architecture Configuration. A higher limit allows longer uncompacted conversations but uses more tokens per LLM call.

Turn Count Trigger

A turn-count threshold — compacting after a fixed number of turns regardless of token count — is configurable through the Memory Architecture Configuration for predictable compaction behavior.

Adaptive Compaction

Adaptive compaction, where the system decides when to compact based on content relevance analysis, is configurable through the Memory Architecture Configuration.

Start with the default thresholds and adjust based on your use case. Customer support conversations may benefit from higher turn count limits (users reference earlier parts of the conversation), while quick Q&A interactions can use aggressive compaction.

Next Steps

Long-term Context Lifecycle

Learn how memories persist and evolve across conversations.

Context Compaction

Deep dive into compaction strategies and algorithms.

SDK: Context Compaction

Programmatic control over compaction in your application.

Documentation Index

​Overview

​Lifecycle Stages

​Lifecycle Diagram

​Conversational Context vs. Long-term Memory

Short-term Path (Compaction)

Long-term Path (Ingestion)

​Configuring Thresholds

​Next Steps

Long-term Context Lifecycle

Context Compaction

SDK: Context Compaction

Overview

Lifecycle Stages

Lifecycle Diagram

Conversational Context vs. Long-term Memory

Configuring Thresholds

Next Steps