Overview
Every conversation in Synap follows a predictable lifecycle: context starts empty, grows as turns accumulate, may be compacted when it becomes too large, and ultimately produces long-term memories when the conversation ends. Understanding this lifecycle is essential for tuning your memory architecture and ensuring your agents have the right context at the right time. This page walks through each stage of a conversation’s context lifecycle, from the first user message to final memory storage.Lifecycle Stages
Conversation Starts
When a new conversation begins, the context window is empty. The SDK creates a new conversation scope identified by a unique
conversation_id. No memories have been retrieved yet, and no turns have been recorded.First Turn — Retrieval and Initial Context
The first user message triggers a retrieval call to the Synap Cloud. Relevant long-term memories are fetched based on the user’s message, their identity (user and customer scope), and any organizational context available at the client scope.The agent uses these retrieved memories to build the initial context — a rich starting point that includes relevant facts, preferences, past interactions, and organizational knowledge. This is the moment where long-term memory meets the current conversation.
Conversation Progresses — Context Grows
Each subsequent turn adds to the context window. Both the conversation history (user messages and assistant responses) and any newly retrieved memories contribute to the growing context.At each turn, the SDK may perform additional retrieval calls if the conversation topic shifts or if new entities are mentioned. The context window now contains:
- The full conversation history (all prior turns)
- Retrieved long-term memories (from the initial and subsequent retrievals)
- Any system instructions or organizational context
Token Count Rises
As turns accumulate, the total token count of the context window increases. For long conversations — customer support sessions, multi-step workflows, in-depth discussions — this can grow to tens of thousands of tokens.The SDK tracks the token count internally and compares it against configurable thresholds. At this stage, the context is still within limits and no action is needed.
Compaction Trigger
When the context exceeds the configured threshold, the compaction process is triggered. Compaction extracts the essential information from the conversation so far — key facts, decisions, unresolved questions, user preferences expressed during the conversation — and compresses the context into a significantly smaller representation.Compaction can be triggered by:
- Token limit: the context exceeds
max_context_tokens(default: configurable per instance) - Turn count: the conversation exceeds
max_turns_before_compaction - Adaptive trigger: the system detects diminishing relevance in older turns
Post-Compaction — Rebuilt Context
After compaction, the context window is rebuilt from three sources:
- Compressed context — the compacted summary of all prior turns
- Recent turns — the last N turns are kept in full (not compressed) to maintain conversational flow
- Retrieved memories — a fresh retrieval pass ensures relevant long-term memories are still included
Conversation Ends — Long-term Memory Storage
When the conversation ends (explicitly via
sdk.conversation.end() or implicitly via session timeout), the final state of the conversation is ingested into long-term memory.The ingestion pipeline processes the conversation content through the full multi-stage pipeline: categorization, extraction, chunking, entity resolution, and organization. The extracted memories are stored in the vector and graph stores, scoped to the appropriate user and customer.Lifecycle Diagram
The following diagram shows the full lifecycle of context within a single conversation:Conversational Context vs. Long-term Memory
A conversation feeds both short-term and long-term memory paths:Short-term Path (Compaction)
During the conversation, compaction summarizes older turns to keep the context window manageable. This information exists only for the duration of the conversation. It is not persisted beyond the session.
Long-term Path (Ingestion)
When the conversation ends, the full content is processed through the ingestion pipeline. Extracted facts, preferences, episodes, emotions, and temporal events become persistent memories available to all future conversations.
| Aspect | Compaction (Short-term) | Ingestion (Long-term) |
|---|---|---|
| When | During conversation | After conversation ends |
| Purpose | Keep context within token budget | Create persistent memories |
| Scope | Current conversation only | Available across all future conversations |
| Output | Compressed summary | Structured memories (facts, entities, episodes) |
| Storage | In-memory / session | Vector store + graph store |
Configuring Thresholds
You can tune the compaction behavior through your MACA (Memory Architecture Configuration) settings:Token Limit
Token Limit
Set
max_context_tokens to control when compaction triggers based on token count. A higher limit allows longer uncompacted conversations but uses more tokens per LLM call.Turn Count Trigger
Turn Count Trigger
Set
max_turns_before_compaction to trigger compaction after a fixed number of turns, regardless of token count. Useful for predictable compaction behavior.Adaptive Compaction
Adaptive Compaction
Enable adaptive compaction to let the system decide when to compact based on content relevance analysis. The system monitors the relevance of older turns and compacts when they fall below a threshold.
Next Steps
Long-term Context Lifecycle
Learn how memories persist and evolve across conversations.
Context Compaction
Deep dive into compaction strategies and algorithms.
SDK: Context Compaction
Programmatic control over compaction in your application.