How compaction works
Compaction analyzes the full conversation, identifies what is essential, and produces a compressed representation that preserves the information your agent needs to maintain coherent, personalized responses.Analyze the conversation
The compaction engine reads the full conversation history, identifying facts, decisions, preferences, emotional shifts, and the current state of the discussion.
Extract essential information
Key information is extracted and categorized: facts that have been established, decisions that have been made, preferences that have been expressed, and the current topic and emotional tone.
Compress into target format
The extracted information is compressed into the target token budget using the selected strategy. A quality validation score is computed to ensure critical information is preserved.
Compaction strategies
Synap provides four compaction strategies, each optimizing for a different balance between compression and detail retention:| Strategy | Compression Ratio | Output Size | Best For |
|---|---|---|---|
conservative | ~70% of original | Largest | Short conversations where high detail is needed. Preserves most of the original context with minimal information loss. |
balanced | ~40% of original | Medium | General-purpose use. Good balance between compression and detail. Recommended for most applications. |
aggressive | ~15% of original | Smallest | Long conversations or cost-sensitive applications. Preserves only the most critical facts and decisions. |
adaptive | Cloud decides | Varies | Synap analyzes the conversation and selects the optimal strategy automatically. Recommended default. |
How adaptive strategy works
Theadaptive strategy analyzes several signals to choose the optimal compression level:
- Conversation length: Longer conversations get more aggressive compression
- Information density: Conversations with many facts and decisions get less aggressive compression to preserve detail
- Repetition: Conversations with redundant exchanges get more aggressive compression
- Recency: Recent turns are weighted more heavily than older ones
- Token budget: The target token count influences how aggressively the engine compresses
What gets extracted
During compaction, the engine identifies and preserves five categories of essential information:Facts
Factual statements established during the conversation. These are the foundation of compacted context — what has been confirmed, stated, or agreed upon.“User is based in Portland. They work at Acme Corp as a senior engineer. They started in January.”
Decisions
Decisions made during the conversation. These capture what was agreed, chosen, or determined.“User decided to proceed with Option B. Meeting scheduled for next Thursday.”
Preferences
Preferences expressed during the conversation, including communication style, topic interests, and behavioral patterns.“User prefers concise responses. They like bullet-point summaries. They dislike overly formal language.”
Summary Narrative
A natural language summary of the conversation arc — what happened, in what order, and where things stand now.“The user asked about migration options, discussed pricing, and chose the enterprise plan.”
Current State
The current topic, active questions, and unresolved threads. This ensures the agent knows where the conversation stands.“Currently discussing: implementation timeline. Open question: when can the team start?”
Quality validation
Every compaction result includes quality metrics so you can verify that the compression preserved critical information:| Field | Type | Description |
|---|---|---|
validation_score | float (0.0-1.0) | Overall quality score. Higher means more information was preserved. Scores above 0.7 are generally considered good. |
validation_passed | bool | Whether the compaction meets the minimum quality threshold. false if critical information was lost. |
original_token_count | int | Token count of the original conversation. |
compacted_token_count | int | Token count of the compacted output. |
compression_ratio | float | Ratio of compacted to original (e.g., 0.35 means 35% of original size). |
preserved_facts_count | int | Number of facts preserved in the compacted output. |
preserved_decisions_count | int | Number of decisions preserved. |
Output formats
Compacted context can be returned in two formats:- Structured
- Injection-ready
Returns typed fields that you can programmatically access. Best for applications that build custom LLM prompts.
Code examples
Basic compaction
Compaction with quality check
Using compacted context in a conversation flow
When to compact
Compaction vs. retrieval
Compaction and retrieval serve different purposes and are complementary:| Aspect | Context Compaction | Memory Retrieval |
|---|---|---|
| Input | Current conversation history | Query against stored memories |
| Scope | Single conversation | All memories across all conversations |
| Purpose | Reduce token usage for current conversation | Bring relevant past knowledge into current conversation |
| Output | Compressed version of current conversation | Ranked memories from vector and graph stores |
| When to use | Conversation is too long for LLM context | Agent needs knowledge from past interactions |
- Retrieve relevant memories from past conversations
- Compact the current conversation if it is long
- Combine retrieved memories + compacted context + recent turns into the LLM prompt
Next steps
Memories & Context
See how compaction fits into the broader memory lifecycle.
Memory Architecture
Configure token budgets and retrieval settings in MACA.
SDK: Context Compaction
Full SDK reference for compaction methods and parameters.
Storage Infrastructure
Understand the storage engines that power memory retrieval.