Skip to main content

General

Synap is a managed memory platform for AI agents. It provides a complete pipeline for ingesting conversations and documents, extracting structured knowledge (facts, preferences, episodes, emotions, temporal events), resolving entities across conversations, and retrieving relevant context when your agent needs it.Instead of building and maintaining your own vector database, retrieval pipeline, and entity resolution system, you integrate the Synap SDK into your application and let the platform handle the rest. Your agent gets long-term, structured memory with a few lines of code.
Traditional RAG systems retrieve raw document chunks based on similarity search. Synap goes several steps further:
  • Structured extraction: Synap does not just store chunks. It extracts typed knowledge — facts, preferences, episodes, emotions, and temporal events — with confidence scores.
  • Entity resolution: Mentions of the same entity across conversations (e.g., “John”, “my manager”, “John Smith”) are linked to a single canonical entity.
  • Scoped retrieval: Memories are scoped to users, customers, and organizations. Each user gets their own memory without manual isolation logic.
  • Context compaction: Long conversation histories are automatically summarized while preserving key information, reducing token usage.
  • Managed pipeline: No vector databases to deploy, no embedding models to tune, no retrieval pipelines to build.
Yes. Synap is designed with a zero-trust security model:
  • Encryption in transit: All connections use TLS 1.3. All connections (REST and gRPC) use API key authentication.
  • Encryption at rest: All stored data is encrypted at rest using AES-256.
  • Instance isolation: Each instance has its own storage namespace. Memories from one instance are never accessible from another.
  • Scope isolation: Within an instance, memories are scoped to users and customers. A user can only access memories in their scope chain.
  • Credential management: API keys are hashed before storage. API keys can be rotated at any time via the Dashboard or API.
See Authentication for details on the credential lifecycle.
Synap Cloud is currently available in US East (Virginia) and EU West (Frankfurt). Additional regions are planned based on demand. Contact [email protected] for region-specific requirements or data residency needs.

SDK

The official Synap SDK is available for Python 3.9+. It is fully async, built on asyncio, and available via pip:
pip install maximem-synap
TypeScript and Go SDKs are on the roadmap. In the meantime, you can use any language that supports HTTP/JSON to call the REST API directly. Check the Changelog for updates on new language support.
Yes. All SDK methods are async and must be called with await inside an async context. This design ensures your application never blocks on network I/O.If you need to call the SDK from synchronous code, use asyncio.run():
import asyncio
from synap import SynapSDK

sdk = SynapSDK(api_key="synap_abc123...", instance_id="inst_abc123")

# From synchronous code
result = asyncio.run(sdk.memories.create(
    document="User prefers dark mode.",
    document_type="ai-chat-conversation",
    user_id="user_123",
    mode="fast",
))
The SDK raises typed exceptions that map to HTTP error codes. Catch specific exceptions for fine-grained error handling:
from synap.exceptions import (
    SynapAuthError,       # 401, 403
    SynapNotFoundError,   # 404
    SynapRateLimitError,  # 429
    SynapServerError,     # 500, 503
    SynapValidationError, # 400
)

try:
    context = await sdk.conversation.context.fetch(
        conversation_id="conv_abc123",
        search_query=["user preferences"],
    )
except SynapRateLimitError as e:
    # Automatic retry with backoff is built into the SDK.
    # This exception is raised only after all retries are exhausted.
    print(f"Retry after {e.retry_after}s")
except SynapNotFoundError:
    print("Conversation not found")
The SDK automatically retries 429, 500, and 503 errors with exponential backoff. See Error Handling for the full reference.
Yes. Synap is framework-agnostic. The SDK operates independently of your LLM orchestration layer. Common integration patterns:
  • LangChain: Use sdk.conversation.context.fetch() in a custom retriever, then pass the context to your chain.
  • LlamaIndex: Use sdk.conversation.context.compacted() with format="system_prompt" and inject it into your query engine.
  • Direct: Call the SDK from your application code and pass context to any LLM API.
See the First Integration guide for detailed examples.

Memory

By default, memories are stored indefinitely (no TTL). You can configure retention policies per instance in the MACA configuration:
  • Time-based: Set storage.retention.default_ttl_days to automatically expire memories after a number of days.
  • Capacity-based: Set storage.retention.max_memories_per_scope to limit the number of memories per scope. When the limit is reached, the oldest low-confidence memories are evicted first.
You can also delete individual memories via the Delete Memory endpoint.
Yes. Use the DELETE /v1/memories/{memory_id} endpoint to permanently delete a specific memory. Deletion removes the memory from both the vector store and graph store. Entity references are updated but the entities themselves are not deleted, as they may be referenced by other memories.
await sdk.memories.delete("mem_a1b2c3d4e5f67890")
Deletion is permanent and cannot be undone. See the Memory API for details.
The ingestion mode controls the speed-quality tradeoff of the extraction pipeline:
ModeSpeedQualityBest For
fastHighestGoodReal-time chat ingestion, high-volume streams
accurateModerateHighestImportant documents, support tickets, onboarding conversations
batchQueuedHighBulk historical imports, off-peak processing
fast uses lighter extraction models and skips some entity resolution steps. accurate uses the full pipeline with thorough entity resolution and re-ranking. batch queues documents for processing during off-peak hours for maximum throughput.
If you provide a document_id in the create memory request, Synap checks for duplicates. If a document with the same ID has already been ingested, the request is rejected with a 409 Conflict error.If you do not provide a document_id, the document is ingested as a new record. The extraction pipeline may produce duplicate memories if the content overlaps with previously ingested documents. Entity resolution helps by linking entities across documents, but the memories themselves are stored independently.For production use, we recommend always providing a document_id for deduplication.

Configuration

Yes. Configuration changes in Synap are designed for zero-downtime application:
  • Graceful mode (default): Synap waits for in-flight requests to complete before switching to the new configuration. No requests see mixed behavior.
  • Immediate mode: The new configuration takes effect immediately. Active requests may see a mix of old and new behavior during the transition window.
  • Canary mode: The new configuration is gradually rolled out to a percentage of traffic before full deployment.
See Apply Configuration for details on apply modes.
If a configuration is rejected during review, it cannot be applied. The rejection notes are stored alongside the configuration version for reference. To proceed, you have two options:
  1. Submit a new version: Create a new configuration update addressing the reviewer’s feedback. This creates a new version that goes through the review process again.
  2. Request re-review: If the rejection was a mistake, the reviewer can submit a new review with decision: "approve" on the same config ID (if the status allows it).
Rejected configurations remain visible in the version history for audit purposes.
When you apply a configuration, the response includes a rollback_config_id pointing to the previously active configuration. To rollback:
import httpx

resp = httpx.post(
    f"https://api.synap.maximem.ai/instances/{instance_id}/memory-architecture/rollback/{rollback_config_id}",
    headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
    json={
        "applied_by": "user_admin_001",
        "reason": "New config causing elevated error rates",
    },
)
Rollback re-applies the previous configuration. Memories ingested under the bad configuration are retained — only the processing rules change going forward. See Rollback Configuration.

Billing and Usage

Synap usage is measured across three dimensions:
  • API calls: Each HTTP request to the API counts as one API call. Batch endpoints count as a single call regardless of batch size.
  • Token usage: LLM tokens consumed during ingestion (extraction, categorization) and retrieval (re-ranking, compaction). Input and output tokens are tracked separately.
  • Storage: Total memories stored across all instances. Measured as a monthly peak.
Use the Dashboard Analytics to monitor your usage in real time.
Each HTTP request to any Synap API endpoint counts as one API call, including:
  • Memory ingestion (single and batch)
  • Context fetch and compaction
  • Configuration operations
  • Dashboard queries
  • Analytics queries
  • Status checks
Webhook deliveries do not count as API calls.

Troubleshooting

Common causes and solutions:
  1. Missing or malformed API key: Ensure the header is Authorization: Bearer synap_... with the Bearer prefix.
  2. Expired or rotated key: Check the Dashboard to verify the key is still active.
  3. Wrong instance: The API key may not have access to the instance you are targeting.
See Error Codes for the full list of auth-related errors.
If context fetch returns empty results when you expect matches:
  1. Check ingestion status: Verify the ingestion completed successfully via GET /v1/memories/{ingestion_id}/status. Memories are not retrievable until ingestion completes.
  2. Check scope: Memories are scoped to the user/customer that was specified during ingestion. Context fetch only returns memories within the conversation’s scope chain.
  3. Check confidence threshold: Memories with confidence below the MACA threshold (default 0.7) are discarded during ingestion.
  4. Check memory types: If you are filtering by types in the fetch request, ensure the desired types are included.
  5. Check context budget: If the budget is very small, only the highest-ranked memories may fit.
Use the Dashboard monitoring tools to inspect the ingestion pipeline and stored memories for debugging.
Steps for diagnosing retrieval problems:
  1. Get the correlation ID: Note the X-Correlation-Id from the fetch response.
  2. Check analytics: Use GET /v1/analytics/latency?operation=context_fetch to see if latency is abnormal.
  3. Try different modes: Switch from fast to accurate mode to see if graph traversal finds additional results.
  4. Broaden the query: Try more general search queries or remove type filters.
  5. Check compaction: If the context was recently compacted, some memories may have been summarized away. Use format: "full" to see both the narrative and structured extractions.
If the issue persists, contact support with the correlation ID and instance ID.