Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.maximem.ai/llms.txt

Use this file to discover all available pages before exploring further.

Run through this checklist before every production deployment, not just the first one. Configuration changes, SDK upgrades, and new features each warrant a fresh review.

Authentication and Security

Credential management is the foundation of a secure Synap integration. A compromised API key gives an attacker full access to your instance’s memory store.
1

API key stored in a secrets manager

Never hardcode API keys in source code, environment files committed to version control, or Docker images. Use a proper secrets manager:
  • AWS: Secrets Manager or SSM Parameter Store
  • GCP: Secret Manager
  • Azure: Key Vault
  • Self-hosted: HashiCorp Vault
# Good: loaded from secrets manager at runtime
import boto3

def get_api_key():
    client = boto3.client("secretsmanager")
    response = client.get_secret_value(SecretId="synap/api-key")
    return response["SecretString"]

sdk = MaximemSynapSDK(
    api_key=get_api_key()
)
API key is stored in a secrets manager (not in code, .env files, or container images)
2

Webhook signature verification implemented

If you receive webhooks from Synap, always verify the signature before processing the payload. Unverified webhooks can be spoofed by attackers.
import hmac
import hashlib

def verify_webhook_signature(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(), payload, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)
Webhook signature verification is implemented and tested
3

API key rotation schedule established

API keys should be rotated periodically. You can have multiple active keys per instance, so rotation is zero-downtime: generate a new key, roll it out, then revoke the old one.
  • Recommended rotation cadence: every 90 days for standard deployments, every 30 days for high-security environments
  • Document the rotation procedure in your team’s runbook
  • Automate rotation if possible (e.g., via a cron job or CI/CD step)
API key rotation schedule is established and documented

SDK Configuration

Proper SDK configuration ensures your integration performs well under production load and does not generate excessive logging or resource usage.
1

Log level set appropriately

In production, set log_level to "WARNING" or "ERROR". The "DEBUG" and "INFO" levels generate high-volume output that degrades performance and can expose sensitive information in log aggregators.
config = SDKConfig(
    log_level="WARNING"    # Not "DEBUG" or "INFO" in production
)
log_level is set to "WARNING" or "ERROR" (not "DEBUG" or "INFO")
2

Timeouts configured for your SLA

Default timeouts are suitable for most applications, but review them against your latency requirements:
TimeoutDefaultGuidance
connect5sIncrease to 10s if your infrastructure has high network latency
read30sDecrease for latency-sensitive paths; increase for large batch operations
write10sUsually sufficient; increase for large document ingestion
stream_idle60sgRPC streaming idle timeout; increase for low-traffic streams
config = SDKConfig(
    timeouts=TimeoutConfig(
        connect=5.0,
        read=30.0,
        write=10.0,
        stream_idle=60.0
    )
)
Timeouts are reviewed and aligned with your application’s SLA requirements
3

Retry policy tuned

The default retry policy (3 attempts, exponential backoff with jitter) is appropriate for most use cases. Adjust if needed:
  • High-throughput systems: Reduce max_attempts to 2 to avoid retry storms
  • Critical operations: Increase max_attempts to 5 for reliability
  • Low-latency paths: Reduce backoff_max to limit total retry time
config = SDKConfig(
    retry_policy=RetryPolicy(
        max_attempts=3,
        backoff_base=1.0,
        backoff_max=10.0,
        backoff_jitter=True     # Always enable jitter in production
    )
)
Retry policy is reviewed and tuned for your workload profile
4

Cache backend enabled

The SQLite cache backend significantly improves retrieval performance for repeated queries. Ensure it is enabled:
config = SDKConfig(
    cache_backend="sqlite"     # Not None
)
cache_backend is set to "sqlite" for production performance
5

Session timeout configured

The session_timeout_minutes setting controls how long an authenticated session lasts before requiring re-authentication. The default is appropriate for most cases, but adjust based on your security requirements:
  • Standard applications: 60-480 minutes (1-8 hours)
  • High-security environments: 5-30 minutes
  • Long-running batch processes: 720-1440 minutes (12-24 hours)
session_timeout_minutes is configured appropriately (range: 5-1440)

Memory Architecture

Synap generates each Instance’s memory configuration automatically from the use-case file you upload. Before going to production, make sure that file reflects the agent you are actually deploying.
1

Use-case file accurately describes the production agent

The use-case Markdown you uploaded at instance creation drives every memory decision: which categories are extracted, how scopes are partitioned, what retention behavior applies. Review it now and re-upload an updated version if the agent’s purpose, audience, or compliance requirements have shifted since you created the Instance.
Use-case file reflects the production agent’s behavior, audience, and compliance posture
2

Verify retrieval quality on representative queries

Before going live, run a handful of representative production queries against the Instance and confirm the returned memories are relevant and complete. Catch retrieval drift before users do.
Retrieval quality validated on at least 10 representative queries
3

Confirm the Instance is in the active state

Check the Dashboard to confirm your Instance has moved from initializing to active and that its memory architecture has been generated and applied. Do not start production traffic on an Instance that is still initializing.
Instance status is active and ready to accept traffic

Error Handling

Robust error handling ensures your application degrades gracefully when Synap encounters issues, rather than crashing or returning empty responses.
1

All SynapError subtypes caught appropriately

Handle transient and permanent errors differently:
from maximem_synap.errors import (
    SynapError,
    NetworkTimeoutError,
    RateLimitError,
    ServiceUnavailableError,
    InvalidInputError,
    AuthenticationError,
)

try:
    context = await sdk.conversation.context.fetch(
        conversation_id=conv_id,
        search_query=[query],
        mode="fast"
    )
except (NetworkTimeoutError, ServiceUnavailableError) as e:
    # Transient: retry or fall back to no-memory mode
    logger.warning(
        "Synap unavailable (transient), proceeding without memory: %s "
        "(correlation_id=%s)", e, e.correlation_id
    )
    context = None
except RateLimitError as e:
    # Transient: respect retry_after
    logger.warning(
        "Rate limited, retry after %s seconds (correlation_id=%s)",
        e.retry_after_seconds, e.correlation_id
    )
    context = None
except InvalidInputError as e:
    # Permanent: fix the request
    logger.error("Invalid request to Synap: %s", e)
    raise
except AuthenticationError as e:
    # Permanent: credentials issue
    logger.critical("Synap auth failed: %s", e)
    raise
Error handling distinguishes between transient and permanent errors
2

Transient errors logged with correlation_id

Every SynapError includes a correlation_id field. Always log it — this is the primary identifier Synap support uses to trace issues.
except SynapError as e:
    logger.error(
        "Synap error: %s (correlation_id=%s)",
        e, e.correlation_id
    )
All error logs include the correlation_id from the Synap error
3

Graceful degradation implemented

Your application should continue functioning when Synap is unavailable — just without memory context. This is the single most important resilience pattern.
async def get_memory_context(sdk, conversation_id, query):
    """Retrieve memory context, returning None if unavailable."""
    try:
        return await sdk.conversation.context.fetch(
            conversation_id=conversation_id,
            search_query=[query],
            max_results=5,
            mode="fast"
        )
    except SynapError as e:
        logger.warning(
            "Memory retrieval failed, proceeding without context: %s",
            e
        )
        return None

# In your chat handler:
context = await get_memory_context(sdk, conv_id, user_message)

if context and context.facts:
    # Build enriched prompt with memories
    system_prompt = build_prompt_with_memories(context)
else:
    # Fall back to generic prompt -- your app still works
    system_prompt = build_generic_prompt()
Application continues working (without memory) when Synap is unavailable
4

Rate limit handling with retry_after

When you receive a RateLimitError, respect the retry_after_seconds field before retrying:
except RateLimitError as e:
    await asyncio.sleep(e.retry_after_seconds)
    # Retry the operation
Rate limit errors are handled with proper backoff using retry_after_seconds

Monitoring

Observability is critical for understanding how your Synap integration performs in production and catching issues before they impact users.
1

Dashboard analytics reviewed regularly

The Synap Dashboard provides real-time analytics for each instance:
  • API call volume and success rate
  • Memory counts by category and scope
  • Ingestion throughput and processing latency
  • Retrieval latency percentiles (P50, P95, P99)
Establish a regular review cadence (at least weekly).
Dashboard analytics overview showing API volume, memory counts, and latency
Dashboard analytics are reviewed on a regular schedule
2

Webhooks configured for critical events

Set up webhooks to receive notifications for important events:
  • ingestion.failed — ingestion pipeline errors
  • credential.expiring — credentials approaching expiration
  • config.applied — configuration changes
  • retention.cleanup — memory retention cleanup completed
See Dashboard Webhooks for setup instructions.
Webhooks are configured for critical operational events
3

P95 latency baseline established

Before deploying to production, establish latency baselines by running representative queries in staging:
OperationExpected P95Alert Threshold
memories.create() (fast)<100ms>300ms
context.fetch() (fast)<150ms>500ms
context.fetch() (accurate)<600ms>1500ms
memories.batch_create()<500ms>2000ms
Adjust thresholds based on your staging measurements.
P95 latency baselines are established from staging measurements
4

Error rate alerts configured

Set up alerts in your monitoring system (Datadog, PagerDuty, CloudWatch, etc.) for:
  • Synap API error rate exceeding 1% over 5 minutes
  • Authentication failures (any occurrence)
  • Rate limit hits exceeding your expected threshold
  • Retrieval returning zero results when memories are expected
Error rate alerts are configured in your monitoring platform
5

Cost tracking enabled

If your Synap plan includes usage-based pricing, track your usage against budget:
  • API call volume (ingestion + retrieval)
  • Storage usage (vector + graph)
  • Bandwidth usage
The Dashboard provides usage breakdowns on the billing page.
Usage and cost tracking is enabled and reviewed regularly

Performance

Optimization ensures your integration meets latency requirements and minimizes unnecessary resource usage.
1

Using fast mode for latency-sensitive paths

Use mode="fast" for any operation in the critical path of user-facing requests. Reserve mode="accurate" for background tasks, research queries, or paths where the user is willing to wait.
# Real-time chat: use fast mode
context = await sdk.conversation.context.fetch(
    conversation_id=conv_id,
    search_query=[query],
    mode="fast"           # ~50-100ms
)

# Background analysis: use accurate mode
context = await sdk.conversation.context.fetch(
    conversation_id=conv_id,
    search_query=[query],
    mode="accurate"       # ~200-500ms, better precision
)
Fast mode is used for all latency-sensitive code paths
2

Batch ingestion for bulk operations

When ingesting multiple documents, use batch_create() instead of multiple create() calls:
# Good: single batch call
await sdk.memories.batch_create(
    documents=[
        {"document": doc1, "document_type": "document", "user_id": "user_1"},
        {"document": doc2, "document_type": "email", "user_id": "user_1"},
        {"document": doc3, "document_type": "pdf", "user_id": "user_2"},
    ],
    fail_fast=False    # Continue processing even if one document fails
)

# Avoid: N sequential calls
for doc in documents:
    await sdk.memories.create(document=doc, ...)  # Slower, more API calls
Batch ingestion is used for all bulk operations
3

Context compaction enabled for long conversations

For conversations that span many turns, use context compaction to keep the context within your LLM’s token budget:
result = await sdk.conversation.context.compact(
    conversation_id=conv_id,
    strategy="adaptive",      # Automatically adjusts compression level
    target_tokens=2000
)

compacted = await sdk.conversation.context.get_compacted(
    conversation_id=conv_id,
    format="injection-ready"  # Ready to insert into your LLM prompt
)
StrategyCompressionBest For
conservative~70% retentionImportant conversations, legal/compliance
balanced~40% retentionGeneral use
aggressive~15% retentionVery long conversations, cost optimization
adaptiveVariableRecommended default — adjusts based on content
Context compaction is configured for conversations that may exceed token budgets
4

Cache is enabled

Verify the cache backend is active and functioning:
stats = sdk.cache.stats()
print(f"Cache entries: {stats['total_entries']}")
print(f"Cache size: {stats['total_bytes']} bytes")
print(f"Backends: {stats['backends']}")
cache.stats() is synchronous and returns a dict with enabled, client_id, base_path, total_entries, total_bytes, and per-backend stats under backends. If enabled is False or total_entries stays at 0 over time, your cache backend isn’t engaged — check SDKConfig.cache_backend.
Cache backend is enabled and accumulating entries

Operational Readiness

Beyond code and configuration, production readiness requires documented procedures and team alignment.
1

Team roles assigned appropriately

In the Synap Dashboard, assign roles based on the principle of least privilege:
RoleCapabilitiesAssign To
OwnerFull access, billing, delete instanceEngineering lead, CTO
AdminConfig changes, key management, analyticsSenior engineers, DevOps
DeveloperRead analytics, view config (no changes)All developers
ViewerRead-only Dashboard accessProduct managers, support
Team members have appropriate roles (not everyone is Owner)
2

Credential rotation runbook documented

Document the step-by-step procedure for rotating API keys:
  1. Generate new key in Dashboard
  2. Update secrets manager with new key
  3. Deploy application with updated secret reference
  4. Verify new key is working (check Dashboard for API calls)
  5. Revoke old key after grace period (48 hours)
Credential rotation runbook is documented and accessible to the operations team
3

Support channel documented

Ensure your team knows how to get help:
Support channels are documented and the team knows how to report issues

Quick Summary

Use this condensed checklist for quick pre-deployment reviews:
AreaItemStatus
SecurityAPI key in secrets manager
SecurityWebhook signatures verified
SecurityAPI key rotation scheduled
SDKLog level set to WARNING/ERROR
SDKTimeouts match SLA
SDKCache enabled (sqlite)
MemoryUse-case file matches the production agent
MemoryRetrieval quality validated on representative queries
MemoryInstance status is active
ErrorsTransient vs permanent handling
ErrorsGraceful degradation
Errorscorrelation_id in logs
MonitoringDashboard reviewed regularly
MonitoringWebhooks for critical events
MonitoringLatency alerts configured
PerformanceFast mode for user-facing paths
PerformanceBatch ingestion for bulk ops
OperationsRoles assigned (least privilege)
OperationsRotation + rollback runbooks

Next Steps

Migrate from Competitors

Mapping your existing memory system (Mem0, Zep, Letta, SuperMemory) to Synap.

Monitoring and Analytics

Deep dive into Dashboard analytics and monitoring capabilities.

Error Handling

Complete reference for all Synap error types and handling patterns.

Webhooks

Configure webhooks for real-time event notifications.