Skip to main content
Run through this checklist before every production deployment, not just the first one. Configuration changes, SDK upgrades, and new features each warrant a fresh review.

Authentication and Security

Credential management is the foundation of a secure Synap integration. A compromised bootstrap key or API key gives an attacker full access to your instance’s memory store.
1

Bootstrap key stored in a secrets manager

Never hardcode bootstrap keys in source code, environment files committed to version control, or Docker images. Use a proper secrets manager:
  • AWS: Secrets Manager or SSM Parameter Store
  • GCP: Secret Manager
  • Azure: Key Vault
  • Self-hosted: HashiCorp Vault
# Good: loaded from secrets manager at runtime
import boto3

def get_bootstrap_token():
    client = boto3.client("secretsmanager")
    response = client.get_secret_value(SecretId="synap/bootstrap-token")
    return response["SecretString"]

sdk = MaximemSynapSDK(
    instance_id=os.environ["SYNAP_INSTANCE_ID"],
    bootstrap_token=get_bootstrap_token()
)
Bootstrap key is stored in a secrets manager (not in code, .env files, or container images)
2

Credentials source configured for production

Set credentials_source appropriately for your deployment environment:
  • "file" (default): Credentials stored on local filesystem at storage_path. Suitable for persistent servers.
  • "env": Credentials loaded from environment variables. Suitable for containers, serverless, and ephemeral environments.
credentials_source is set to match your deployment model
3

Webhook signature verification implemented

If you receive webhooks from Synap, always verify the signature before processing the payload. Unverified webhooks can be spoofed by attackers.
import hmac
import hashlib

def verify_webhook_signature(payload: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(), payload, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)
Webhook signature verification is implemented and tested
4

API key rotation schedule established

API keys should be rotated periodically. Synap supports graceful rotation with a 48-hour overlap window where both old and new keys are valid.
  • Recommended rotation cadence: every 90 days for standard deployments, every 30 days for high-security environments
  • Document the rotation procedure in your team’s runbook
  • Automate rotation if possible (e.g., via a cron job or CI/CD step)
API key rotation schedule is established and documented
5

Bootstrap key revoked after initial setup

The bootstrap key is consumed on first use, but revoking it explicitly in the Dashboard confirms it cannot be reused (even if the consumption state is somehow lost).
Bootstrap key has been explicitly revoked in the Dashboard after successful initialization

SDK Configuration

Proper SDK configuration ensures your integration performs well under production load and does not generate excessive logging or resource usage.
1

Log level set appropriately

In production, set log_level to "WARNING" or "ERROR". The "DEBUG" and "INFO" levels generate high-volume output that degrades performance and can expose sensitive information in log aggregators.
config = SDKConfig(
    log_level="WARNING"    # Not "DEBUG" or "INFO" in production
)
log_level is set to "WARNING" or "ERROR" (not "DEBUG" or "INFO")
2

Timeouts configured for your SLA

Default timeouts are suitable for most applications, but review them against your latency requirements:
TimeoutDefaultGuidance
connect5sIncrease to 10s if your infrastructure has high network latency
read30sDecrease for latency-sensitive paths; increase for large batch operations
write10sUsually sufficient; increase for large document ingestion
stream_idle60sgRPC streaming idle timeout; increase for low-traffic streams
config = SDKConfig(
    timeouts=TimeoutConfig(
        connect=5.0,
        read=30.0,
        write=10.0,
        stream_idle=60.0
    )
)
Timeouts are reviewed and aligned with your application’s SLA requirements
3

Retry policy tuned

The default retry policy (3 attempts, exponential backoff with jitter) is appropriate for most use cases. Adjust if needed:
  • High-throughput systems: Reduce max_attempts to 2 to avoid retry storms
  • Critical operations: Increase max_attempts to 5 for reliability
  • Low-latency paths: Reduce backoff_max to limit total retry time
config = SDKConfig(
    retry_policy=RetryPolicy(
        max_attempts=3,
        backoff_base=1.0,
        backoff_max=10.0,
        backoff_jitter=True     # Always enable jitter in production
    )
)
Retry policy is reviewed and tuned for your workload profile
4

Cache backend enabled

The SQLite cache backend significantly improves retrieval performance for repeated queries. Ensure it is enabled:
config = SDKConfig(
    cache_backend="sqlite"     # Not None
)
cache_backend is set to "sqlite" for production performance
5

Session timeout configured

The session_timeout_minutes setting controls how long an authenticated session lasts before requiring re-authentication. The default is appropriate for most cases, but adjust based on your security requirements:
  • Standard applications: 60-480 minutes (1-8 hours)
  • High-security environments: 5-30 minutes
  • Long-running batch processes: 720-1440 minutes (12-24 hours)
session_timeout_minutes is configured appropriately (range: 5-1440)

Memory Architecture

Your MACA configuration directly impacts memory quality, retrieval accuracy, and storage costs. Review it thoroughly before production.
1

MACA config reviewed and approved

Do not deploy with the default configuration. Create a config tailored to your use case and have it reviewed by your team. Refer to the Configuring Memory Guide for detailed guidance.
MACA config has been customized for your use case (not using defaults)
2

Dry-run tested before applying

Always validate configuration changes with a dry run before applying:
result = await admin_client.config.validate(
    instance_id="inst_xxx",
    config_yaml=config_content,
    dry_run=True
)
assert result.valid, f"Config validation failed: {result.errors}"
Config changes have been validated with dry_run=True before applying
3

Rollback plan documented

Know which config version you will roll back to if issues arise. Document:
  • The current stable version number
  • The rollback command or procedure
  • Expected impact of rolling back
  • Who is authorized to execute the rollback
Rollback plan is documented and the team knows how to execute it
4

Retention policy set

Unbounded memory growth increases storage costs and can degrade retrieval quality over time. Set a reasonable max_memory_age_days based on your use case:
storage:
  retention:
    max_memory_age_days: 365    # Not 0 (unlimited) unless justified
Retention policy is set to prevent unbounded memory growth
5

Context budget aligned with LLM context window

Ensure retrieval.context_budget.max_tokens does not exceed the space available in your LLM’s context window after accounting for the system prompt, user message, and expected response length.
LLM WindowSystem PromptUser + ResponseAvailable for Memories
8K~500 tokens~3K tokens~4K tokens
32K~500 tokens~8K tokens~8-16K tokens
128K~500 tokens~16K tokens~16-32K tokens
context_budget.max_tokens fits within your LLM’s available context space

Error Handling

Robust error handling ensures your application degrades gracefully when Synap encounters issues, rather than crashing or returning empty responses.
1

All SynapError subtypes caught appropriately

Handle transient and permanent errors differently:
from maximem_synap.errors import (
    SynapError,
    NetworkTimeoutError,
    RateLimitError,
    ServiceUnavailableError,
    InvalidInputError,
    AuthenticationError,
)

try:
    context = await sdk.conversation.context.fetch(
        conversation_id=conv_id,
        search_query=[query],
        mode="fast"
    )
except (NetworkTimeoutError, ServiceUnavailableError) as e:
    # Transient: retry or fall back to no-memory mode
    logger.warning(
        "Synap unavailable (transient), proceeding without memory: %s "
        "(correlation_id=%s)", e, e.correlation_id
    )
    context = None
except RateLimitError as e:
    # Transient: respect retry_after
    logger.warning(
        "Rate limited, retry after %s seconds (correlation_id=%s)",
        e.retry_after_seconds, e.correlation_id
    )
    context = None
except InvalidInputError as e:
    # Permanent: fix the request
    logger.error("Invalid request to Synap: %s", e)
    raise
except AuthenticationError as e:
    # Permanent: credentials issue
    logger.critical("Synap auth failed: %s", e)
    raise
Error handling distinguishes between transient and permanent errors
2

Transient errors logged with correlation_id

Every SynapError includes a correlation_id field. Always log it — this is the primary identifier Synap support uses to trace issues.
except SynapError as e:
    logger.error(
        "Synap error: %s (correlation_id=%s)",
        e, e.correlation_id
    )
All error logs include the correlation_id from the Synap error
3

Graceful degradation implemented

Your application should continue functioning when Synap is unavailable — just without memory context. This is the single most important resilience pattern.
async def get_memory_context(sdk, conversation_id, query):
    """Retrieve memory context, returning None if unavailable."""
    try:
        return await sdk.conversation.context.fetch(
            conversation_id=conversation_id,
            search_query=[query],
            max_results=5,
            mode="fast"
        )
    except SynapError as e:
        logger.warning(
            "Memory retrieval failed, proceeding without context: %s",
            e
        )
        return None

# In your chat handler:
context = await get_memory_context(sdk, conv_id, user_message)

if context and context.facts:
    # Build enriched prompt with memories
    system_prompt = build_prompt_with_memories(context)
else:
    # Fall back to generic prompt -- your app still works
    system_prompt = build_generic_prompt()
Application continues working (without memory) when Synap is unavailable
4

Rate limit handling with retry_after

When you receive a RateLimitError, respect the retry_after_seconds field before retrying:
except RateLimitError as e:
    await asyncio.sleep(e.retry_after_seconds)
    # Retry the operation
Rate limit errors are handled with proper backoff using retry_after_seconds

Monitoring

Observability is critical for understanding how your Synap integration performs in production and catching issues before they impact users.
1

Dashboard analytics reviewed regularly

The Synap Dashboard provides real-time analytics for each instance:
  • API call volume and success rate
  • Memory counts by category and scope
  • Ingestion throughput and processing latency
  • Retrieval latency percentiles (P50, P95, P99)
Establish a regular review cadence (at least weekly).
Dashboard analytics overview showing API volume, memory counts, and latency
Dashboard analytics are reviewed on a regular schedule
2

Webhooks configured for critical events

Set up webhooks to receive notifications for important events:
  • ingestion.failed — ingestion pipeline errors
  • credential.expiring — credentials approaching expiration
  • config.applied — configuration changes
  • retention.cleanup — memory retention cleanup completed
See Dashboard Webhooks for setup instructions.
Webhooks are configured for critical operational events
3

P95 latency baseline established

Before deploying to production, establish latency baselines by running representative queries in staging:
OperationExpected P95Alert Threshold
memories.create() (fast)<100ms>300ms
context.fetch() (fast)<150ms>500ms
context.fetch() (accurate)<600ms>1500ms
memories.batch_create()<500ms>2000ms
Adjust thresholds based on your staging measurements.
P95 latency baselines are established from staging measurements
4

Error rate alerts configured

Set up alerts in your monitoring system (Datadog, PagerDuty, CloudWatch, etc.) for:
  • Synap API error rate exceeding 1% over 5 minutes
  • Authentication failures (any occurrence)
  • Rate limit hits exceeding your expected threshold
  • Retrieval returning zero results when memories are expected
Error rate alerts are configured in your monitoring platform
5

Cost tracking enabled

If your Synap plan includes usage-based pricing, track your usage against budget:
  • API call volume (ingestion + retrieval)
  • Storage usage (vector + graph)
  • Bandwidth usage
The Dashboard provides usage breakdowns on the billing page.
Usage and cost tracking is enabled and reviewed regularly

Performance

Optimization ensures your integration meets latency requirements and minimizes unnecessary resource usage.
1

Using fast mode for latency-sensitive paths

Use mode="fast" for any operation in the critical path of user-facing requests. Reserve mode="accurate" for background tasks, research queries, or paths where the user is willing to wait.
# Real-time chat: use fast mode
context = await sdk.conversation.context.fetch(
    conversation_id=conv_id,
    search_query=[query],
    mode="fast"           # ~50-100ms
)

# Background analysis: use accurate mode
context = await sdk.conversation.context.fetch(
    conversation_id=conv_id,
    search_query=[query],
    mode="accurate"       # ~200-500ms, better precision
)
Fast mode is used for all latency-sensitive code paths
2

Batch ingestion for bulk operations

When ingesting multiple documents, use batch_create() instead of multiple create() calls:
# Good: single batch call
await sdk.memories.batch_create(
    documents=[
        {"document": doc1, "document_type": "document", "user_id": "user_1"},
        {"document": doc2, "document_type": "email", "user_id": "user_1"},
        {"document": doc3, "document_type": "pdf", "user_id": "user_2"},
    ],
    fail_fast=False    # Continue processing even if one document fails
)

# Avoid: N sequential calls
for doc in documents:
    await sdk.memories.create(document=doc, ...)  # Slower, more API calls
Batch ingestion is used for all bulk operations
3

Context compaction enabled for long conversations

For conversations that span many turns, use context compaction to keep the context within your LLM’s token budget:
result = await sdk.conversation.context.compact(
    conversation_id=conv_id,
    strategy="adaptive",      # Automatically adjusts compression level
    target_tokens=2000
)

compacted = await sdk.conversation.context.get_compacted(
    conversation_id=conv_id,
    format="injection-ready"  # Ready to insert into your LLM prompt
)
StrategyCompressionBest For
conservative~70% retentionImportant conversations, legal/compliance
balanced~40% retentionGeneral use
aggressive~15% retentionVery long conversations, cost optimization
adaptiveVariableRecommended default — adjusts based on content
Context compaction is configured for conversations that may exceed token budgets
4

Cache is enabled

Verify the cache backend is active and functioning:
stats = await sdk.cache.stats()
print(f"Cache entries: {stats.entry_count}")
print(f"Hit rate: {stats.hit_rate:.1%}")
A healthy cache should show a hit rate above 20% for typical applications. If the hit rate is near 0%, your query patterns may be too diverse for caching to help.
Cache backend is enabled and showing a healthy hit rate

Operational Readiness

Beyond code and configuration, production readiness requires documented procedures and team alignment.
1

Team roles assigned appropriately

In the Synap Dashboard, assign roles based on the principle of least privilege:
RoleCapabilitiesAssign To
OwnerFull access, billing, delete instanceEngineering lead, CTO
AdminConfig changes, key management, analyticsSenior engineers, DevOps
DeveloperRead analytics, view config (no changes)All developers
ViewerRead-only Dashboard accessProduct managers, support
Team members have appropriate roles (not everyone is Owner)
2

Credential rotation runbook documented

Document the step-by-step procedure for rotating API keys:
  1. Generate new key in Dashboard
  2. Update secrets manager with new key
  3. Deploy application with updated secret reference
  4. Verify new key is working (check Dashboard for API calls)
  5. Revoke old key after grace period (48 hours)
Credential rotation runbook is documented and accessible to the operations team
3

Config rollback runbook documented

Document the procedure for rolling back a MACA configuration:
  1. Identify the last known good version number
  2. Execute rollback via Admin API or Dashboard
  3. Verify new requests are using the rolled-back config
  4. Post-mortem on why the config change caused issues
Config rollback runbook is documented with the last known good version number
4

Support channel documented

Ensure your team knows how to get help:
Support channels are documented and the team knows how to report issues

Quick Summary

Use this condensed checklist for quick pre-deployment reviews:
AreaItemStatus
SecurityBootstrap key in secrets manager
SecurityWebhook signatures verified
SecurityAPI key rotation scheduled
SDKLog level set to WARNING/ERROR
SDKTimeouts match SLA
SDKCache enabled (sqlite)
MemoryMACA config reviewed (not defaults)
MemoryDry-run tested
MemoryRetention policy set
ErrorsTransient vs permanent handling
ErrorsGraceful degradation
Errorscorrelation_id in logs
MonitoringDashboard reviewed regularly
MonitoringWebhooks for critical events
MonitoringLatency alerts configured
PerformanceFast mode for user-facing paths
PerformanceBatch ingestion for bulk ops
OperationsRoles assigned (least privilege)
OperationsRotation + rollback runbooks

Next Steps

Migration Guide

Upgrade SDK versions and migrate configurations safely.

Monitoring and Analytics

Deep dive into Dashboard analytics and monitoring capabilities.

Error Handling

Complete reference for all Synap error types and handling patterns.

Webhooks

Configure webhooks for real-time event notifications.