> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maximem.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Graceful Degradation

> What to do when Synap is unreachable. Cache, fallback prompts, queued retries.

Synap should make your agent better when it's available, and not break your agent when it isn't. Treat retrieval and ingestion as **best-effort** in the hot path: never let them stop the LLM from generating a response.

## The shape of "good enough" degradation

```python theme={null}
import asyncio
import logging
from maximem_synap import MaximemSynapSDK, SynapError, SynapTransientError

sdk = MaximemSynapSDK(api_key=...)
log = logging.getLogger(__name__)

async def safe_fetch_context(conversation_id: str, query: str):
    """Always returns something, even if it's empty."""
    try:
        return await asyncio.wait_for(
            sdk.conversation.context.fetch(
                conversation_id=conversation_id,
                search_query=[query],
                mode="fast",
                max_results=8,
            ),
            timeout=YOUR_CONVERSATIONAL_BUDGET_SECONDS,   # illustrative: tune to your own conversational budget
        )
    except asyncio.TimeoutError:
        log.warning("synap_context_timeout conv=%s", conversation_id)
        return None
    except SynapTransientError as e:
        log.warning("synap_transient err=%s correlation_id=%s", e, e.correlation_id)
        return None
    except SynapError as e:
        log.error("synap_unexpected err=%s correlation_id=%s", e, e.correlation_id)
        return None


async def handle_turn(user_id: str, customer_id: str, conversation_id: str, msg: str) -> str:
    ctx = await safe_fetch_context(conversation_id, msg)

    if ctx is None:
        # Degraded mode: call the LLM without memory rather than 500ing
        memory_block = ""
        log.info("turn_degraded user=%s", user_id)
    else:
        memory_block = "\n".join(f"- {f.content}" for f in ctx.facts[:5])

    reply = await call_llm(memory_block, msg)

    # Ingest in the background. If it fails, queue for retry, don't await.
    asyncio.create_task(safe_ingest(user_id, customer_id, conversation_id, msg, reply))
    return reply


async def safe_ingest(user_id, customer_id, conversation_id, msg, reply, _retries=0):
    try:
        await sdk.memories.create(
            document=f"User: {msg}\nAssistant: {reply}",
            document_type="ai-chat-conversation",
            user_id=user_id,
            customer_id=customer_id,
            metadata={"conversation_id": conversation_id},
        )
    except SynapTransientError as e:
        if _retries < 3:
            await asyncio.sleep(2 ** _retries)
            return await safe_ingest(user_id, customer_id, conversation_id, msg, reply, _retries + 1)
        log.error("synap_ingest_dropped after retries user=%s msg_excerpt=%r correlation_id=%s",
                  user_id, msg[:80], e.correlation_id)
        # Optionally: enqueue for an out-of-band replayer
        await enqueue_for_replay(user_id, customer_id, conversation_id, msg, reply)
    except SynapError as e:
        log.error("synap_ingest_permanent err=%s correlation_id=%s", e, e.correlation_id)
```

## What to watch in production

Three metrics that should be on your dashboard from day one:

| Metric                            | What it tells you                             | Page if                                     |
| --------------------------------- | --------------------------------------------- | ------------------------------------------- |
| `synap_context_timeout_rate`      | Are users seeing degraded responses?          | `> 1%` over 5 min                           |
| `synap_ingest_dropped_rate`       | Are you losing memory?                        | `> 0.1%` over 1 hour                        |
| `synap_correlation_ids_in_errors` | Sample of `correlation_id` values for support | Always log; sample 5% to your error tracker |

Every `SynapError` exposes `e.correlation_id`: log it. When you need to ask support, they need that ID.

## Don't do these things

* **Don't fail the request on a Synap timeout.** The LLM can answer without memory. The user gets a slightly worse response. Failing the request gets you an outage.
* **Don't retry permanent errors.** `InvalidInputError`, `ContextNotFoundError`, `AuthenticationError` won't get better with retries. Fix the input or the credentials.
* **Don't block on ingestion in the hot path.** Always background it. The user shouldn't wait for memory persistence to see the next assistant message.
* **Don't catch and swallow without logging.** Every catch should at minimum log the `correlation_id`. Silent swallows make production debugging hopeless.

## Where the SDK already retries for you

`SynapTransientError` subclasses (`NetworkTimeoutError`, `RateLimitError`, `ServiceUnavailableError`, `AgentUnavailableError`) are retried automatically inside the SDK using the configured `RetryPolicy`. By the time one of these reaches your `except` block, the SDK already tried 2-3 times. So your wrapper retries are belt-and-suspenders for genuinely down-for-a-while scenarios.

## Going further

* **Patterns:** [Replay Conversation History](/patterns/replay-history)
* **Cookbook:** [Voice Concierge](/cookbook/voice-concierge) · [Uber: Customer Support](/cookbook/consumer-uber)
* **Guides:** [Production Checklist](/guides/production-checklist)