Skip to main content
Synap publishes very few hard numbers. The two below are the ones the SDK actually enforces. Everything else — latency, throughput, rate-limit ceilings — depends on your Instance, your use-case, and your workload, and is best read off the Dashboard as you run.

SDK timeouts

The SDK ships with two default timeouts, one per retrieval mode:
ModeDefault timeout
fast8000 ms
accurate30000 ms
Both are configurable per call via SDKConfig.timeouts.

Retrieval modes

sdk.conversation.context.fetch() accepts mode="fast" (default) or mode="accurate".
  • fast — lower-latency path. Queries the vector store and the graph store.
  • accurate — higher-quality path. Queries the vector store and the graph store, and additionally runs LLM subquery decomposition and reranking on the candidate set.
Both modes pull from the same underlying memory; accurate simply spends more compute to widen and re-order the result set.

Where to see live numbers

Per-Instance latency, request volume, and rate-limit headroom are visible in Dashboard → Usage. Use that view for capacity planning rather than any number quoted in docs — your workload is the source of truth. Document/message size and rate ceilings are likewise per-Instance rather than fixed platform constants, and are read off the same Dashboard → Usage view. If you need a ceiling raised, contact [email protected].

Behavior under load

  • The SDK auto-retries transient errors (rate limits, service-unavailable) using RetryPolicy with exponential backoff. You generally don’t need to wrap calls in your own retry loop.
  • When retries are exhausted on a rate-limited call, the SDK raises RateLimitError. Catch it to fall back gracefully or surface a user-visible error.
  • If you consistently hit RateLimitError for an Instance, your per-Instance ceiling is tuneable — contact [email protected].

Status and monitoring