This guide walks you through integrating Synap into your application. You will learn how to initialize the SDK in popular Python web frameworks, bridge async/sync execution models, and wire Synap into your LLM provider’s generation pipeline.By the end of this page, your application will have a working memory-augmented agent pattern: retrieve context from Synap, inject it into the LLM prompt, generate a response, and ingest the conversation back into Synap.
FastAPI is the most common framework for Synap integrations. Use the lifespan event for initialization and shutdown, and access the SDK instance from your route handlers.
from contextlib import asynccontextmanagerfrom fastapi import FastAPI, Dependsfrom maximem_synap import MaximemSynapSDKsdk = None@asynccontextmanagerasync def lifespan(app: FastAPI): global sdk sdk = MaximemSynapSDK() # reads SYNAP_API_KEY from env await sdk.initialize() yield await sdk.shutdown()app = FastAPI(lifespan=lifespan)def get_sdk() -> MaximemSynapSDK: """Dependency that provides the initialized SDK.""" if sdk is None: raise RuntimeError("SDK not initialized") return sdk@app.post("/chat")async def chat( message: str, user_id: str, customer_id: str, conversation_id: str, synap: MaximemSynapSDK = Depends(get_sdk)): # 1. Retrieve relevant context context = await synap.conversation.context.fetch( conversation_id=conversation_id, user_id=user_id, customer_id=customer_id, messages=[{"role": "user", "content": message}] ) # 2. Build prompt with retrieved memories system_prompt = build_system_prompt(context.memories) # 3. Call your LLM (see LLM integration below) response = await generate_response(system_prompt, message) # 4. Ingest the turn for long-term memory await synap.memories.create( content=f"User: {message}\nAssistant: {response}", user_id=user_id, customer_id=customer_id, metadata={"conversation_id": conversation_id} ) return {"response": response}
Use FastAPI’s dependency injection (Depends(get_sdk)) to keep your route handlers clean and testable. In tests, you can override the dependency with a mock SDK.
Flask is synchronous by default, so you need to bridge to Synap’s async SDK. Use the app factory pattern and initialize the SDK at startup.
Using loop.run_until_complete() in Flask blocks the worker thread during async operations. For production Flask deployments with high concurrency, consider migrating to FastAPI or running the async SDK calls in a thread pool.
Use instrumentation.ts to initialize the provider once at server startup. The synap.wrap() call in route handlers is lightweight — the provider instance is shared across requests.
// instrumentation.tsimport { createSynap } from "@maximem/synap-vercel-adk";let _synap: Awaited<ReturnType<typeof createSynap>> | null = null;export async function register() { if (process.env.NEXT_RUNTIME === "nodejs") { _synap = await createSynap({ apiKey: process.env.SYNAP_API_KEY }); await _synap.listen(); // open gRPC anticipation stream }}export function getSynap() { if (!_synap) throw new Error("Synap not initialized"); return _synap;}
// app/api/chat/route.tsimport { streamText } from "ai";import { anthropic } from "@ai-sdk/anthropic";import { getSynap } from "@/instrumentation";export const runtime = "nodejs"; // required for gRPC; Edge is also supported without gRPCexport async function POST(req: Request) { const { messages, userId, conversationId } = await req.json(); const model = getSynap().wrap(anthropic("claude-sonnet-4-6"), { userId, conversationId, }); const result = streamText({ model, messages }); return result.toDataStreamResponse();}
The listen() call is guarded to no-op in Edge Runtime automatically, so the same code works in both Node.js and Edge routes. gRPC is only active in Node.js.
In Django, initialize the SDK in your AppConfig.ready() method. Use asgiref.sync_to_async for bridging in async views, or asyncio.run() for synchronous views.
The Synap SDK is async-native. If your application uses a synchronous framework, you need to bridge the async calls. Here are the recommended patterns:
asyncio.run() — Simplest approach
Use for scripts, CLI tools, and simple synchronous applications. Creates a new event loop for each call.
import asynciofrom maximem_synap import MaximemSynapSDKsdk = MaximemSynapSDK() # reads SYNAP_API_KEY from envasyncio.run(sdk.initialize())# Later, in synchronous code:context = asyncio.run(sdk.conversation.context.fetch(...))
asyncio.run() creates a new event loop each time. This is fine for low-frequency calls but adds overhead for high-throughput applications.
Dedicated event loop — Best for web frameworks
Create a single event loop at startup and reuse it for all SDK calls. This is the pattern shown in the Flask and Django examples above.
import asyncio# At startuploop = asyncio.new_event_loop()# For each callresult = loop.run_until_complete(sdk.some_async_method())
asgiref.sync_to_async — Django-specific
If you are using Django’s async views alongside sync code, asgiref provides utilities for bridging:
from asgiref.sync import async_to_sync# Wrap an async SDK method for use in sync codefetch_context = async_to_sync(sdk.conversation.context.fetch)context = fetch_context(conversation_id="conv_123", ...)
Full example of a memory-augmented agent using OpenAI’s gpt-4o:
from openai import AsyncOpenAIfrom maximem_synap import MaximemSynapSDKopenai_client = AsyncOpenAI(api_key="sk-...")synap_sdk = MaximemSynapSDK() # reads SYNAP_API_KEY from envasync def memory_augmented_chat( user_message: str, user_id: str, customer_id: str, conversation_id: str) -> str: # Step 1: Retrieve relevant context from Synap context = await synap_sdk.conversation.context.fetch( conversation_id=conversation_id, user_id=user_id, customer_id=customer_id, messages=[{"role": "user", "content": user_message}] ) # Step 2: Build the system prompt with retrieved memories memory_block = "\n".join([ f"- {memory.content}" for memory in context.memories ]) system_prompt = f"""You are a helpful assistant with access to the followinginformation about the user and their organization:{memory_block}Use this information to provide personalized, contextual responses.If the user's question relates to something in the memories above,reference it naturally in your response.""" # Step 3: Call OpenAI with the enriched prompt response = await openai_client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message} ], temperature=0.7 ) assistant_message = response.choices[0].message.content # Step 4: Ingest the conversation turn for long-term memory await synap_sdk.memories.create( content=f"User: {user_message}\nAssistant: {assistant_message}", user_id=user_id, customer_id=customer_id, metadata={ "conversation_id": conversation_id, "model": "gpt-4o", "source": "chat" } ) return assistant_message
Full example of a memory-augmented agent using Anthropic’s Claude:
from anthropic import AsyncAnthropicfrom maximem_synap import MaximemSynapSDKanthropic_client = AsyncAnthropic(api_key="sk-ant-...")synap_sdk = MaximemSynapSDK() # reads SYNAP_API_KEY from envasync def memory_augmented_chat( user_message: str, user_id: str, customer_id: str, conversation_id: str) -> str: # Step 1: Retrieve relevant context from Synap context = await synap_sdk.conversation.context.fetch( conversation_id=conversation_id, user_id=user_id, customer_id=customer_id, messages=[{"role": "user", "content": user_message}] ) # Step 2: Build the system prompt with retrieved memories memory_block = "\n".join([ f"- {memory.content}" for memory in context.memories ]) system_prompt = f"""You are a helpful assistant with access to the followinginformation about the user and their organization:{memory_block}Use this information to provide personalized, contextual responses.Reference relevant memories naturally when they apply to the user's question.""" # Step 3: Call Claude with the enriched prompt response = await anthropic_client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, system=system_prompt, messages=[ {"role": "user", "content": user_message} ] ) assistant_message = response.content[0].text # Step 4: Ingest the conversation turn for long-term memory await synap_sdk.memories.create( content=f"User: {user_message}\nAssistant: {assistant_message}", user_id=user_id, customer_id=customer_id, metadata={ "conversation_id": conversation_id, "model": "claude-sonnet-4-20250514", "source": "chat" } ) return assistant_message
If your application uses the Vercel AI SDK, @maximem/synap-vercel-adk wraps any LanguageModelV1 model as a middleware. Context retrieval and memory writes happen automatically — no manual fetch/inject/ingest loop needed.Setup (run once at startup)
// instrumentation.ts (Next.js) or top-level moduleimport { createSynap } from "@maximem/synap-vercel-adk";export const synap = await createSynap({ apiKey: process.env.SYNAP_API_KEY,});// Optional: open gRPC stream for real-time anticipation cache updatesawait synap.listen();
Usage in a route handler or server action
import { generateText, streamText } from "ai";import { anthropic } from "@ai-sdk/anthropic";import { synap } from "@/lib/synap";const model = synap.wrap(anthropic("claude-sonnet-4-6"), { userId: "user-123", conversationId: "conv-456",});// generateText — context injected, memory written automaticallyconst { text } = await generateText({ model, messages: [{ role: "user", content: userMessage }],});// streamText works identicallyconst result = streamText({ model, messages });
Next.js App Router (streaming)
// app/api/chat/route.tsimport { streamText } from "ai";import { anthropic } from "@ai-sdk/anthropic";import { synap } from "@/lib/synap";export async function POST(req: Request) { const { messages, userId, conversationId } = await req.json(); const model = synap.wrap(anthropic("claude-sonnet-4-6"), { userId, conversationId, }); const result = streamText({ model, messages }); return result.toDataStreamResponse();}
synap.wrap() returns a standard LanguageModelV1 — it is compatible with all Vercel AI SDK functions (generateText, streamText, generateObject, etc.) and any framework built on top of the AI SDK.
For the writeMemory option (default true), the middleware fires a background memory write after each generation so it never adds latency to the response. Set writeMemory: false to disable.
Call your LLM provider with the enriched prompt. The model generates a response informed by the user’s history, preferences, and organizational context.
Synap should enhance your application, not be a single point of failure. Design your integration to degrade gracefully when Synap is unavailable.
async def chat_with_graceful_degradation(user_message: str, **kwargs) -> str: memories = [] # Attempt to retrieve context, but don't fail if Synap is down try: context = await synap_sdk.conversation.context.fetch( conversation_id=kwargs["conversation_id"], user_id=kwargs["user_id"], customer_id=kwargs["customer_id"], messages=[{"role": "user", "content": user_message}] ) memories = context.memories except Exception as e: logger.warning(f"Synap retrieval failed, proceeding without context: {e}") # Generate response (with or without memories) system_prompt = build_system_prompt(memories) # handles empty list gracefully response = await generate_response(system_prompt, user_message) # Attempt to ingest, but don't fail if Synap is down try: await synap_sdk.memories.create( content=f"User: {user_message}\nAssistant: {response}", user_id=kwargs["user_id"], customer_id=kwargs["customer_id"], metadata={"conversation_id": kwargs["conversation_id"]} ) except Exception as e: logger.warning(f"Synap ingestion failed: {e}") return response
Always wrap Synap SDK calls in try/except blocks in production. Network issues, rate limits, or service disruptions should not prevent your application from responding to users. The LLM can still generate useful responses without memory context — it just won’t be personalized.