Skip to main content

Prerequisites

Before you begin, make sure you have:
  • Python 3.10+ installed on your machine
  • A Synap account with access to the Dashboard
  • An instance created in the Dashboard (see Quickstart if you haven’t done this yet). For best results, upload a Use-Case Markdown file when creating your instance — see Use-Case Markdown for the template and authoring guide.
  • An OpenAI API key (or any LLM provider — we use OpenAI in this tutorial for simplicity)
This tutorial assumes basic familiarity with Python async/await and REST APIs. If you are new to async Python, check out the asyncio documentation first.

1

Set Up Your Project

Create a new directory for your project and install the required dependencies:
mkdir synap-chatbot && cd synap-chatbot
python -m venv venv
source venv/bin/activate
Install the SDK and supporting libraries:
pip install maximem-synap openai fastapi uvicorn
Your project will have the following structure:
synap-chatbot/
  startup.py      # SDK initialization and lifecycle
  main.py         # FastAPI application with chat endpoint
  .env            # Environment variables (not committed)
2

Configure Your Environment

Create a .env file with your credentials. You will need three values:
  • SYNAP_INSTANCE_ID: The instance ID from your Dashboard (e.g., inst_a1b2c3d4e5f67890)
  • SYNAP_BOOTSTRAP_TOKEN: The bootstrap key generated for your instance
  • OPENAI_API_KEY: Your OpenAI API key
.env
SYNAP_INSTANCE_ID=inst_a1b2c3d4e5f67890
SYNAP_BOOTSTRAP_TOKEN=your-bootstrap-token-here
OPENAI_API_KEY=sk-your-openai-key-here
Never commit .env files to version control. Add .env to your .gitignore immediately. In production, use a secrets manager (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault) instead of environment files.
3

Initialize the SDK

Create a startup.py module that manages the SDK lifecycle. This module will be imported by your FastAPI application.
startup.py
from maximem_synap import MaximemSynapSDK, SDKConfig
import os

sdk = MaximemSynapSDK(
    instance_id=os.environ["SYNAP_INSTANCE_ID"],
    bootstrap_token=os.environ["SYNAP_BOOTSTRAP_TOKEN"],
    config=SDKConfig(
        cache_backend="sqlite",
        log_level="INFO"
    )
)

async def init():
    """Bootstrap credentials and establish connection to Synap Cloud."""
    await sdk.initialize()

async def cleanup():
    """Flush pending operations and close connections."""
    await sdk.shutdown()
Key points about this setup:
  • cache_backend="sqlite" enables local caching for faster repeated retrievals.
  • log_level="INFO" is appropriate for development. Switch to "WARNING" or "ERROR" in production.
  • The sdk object is a module-level singleton. Import it from any module and it will reference the same initialized instance.
The bootstrap token is consumed on first use. After the initial bootstrap, the SDK persists credentials locally and no longer needs the token. You can safely remove it from your environment after the first successful run.
4

Create the FastAPI Application with Lifespan

Now create the main.py file. Start with the application lifespan manager, which ensures the SDK initializes on startup and shuts down cleanly when the server stops.
main.py
import os
from contextlib import asynccontextmanager
from fastapi import FastAPI
from pydantic import BaseModel
from openai import AsyncOpenAI
from startup import sdk, init, cleanup

# --- Lifespan Management ---

@asynccontextmanager
async def lifespan(app):
    """Initialize Synap SDK on startup, shut down on exit."""
    await init()
    print("Synap SDK initialized. Ready to serve requests.")
    yield
    await cleanup()
    print("Synap SDK shut down cleanly.")

app = FastAPI(
    title="Synap Chatbot",
    description="A memory-enabled chatbot powered by Synap",
    lifespan=lifespan
)
openai_client = AsyncOpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# --- Request/Response Models ---

class ChatRequest(BaseModel):
    message: str
    conversation_id: str
    user_id: str
    customer_id: str | None = None

class ChatResponse(BaseModel):
    response: str
    memories_used: int
The lifespan context manager is the recommended way to manage startup/shutdown in modern FastAPI applications (v0.95+). It replaces the older @app.on_event("startup") and @app.on_event("shutdown") hooks.
5

Build the Chat Endpoint

Add the chat endpoint to main.py. This endpoint performs four operations in sequence:
  1. Retrieve relevant memories from Synap
  2. Build a system prompt enriched with memory context
  3. Call the LLM with the enriched prompt
  4. Ingest the conversation turn back into Synap for future memory
main.py (continued)
@app.post("/chat", response_model=ChatResponse)
async def chat(req: ChatRequest):
    # -------------------------------------------------------
    # Step 1: Retrieve relevant memories for this conversation
    # -------------------------------------------------------
    context = await sdk.conversation.context.fetch(
        conversation_id=req.conversation_id,
        search_query=[req.message],
        max_results=5,
        types=["facts", "preferences"],
        mode="fast"
    )

    # -------------------------------------------------------
    # Step 2: Build system prompt with memory context
    # -------------------------------------------------------
    memory_lines = []
    for fact in context.facts:
        memory_lines.append(
            f"- {fact.content} (confidence: {fact.confidence:.0%})"
        )
    for pref in context.preferences:
        memory_lines.append(f"- User preference: {pref.content}")

    memory_block = "\n".join(memory_lines) if memory_lines else (
        "No prior context available."
    )

    system_prompt = f"""You are a helpful assistant with memory.

Known information about this user:
{memory_block}

Use this context naturally in your responses. Do not explicitly mention
that you are reading from a memory system -- just be naturally informed."""

    # -------------------------------------------------------
    # Step 3: Call the LLM
    # -------------------------------------------------------
    response = await openai_client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": req.message}
        ],
        temperature=0.7,
        max_tokens=1024
    )
    assistant_message = response.choices[0].message.content

    # -------------------------------------------------------
    # Step 4: Ingest this conversation turn for future memory
    # -------------------------------------------------------
    await sdk.memories.create(
        document=f"User: {req.message}\nAssistant: {assistant_message}",
        document_type="ai-chat-conversation",
        user_id=req.user_id,
        customer_id=req.customer_id,
        mode="fast",
        metadata={
            "conversation_id": req.conversation_id,
            "source": "chatbot-tutorial"
        }
    )

    return ChatResponse(
        response=assistant_message,
        memories_used=len(memory_lines)
    )
Let’s break down each step:
The sdk.conversation.context.fetch() call searches Synap’s vector and graph stores for memories relevant to the user’s message. Key parameters:
  • search_query: A list of strings used for semantic search. Passing the user’s message ensures we find contextually relevant memories.
  • max_results=5: Limits context to the top 5 most relevant memories, keeping the prompt concise.
  • types=["facts", "preferences"]: Retrieves only facts and preferences. Other types include temporal_events, relationships, and entities.
  • mode="fast": Uses the fast retrieval path (~50-100ms). Use "accurate" (~200-500ms) when precision matters more than latency.
The retrieved memories are formatted as bullet points and injected into the system prompt. This gives the LLM access to user-specific context without modifying the conversation history.The confidence score (e.g., 92%) is included to help the LLM weigh how certain each piece of information is. You can omit confidence scores if you prefer a cleaner prompt.
A standard OpenAI chat completion call. The system prompt now contains personalized context, so the LLM can respond as if it “remembers” the user. This works with any LLM provider — replace the OpenAI call with your preferred provider.
After generating a response, the full conversation turn is sent back to Synap. The ingestion pipeline automatically extracts structured knowledge (facts, preferences, entities, temporal events) from the text.
  • document_type="ai-chat-conversation" tells the pipeline to expect a conversational format with User: and Assistant: turns.
  • mode="fast" processes asynchronously without blocking the response.
  • metadata attaches arbitrary key-value pairs for later filtering and audit.
6

Add a Health Check Endpoint

Good practice for production deployments — add a health check that verifies the SDK is connected:
main.py (continued)
@app.get("/health")
async def health():
    try:
        stats = await sdk.cache.stats()
        return {
            "status": "healthy",
            "synap_connected": True,
            "cache_entries": stats.entry_count
        }
    except Exception as e:
        return {
            "status": "degraded",
            "synap_connected": False,
            "error": str(e)
        }
7

Run and Test

Load your environment variables and start the server:
# Load environment variables
export $(cat .env | xargs)

# Start the FastAPI server
uvicorn main:app --reload --port 8000
You should see output confirming the SDK has initialized:
INFO:     Started server process
Synap SDK initialized. Ready to serve requests.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000
Now test with a few conversation turns:
# First message -- no memories exist yet
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Hi! I am planning a trip to Japan next month. Any tips?",
    "conversation_id": "conv_001",
    "user_id": "user_alice"
  }'
{
  "response": "Japan is wonderful! What kind of experience are you looking for...",
  "memories_used": 0
}
# Second message -- Synap now has context from the first turn
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What should I pack?",
    "conversation_id": "conv_002",
    "user_id": "user_alice"
  }'
{
  "response": "Since you're heading to Japan next month, here's what I'd recommend packing...",
  "memories_used": 2
}
Notice that memories_used increases as Synap accumulates knowledge about the user. The assistant naturally references the Japan trip even though it was mentioned in a different conversation.
8

Verify in the Dashboard

Open the Synap Dashboard and navigate to your instance. You should see:
  • API call counts reflecting your test requests
  • Memory counts showing extracted facts, preferences, and entities
  • Ingestion history with the conversation turns you sent
Dashboard showing API calls and memory counts

Complete Code

Here is the final version of both files for reference:
from maximem_synap import MaximemSynapSDK, SDKConfig
import os

sdk = MaximemSynapSDK(
    instance_id=os.environ["SYNAP_INSTANCE_ID"],
    bootstrap_token=os.environ["SYNAP_BOOTSTRAP_TOKEN"],
    config=SDKConfig(
        cache_backend="sqlite",
        log_level="INFO"
    )
)

async def init():
    """Bootstrap credentials and establish connection to Synap Cloud."""
    await sdk.initialize()

async def cleanup():
    """Flush pending operations and close connections."""
    await sdk.shutdown()

What’s Next?

You have a working memory-enabled chatbot. Here are the natural next steps to make it production-ready:

Configuring Memory

Tune what gets extracted, how it is stored, and how retrieval ranking works with MACA configuration.

Multi-User Scoping

Set up memory isolation for multi-tenant applications with user, customer, and client scopes.

Context Compaction

Manage long conversations by compacting context to fit within your LLM’s token budget.

Production Checklist

Security, performance, and monitoring best practices before going live.