Skip to main content

Metric Types

Synap tracks the following metric types for each instance:
MetricDescriptionUnit
api_callTotal API calls madecount
context_fetchContext retrieval requestscount
context_compactContext compaction operationscount
llm_input_tokensTokens sent to LLM during ingestion and retrievaltokens
llm_output_tokensTokens generated by LLM during extraction and compactiontokens
vector_queriesQueries executed against the vector storecount
graph_queriesQueries executed against the graph storecount
memories_storedNew memories written to storagecount

Rollup Buckets

Metrics are aggregated into time-based rollup buckets with different retention periods:
BucketGranularityRetention
minute1-minute intervals30 days
hour1-hour intervals1 year
day1-day intervalsIndefinite
Minute-level data is available for the last 30 days. For historical analysis beyond 30 days, use hourly or daily rollups.

Latency Percentiles

Performance metrics include latency percentiles for timing-sensitive operations:
PercentileDescription
avgMean latency across all requests in the bucket
p50Median latency (50th percentile)
p9595th percentile latency — most requests are faster than this
p9999th percentile latency — tail latency for worst-case analysis

Query Metrics

Retrieve aggregated metrics for an instance.
GET /v1/analytics/metrics

Query Parameters

instance_id
string
required
The instance ID to query metrics for.
metric
string
required
The metric type to query. See Metric Types for valid values.
bucket
string
required
Rollup bucket granularity: minute, hour, or day.
start
string
required
Start of the time range. ISO 8601 format.
end
string
required
End of the time range. ISO 8601 format.
group_by
string
Optional grouping dimension. Options: user_id, customer_id, document_type, mode.

Response

instance_id
string
The instance these metrics belong to.
metric
string
The metric type queried.
bucket
string
The rollup bucket used.
start
string
Start of the queried time range.
end
string
End of the queried time range.
data_points
array
Array of metric data points.
summary
object
Summary statistics across the entire time range.

Example: API Call Volume

curl -X GET "https://api.synap.maximem.ai/v1/analytics/metrics?instance_id=inst_f1e2d3c4b5a69078&metric=api_call&bucket=hour&start=2025-01-15T00:00:00Z&end=2025-01-15T23:59:59Z" \
  -H "Authorization: Bearer synap_your_key_here"
Response
{
  "instance_id": "inst_f1e2d3c4b5a69078",
  "metric": "api_call",
  "bucket": "hour",
  "start": "2025-01-15T00:00:00Z",
  "end": "2025-01-15T23:59:59Z",
  "data_points": [
    { "timestamp": "2025-01-15T00:00:00Z", "value": 42, "group": null },
    { "timestamp": "2025-01-15T01:00:00Z", "value": 38, "group": null },
    { "timestamp": "2025-01-15T02:00:00Z", "value": 15, "group": null },
    { "timestamp": "2025-01-15T08:00:00Z", "value": 156, "group": null },
    { "timestamp": "2025-01-15T09:00:00Z", "value": 203, "group": null },
    { "timestamp": "2025-01-15T10:00:00Z", "value": 187, "group": null }
  ],
  "summary": {
    "total": 1247,
    "average": 51.96,
    "min": 8,
    "max": 203
  }
}

Query Latency

Retrieve latency percentiles for timing-sensitive operations.
GET /v1/analytics/latency

Query Parameters

instance_id
string
required
The instance ID to query latency for.
operation
string
required
The operation to measure. Options: context_fetch, context_compact, ingestion, memory_get.
bucket
string
required
Rollup bucket granularity: minute, hour, or day.
start
string
required
Start of the time range (ISO 8601).
end
string
required
End of the time range (ISO 8601).

Response

instance_id
string
The instance ID.
operation
string
The operation measured.
data_points
array
Array of latency data points.

Example: Context Fetch Latency

curl -X GET "https://api.synap.maximem.ai/v1/analytics/latency?instance_id=inst_f1e2d3c4b5a69078&operation=context_fetch&bucket=hour&start=2025-01-15T00:00:00Z&end=2025-01-15T23:59:59Z" \
  -H "Authorization: Bearer synap_your_key_here"
Response
{
  "instance_id": "inst_f1e2d3c4b5a69078",
  "operation": "context_fetch",
  "bucket": "hour",
  "start": "2025-01-15T00:00:00Z",
  "end": "2025-01-15T23:59:59Z",
  "data_points": [
    {
      "timestamp": "2025-01-15T08:00:00Z",
      "avg": 52.3,
      "p50": 45.0,
      "p95": 98.7,
      "p99": 187.2,
      "count": 156
    },
    {
      "timestamp": "2025-01-15T09:00:00Z",
      "avg": 48.1,
      "p50": 42.0,
      "p95": 91.3,
      "p99": 162.8,
      "count": 203
    },
    {
      "timestamp": "2025-01-15T10:00:00Z",
      "avg": 55.7,
      "p50": 47.0,
      "p95": 112.4,
      "p99": 245.1,
      "count": 187
    }
  ]
}
Monitor p95 and p99 latencies to catch tail latency issues before they impact user experience. If p99 regularly exceeds 500ms for context_fetch, consider switching to mode: "fast" or increasing your context budget to reduce re-ranking overhead.

Query Token Usage

Retrieve LLM token consumption for cost tracking and optimization.
GET /v1/analytics/tokens

Query Parameters

instance_id
string
required
The instance ID.
bucket
string
required
Rollup bucket: minute, hour, or day.
start
string
required
Start of the time range (ISO 8601).
end
string
required
End of the time range (ISO 8601).
group_by
string
Optional grouping: operation (ingestion vs. retrieval), model, user_id.

Response

data_points
array
Array of token usage data points.
summary
object
Summary across the time range.

Example: Daily Token Usage by Operation

curl -X GET "https://api.synap.maximem.ai/v1/analytics/tokens?instance_id=inst_f1e2d3c4b5a69078&bucket=day&start=2025-01-10T00:00:00Z&end=2025-01-15T23:59:59Z&group_by=operation" \
  -H "Authorization: Bearer synap_your_key_here"
Response
{
  "instance_id": "inst_f1e2d3c4b5a69078",
  "bucket": "day",
  "start": "2025-01-10T00:00:00Z",
  "end": "2025-01-15T23:59:59Z",
  "data_points": [
    {
      "timestamp": "2025-01-15T00:00:00Z",
      "input_tokens": 124500,
      "output_tokens": 31200,
      "total_tokens": 155700,
      "estimated_cost_usd": 0.47,
      "group": "ingestion"
    },
    {
      "timestamp": "2025-01-15T00:00:00Z",
      "input_tokens": 89300,
      "output_tokens": 12100,
      "total_tokens": 101400,
      "estimated_cost_usd": 0.31,
      "group": "retrieval"
    },
    {
      "timestamp": "2025-01-15T00:00:00Z",
      "input_tokens": 45200,
      "output_tokens": 8700,
      "total_tokens": 53900,
      "estimated_cost_usd": 0.16,
      "group": "compaction"
    }
  ],
  "summary": {
    "total_input_tokens": 259000,
    "total_output_tokens": 52000,
    "total_tokens": 311000,
    "total_estimated_cost_usd": 0.94
  }
}

Usage Summary

Get a high-level usage summary for an instance, suitable for dashboard display.
GET /v1/analytics/summary

Query Parameters

instance_id
string
required
The instance ID.
period
string
Time period for the summary: 24h, 7d, 30d, 90d. Defaults to 24h.

Response

{
  "instance_id": "inst_f1e2d3c4b5a69078",
  "period": "24h",
  "api_calls": {
    "total": 1247,
    "change_pct": 12.3
  },
  "memories": {
    "total_stored": 58432,
    "created_in_period": 347,
    "change_pct": 8.1
  },
  "context_fetches": {
    "total": 892,
    "avg_latency_ms": 52.3,
    "p95_latency_ms": 98.7,
    "change_pct": 15.7
  },
  "compactions": {
    "total": 45,
    "avg_compression_ratio": 5.2,
    "avg_validation_score": 0.93
  },
  "tokens": {
    "total_input": 259000,
    "total_output": 52000,
    "estimated_cost_usd": 0.94,
    "change_pct": 5.4
  },
  "health": {
    "status": "healthy",
    "error_rate_pct": 0.3,
    "avg_latency_ms": 52.3
  }
}
The change_pct fields show percentage change compared to the previous equivalent period. For example, if period is 24h, the change is compared to the previous 24 hours.

Dashboard Analytics

The Dashboard provides visual analytics through two additional routes:

Current Configuration Detail

GET /dashboard/configs/{instance_id}
Returns configuration metadata alongside usage stats for the current config version, including how many memories were processed under this config and any performance changes since the config was applied.

Configuration History

GET /dashboard/configs/{instance_id}/history
Returns version history with performance comparisons between versions — useful for understanding the impact of configuration changes on latency, token usage, and memory quality.
Use the configuration history endpoint to perform before/after analysis when testing configuration changes. Compare token usage, latency percentiles, and memory quality scores across config versions to validate improvements.