Metric Types
Synap tracks the following metric types for each instance:
Metric Description Unit api_callTotal API calls made count context_fetchContext retrieval requests count context_compactContext compaction operations count llm_input_tokensTokens sent to LLM during ingestion and retrieval tokens llm_output_tokensTokens generated by LLM during extraction and compaction tokens vector_queriesQueries executed against the vector store count graph_queriesQueries executed against the graph store count memories_storedNew memories written to storage count
Rollup Buckets
Metrics are aggregated into time-based rollup buckets with different retention periods:
Bucket Granularity Retention minute1-minute intervals 30 days hour1-hour intervals 1 year day1-day intervals Indefinite
Minute-level data is available for the last 30 days. For historical analysis beyond 30 days, use hourly or daily rollups.
Latency Percentiles
Performance metrics include latency percentiles for timing-sensitive operations:
Percentile Description avgMean latency across all requests in the bucket p50Median latency (50th percentile) p9595th percentile latency — most requests are faster than this p9999th percentile latency — tail latency for worst-case analysis
Query Metrics
Retrieve aggregated metrics for an instance.
GET /v1/analytics/metrics
Query Parameters
The instance ID to query metrics for.
The metric type to query. See Metric Types for valid values.
Rollup bucket granularity: minute, hour, or day.
Start of the time range. ISO 8601 format.
End of the time range. ISO 8601 format.
Optional grouping dimension. Options: user_id, customer_id, document_type, mode.
Response
The instance these metrics belong to.
Start of the queried time range.
End of the queried time range.
Array of metric data points. Start of this bucket interval (ISO 8601).
Aggregated metric value for this interval.
Group key if group_by was specified. null otherwise.
Summary statistics across the entire time range. Sum of all values across the time range.
Minimum value in any single bucket.
Maximum value in any single bucket.
Example: API Call Volume
curl -X GET "https://api.synap.maximem.ai/v1/analytics/metrics?instance_id=inst_f1e2d3c4b5a69078&metric=api_call&bucket=hour&start=2025-01-15T00:00:00Z&end=2025-01-15T23:59:59Z" \
-H "Authorization: Bearer synap_your_key_here"
{
"instance_id" : "inst_f1e2d3c4b5a69078" ,
"metric" : "api_call" ,
"bucket" : "hour" ,
"start" : "2025-01-15T00:00:00Z" ,
"end" : "2025-01-15T23:59:59Z" ,
"data_points" : [
{ "timestamp" : "2025-01-15T00:00:00Z" , "value" : 42 , "group" : null },
{ "timestamp" : "2025-01-15T01:00:00Z" , "value" : 38 , "group" : null },
{ "timestamp" : "2025-01-15T02:00:00Z" , "value" : 15 , "group" : null },
{ "timestamp" : "2025-01-15T08:00:00Z" , "value" : 156 , "group" : null },
{ "timestamp" : "2025-01-15T09:00:00Z" , "value" : 203 , "group" : null },
{ "timestamp" : "2025-01-15T10:00:00Z" , "value" : 187 , "group" : null }
],
"summary" : {
"total" : 1247 ,
"average" : 51.96 ,
"min" : 8 ,
"max" : 203
}
}
Query Latency
Retrieve latency percentiles for timing-sensitive operations.
GET /v1/analytics/latency
Query Parameters
The instance ID to query latency for.
The operation to measure. Options: context_fetch, context_compact, ingestion, memory_get.
Rollup bucket granularity: minute, hour, or day.
Start of the time range (ISO 8601).
End of the time range (ISO 8601).
Response
Array of latency data points. Show LatencyDataPoint fields
Start of this bucket interval.
Mean latency in milliseconds.
Median (50th percentile) latency in milliseconds.
95th percentile latency in milliseconds.
99th percentile latency in milliseconds.
Number of operations in this interval.
Example: Context Fetch Latency
curl -X GET "https://api.synap.maximem.ai/v1/analytics/latency?instance_id=inst_f1e2d3c4b5a69078&operation=context_fetch&bucket=hour&start=2025-01-15T00:00:00Z&end=2025-01-15T23:59:59Z" \
-H "Authorization: Bearer synap_your_key_here"
{
"instance_id" : "inst_f1e2d3c4b5a69078" ,
"operation" : "context_fetch" ,
"bucket" : "hour" ,
"start" : "2025-01-15T00:00:00Z" ,
"end" : "2025-01-15T23:59:59Z" ,
"data_points" : [
{
"timestamp" : "2025-01-15T08:00:00Z" ,
"avg" : 52.3 ,
"p50" : 45.0 ,
"p95" : 98.7 ,
"p99" : 187.2 ,
"count" : 156
},
{
"timestamp" : "2025-01-15T09:00:00Z" ,
"avg" : 48.1 ,
"p50" : 42.0 ,
"p95" : 91.3 ,
"p99" : 162.8 ,
"count" : 203
},
{
"timestamp" : "2025-01-15T10:00:00Z" ,
"avg" : 55.7 ,
"p50" : 47.0 ,
"p95" : 112.4 ,
"p99" : 245.1 ,
"count" : 187
}
]
}
Monitor p95 and p99 latencies to catch tail latency issues before they impact user experience. If p99 regularly exceeds 500ms for context_fetch, consider switching to mode: "fast" or increasing your context budget to reduce re-ranking overhead.
Query Token Usage
Retrieve LLM token consumption for cost tracking and optimization.
Query Parameters
Rollup bucket: minute, hour, or day.
Start of the time range (ISO 8601).
End of the time range (ISO 8601).
Optional grouping: operation (ingestion vs. retrieval), model, user_id.
Response
Array of token usage data points. Show TokenDataPoint fields
Start of the bucket interval.
LLM input tokens consumed.
LLM output tokens generated.
Sum of input and output tokens.
Estimated cost in USD based on the current pricing model.
Group key if group_by was specified.
Summary across the time range. Show Token summary fields
Total input tokens consumed.
Total output tokens generated.
Total tokens (input + output).
Total estimated cost in USD.
Example: Daily Token Usage by Operation
curl -X GET "https://api.synap.maximem.ai/v1/analytics/tokens?instance_id=inst_f1e2d3c4b5a69078&bucket=day&start=2025-01-10T00:00:00Z&end=2025-01-15T23:59:59Z&group_by=operation" \
-H "Authorization: Bearer synap_your_key_here"
{
"instance_id" : "inst_f1e2d3c4b5a69078" ,
"bucket" : "day" ,
"start" : "2025-01-10T00:00:00Z" ,
"end" : "2025-01-15T23:59:59Z" ,
"data_points" : [
{
"timestamp" : "2025-01-15T00:00:00Z" ,
"input_tokens" : 124500 ,
"output_tokens" : 31200 ,
"total_tokens" : 155700 ,
"estimated_cost_usd" : 0.47 ,
"group" : "ingestion"
},
{
"timestamp" : "2025-01-15T00:00:00Z" ,
"input_tokens" : 89300 ,
"output_tokens" : 12100 ,
"total_tokens" : 101400 ,
"estimated_cost_usd" : 0.31 ,
"group" : "retrieval"
},
{
"timestamp" : "2025-01-15T00:00:00Z" ,
"input_tokens" : 45200 ,
"output_tokens" : 8700 ,
"total_tokens" : 53900 ,
"estimated_cost_usd" : 0.16 ,
"group" : "compaction"
}
],
"summary" : {
"total_input_tokens" : 259000 ,
"total_output_tokens" : 52000 ,
"total_tokens" : 311000 ,
"total_estimated_cost_usd" : 0.94
}
}
Usage Summary
Get a high-level usage summary for an instance, suitable for dashboard display.
GET /v1/analytics/summary
Query Parameters
Time period for the summary: 24h, 7d, 30d, 90d. Defaults to 24h.
Response
{
"instance_id" : "inst_f1e2d3c4b5a69078" ,
"period" : "24h" ,
"api_calls" : {
"total" : 1247 ,
"change_pct" : 12.3
},
"memories" : {
"total_stored" : 58432 ,
"created_in_period" : 347 ,
"change_pct" : 8.1
},
"context_fetches" : {
"total" : 892 ,
"avg_latency_ms" : 52.3 ,
"p95_latency_ms" : 98.7 ,
"change_pct" : 15.7
},
"compactions" : {
"total" : 45 ,
"avg_compression_ratio" : 5.2 ,
"avg_validation_score" : 0.93
},
"tokens" : {
"total_input" : 259000 ,
"total_output" : 52000 ,
"estimated_cost_usd" : 0.94 ,
"change_pct" : 5.4
},
"health" : {
"status" : "healthy" ,
"error_rate_pct" : 0.3 ,
"avg_latency_ms" : 52.3
}
}
The change_pct fields show percentage change compared to the previous equivalent period. For example, if period is 24h, the change is compared to the previous 24 hours.
Dashboard Analytics
The Dashboard provides visual analytics through two additional routes:
Current Configuration Detail
GET /dashboard/configs/{instance_id}
Returns configuration metadata alongside usage stats for the current config version, including how many memories were processed under this config and any performance changes since the config was applied.
Configuration History
GET /dashboard/configs/{instance_id}/history
Returns version history with performance comparisons between versions — useful for understanding the impact of configuration changes on latency, token usage, and memory quality.
Use the configuration history endpoint to perform before/after analysis when testing configuration changes. Compare token usage, latency percentiles, and memory quality scores across config versions to validate improvements.