Entity Resolution (ER) is Synap’s ability to identify and link mentions of the same real-world entity across different conversations and documents. When a user says “John”, “Mr. Smith”, “my manager”, and “the person I met at the conference”, Synap determines whether these all refer to the same individual and links them to a single canonical entity. This builds a coherent knowledge graph over time, even as the ways people refer to entities vary naturally.
Use this file to discover all available pages before exploring further.
Beyond simple deduplication, entity resolution serves as a master data management layer for your AI agents. The entity registry acts as a master data store — a single source of truth for the people, organizations, products, and concepts your application encounters. As conversations accumulate, this registry grows into a rich organizational knowledge base that improves retrieval accuracy, enables entity-centric queries, and provides a foundation for building structured knowledge on top of unstructured conversations.
Entity resolution runs automatically during ingestion. No additional SDK calls are needed. Every document that passes through the ingestion pipeline has its entities extracted and resolved before storage.
Traditional AI applications treat each conversation as isolated text. Over hundreds or thousands of interactions, the same entities appear under different names, in different contexts, and from different users. Without entity resolution, your agent has no way to connect “the CEO” mentioned in one conversation with “Maria Garcia” mentioned in another.The entity registry solves this by:
Consolidating identity: All references to the same real-world entity converge on a single canonical record, regardless of how they were originally mentioned
Building organizational knowledge over time: Each conversation enriches the registry with new aliases, context, and relationships, making future resolution more accurate
Enabling entity-centric retrieval: Instead of searching by keywords, you can retrieve all memories associated with a specific entity across all conversations and users
Providing auditability: The registry tracks when each entity was first seen, last referenced, and how it has been resolved, giving you a clear provenance trail
Entity resolution is a multi-step process that runs as part of the ingestion pipeline:
1
Extract entities from text
The ingestion pipeline identifies entity mentions in the incoming content. Entities include people, organizations, products, locations, and other named references. Each entity mention is extracted with its surrounding context.
2
Search the entity registry
Each extracted entity is matched against the Instance’s entity registry. The search follows the scope chain (USER, CUSTOMER, CLIENT, WORLD), checking narrowest scopes first. Matching uses both exact text comparison and semantic similarity via vector embeddings.
3
Resolve or register
If a match is found, the entity mention is linked to the existing canonical entity. If no match is found, the entity is auto-registered at CUSTOMER scope for future lookups. Ambiguous matches (multiple possible candidates) can be queued for human review.
4
Apply canonical names
Resolved entities receive a canonical_name that is consistent across all references. This canonical name is stored alongside the extracted memory, enabling precise retrieval by entity.
The entity registry is a per-Instance database of known entities. It functions as the master data store for all entities your application encounters. Each registry entry contains:
Field
Description
canonical_name
The authoritative name for this entity (e.g., “John Smith”)
aliases
Known alternative names and references (e.g., “John”, “Mr. Smith”, “JS”)
entity_type
Category: person, organization, product, location, concept, etc.
scope
The scope level where this entity is registered (user, customer, client, world)
embedding
A 384-dimensional vector embedding for semantic matching
metadata
Arbitrary metadata (role, department, relationship to user, etc.)
created_at
When this entity was first registered
last_seen
When this entity was last referenced in an ingestion
The registry is searched following the scope chain, narrowest first:
1. USER scope — Entities specific to this user2. CUSTOMER scope — Entities shared within the organization3. CLIENT scope — Entities shared across your application4. WORLD scope — Global entities
This ordering means that if a user has a personal contact named “Alex” and the company also has an employee named “Alex”, the user-scoped entity takes priority in that user’s context. The customer-scoped entity remains available for other users in the same organization.
Synap uses multiple matching strategies to resolve entities, applied in order of confidence:
Exact match
Alias match
Semantic match
Contextual match
The extracted entity name exactly matches a canonical name or alias in the registry.
Input: "John Smith"Registry: canonical_name="John Smith"Result: Exact match (confidence: 1.0)
The extracted entity matches a known alias of a registered entity.
Input: "Mr. Smith"Registry: canonical_name="John Smith", aliases=["Mr. Smith", "JS"]Result: Alias match (confidence: 0.95)
The entity’s vector embedding (384 dimensions) is compared against registry embeddings using cosine similarity. This catches cases where the surface form is different but the meaning is the same.
Input: "my team lead from engineering"Registry: canonical_name="John Smith", metadata={"role": "Engineering Team Lead"}Result: Semantic match (confidence: 0.82)
The surrounding context of the entity mention is used to disambiguate. If multiple registry entries match by name, the context helps pick the right one.
Input: "Alex from the billing department called"Registry: - canonical_name="Alex Chen", metadata={"department": "Engineering"} - canonical_name="Alex Rivera", metadata={"department": "Billing"}Result: Contextual match → Alex Rivera (confidence: 0.88)
When the resolution pipeline encounters an entity that does not match any existing registry entry, it automatically registers the entity at CUSTOMER scope. This means:
The system learns new entities organically as conversations happen
Future mentions of the same entity will resolve to the auto-registered entry
No manual entity management is required for common use cases
Auto-registered entities can be promoted, edited, or merged through the review queue
Conversation 1: "I had a call with Sarah from the partner team."→ No match found → Auto-registers "Sarah" at CUSTOMER scope canonical_name: "Sarah" entity_type: "person" metadata: {"context": "partner team"}Conversation 2: "Sarah mentioned the Q3 timeline is shifting."→ Matches auto-registered "Sarah" → Links to same canonical entityConversation 3: "Sarah Chen confirmed the new deadline."→ Matches "Sarah" → Updates canonical_name to "Sarah Chen", adds alias "Sarah"
Auto-registration happens at CUSTOMER scope by default because it provides the right balance: entities are shared within an organization (so all users in that org benefit) but isolated from other organizations (preventing cross-tenant entity leakage).
When the resolution pipeline encounters an ambiguous match — where multiple registry entries are plausible candidates — the entity is placed in a review queue for human review rather than making an incorrect automatic resolution.
Entity resolution happens transparently during ingestion. You do not need to make any special calls:
from synap import Synapsdk = Synap(api_key="your_api_key")# ER happens automatically during ingestionawait sdk.memories.create( document="John Smith from Acme Corp called about the Q4 report.", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp")# Future mentions of "John", "Mr. Smith", "JS" will resolve to the same entityawait sdk.memories.create( document="Mr. Smith followed up on the Q4 numbers. He wants the final version by Friday.", document_type="ai-chat-conversation", user_id="user_123", customer_id="acme_corp")# When retrieving, entities are already resolvedcontext = await sdk.user.context.fetch( user_id="user_123", customer_id="acme_corp")# Memories about "John Smith" and "Mr. Smith" are linked to the same canonical entityfor fact in context.facts: if fact.entities: for entity in fact.entities: print(f"Entity: {entity.canonical_name}") # "John Smith"
You can retrieve all memories associated with a specific entity:
# Find all memories related to a specific entitymemories = await sdk.memories.search( entity="John Smith", customer_id="acme_corp", max_results=20)for memory in memories: print(f"[{memory.type}] {memory.content}") print(f" Source: {memory.source.document_type} at {memory.source.created_at}")
The more context you include in ingested documents, the better entity resolution works. Full names, roles, and departments help distinguish between entities with similar names.Instead of: “Alex said the deadline is Friday.”
Prefer: “Alex Rivera from the billing team said the deadline is Friday.”
Use consistent customer_id values
Entity resolution relies on scope boundaries. Ensure you use consistent customer_id values across all ingestion calls for the same organization. Inconsistent IDs will fragment the entity registry.
Review ambiguous matches promptly
The review queue catches edge cases that automatic resolution cannot handle. Review these regularly to maintain entity registry quality. Unresolved queue items do not block ingestion — they use the best available match and flag it for review.
Let the system learn
Auto-registration is designed to build the entity registry organically. Avoid manually populating the registry for every possible entity. Instead, let natural conversations populate it and use the review queue to catch errors.
Treat the registry as master data
The entity registry is not just a deduplication tool — it is your application’s master data store for entities. Invest in keeping it clean: merge duplicates, correct canonical names, and enrich metadata. The quality of entity resolution improves directly with registry quality.