Overview
Entity Resolution (ER) is the process of identifying when different mentions in your data refer to the same real-world entity. When a user says “my manager Sarah”, “S. Chen”, and “Sarah Chen” across different conversations, Synap’s ER system recognizes these as the same person and links them to a single canonical entity. Entity resolution runs automatically during ingestion. There are no explicit SDK calls required to trigger or configure it. As you ingest more data throughsdk.memories.create(), Synap continuously builds and refines its entity registry, improving resolution accuracy over time.
How It Works
When a document enters the ingestion pipeline, entity resolution occurs as part of the extraction stage. Here is what happens behind the scenes:Entity Extraction
The ingestion pipeline identifies entity mentions in the document — people, organizations, projects, locations, products, and other named entities. Each mention is extracted with surrounding context to aid disambiguation.
Scope-Based Lookup
For each extracted entity, Synap searches the entity registry using a narrowest-first scope chain: USER, then CUSTOMER, then CLIENT, then WORLD. The search stops at the first scope where a match is found.
Matching and Resolution
The ER system uses a combination of exact string matching, normalized name matching, and semantic similarity (via 384-dimensional vector embeddings) to find potential matches in the registry. Matches above the confidence threshold are resolved.
Canonical Name Assignment
When a match is found, the extracted entity’s
canonical_name is set to the registry entry’s canonical form. All future references resolve to this same canonical entity, enabling cross-conversation linking.
The Scope Chain
Synap searches for entity matches across scopes in a strict order, from narrowest to broadest:| Priority | Scope | Description | Example |
|---|---|---|---|
| 1 | USER | Entities visible only to a specific user | ”my wife” resolves to “Emily” for user_alice but “Maria” for user_bob |
| 2 | CUSTOMER | Entities shared across all users in a customer/organization | ”the CEO” resolves to “Jane Smith” for all users at Acme Corp |
| 3 | CLIENT | Entities shared across all customers within your Synap client | Product names, internal project codenames |
| 4 | WORLD | Global entities (reserved for future use) | Well-known public entities |
The scope chain stops at the first match. If “Project Atlas” exists at both USER and CUSTOMER scope, the USER-scope entry is used. This ensures personal context is not overridden by organizational context.
Auto-Registration
When the ER system encounters an entity that does not match any existing registry entry, it automatically registers the entity at CUSTOMER scope. This is a deliberate design choice with important implications:- The system improves over time. The first mention of “Sarah Chen” creates a registry entry. Every subsequent mention of “S. Chen”, “Sarah”, or “Chen” can now resolve to the same canonical entity.
- No manual entity setup required. You do not need to pre-populate an entity registry. The system bootstraps itself from your data.
- CUSTOMER scope is the default. Auto-registered entities are visible to all users within the same customer, which is the right default for most organizational entities (team members, projects, products).
What Happens to Resolved Entities
When an entity is resolved, the extracted entity’scanonical_name field is populated with the registry’s canonical form. This has downstream effects throughout Synap:
- Graph storage: Resolved entities are stored as nodes in the graph with edges connecting them to related entities, facts, and memories. Because they share a canonical identity, information about “Sarah Chen” from different conversations is linked in the graph.
- Retrieval quality: When you retrieve context with
sdk.conversation.context.fetch(), theaccuratemode traverses entity relationships in the graph. Resolved entities mean that asking about “S. Chen” returns facts from conversations that mentioned “Sarah Chen”. - Cross-conversation linking: Facts extracted from one conversation (“Sarah Chen is the engineering lead”) become accessible when retrieving context for a different conversation that mentions “S. Chen”.
Scope-Based Visibility
Entities registered at a given scope are only visible to lookups within that scope and narrower:Scope-based visibility ensures that private entity mappings (like personal nicknames or relationship references) remain private to the user who created them, while organizational knowledge is shared across the team.
Review Queue
In some cases, the ER system encounters ambiguous matches where the confidence is above a minimum threshold but below the auto-resolution threshold. These cases are placed in a review queue for human review. Ambiguous matches typically occur when:- Two entities have similar names but may be different people (e.g., “John Smith” in two different departments)
- A nickname could map to multiple canonical entities
- An entity’s context is insufficient to determine the correct match
The review queue is manageable from the Synap Dashboard in a future release. Currently, ambiguous matches below the auto-resolution threshold are auto-registered as new entities. As the system encounters more context about these entities, it may automatically merge them in future ingestions.
Practical Example
The following example demonstrates how entity resolution enriches retrieval results across multiple conversations.- Registered “Sarah Chen” during the first ingestion
- Resolved “S. Chen” to “Sarah Chen” during the second ingestion
- Resolved “Sarah” to “Sarah Chen” during the third ingestion (CUSTOMER scope match)
- Linked all extracted facts to the same canonical entity, enabling comprehensive retrieval
Entity resolution is fully automatic. As you ingest more data, Synap builds a richer entity registry that improves future resolutions. The more context the system has about an entity, the better it becomes at resolving ambiguous references.
Impact on Retrieval Modes
Entity resolution primarily benefits theaccurate retrieval mode, which traverses entity relationships in the graph. The fast mode relies on vector similarity alone and benefits from ER indirectly (resolved entities share embedding neighborhoods).
| Retrieval Mode | ER Benefit |
|---|---|
fast | Indirect — resolved entities have better vector alignment |
accurate | Direct — graph traversal follows entity relationships across conversations |
accurate mode with a well-populated entity registry delivers significantly richer results.
Next Steps
Ingesting Memories
Feed more data to build a richer entity registry.
Retrieving Memories
Query context that leverages resolved entities.
Scopes
Understand how scopes affect entity visibility and memory access.
Entities and Resolution
Deep dive into entity resolution concepts and architecture.