Tiered Entity Enrichment¶
Tiered Entity Enrichment is a data processing strategy designed to balance the need for comprehensive knowledge about entities (such as people or companies) with the constraints of API rate limits and processing costs^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
This approach recognizes that while an AI Agent may encounter thousands of entities, not all require the same level of detail. By categorizing entities into different "tiers," the system can prioritize deep research for high-value subjects while applying lighter processing to others, ensuring the Knowledge Base remains rich and up-to-date without exhausting external resources^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Implementation¶
The core logic for this strategy is typically implemented within an enrichment skill, such as the enrich skill found in systems like GBrain^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. This skill functions as a router that determines how much effort to expend on a specific entity.
The Three Tiers¶
The enrichment process generally classifies entities into three distinct levels:
-
Tier 1: Data Cleaning The most basic level of processing. Here, the system focuses on normalizing raw data to ensure consistency. This might involve formatting names, standardizing dates, or correcting typos in existing descriptions without fetching new information^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
-
Tier 2: Passive Enrichment A moderate level of enhancement. The system leverages information already present within the internal knowledge base to flesh out the entity's profile. For example, it might link an entity to existing meetings, projects, or documents where they are mentioned, building a profile based solely on "local" context^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
-
Tier 3: External Deep Search The most intensive level, reserved for high-priority or high-frequency entities. This involves making external API calls (e.g., to LinkedIn, Crunchbase, or search engines) to actively fetch new, up-to-date information from the web^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. Due to the potential cost and rate limits associated with these calls, this tier is applied selectively.
Routing Logic¶
To determine which tier an entity falls into, the system evaluates specific triggers:
- Frequency: Entities that appear often in daily interactions (e.g., frequent collaborators) are flagged as higher priority.
- Recency: Entities recently mentioned or encountered may require updated information.
- Relevance: Entities directly related to active projects or "Most Important Tasks" (MITs) are often escalated to Tier 3 to ensure the user has the most current context^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Related Concepts¶
- [[Entity Extraction]]
- [[AI Agent Skills]]
- [[Hybrid Retrieval]]
- [[Knowledge Graph]]
Sources¶
001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md