GBrain AI Agent Knowledge System¶
GBrain is an open-source personal knowledge system designed specifically for AI Agents. Created by Garry Tan (President & CEO of Y Combinator), it functions as a "second brain" that allows agents to query personal data before interactions and update it afterwards, creating a compound interest effect where the system becomes "smarter the more it is used."^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]
Core Architecture¶
GBrain operates on a three-layer architecture centered around a git repository as the single source of truth^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]:
- Brain Repo: A git repository containing Markdown files that serves as the system's record of truth. Humans can edit this directly^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- GBrain (Retrieval Layer): A search engine powered by Postgres and
pgvector. It handles hybrid search capabilities (vector + keyword) and exposes data to the agent^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. - AI Agent: The consumer of the knowledge, utilizing 25 distinct "skills" to define how to read, enrich, and write to the brain^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
The system creates a closed loop: the agent retrieves context from the brain before acting and updates the brain with new insights or entities after interacting^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Knowledge Model¶
GBrain structures information using a Compiled Truth + Timeline model^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. Each page contains:
- Frontmatter: Metadata including
type(person, company, concept, deal, etc.),title, andtags^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. - Compiled Truth: The current best understanding of a topic, rewritten as new evidence emerges^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Timeline: An append-only log of events (e.g., meetings, updates) that is never edited, only added to^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Search & Retrieval¶
The retrieval system utilizes Hybrid Search to find relevant context^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Query Expansion: Uses Anthropic Claude Haiku to expand queries before searching^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Dual Search: Combines vector search (HNSW cosine similarity using
text-embedding-3-large) and keyword search (Postgrestsvector+ts_rank+pg_trgm)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. - Fusion: RRF (Reciprocal Rank Fusion) combines scores with the formula
score = sum(1/(60 + rank))^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. - Chunking Strategies: Employs three distinct strategies depending on content: Recursive (for timelines), Semantic (for compiled truth), and LLM-guided (for high-value content)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Agent Skills¶
GBrain's functionality is defined by 25 skills that manage the lifecycle of information^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Core Operations¶
- signal-detector: Runs on every message to capture original ideas and entity mentions^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- brain-ops: Intercepts external API calls to perform a read-enrich-write cycle^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- enrich: Tiered enrichment (Tier 1/2/3) that creates or updates person/company pages^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Ingestion & Maintenance¶
- ingest & media-ingest: Routes inputs (articles, PDFs, audio, GitHub repos) into the brain, extracting entities and creating cross-links^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- meeting-ingestion: Processes meeting transcripts to create pages and enrich attendee profiles^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- maintain: Performs health checks, fixing broken links, auditing citations, and identifying stale pages^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Daily Workflow¶
- daily-task-prep: Provides morning briefings with calendar context and background on attendees^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- briefing: Generates daily summaries of active deals, citations, and meeting contexts^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Technical Stack¶
GBrain is built for performance and ease of deployment^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Runtime: Bun (TypeScript)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Database: Supports both PGLite (embedded, local Postgres 17.5) for zero-config setups and Supabase for hosted environments^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Vector Search:
pgvectorwith HNSW indexing^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. - Integration: Available as a CLI, an MCP Server (30+ tools for Claude Code/Cursor), or a TypeScript library^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Key Features¶
- Git-Native: The brain is a standard git repository, allowing for version control, human editing, and easy synchronization^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- MCP Support: Exposes over 30 tools via the Model Context Protocol for integration with IDEs and agents^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
- Automated Workflows: Supports integrations like Voice-to-Brain, Email-to-Brain, and Calendar-to-Brain for automatic data capture^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
Sources¶
001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md