Skip to content

GBrain AI Agent Knowledge System

GBrain is an open-source personal knowledge system designed specifically for AI Agents. Created by Garry Tan (President & CEO of Y Combinator), it functions as a "second brain" that allows agents to query personal data before interactions and update it afterwards, creating a compound interest effect where the system becomes "smarter the more it is used."^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]

Core Architecture

GBrain operates on a three-layer architecture centered around a git repository as the single source of truth^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]:

  • Brain Repo: A git repository containing Markdown files that serves as the system's record of truth. Humans can edit this directly^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • GBrain (Retrieval Layer): A search engine powered by Postgres and pgvector. It handles hybrid search capabilities (vector + keyword) and exposes data to the agent^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • AI Agent: The consumer of the knowledge, utilizing 25 distinct "skills" to define how to read, enrich, and write to the brain^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

The system creates a closed loop: the agent retrieves context from the brain before acting and updates the brain with new insights or entities after interacting^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Knowledge Model

GBrain structures information using a Compiled Truth + Timeline model^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md]. Each page contains:

  • Frontmatter: Metadata including type (person, company, concept, deal, etc.), title, and tags^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Compiled Truth: The current best understanding of a topic, rewritten as new evidence emerges^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Timeline: An append-only log of events (e.g., meetings, updates) that is never edited, only added to^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Search & Retrieval

The retrieval system utilizes Hybrid Search to find relevant context^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

  • Query Expansion: Uses Anthropic Claude Haiku to expand queries before searching^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Dual Search: Combines vector search (HNSW cosine similarity using text-embedding-3-large) and keyword search (Postgres tsvector + ts_rank + pg_trgm)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Fusion: RRF (Reciprocal Rank Fusion) combines scores with the formula score = sum(1/(60 + rank))^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Chunking Strategies: Employs three distinct strategies depending on content: Recursive (for timelines), Semantic (for compiled truth), and LLM-guided (for high-value content)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Agent Skills

GBrain's functionality is defined by 25 skills that manage the lifecycle of information^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Core Operations

  • signal-detector: Runs on every message to capture original ideas and entity mentions^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • brain-ops: Intercepts external API calls to perform a read-enrich-write cycle^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • enrich: Tiered enrichment (Tier 1/2/3) that creates or updates person/company pages^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Ingestion & Maintenance

  • ingest & media-ingest: Routes inputs (articles, PDFs, audio, GitHub repos) into the brain, extracting entities and creating cross-links^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • meeting-ingestion: Processes meeting transcripts to create pages and enrich attendee profiles^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • maintain: Performs health checks, fixing broken links, auditing citations, and identifying stale pages^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Daily Workflow

  • daily-task-prep: Provides morning briefings with calendar context and background on attendees^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • briefing: Generates daily summaries of active deals, citations, and meeting contexts^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Technical Stack

GBrain is built for performance and ease of deployment^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

  • Runtime: Bun (TypeScript)^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Database: Supports both PGLite (embedded, local Postgres 17.5) for zero-config setups and Supabase for hosted environments^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Vector Search: pgvector with HNSW indexing^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Integration: Available as a CLI, an MCP Server (30+ tools for Claude Code/Cursor), or a TypeScript library^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Key Features

  • Git-Native: The brain is a standard git repository, allowing for version control, human editing, and easy synchronization^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • MCP Support: Exposes over 30 tools via the Model Context Protocol for integration with IDEs and agents^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].
  • Automated Workflows: Supports integrations like Voice-to-Brain, Email-to-Brain, and Calendar-to-Brain for automatic data capture^[001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md].

Sources

  • 001-TODO__GBrain_-_AI_Agent_个人知识库与混合检索引擎.md