Graphify¶
Graphify is an open-source tool and skill designed to convert arbitrary folders—containing code, documentation, papers, images, or videos—into a queryable [[Knowledge Graph]]. It functions as a specialized skill for AI programming assistants, enabling them to understand codebase structures and design decisions with significantly reduced token consumption^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
Core Problem & Solution¶
Traditional AI programming assistants often face challenges when working with large codebases: they must re-read entire file histories, consuming massive amounts of tokens, and they struggle to connect heterogeneous materials like code, documentation, and media^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
Graphify addresses this by extracting structure, establishing relationships, and persisting a knowledge graph. Subsequent AI queries interact with this compressed graph rather than the raw source files, reportedly reducing query token consumption by up to 71.5x in specific benchmarks^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
Key Features¶
| Feature | Description |
|---|---|
| Multi-modal Input | Supports code (25 languages via tree-sitter), PDFs, Markdown, screenshots, whiteboard photos, videos, and audio^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| Local Processing | Code parsing via tree-sitter and media transcription via faster-whisper are performed locally to minimize LLM API costs^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| Leiden Clustering | Uses graph topology for community discovery (identifying related nodes), independent of vector databases^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| Confidence Labels | Tags relationships as EXTRACTED, INFERRED, or AMBIGUOUS to distinguish between factual parsing and AI guesswork^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| Incremental Updates | SHA256 caching ensures that only changed files are re-processed^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| MCP Server | Can run as a Model Context Protocol server to expose tools like query_graph, get_node, and shortest_path^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
| Git Integration | Supports post-commit and post-checkout hooks to automatically rebuild the graph^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. |
Workflow: Three-Pass Scanning¶
Graphify processes input data through a three-pass pipeline^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]:
- Pass 1 (AST Scan): Uses
tree-sitterlocally to parse code files, extracting classes, functions, imports, call graphs, docstrings, and design comments without using LLM tokens^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. - Pass 2 (Media Transcription): Uses
faster-whisperlocally to transcribe audio and video. Domain-aware prompts improve transcription accuracy^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]. - Pass 3 (Semantic Extraction): Parallel LLM sub-agents (Claude/GPT) process documents, papers, images, and transcribed text to extract concepts, relationships, and design decisions^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
The extracted data is merged into a NetworkX graph, analyzed using Leiden community discovery, and output as HTML, JSON, and reports^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
Usage & Installation¶
The package is available on PyPI (as graphifyy with two 'y's)^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
Installation¶
pip install graphifyy
graphify install
Common Commands¶
# Generate graph for current directory
/graphify .
# Deep inference mode
/graphify ./raw --mode deep
# Incremental update (only changed files)
/graphify ./raw --update
# Export to Obsidian vault
/graphify ./raw --obsidian
# Watch for file changes and auto-rebuild
/graphify ./raw --watch
# Query the graph
graphify query "show the auth flow"
graphify path "DigestAuth" "Response"
Platform Support¶
Graphify provides "always-on" integration for numerous AI coding platforms^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md].
- Claude Code: Uses
CLAUDE.mdandPreToolUsehooks. - Cursor: Installs rules to
.cursor/rules/graphify.mdc. - Aider / Hermes / Codex / Trae: Integrates via
AGENTS.mdor platform-specific skill files. - VS Code Copilot: Installs
copilot-instructions.md.
Output Artifacts¶
The tool generates a graphify-out/ directory containing^[001-TODO__Graphify_-_AI编程助手知识图谱技能.md]:
* graph.html: An interactive visualization (browser-based).
* GRAPH_REPORT.md: A report containing "God nodes" (central concepts), "Surprising Connections", and suggested questions.
* graph.json: The persistent graph data usable by the MCP server or other tools.
Related Concepts¶
- [[Tree-sitter]]: The underlying parsing engine used for code structure extraction.
- MCP Server: The protocol standard used to expose Graphify tools to AI agents.
- [[Knowledge Graph]]: The fundamental data structure Graphify generates.
Sources¶
001-TODO__Graphify_-_AI编程助手知识图谱技能.md