title: Memory Architecture Guide for Hermes Agent — Triple-Stack Persistent AI Memory description: Hermes Agent memory architecture guide covering the triple stack: Honcho peer memory, GBrain organizational knowledge, memcore-cloud cross-session recall, GraphRAG, and Session DB. Production-tested persistent AI agent memory. category: knowledge tags: [hermes-agent, memory, honcho, gbrain, memcore-cloud, graphrag, persistent-memory, knowledge-management] last_updated: 2026-06-16
Hermes Agent Memory Architecture — The Triple Stack for Persistent AI Memory¶
Why Three Systems?¶
| Problem | Without Memory | Solution | System |
|---|---|---|---|
| "Who am I talking to?" | Agent forgets users between sessions | Peer identity, preferences, bans | Honcho |
| "Where is that file?" | Agent can't find code/docs it created | Organizational knowledge indexing | GBrain |
| "What happened last session?" | Agent repeats work, misses context | Cross-session context injection | memcore-cloud |
These aren't competitors — they solve different problems. Run all three.
Layer 1: Honcho — Peer Memory¶
What it solves: Persistent identity. Who the user is, what platforms are banned, what decisions were made, what rules apply.
Setup (2 minutes):
# Add Honcho MCP to Hermes
hermes mcp add honcho -- npx mcp-remote https://mcp.honcho.dev \
--header "Authorization: Bearer YOUR_HONCHO_API_KEY" \
--header "X-Honcho-Workspace-ID: your-workspace" \
--header "X-Honcho-User-Name: your-agent-name"
What it stores:
Peer: user
├── Preferences: stored and retrieved across sessions
├── Platform bans: tracked per platform
├── Rules: business rules and communication preferences
├── Decisions: dated decision log
└── Context: role, autonomy level, communication style
Peer: corpusiq-agent
├── Self-conclusions: What I learned, what I fixed
├── Session summaries: Previous session handoffs
└── Improvement log: Skills patched, patterns discovered
Key Honcho tools in Hermes:
mcp_honcho_get_peer_context — Load everything about a peer
mcp_honcho_chat — Ask questions about peer knowledge
mcp_honcho_search — Semantic search across conversations
mcp_honcho_schedule_dream — Queue memory consolidation
Pattern: Pre-flight gate
# Before EVERY external action, check Honcho
peer = honcho_get_peer_context(target_peer_id="user")
# Check: Is this platform banned?
# Check: Any rules about this action?
# Check: Was this already done?
Layer 2: GBrain — Organizational Knowledge¶
What it solves: Where are the files, what code does what, what was built when. Organizational knowledge that persists.
Setup (10 minutes):
git clone https://github.com/garrytan/gbrain
cd gbrain
./setup.sh
# GBrain indexes your project files into pglite with 768d embeddings
What it stores:
Organization: CorpusIQ
├── Files: 729 indexed (code, docs, configs, scripts)
├── Embeddings: nomic-embed-text, 768-dimension
├── Categories: infrastructure, video, governance, email, social
└── Relationships: "heygen-video-generator.py" → depends on → "heygen.json"
Query pattern:
# "What videos did we generate this week?"
gbrain.search("video pipeline generated files MP4 week June 2026")
# "Where is the email template?"
gbrain.search("HTML email template operator time waste campaign")
Dream Cycle (3 AM daily):
Nightly consolidation:
├── Merge duplicate observations
├── Strengthen frequently-accessed paths
├── Prune stale/irrelevant entries
└── Update peer cards with new conclusions
Layer 3: memcore-cloud — Cross-Session Recall¶
What it solves: "What happened last session?" Full conversation recall with raw source tracking.
Setup (5 minutes):
pip install memcore-cloud
memcore-cloud init
# It runs as a background service
# Injects context into every Hermes turn automatically
Architecture:
Session N-1 Session N
│ │
▼ ▼
memcore-cloud ──► injects ──► Hermes turn
│
├── Raw sources: What files were read, what URLs were fetched
├── FTS5 recall: Search past conversations instantly
└── Context window: Relevant history injected automatically
What it enables:
| Without memcore-cloud | With memcore-cloud |
|---|---|
| "What did I do yesterday?" → Blank stare | "You generated a video, posted 3 tweets, replied to 2 emails" |
| Repeats the same research | "Already researched this — here's what we found" |
| Forgets tool quirks | "HeyGen background URLs are broken, use default" |
| Requires user to re-explain | Remembers preferences across sessions |
Layer 4: GraphRAG — Relationship Memory¶
What it solves: How entities and concepts connect — not just what documents say.
When to use: - Complex codebases where files reference each other - Business operations with multiple connected systems - Multi-agent architectures with dependencies
Setup (30 minutes):
pip install graphrag
graphrag init --root ./graphrag-data
# Index your knowledge base
graphrag index --root ./graphrag-data
Query pattern:
"Which cron jobs depend on the Gmail API token?"
→ Returns: Email Monitor, Email Responder, Forward Handler, Job Reply Forwarder
→ With: Token location, refresh schedule, failure handling per cron
Layer 5: Session DB — Hermes Built-In¶
What it is: SQLite database at ~/.hermes/profiles/<profile>/state.db with FTS5 full-text search. Every conversation, every tool call, every result.
How to use:
# Search past sessions from CLI
hermes session search "heyGen video failed"
# From within a session, use session_search tool
session_search(query="email campaign June")
What's stored: - Every message (user + assistant) - Every tool call + result - Session metadata (model, profile, timestamps) - Auto-summarized session handoffs
The Complete Memory Flow¶
SESSION START
│
├── 1. Honcho: Load peer context
│ └── Who is the user? What are the rules? What platforms are banned?
│
├── 2. GBrain: Query organizational knowledge
│ └── Where are the files? What was built? What's the architecture?
│
├── 3. memcore-cloud: Inject cross-session context
│ └── What happened last session? What was decided? What's pending?
│
├── 4. Session DB: Load today's conversation
│ └── What have we already discussed this session?
│
▼
AGENT OPERATES WITH FULL CONTEXT
│
▼
DURING SESSION
│
├── Every action: Log to Session DB (automatic)
├── New observations: Write to Honcho
├── New files: GBrain picks up on next index
└── memcore-cloud: Continuously updating context window
│
▼
SESSION END (or every 30 min)
│
├── 1. Honcho: Queue dream for consolidation
├── 2. GBrain: Re-index changed files
├── 3. memcore-cloud: Save raw sources
└── 4. Session DB: Auto-compact if needed
Why Not Just One Memory System?¶
| Approach | Good For | Fails At |
|---|---|---|
| Single Vector DB | Semantic document search | Peer identity, conversation history, cross-session context |
| Honcho only | Peer modeling, conversations | Code/document indexing, file relationships |
| GBrain only | File/code knowledge | "Who is the user?" — no peer modeling |
| memcore-cloud only | Cross-session recall | Organizational knowledge indexing |
| All three | Complete agent memory | Nothing — this is the production stack |
The cost of NOT running all three: Your agent forgets who it's talking to, can't find its own files, and repeats the same research every session.
Production Numbers¶
From our deployment (as of June 16, 2026):
| System | Data Volume | Query Speed |
|---|---|---|
| Honcho | 200+ peer conclusions, 50+ sessions | <100ms semantic search |
| GBrain | 729 indexed files, 768d vectors | <200ms search |
| memcore-cloud | Cross-session context auto-injected | Per-turn, transparent |
| Session DB | 50K+ messages across sessions | FTS5 instant |
Next: MCP Integration Guide · Skills Marketplace · Production Crons