Architecture
Overview
xbrain runs as 30+ Docker containers on a single VM, connected via a private Docker network
(xbrain_net). All external traffic enters through nginx on port 80/443. Internal
services never expose ports to the internet.
Since Phase 11 (Brain Monitor + Superadmin Dashboard, 2026-05-17) the stack also includes
brain-janitor — a daily cron container that hard-purges soft-deleted entities
from Postgres, Qdrant, and Neo4j after the 30-day retention window. Phase 12 introduced the
GitHub App auth model (multi-callback + refresh tokens + installation
webhooks), replacing the legacy OAuth App + long-lived GITHUB_API_PAT.
The central invariant is memory-api: every data point from every frontend (LibreChat, Open WebUI, Chrome extension, agents) passes through memory-api before persisting. memory-api enforces the 7-field tagging contract and routes data to the right stores: PostgreSQL (event store), Qdrant (vector search), and Neo4j (knowledge graph).
Architecture Diagram
┌─────────────────┐
Browser ─────────►│ nginx (80/443) │
└────────┬────────┘
│
┌────────────┼────────────┐
│ │ │
┌──────▼──┐ ┌──────▼──┐ ┌─────▼──────┐
│ LibreChat │ │Open WebUI│ │ memory-api │
└──────┬──┘ └──────┬──┘ └─────┬──────┘
│ │ │
┌─────▼───────────▼──┐ ┌─────▼──────┐
│ librechat-bridge │ │ PostgreSQL │
│ openwebui-pipeline │ │ Qdrant │
└─────────────────────────┘ │ Neo4j │
└────────────┘
┌──────────────┐ ┌──────────────────────────┐
│ agent-runtime │ │ MCP Gateway │
│ (LangGraph) │ │ mcp-scraper │
└──────┬────────┘ │ mcp-drive-read │
│ │ mcp-calendar │
┌─────▼───────┐ │ mcp-deck │
│ graphiti-svc │ └──────────────────────────┘
└───────────────┘
┌──────────────┐ ┌────────────────────────┐
│ drive-sync │ │ Langfuse (observability) │
└──────────────┘ │ ClickHouse / Redis │
│ MinIO (Chainguard) │
└────────────────────────┘
All containers: xbrain_net (Docker bridge network)
Container Reference
All 25 containers across Phases 1–5. The Phase column indicates when each container is first introduced. Containers from earlier phases remain active in later phases.
| Container | Image | Phase | Purpose | RAM (idle) |
|---|---|---|---|---|
nginx |
nginx:1.27-alpine | 1 | Reverse proxy, TLS termination, routing to all services | 64 MB |
postgres |
postgres:17 | 1 | Primary relational DB: event store, audit logs, user/team data | 256 MB |
qdrant |
qdrant/qdrant:v1.17.1 | 1 | Vector store: semantic search, scoped by team_scope + truth_level | 300 MB |
memory-api |
xbrain/memory-api | 1 | Core API: tagging contract enforcer, routes to PostgreSQL + Qdrant + Neo4j | 384 MB |
librechat-mongo |
mongo:7 | 1 | LibreChat conversation and message storage (MongoDB) | 300 MB |
librechat-meili |
getmeili/meilisearch:v1.10 | 1 | LibreChat full-text search index | 192 MB |
librechat |
librechat/librechat:v0.8.2-rc2 | 1 | Primary chat frontend — Claude, GPT, Grok support, MCP-ready | 400 MB |
open-webui |
ghcr.io/open-webui/open-webui:v0.9.0 | 1 | Secondary chat + admin UI, RAG testing, agent experimentation | 600 MB |
librechat-bridge |
xbrain/librechat-bridge | 1 | Listens for LibreChat messages, forwards to memory-api with JWT auth | 128 MB |
openwebui-pipeline |
xbrain/openwebui-pipeline | 1 | Open WebUI pipeline plugin: intercepts messages, sends to memory-api | 192 MB |
agent-runtime |
xbrain/agent-runtime | 2 | LangGraph agent executor with human-in-the-loop (HITL) support | 384 MB |
langfuse |
langfuse/langfuse:3 | 2 | LLM observability web UI — traces, scores, evaluations | 512 MB |
langfuse-worker |
langfuse/langfuse-worker:3 | 2 | Langfuse background processing — async ingestion, exports | 512 MB |
clickhouse |
clickhouse/clickhouse-server | 2 | OLAP backend for Langfuse trace storage and analytics | 1.5 GB |
redis |
redis:7 | 2 | Langfuse cache, job queues, rate limiting | 50 MB |
minio |
cgr.dev/chainguard/minio | 2 | S3-compatible object storage: PDFs, images, Langfuse blobs | 256 MB |
neo4j |
neo4j:2026.04.0-community | 3 | Knowledge graph: entity nodes, MENTIONS edges, relationship lineage | 1 GB |
mcp-gateway |
xbrain/mcp-gateway | 3 | MCP tool registry and proxy — routes LibreChat MCP calls to sidecars | 256 MB |
mcp-scraper |
xbrain/mcp-scraper | 3 | Web scraping MCP tool (port 8100) — Playwright-based content extraction | 128 MB |
mcp-drive-read |
xbrain/mcp-drive-read | 3 | Google Drive read/write MCP tool (port 8101) — file listing, content fetch | 128 MB |
mcp-calendar |
xbrain/mcp-calendar | 3 | Google Calendar read MCP tool (port 8102) — event listing, availability | 128 MB |
drive-sync |
xbrain/drive-sync | 3 | Incremental Google Drive sync + Push Notifications (webhooks) | 128 MB |
mcp-deck |
xbrain/mcp-deck | 4 | PPTX generation MCP tool — creates slide decks from structured data | 256 MB |
graphiti-service |
xbrain/graphiti-service | 5 | Temporal fact extraction (port 8300) — enriches Neo4j graph, fail-soft | 512 MB |
projects-dashboard |
Firebase Hosting (static) | 5 | Team project dashboard — React SPA, deployed on Firebase Hosting | N/A |
brain-janitor |
xbrain/brain-janitor:phase11 | 11 | Daily cron (03:00 UTC) — hard-purges deleted_at < NOW() - 30 days rows
from Postgres, Qdrant, and Neo4j. APScheduler-based, sentinel file at
/tmp/brain-janitor-alive. |
128 MB |
Phase 11 — Brain Monitor + Superadmin Dashboard
Phase 11 (shipped 2026-05-17) extends the universal truth-level + soft-delete contract from
memory_items to every brainable entity: conversations,
messages, team_messages, tasks, contacts,
plus Granola transcripts. Three columns (truth_level, deleted_at,
deleted_by) are now mandatory on every entity table.
- Migration
0017_brain_monitor_basepatches the 5 missing tables with the columns and per-type backfill defaults (e.g.task→WORKING,contact→VALIDATED). - Migration
0018_brain_events_viewcreates thev_brain_eventsview — a UNION-ALL across the 7 entity types that powers the universal feed atGET /v1/brain/events. - A new
brain-janitorcontainer runs the 30-day hard-purge cron (Postgres first, then Qdrant points, then Neo4j nodes — idempotent). - Superadmin endpoints (
/v1/admin/brain/*) gate behindADMIN_USER_SUBSwith synchronousaudit_logwrites before any data read. See the Brain Monitor guide.
Phase 12 — GitHub App migration
Phase 12 (shipped 2026-05-17) retired the legacy OAuth App + long-lived
GITHUB_API_PAT in favour of a GitHub App. The new model unlocks multi-callback
(web + Chrome extension), short-lived installation tokens for org-membership lookups, and
user-token refresh (~6 month rolling sessions).
- Server-side:
PyJWT[crypto]>=2.10inmemory-apifor RS256 signing of the App JWT. Three token types: App JWT (10 min), installation tokenghs_(1h, cached in-process 55 min), user-to-serverghu_(8h) +ghr_refresh (6 months). - Migration
0019_github_app_installadds theinstallationstable and the encrypted token columns onusers:github_access_token_enc,github_refresh_token_enc, an HMACgithub_access_token_hashfor O(log n) Bearer lookup. - Webhook handler
POST /v1/webhooks/github/installationverifies HMAC-SHA256 and upserts oninstallation.created/deleted/suspend/unsuspend. - The Chrome extension and the web app share the same GitHub App client ID
(
Iv23liVnZvIN0Lo6isof) because GitHub Apps support multiple callback URLs natively. See the GitHub Auth guide.
Authentication recap (post-Phase 12)
Sign-in goes through the xbrain GitHub App (Phase 12). Tokens are
stored Fernet-encrypted on users with HMAC-indexed lookup. Per-org installs
unlock the auto-grant team-membership flow from Phase 10. Google OAuth and the LibreChat
OAuth App (xbrain LibreChat) remain available as separate, scoped legacy
paths; the GITHUB_API_PAT env var is removed.
Network Architecture
All containers connect on xbrain_net, a Docker bridge network defined in
docker-compose.yml. Container-to-container communication uses service names
as hostnames (e.g., http://memory-api:8000).
Exposed ports
| Port | Service | Accessible from |
|---|---|---|
| 80 | nginx | Internet (HTTP → redirects to 443) |
| 443 | nginx (with Cloudflare) | Internet (HTTPS) |
Internal-only ports
| Port | Service | Notes |
|---|---|---|
| 8000 | memory-api | FastAPI — all frontends and agents write here |
| 9100 | agent-runtime | LangGraph HTTP API |
| 8080 | mcp-gateway | MCP tool registry + proxy |
| 8100 | mcp-scraper | Web scraping MCP sidecar |
| 8101 | mcp-drive-read | Google Drive MCP sidecar |
| 8102 | mcp-calendar | Google Calendar MCP sidecar |
| 8300 | graphiti-service | Temporal fact extraction service |
| 6333 | qdrant | Qdrant HTTP + gRPC API |
| 7687 | neo4j | Neo4j Bolt protocol |
| 5432 | postgres | PostgreSQL |
No host networking
No container uses network_mode: host. All inter-service communication
stays within xbrain_net. Only nginx is bound to host ports 80 and 443.
Data Flow — Chat Message
Here is what happens from the moment a user sends a message in LibreChat to the moment it becomes a searchable memory item:
- User sends a message in LibreChat.
- LibreChat saves the message to MongoDB (librechat-mongo) — its own conversation history.
- librechat-bridge detects the new message via the LibreChat webhook/event.
- bridge sends
POST /v1/messagesto memory-api with a Bearer JWT. - memory-api enforces the tagging contract — all 7 mandatory fields must be present or the request returns 422.
- Message is stored in PostgreSQL (event store + audit log) and Qdrant (vector embedding for semantic search).
- If the message payload includes an
entitiesfield, the Neo4j outbox is updated by a background worker. - graphiti-service enriches the knowledge graph with temporal facts — this call is fail-soft and async (a graphiti-service outage does not block the memory write).
- Audit log entry written to PostgreSQL with timestamp, source, and team_scope.
Open WebUI follows the same flow
Messages from Open WebUI pass through openwebui-pipeline instead of librechat-bridge, but arrive at the same memory-api endpoint with the same tagging contract.
VM Sizing
xbrain's container count grows with each phase. Plan your VM size before deploying a new phase — attempting Phase 2 on a 4 GB VM will cause OOM kills.
| Phase | VM | RAM | Cost / mo | Notes |
|---|---|---|---|---|
| Phase 1 | e2-medium | 4 GB | ~25€ | ~2.2 GB used. Monitor OOM. Do not add Phase 2 services without upgrade. |
| Phase 2 | e2-standard-2 | 8 GB | ~49€ | Upgrade before adding mem0 + LangGraph + ClickHouse. |
| Phase 3+ | e2-standard-4 | 16 GB | ~98€ | Or split: xbrain on e2-standard-2 + Langfuse on e2-small (~62€/mo total). |
See the Deployment guide for detailed instructions on how to resize a GCP VM without data loss.