Google Drive Sync
Overview
xbrain continuously synchronizes Google Drive folders to your team's memory. When a file is added
or updated in a mapped folder, it's automatically processed and indexed in xbrain with full
tagging — team_scope, project_scope, truth_level=WORKING.
The drive-sync service handles the sync loop. It runs in its own container and
communicates with memory-api to upsert content. Each mapped folder gets a unique
mapping_id (UUID) that ties the Drive credentials to the correct team and project scope.
How It Works
Mode 1 — Push Webhooks (Primary)
When a Drive file changes, Google sends a push notification to xbrain within seconds. This is the default for mapped folders and requires no polling overhead.
Drive file updated
│
▼
Google → POST /v1/drive/webhook (xbrain)
│
▼
drive-sync processes the changed file
│
▼
memory-api upsert (truth_level=WORKING)
Mode 2 — Polling Fallback (5-minute intervals)
If webhooks are unavailable (e.g. the VM is temporarily unreachable by Google), drive-sync
falls back to polling Drive every 5 minutes for changes. Only modified files are re-processed
— the sync is incremental and uses Drive's pageToken mechanism to track position.
Setting Up Drive Sync
Step 1 — OAuth authorization (admin only)
An admin must authorize xbrain to access the team's Google Drive. This redirects to Google OAuth and stores the credentials encrypted in PostgreSQL using Fernet encryption.
bash# Admin initiates OAuth flow
GET https://api.grooveos.app/v1/admin/drive/auth
# → Redirects to Google OAuth consent screen
# → After consent, credentials stored with Fernet encryption in PostgreSQL
Step 2 — Map a folder
Once authorized, map a specific Drive folder to a team and project scope. The
folder_id is the last segment of the Drive URL for that folder.
bashcurl -X POST https://api.grooveos.app/v1/admin/drive/mappings \
-H "Authorization: Bearer $ADMIN_JWT" \
-H "Content-Type: application/json" \
-d '{
"team_scope": "excalibur",
"project_scope": "fundraising",
"folder_id": "1ABC...xyz",
"folder_name": "Fundraising Docs"
}'
# Returns: {"mapping_id": "uuid-...", "status": "active"}
Step 3 — Webhook auto-registered
After mapping, drive-sync automatically registers a Drive push notification channel pointing
to POST /v1/drive/webhook. Webhook channels expire after 7 days by default —
drive-sync renews them automatically before expiry using a background scheduler.
Info
The webhook channel renewal is handled automatically by the drive-sync scheduler. No manual intervention is required after initial mapping.
Multi-Folder Mapping
Multiple Drive folders can be mapped to the same team, each with a distinct
project_scope. This lets you scope Drive content to the relevant project without
creating separate team accounts.
bash# Map fundraising folder
POST /v1/admin/drive/mappings
{"team_scope": "excalibur", "project_scope": "fundraising", "folder_id": "1ABC..."}
# Map engineering folder (same team, different project scope)
POST /v1/admin/drive/mappings
{"team_scope": "excalibur", "project_scope": "engineering", "folder_id": "2DEF..."}
Files in the fundraising folder get project_scope='fundraising', and engineering
files get project_scope='engineering'. They are indexed and searchable
independently within xbrain memory.
List Current Mappings
bashcurl https://api.grooveos.app/v1/admin/drive/mappings \
-H "Authorization: Bearer $ADMIN_JWT"
# Returns a list of all active folder mappings for the team
Sync Status
Check what has been synced by querying memory-api with the drive-sync source filter:
bashcurl "https://api.grooveos.app/v1/memory/search?q=drive+sync&source=drive-sync" \
-H "Authorization: Bearer $JWT" \
-H "X-Team-Scope: excalibur"
Security
Drive credentials are stored encrypted using Fernet encryption. Never commit OAuth tokens,
service account keys, or the FERNET_KEY value to the repository. These must
be injected via environment variables or a secrets manager.
Supported File Types
drive-sync extracts text from supported formats and passes it to memory-api. Unsupported formats are logged and skipped — they do not cause sync failures.
| File Type | Processing | Notes |
|---|---|---|
| Google Docs | Text extraction | Exported as plain text via Drive export API |
| Google Sheets | Text extraction | First sheet only; cell values joined as text |
| Text extraction | Via PyMuPDF; scanned PDFs without OCR may be empty | |
| Markdown (.md) | Direct | No conversion needed; indexed as-is |
| Text (.txt) | Direct | No conversion needed; indexed as-is |
| Images (.png, .jpg, etc.) | Skipped | Not yet supported — logged but not indexed |
Architecture Notes
The OAuth state parameter used during the Drive authorization flow is set to the
mapping_id UUID — not the team_scope. This ensures the callback
correctly resolves which mapping to associate credentials with, even when multiple mappings
are in progress simultaneously.
The push webhook endpoint — POST /v1/drive/webhook — is public (no auth
header required) because Google Drive sends notifications without Bearer tokens. Request
authenticity is verified using the channel ID and resource ID returned during webhook
registration.