feat: Add Alateya, Clan, Eonarch agents + fix gateway-router connection

## Agents Added - Alateya: R&D, biotech, innovations - Clan (Spirit): Community spirit agent - Eonarch: Consciousness evolution agent ## Changes - docker-compose.node1.yml: Added tokens for all 3 new agents - gateway-bot/http_api.py: Added configs and webhook endpoints - gateway-bot/clan_prompt.txt: New prompt file - gateway-bot/eonarch_prompt.txt: New prompt file ## Fixes - Fixed ROUTER_URL from :9102 to :8000 (internal container port) - All 9 Telegram agents now working ## Documentation - Created PROJECT-MASTER-INDEX.md - single entry point - Added various status documents and scripts Tokens configured: - Helion, NUTRA, Agromatrix (existing) - Alateya, Clan, Eonarch (new) - Druid, GreenFood, DAARWIZZ (configured)
2026-01-28 06:40:34 -08:00
parent 4aeb69e7ae
commit 0c8bef82f4
120 changed files with 21905 additions and 425 deletions
--- a/docs/RUNBOOK_NODE1_RECOVERY_SAFETY.md
+++ b/docs/RUNBOOK_NODE1_RECOVERY_SAFETY.md
@@ -0,0 +1,150 @@
+# Runbook: NODE1 Recovery & Safety
+
+## Purpose
+Швидко відновити роботу NODE1 після збоїв (Telegram webhook 500, router DNS, NATS/worker, Grafana crash-loop) і уникнути випадкового зупинення не того стеку.
+
+## Quick links / aliases
+- `./stack-node1 ps|up|down|logs` (node1 stack)
+- `./stack-staging ps|up|down|logs` (staging stack)
+- NODE1 Docker network: `dagi-network` (для `nats-box`)
+
+## Scope (NODE1 stack)
+- dagi-gateway-node1 (9300)
+- dagi-router-node1 (router API)
+- dagi-nats-node1 (4222, JetStream enabled)
+- crewai-nats-worker
+- dagi-memory-service-node1 (8000)
+- dagi-qdrant-node1 (6333)
+- dagi-postgres (5432)
+- dagi-redis-node1 (6379)
+- dagi-neo4j-node1 (7474/7687)
+- prometheus (9090)
+- grafana
+- dagi-crawl4ai-node1 (11235)
+- control-plane (9200)
+- other node1 services as defined in docker-compose.node1.yml
+
+## Safety rules (DO THIS FIRST)
+1) Always set project name for NODE1:
+   - `export COMPOSE_PROJECT_NAME=dagi_node1`
+2) Always use the correct compose file:
+   - `-f docker-compose.node1.yml`
+3) Never run `docker compose down` without verifying target:
+   - `docker compose -f docker-compose.node1.yml ps`
+4) If staging exists, it MUST have a different `COMPOSE_PROJECT_NAME` and networks.
+
+## Quick status
+- `docker compose -f docker-compose.node1.yml ps`
+- `docker compose -f docker-compose.node1.yml logs --tail=80 dagi-gateway-node1 dagi-router-node1 dagi-nats-node1 crewai-nats-worker grafana`
+
+## Standard restart order (most incidents)
+1) NATS (foundation)
+2) Router (dependency for Gateway routing)
+3) Gateway (webhooks)
+4) Worker (async jobs)
+5) Grafana (observability only)
+
+Commands:
+- `docker compose -f docker-compose.node1.yml up -d dagi-nats-node1`
+- `docker compose -f docker-compose.node1.yml up -d dagi-router-node1`
+- `docker compose -f docker-compose.node1.yml up -d dagi-gateway-node1`
+- `docker compose -f docker-compose.node1.yml up -d crewai-nats-worker`
+- `docker compose -f docker-compose.node1.yml up -d grafana`
+
+## Incident playbooks
+
+### A) Telegram webhook returns 500 (e.g. /greenfood/telegram/webhook)
+Symptoms:
+- 500 responses from gateway
+- gateway logs show router request failures
+
+Check:
+- `docker logs --tail=200 dagi-gateway-node1 | grep -E "webhook|Router request failed|GREENFOOD"`
+- `docker compose -f docker-compose.node1.yml ps | grep -E "dagi-gateway-node1|dagi-router-node1"`
+
+Fix:
+1) Ensure router is healthy:
+   - `docker logs --tail=120 dagi-router-node1`
+   - `docker inspect --format '{{json .State.Health}}' dagi-router-node1`
+2) Ensure gateway can resolve router (Docker DNS):
+   - `docker exec -it dagi-gateway-node1 getent hosts router || true`
+3) Restart router + gateway:
+   - `docker restart dagi-router-node1`
+   - `docker restart dagi-gateway-node1`
+
+Root cause examples:
+- router container crash-loop → DNS name `router` not resolvable
+- ROUTER_URL points to non-existing host/service in node1 network
+
+### B) Router crash-loop on startup (Pydantic / config errors)
+Symptoms:
+- router restarting
+- traceback in `docker logs dagi-router-node1`
+
+Fix:
+1) Read the first error in logs:
+   - `docker logs --tail=200 dagi-router-node1`
+2) Hotfix then rebuild/recreate if needed:
+   - code fix (example previously: `temperature: float = 0.2`)
+   - `docker compose -f docker-compose.node1.yml up -d --build --force-recreate dagi-router-node1`
+
+### C) NATS worker shows Subscription failed / NotFoundError
+Symptoms:
+- worker logs mention `NotFoundError`
+- worker cannot subscribe / consume tasks
+
+Check:
+- `docker logs --tail=200 crewai-nats-worker`
+- `docker logs --tail=200 dagi-nats-node1 | grep -i jetstream`
+
+Fix (JetStream):
+1) Ensure JetStream enabled (NATS started with `-js`).
+2) Ensure required stream exists (example used on NODE1):
+   - Stream: `STREAM_AGENT_RUN`
+   - Subjects: `agent.run.>`
+3) Using nats-box (inside node1 network):
+   - `docker run --rm -it --network <NODE1_NETWORK> natsio/nats-box:latest sh`
+   - create stream/consumer as required by worker subjects
+4) Restart worker:
+   - `docker restart crewai-nats-worker`
+
+### D) Grafana crash-loop due to provisioning alert rule
+Symptoms:
+- grafana restarting
+- logs mention invalid alert rule / relative time range `From: 0s, To: 0s`
+
+Fix:
+1) Identify failing rule file:
+   - `docker logs --tail=200 grafana`
+2) Fix provisioning yaml (example path used on NODE1):
+   - `/opt/microdao-daarion/monitoring/grafana/provisioning/alerting/alerts.yml`
+   - Ensure rule has valid `relativeTimeRange`
+3) Restart grafana:
+   - `docker restart grafana`
+
+## Post-recovery verification checklist
+1) Core health:
+- `docker compose -f docker-compose.node1.yml ps | grep -E "Up|healthy"`
+2) Router reachable from gateway:
+- `docker exec -it dagi-gateway-node1 getent hosts router`
+3) NATS OK:
+- `docker logs --tail=80 dagi-nats-node1 | grep -i "JetStream\|Server is ready"`
+4) Worker subscribed:
+- `docker logs --tail=120 crewai-nats-worker | grep -E "Subscribed|Subscription OK|NotFoundError" || true`
+5) GREENFOOD policy sanity:
+- рекламне оголошення → ігнор
+- пряме питання → відповідь ≤ 3 речень
+
+## Known configuration anchors (update when changed)
+- GREENFOOD торговa група: `t.me/+SPm1OV-pDJZhZGFi`
+- ROUTER_URL used by gateway: `http://router:8000` (must resolve inside node1 network)
+- NATS_URL: `nats://nats:4222`
+- JetStream Stream: `STREAM_AGENT_RUN` (`agent.run.>`)
+- Grafana alerts provisioning file: `monitoring/grafana/provisioning/alerting/alerts.yml`
+
+## Appendix: common commands
+- `docker compose -f docker-compose.node1.yml ps`
+- `docker compose -f docker-compose.node1.yml logs -f <service>`
+- `docker restart <container>`
+- `docker compose -f docker-compose.node1.yml up -d --build --force-recreate <service>`
+- `docker system df`
--- a/docs/hardcode_vs_config.md
+++ b/docs/hardcode_vs_config.md
@@ -0,0 +1,123 @@
+# Хардкод vs Конфігурація
+
+## 🔴 ХАРДКОД (Hardcode) - Що це?
+
+**Хардкод** = значення, які "зашиті" прямо в коді і не можуть змінюватись без редагування коду.
+
+### Приклад хардкоду:
+```python
+# ❌ ХАРДКОД - значення прямо в коді
+local_model = "qwen3-8b"  # Якщо треба змінити модель, треба редагувати код!
+```
+
+### Проблеми хардкоду:
+1. **Треба редагувати код** для зміни значення
+2. **Не можна змінити без перезапуску** сервісу
+3. **Важко тестувати** різні конфігурації
+4. **Не гнучко** - однаковий код для всіх середовищ (dev/prod)
+
+---
+
+## ✅ КОНФІГУРАЦІЯ (Config) - Що це?
+
+**Конфігурація** = значення, які зберігаються окремо від коду (в файлах, змінних середовища, БД) і можуть змінюватись без редагування коду.
+
+### Приклад конфігурації:
+```yaml
+# ✅ КОНФІГ - значення в окремому файлі router-config.yml
+llm_profiles:
+  qwen3_science_8b:
+    provider: ollama
+    model: qwen3:8b
+    max_tokens: 2048
+```
+
+```python
+# ✅ КОД читає з конфігу
+llm_profile = router_config.get("llm_profiles", {}).get("qwen3_science_8b")
+model = llm_profile.get("model")  # Беремо з конфігу, не хардкодимо!
+```
+
+### Переваги конфігурації:
+1. **Зміна без редагування коду** - просто змінити YAML файл
+2. **Різні конфіги для різних середовищ** (dev/prod/staging)
+3. **Легко тестувати** - можна створити test-config.yml
+4. **Гнучко** - один код, багато конфігурацій
+
+---
+
+## 📊 ПОРІВНЯННЯ
+
+| Аспект | Хардкод | Конфігурація |
+|--------|---------|--------------|
+| **Де зберігається?** | В коді | В окремих файлах |
+| **Як змінити?** | Редагувати код | Редагувати конфіг |
+| **Потрібен рестарт?** | Так (перекомпіляція) | Так (перезапуск) |
+| **Гнучкість** | Низька | Висока |
+| **Тестування** | Важко | Легко |
+
+---
+
+## 🔧 ЩО МИ ВИПРАВИЛИ?
+
+### БУЛО (хардкод):
+```python
+# ❌ Хардкод - модель завжди "qwen3-8b"
+local_model = "qwen3-8b"
+```
+
+### СТАЛО (з конфігу):
+```python
+# ✅ Читаємо з конфігу
+if llm_profile.get("provider") == "ollama":
+    ollama_model = llm_profile.get("model", "qwen3:8b")
+    local_model = ollama_model.replace(":", "-")  # qwen3:8b → qwen3-8b
+```
+
+### Результат:
+- ✅ Модель береться з `router-config.yml`
+- ✅ Якщо змінити конфіг → зміниться поведінка
+- ✅ Не треба редагувати код для зміни моделі
+- ✅ Різні агенти можуть мати різні локальні моделі
+
+---
+
+## 💡 КОЛИ ВИКОРИСТОВУВАТИ?
+
+### Хардкод - тільки для:
+- Константи (π = 3.14, версія API)
+- Значення, які ніколи не зміняться
+- Технічні деталі (timeout = 5.0 сек)
+
+### Конфігурація - для:
+- Моделі LLM
+- API ключі
+- URL сервісів
+- Параметри (temperature, max_tokens)
+- Налаштування агентів
+
+---
+
+## 📝 ПРИКЛАД З НАШОГО ПРОЄКТУ
+
+### router-config.yml (конфігурація):
+```yaml
+agents:
+  helion:
+    default_llm: qwen3_science_8b  # ← Можна змінити тут
+
+llm_profiles:
+  qwen3_science_8b:
+    provider: ollama
+    model: qwen3:8b  # ← Можна змінити модель тут
+```
+
+### main.py (код):
+```python
+# Читаємо з конфігу
+default_llm = agent_config.get("default_llm", "qwen3-8b")
+llm_profile = llm_profiles.get(default_llm, {})
+model = llm_profile.get("model")  # ← Беремо з конфігу!
+```
+
+**Тепер можна змінити модель просто редагуванням YAML файлу!** 🎉
--- a/docs/memory/CUTOVER_CHECKLIST.md
+++ b/docs/memory/CUTOVER_CHECKLIST.md
@@ -0,0 +1,183 @@
+# Qdrant Canonical Migration - Cutover Checklist
+
+**Status:** GO  
+**Date:** 2026-01-26  
+**Risk Level:** Operational (security invariants verified)
+
+---
+
+## Pre-Cutover Verification
+
+### Security Invariants (VERIFIED ✅)
+
+| Invariant | Status |
+|-----------|--------|
+| `tenant_id` always required | ✅ |
+| `agent_ids ⊆ allowed_agent_ids` (or admin) | ✅ |
+| Admin default: no private | ✅ |
+| Empty `should` → error | ✅ |
+| Private only via owner | ✅ |
+| Qdrant `match.any` format | ✅ |
+
+---
+
+## Cutover Steps
+
+### 1. Deploy
+
+```bash
+# Copy code to NODE1
+scp -r services/memory/qdrant/ root@NODE1:/opt/microdao-daarion/services/memory/
+scp docs/memory/canonical_collections.yaml root@NODE1:/opt/microdao-daarion/docs/memory/
+scp scripts/qdrant_*.py root@NODE1:/opt/microdao-daarion/scripts/
+
+# IMPORTANT: Verify dim/metric in canonical_collections.yaml matches live embedding
+# Current: dim=1024, metric=cosine
+
+# Restart service that owns Qdrant reads/writes
+docker compose restart memory-service
+# OR
+systemctl restart memory-service
+```
+
+### 2. Migration
+
+```bash
+# Dry run - MUST pass before real migration
+python3 scripts/qdrant_migrate_to_canonical.py --all --dry-run 2>&1 | tee migration_dry_run.log
+
+# Verify dry run output:
+# - Target collection name(s) shown
+# - Per-collection counts listed
+# - Zero dim/metric mismatches (unless --skip-dim-check used)
+
+# Real migration
+python3 scripts/qdrant_migrate_to_canonical.py --all --continue-on-error 2>&1 | tee migration_$(date +%Y%m%d_%H%M%S).log
+
+# Review summary:
+# - Collections processed: X/Y
+# - Points migrated: N
+# - Errors: should be 0 or minimal
+```
+
+### 3. Parity Check
+
+```bash
+python3 scripts/qdrant_parity_check.py --agents helion,nutra,druid 2>&1 | tee parity_check.log
+
+# Requirements:
+# - Count parity within tolerance
+# - topK overlap threshold passes
+# - Schema validation passes
+```
+
+### 4. Dual-Read Window
+
+```bash
+# Enable dual-read for validation
+export DUAL_READ_OLD=true
+
+# Restart service to pick up env change
+docker compose restart memory-service
+```
+
+**Validation queries (must pass):**
+
+| Query Type | Expected Result |
+|------------|-----------------|
+| Agent-only (Helion) | Returns own docs, no other agents |
+| Multi-agent (DAARWIZZ) | Returns from allowed agents only |
+| Private visibility | Only owner sees private |
+
+```bash
+# Run smoke test
+python3 scripts/qdrant_smoke_test.py --host localhost
+```
+
+### 5. Cutover
+
+```bash
+# Disable dual-read
+export DUAL_READ_OLD=false
+
+# Ensure no legacy writes
+export DUAL_WRITE_OLD=false
+# OR remove these env vars entirely
+
+# Restart service
+docker compose restart memory-service
+
+# Verify service is healthy
+curl -s http://localhost:8000/health
+```
+
+### 6. Post-Cutover Guard
+
+```bash
+# Keep legacy collections for rollback window (recommended: 7 days)
+# DO NOT delete legacy collections yet
+
+# After rollback window (7 days):
+# 1. Run one more parity check
+python3 scripts/qdrant_parity_check.py --all
+
+# 2. If parity passes, delete legacy collections
+# WARNING: This is irreversible
+# python3 -c "
+# from qdrant_client import QdrantClient
+# client = QdrantClient(host='localhost', port=6333)
+# legacy = ['helion_docs', 'nutra_messages', ...]  # list all legacy
+# for col in legacy:
+#     client.delete_collection(col)
+#     print(f'Deleted: {col}')
+# "
+```
+
+---
+
+## Rollback Procedure
+
+If issues arise after cutover:
+
+```bash
+# 1. Re-enable dual-read from legacy
+export DUAL_READ_OLD=true
+export DUAL_WRITE_OLD=true  # if needed
+
+# 2. Restart service
+docker compose restart memory-service
+
+# 3. Investigate issues
+# - Check migration logs
+# - Check parity results
+# - Review error messages
+
+# 4. If canonical data is corrupted, switch to legacy-only mode:
+# (requires code change to bypass canonical reads)
+```
+
+---
+
+## Files Reference
+
+| File | Purpose |
+|------|---------|
+| `services/memory/qdrant/` | Canonical Qdrant module |
+| `docs/memory/canonical_collections.yaml` | Collection config |
+| `docs/memory/cm_payload_v1.md` | Payload schema docs |
+| `scripts/qdrant_migrate_to_canonical.py` | Migration tool |
+| `scripts/qdrant_parity_check.py` | Parity verification |
+| `scripts/qdrant_smoke_test.py` | Security smoke test |
+
+---
+
+## Sign-off
+
+- [ ] Dry run passed
+- [ ] Migration completed
+- [ ] Parity check passed
+- [ ] Dual-read validation passed
+- [ ] Cutover completed
+- [ ] Post-cutover health verified
+- [ ] Rollback window started (7 days)
+- [ ] Legacy collections deleted (after rollback window)
--- a/docs/memory/canonical_collections.yaml
+++ b/docs/memory/canonical_collections.yaml
@@ -0,0 +1,142 @@
+# Canonical Qdrant Collections Configuration
+# Version: 1.0
+# Last Updated: 2026-01-26
+
+# Default embedding configuration
+embedding:
+  text:
+    model: "cohere-embed-multilingual-v3"
+    dim: 1024
+    metric: "cosine"
+  code:
+    model: "openai-text-embedding-3-small"
+    dim: 1536
+    metric: "cosine"
+
+# Canonical collections
+collections:
+  text:
+    name: "cm_text_1024_v1"
+    dim: 1024
+    metric: "cosine"
+    description: "Main text embeddings collection"
+    payload_indexes:
+      - field: "tenant_id"
+        type: "keyword"
+      - field: "team_id"
+        type: "keyword"
+      - field: "agent_id"
+        type: "keyword"
+      - field: "scope"
+        type: "keyword"
+      - field: "visibility"
+        type: "keyword"
+      - field: "indexed"
+        type: "bool"
+      - field: "source_id"
+        type: "keyword"
+      - field: "tags"
+        type: "keyword"
+      - field: "created_at"
+        type: "datetime"
+
+# Tenant configuration
+tenants:
+  daarion:
+    id: "t_daarion"
+    default_team: "team_core"
+
+# Team configuration  
+teams:
+  core:
+    id: "team_core"
+    tenant_id: "t_daarion"
+
+# Agent slug mapping (legacy name -> canonical slug)
+agent_slugs:
+  helion: "agt_helion"
+  Helion: "agt_helion"
+  HELION: "agt_helion"
+  
+  nutra: "agt_nutra"
+  Nutra: "agt_nutra"
+  NUTRA: "agt_nutra"
+  
+  druid: "agt_druid"
+  Druid: "agt_druid"
+  DRUID: "agt_druid"
+  
+  greenfood: "agt_greenfood"
+  Greenfood: "agt_greenfood"
+  GREENFOOD: "agt_greenfood"
+  
+  agromatrix: "agt_agromatrix"
+  AgroMatrix: "agt_agromatrix"
+  AGROMATRIX: "agt_agromatrix"
+  
+  daarwizz: "agt_daarwizz"
+  Daarwizz: "agt_daarwizz"
+  DAARWIZZ: "agt_daarwizz"
+  
+  alateya: "agt_alateya"
+  Alateya: "agt_alateya"
+  ALATEYA: "agt_alateya"
+
+# Legacy collection mapping rules
+legacy_collection_mapping:
+  # Pattern: collection_name_regex -> (agent_slug_group, scope, tags)
+  patterns:
+    - regex: "^([a-z]+)_docs$"
+      agent_group: 1
+      scope: "docs"
+      tags: []
+    
+    - regex: "^([a-z]+)_messages$"
+      agent_group: 1
+      scope: "messages"
+      tags: []
+    
+    - regex: "^([a-z]+)_memory_items$"
+      agent_group: 1
+      scope: "memory"
+      tags: []
+    
+    - regex: "^([a-z]+)_artifacts$"
+      agent_group: 1
+      scope: "artifacts"
+      tags: []
+    
+    - regex: "^druid_legal_kb$"
+      agent_group: null
+      agent_id: "agt_druid"
+      scope: "docs"
+      tags: ["legal_kb"]
+    
+    - regex: "^nutra_food_knowledge$"
+      agent_group: null
+      agent_id: "agt_nutra"
+      scope: "docs"
+      tags: ["food_kb"]
+    
+    - regex: "^memories$"
+      agent_group: null
+      scope: "memory"
+      tags: []
+    
+    - regex: "^messages$"
+      agent_group: null
+      scope: "messages"
+      tags: []
+
+# Feature flags for migration
+feature_flags:
+  dual_write_enabled: false
+  dual_read_enabled: false
+  canonical_write_only: false
+  legacy_read_fallback: true
+
+# Defaults for migration
+migration_defaults:
+  visibility: "confidential"
+  owner_kind: "agent"
+  indexed: true
--- a/docs/memory/cm_payload_v1.md
+++ b/docs/memory/cm_payload_v1.md
@@ -0,0 +1,216 @@
+# Co-Memory Payload Schema v1 (cm_payload_v1)
+
+**Version:** 1.0  
+**Status:** Canonical  
+**Last Updated:** 2026-01-26
+
+## Overview
+
+This document defines the canonical payload schema for all vectors stored in Qdrant across the DAARION platform. The schema enables:
+
+- **Unlimited agents** without creating new collections
+- **Fine-grained access control** via payload filters
+- **Multi-tenant isolation** via tenant_id
+- **Consistent querying** across all memory types
+
+## Design Principles
+
+1. **One collection = one embedding space** (same dim + metric)
+2. **No per-agent collections** - agents identified by `agent_id` field
+3. **Access control via payload** - visibility + ACL fields
+4. **Stable identifiers** - ULIDs for all entities
+
+---
+
+## Collection Naming Convention
+
+```
+cm_<type>_<dim>_v<version>
+```
+
+Examples:
+- `cm_text_1024_v1` - text embeddings, 1024 dimensions
+- `cm_code_768_v1` - code embeddings, 768 dimensions
+- `cm_mm_512_v1` - multimodal embeddings, 512 dimensions
+
+---
+
+## Payload Schema
+
+### Required Fields (MVP)
+
+| Field | Type | Description | Example |
+|-------|------|-------------|---------|
+| `schema_version` | string | Always `"cm_payload_v1"` | `"cm_payload_v1"` |
+| `tenant_id` | string | Tenant identifier | `"t_daarion"` |
+| `team_id` | string | Team identifier (nullable) | `"team_core"` |
+| `project_id` | string | Project identifier (nullable) | `"proj_helion"` |
+| `agent_id` | string | Agent identifier (nullable) | `"agt_helion"` |
+| `owner_kind` | enum | Owner type | `"agent"` / `"team"` / `"user"` |
+| `owner_id` | string | Owner identifier | `"agt_helion"` |
+| `scope` | enum | Content type | `"docs"` / `"messages"` / `"memory"` / `"artifacts"` / `"signals"` |
+| `visibility` | enum | Access level | `"public"` / `"confidential"` / `"private"` |
+| `indexed` | boolean | Searchable by AI | `true` |
+| `source_kind` | enum | Source type | `"document"` / `"wiki"` / `"message"` / `"artifact"` / `"web"` / `"code"` |
+| `source_id` | string | Source identifier | `"doc_01HQ..."` |
+| `chunk.chunk_id` | string | Chunk identifier | `"chk_01HQ..."` |
+| `chunk.chunk_idx` | integer | Chunk index in source | `0` |
+| `fingerprint` | string | Content hash (SHA256) | `"a1b2c3..."` |
+| `created_at` | string | ISO 8601 timestamp | `"2026-01-26T12:00:00Z"` |
+
+### Optional Fields (Recommended)
+
+| Field | Type | Description | Example |
+|-------|------|-------------|---------|
+| `acl.read_team_ids` | array[string] | Teams with read access | `["team_core"]` |
+| `acl.read_agent_ids` | array[string] | Agents with read access | `["agt_nutra"]` |
+| `acl.read_role_ids` | array[string] | Roles with read access | `["role_admin"]` |
+| `tags` | array[string] | Content tags | `["legal_kb", "contracts"]` |
+| `lang` | string | Language code | `"uk"` / `"en"` |
+| `importance` | float | Importance score 0-1 | `0.8` |
+| `ttl_days` | integer | Auto-delete after N days | `365` |
+| `embedding.model` | string | Embedding model ID | `"cohere-embed-v3"` |
+| `embedding.dim` | integer | Vector dimension | `1024` |
+| `embedding.metric` | string | Distance metric | `"cosine"` |
+| `updated_at` | string | Last update timestamp | `"2026-01-26T12:00:00Z"` |
+
+---
+
+## Identifier Formats
+
+| Entity | Prefix | Format | Example |
+|--------|--------|--------|---------|
+| Tenant | `t_` | `t_<slug>` | `t_daarion` |
+| Team | `team_` | `team_<slug>` | `team_core` |
+| Project | `proj_` | `proj_<slug>` | `proj_helion` |
+| Agent | `agt_` | `agt_<slug>` | `agt_helion` |
+| Document | `doc_` | `doc_<ulid>` | `doc_01HQXYZ...` |
+| Message | `msg_` | `msg_<ulid>` | `msg_01HQXYZ...` |
+| Artifact | `art_` | `art_<ulid>` | `art_01HQXYZ...` |
+| Chunk | `chk_` | `chk_<ulid>` | `chk_01HQXYZ...` |
+
+---
+
+## Scope Enum
+
+| Value | Description | Typical Sources |
+|-------|-------------|-----------------|
+| `docs` | Documents, knowledge bases | PDF, Google Docs, Wiki |
+| `messages` | Conversations | Telegram, Slack, Email |
+| `memory` | Agent memory items | Session notes, learned facts |
+| `artifacts` | Generated content | Reports, presentations |
+| `signals` | Events, notifications | System events |
+
+---
+
+## Visibility Enum
+
+| Value | Access Rule |
+|-------|-------------|
+| `public` | Anyone in tenant/team can read |
+| `confidential` | Owner + ACL-granted readers |
+| `private` | Only owner can read |
+
+---
+
+## Access Control Rules
+
+### Private Content
+```python
+visibility == "private" AND owner_kind == request.owner_kind AND owner_id == request.owner_id
+```
+
+### Confidential Content
+```python
+visibility == "confidential" AND (
+    (owner_kind == request.owner_kind AND owner_id == request.owner_id) OR
+    request.agent_id IN acl.read_agent_ids OR
+    request.team_id IN acl.read_team_ids OR
+    request.role_id IN acl.read_role_ids
+)
+```
+
+### Public Content
+```python
+visibility == "public" AND team_id == request.team_id
+```
+
+---
+
+## Migration Mapping (Legacy Collections)
+
+| Old Collection Pattern | New Payload |
+|------------------------|-------------|
+| `helion_docs` | `agent_id="agt_helion"`, `scope="docs"` |
+| `nutra_messages` | `agent_id="agt_nutra"`, `scope="messages"` |
+| `druid_legal_kb` | `agent_id="agt_druid"`, `scope="docs"`, `tags=["legal_kb"]` |
+| `nutra_food_knowledge` | `agent_id="agt_nutra"`, `scope="docs"`, `tags=["food_kb"]` |
+| `*_memory_items` | `scope="memory"` |
+| `*_artifacts` | `scope="artifacts"` |
+
+---
+
+## Example Payloads
+
+### Document Chunk (Helion Knowledge Base)
+
+```json
+{
+  "schema_version": "cm_payload_v1",
+  "tenant_id": "t_daarion",
+  "team_id": "team_core",
+  "project_id": "proj_helion",
+  "agent_id": "agt_helion",
+  "owner_kind": "agent",
+  "owner_id": "agt_helion",
+  "scope": "docs",
+  "visibility": "confidential",
+  "indexed": true,
+  "source_kind": "document",
+  "source_id": "doc_01HQ8K9X2NPQR3FGJKLM5678",
+  "chunk": {
+    "chunk_id": "chk_01HQ8K9X3MPQR3FGJKLM9012",
+    "chunk_idx": 0
+  },
+  "fingerprint": "sha256:a1b2c3d4e5f6...",
+  "created_at": "2026-01-26T12:00:00Z",
+  "tags": ["product", "features"],
+  "lang": "uk",
+  "embedding": {
+    "model": "cohere-embed-multilingual-v3",
+    "dim": 1024,
+    "metric": "cosine"
+  }
+}
+```
+
+### Message (Telegram Conversation)
+
+```json
+{
+  "schema_version": "cm_payload_v1",
+  "tenant_id": "t_daarion",
+  "team_id": "team_core",
+  "agent_id": "agt_helion",
+  "owner_kind": "user",
+  "owner_id": "user_tg_123456",
+  "scope": "messages",
+  "visibility": "private",
+  "indexed": true,
+  "source_kind": "message",
+  "source_id": "msg_01HQ8K9X4NPQR3FGJKLM3456",
+  "chunk": {
+    "chunk_id": "chk_01HQ8K9X5MPQR3FGJKLM7890",
+    "chunk_idx": 0
+  },
+  "fingerprint": "sha256:b2c3d4e5f6g7...",
+  "created_at": "2026-01-26T12:05:00Z",
+  "channel_id": "tg_chat_789"
+}
+```
+
+---
+
+## Changelog
+
+- **v1.0** (2026-01-26): Initial canonical schema
--- a/docs/memory/cm_payload_v1.schema.json
+++ b/docs/memory/cm_payload_v1.schema.json
@@ -0,0 +1,178 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://daarion.city/schemas/cm_payload_v1.json",
+  "title": "Co-Memory Payload Schema v1",
+  "description": "Canonical payload schema for Qdrant vectors in DAARION platform",
+  "type": "object",
+  "required": [
+    "schema_version",
+    "tenant_id",
+    "owner_kind",
+    "owner_id",
+    "scope",
+    "visibility",
+    "indexed",
+    "source_kind",
+    "source_id",
+    "chunk",
+    "fingerprint",
+    "created_at"
+  ],
+  "properties": {
+    "schema_version": {
+      "type": "string",
+      "const": "cm_payload_v1",
+      "description": "Schema version identifier"
+    },
+    "tenant_id": {
+      "type": "string",
+      "pattern": "^t_[a-z0-9_]+$",
+      "description": "Tenant identifier (t_<slug>)"
+    },
+    "team_id": {
+      "type": ["string", "null"],
+      "pattern": "^team_[a-z0-9_]+$",
+      "description": "Team identifier (team_<slug>)"
+    },
+    "project_id": {
+      "type": ["string", "null"],
+      "pattern": "^proj_[a-z0-9_]+$",
+      "description": "Project identifier (proj_<slug>)"
+    },
+    "agent_id": {
+      "type": ["string", "null"],
+      "pattern": "^agt_[a-z0-9_]+$",
+      "description": "Agent identifier (agt_<slug>)"
+    },
+    "owner_kind": {
+      "type": "string",
+      "enum": ["user", "team", "agent"],
+      "description": "Type of owner"
+    },
+    "owner_id": {
+      "type": "string",
+      "minLength": 1,
+      "description": "Owner identifier"
+    },
+    "scope": {
+      "type": "string",
+      "enum": ["docs", "messages", "memory", "artifacts", "signals"],
+      "description": "Content type/scope"
+    },
+    "visibility": {
+      "type": "string",
+      "enum": ["public", "confidential", "private"],
+      "description": "Access visibility level"
+    },
+    "indexed": {
+      "type": "boolean",
+      "description": "Whether content is searchable by AI"
+    },
+    "source_kind": {
+      "type": "string",
+      "enum": ["document", "wiki", "message", "artifact", "web", "code"],
+      "description": "Type of source content"
+    },
+    "source_id": {
+      "type": "string",
+      "pattern": "^(doc|msg|art|web|code)_[A-Za-z0-9]+$",
+      "description": "Source identifier with type prefix"
+    },
+    "chunk": {
+      "type": "object",
+      "required": ["chunk_id", "chunk_idx"],
+      "properties": {
+        "chunk_id": {
+          "type": "string",
+          "pattern": "^chk_[A-Za-z0-9]+$",
+          "description": "Chunk identifier"
+        },
+        "chunk_idx": {
+          "type": "integer",
+          "minimum": 0,
+          "description": "Chunk index within source"
+        }
+      }
+    },
+    "fingerprint": {
+      "type": "string",
+      "minLength": 1,
+      "description": "Content hash for deduplication"
+    },
+    "created_at": {
+      "type": "string",
+      "format": "date-time",
+      "description": "Creation timestamp (ISO 8601)"
+    },
+    "updated_at": {
+      "type": ["string", "null"],
+      "format": "date-time",
+      "description": "Last update timestamp (ISO 8601)"
+    },
+    "acl": {
+      "type": "object",
+      "properties": {
+        "read_team_ids": {
+          "type": "array",
+          "items": {"type": "string"},
+          "description": "Teams with read access"
+        },
+        "read_agent_ids": {
+          "type": "array",
+          "items": {"type": "string"},
+          "description": "Agents with read access"
+        },
+        "read_role_ids": {
+          "type": "array",
+          "items": {"type": "string"},
+          "description": "Roles with read access"
+        }
+      }
+    },
+    "tags": {
+      "type": "array",
+      "items": {"type": "string"},
+      "description": "Content tags for filtering"
+    },
+    "lang": {
+      "type": ["string", "null"],
+      "pattern": "^[a-z]{2}(-[A-Z]{2})?$",
+      "description": "Language code (ISO 639-1)"
+    },
+    "importance": {
+      "type": ["number", "null"],
+      "minimum": 0,
+      "maximum": 1,
+      "description": "Importance score (0-1)"
+    },
+    "ttl_days": {
+      "type": ["integer", "null"],
+      "minimum": 1,
+      "description": "Auto-delete after N days"
+    },
+    "channel_id": {
+      "type": ["string", "null"],
+      "description": "Channel/chat identifier for messages"
+    },
+    "embedding": {
+      "type": "object",
+      "properties": {
+        "model": {
+          "type": "string",
+          "description": "Embedding model identifier"
+        },
+        "dim": {
+          "type": "integer",
+          "minimum": 1,
+          "description": "Vector dimension"
+        },
+        "metric": {
+          "type": "string",
+          "enum": ["cosine", "dot", "euclidean"],
+          "description": "Distance metric"
+        }
+      }
+    }
+  },
+  "additionalProperties": true
+}