feat: Add Alateya, Clan, Eonarch agents + fix gateway-router connection

## Agents Added
- Alateya: R&D, biotech, innovations
- Clan (Spirit): Community spirit agent
- Eonarch: Consciousness evolution agent

## Changes
- docker-compose.node1.yml: Added tokens for all 3 new agents
- gateway-bot/http_api.py: Added configs and webhook endpoints
- gateway-bot/clan_prompt.txt: New prompt file
- gateway-bot/eonarch_prompt.txt: New prompt file

## Fixes
- Fixed ROUTER_URL from :9102 to :8000 (internal container port)
- All 9 Telegram agents now working

## Documentation
- Created PROJECT-MASTER-INDEX.md - single entry point
- Added various status documents and scripts

Tokens configured:
- Helion, NUTRA, Agromatrix (existing)
- Alateya, Clan, Eonarch (new)
- Druid, GreenFood, DAARWIZZ (configured)
This commit is contained in:
Apple
2026-01-28 06:40:34 -08:00
parent 4aeb69e7ae
commit 0c8bef82f4
120 changed files with 21905 additions and 425 deletions

View File

@@ -0,0 +1,150 @@
# Runbook: NODE1 Recovery & Safety
## Purpose
Швидко відновити роботу NODE1 після збоїв (Telegram webhook 500, router DNS, NATS/worker, Grafana crash-loop) і уникнути випадкового зупинення не того стеку.
## Quick links / aliases
- `./stack-node1 ps|up|down|logs` (node1 stack)
- `./stack-staging ps|up|down|logs` (staging stack)
- NODE1 Docker network: `dagi-network` (для `nats-box`)
## Scope (NODE1 stack)
- dagi-gateway-node1 (9300)
- dagi-router-node1 (router API)
- dagi-nats-node1 (4222, JetStream enabled)
- crewai-nats-worker
- dagi-memory-service-node1 (8000)
- dagi-qdrant-node1 (6333)
- dagi-postgres (5432)
- dagi-redis-node1 (6379)
- dagi-neo4j-node1 (7474/7687)
- prometheus (9090)
- grafana
- dagi-crawl4ai-node1 (11235)
- control-plane (9200)
- other node1 services as defined in docker-compose.node1.yml
## Safety rules (DO THIS FIRST)
1) Always set project name for NODE1:
- `export COMPOSE_PROJECT_NAME=dagi_node1`
2) Always use the correct compose file:
- `-f docker-compose.node1.yml`
3) Never run `docker compose down` without verifying target:
- `docker compose -f docker-compose.node1.yml ps`
4) If staging exists, it MUST have a different `COMPOSE_PROJECT_NAME` and networks.
## Quick status
- `docker compose -f docker-compose.node1.yml ps`
- `docker compose -f docker-compose.node1.yml logs --tail=80 dagi-gateway-node1 dagi-router-node1 dagi-nats-node1 crewai-nats-worker grafana`
## Standard restart order (most incidents)
1) NATS (foundation)
2) Router (dependency for Gateway routing)
3) Gateway (webhooks)
4) Worker (async jobs)
5) Grafana (observability only)
Commands:
- `docker compose -f docker-compose.node1.yml up -d dagi-nats-node1`
- `docker compose -f docker-compose.node1.yml up -d dagi-router-node1`
- `docker compose -f docker-compose.node1.yml up -d dagi-gateway-node1`
- `docker compose -f docker-compose.node1.yml up -d crewai-nats-worker`
- `docker compose -f docker-compose.node1.yml up -d grafana`
## Incident playbooks
### A) Telegram webhook returns 500 (e.g. /greenfood/telegram/webhook)
Symptoms:
- 500 responses from gateway
- gateway logs show router request failures
Check:
- `docker logs --tail=200 dagi-gateway-node1 | grep -E "webhook|Router request failed|GREENFOOD"`
- `docker compose -f docker-compose.node1.yml ps | grep -E "dagi-gateway-node1|dagi-router-node1"`
Fix:
1) Ensure router is healthy:
- `docker logs --tail=120 dagi-router-node1`
- `docker inspect --format '{{json .State.Health}}' dagi-router-node1`
2) Ensure gateway can resolve router (Docker DNS):
- `docker exec -it dagi-gateway-node1 getent hosts router || true`
3) Restart router + gateway:
- `docker restart dagi-router-node1`
- `docker restart dagi-gateway-node1`
Root cause examples:
- router container crash-loop → DNS name `router` not resolvable
- ROUTER_URL points to non-existing host/service in node1 network
### B) Router crash-loop on startup (Pydantic / config errors)
Symptoms:
- router restarting
- traceback in `docker logs dagi-router-node1`
Fix:
1) Read the first error in logs:
- `docker logs --tail=200 dagi-router-node1`
2) Hotfix then rebuild/recreate if needed:
- code fix (example previously: `temperature: float = 0.2`)
- `docker compose -f docker-compose.node1.yml up -d --build --force-recreate dagi-router-node1`
### C) NATS worker shows Subscription failed / NotFoundError
Symptoms:
- worker logs mention `NotFoundError`
- worker cannot subscribe / consume tasks
Check:
- `docker logs --tail=200 crewai-nats-worker`
- `docker logs --tail=200 dagi-nats-node1 | grep -i jetstream`
Fix (JetStream):
1) Ensure JetStream enabled (NATS started with `-js`).
2) Ensure required stream exists (example used on NODE1):
- Stream: `STREAM_AGENT_RUN`
- Subjects: `agent.run.>`
3) Using nats-box (inside node1 network):
- `docker run --rm -it --network <NODE1_NETWORK> natsio/nats-box:latest sh`
- create stream/consumer as required by worker subjects
4) Restart worker:
- `docker restart crewai-nats-worker`
### D) Grafana crash-loop due to provisioning alert rule
Symptoms:
- grafana restarting
- logs mention invalid alert rule / relative time range `From: 0s, To: 0s`
Fix:
1) Identify failing rule file:
- `docker logs --tail=200 grafana`
2) Fix provisioning yaml (example path used on NODE1):
- `/opt/microdao-daarion/monitoring/grafana/provisioning/alerting/alerts.yml`
- Ensure rule has valid `relativeTimeRange`
3) Restart grafana:
- `docker restart grafana`
## Post-recovery verification checklist
1) Core health:
- `docker compose -f docker-compose.node1.yml ps | grep -E "Up|healthy"`
2) Router reachable from gateway:
- `docker exec -it dagi-gateway-node1 getent hosts router`
3) NATS OK:
- `docker logs --tail=80 dagi-nats-node1 | grep -i "JetStream\|Server is ready"`
4) Worker subscribed:
- `docker logs --tail=120 crewai-nats-worker | grep -E "Subscribed|Subscription OK|NotFoundError" || true`
5) GREENFOOD policy sanity:
- рекламне оголошення → ігнор
- пряме питання → відповідь ≤ 3 речень
## Known configuration anchors (update when changed)
- GREENFOOD торговa група: `t.me/+SPm1OV-pDJZhZGFi`
- ROUTER_URL used by gateway: `http://router:8000` (must resolve inside node1 network)
- NATS_URL: `nats://nats:4222`
- JetStream Stream: `STREAM_AGENT_RUN` (`agent.run.>`)
- Grafana alerts provisioning file: `monitoring/grafana/provisioning/alerting/alerts.yml`
## Appendix: common commands
- `docker compose -f docker-compose.node1.yml ps`
- `docker compose -f docker-compose.node1.yml logs -f <service>`
- `docker restart <container>`
- `docker compose -f docker-compose.node1.yml up -d --build --force-recreate <service>`
- `docker system df`

123
docs/hardcode_vs_config.md Normal file
View File

@@ -0,0 +1,123 @@
# Хардкод vs Конфігурація
## 🔴 ХАРДКОД (Hardcode) - Що це?
**Хардкод** = значення, які "зашиті" прямо в коді і не можуть змінюватись без редагування коду.
### Приклад хардкоду:
```python
# ❌ ХАРДКОД - значення прямо в коді
local_model = "qwen3-8b" # Якщо треба змінити модель, треба редагувати код!
```
### Проблеми хардкоду:
1. **Треба редагувати код** для зміни значення
2. **Не можна змінити без перезапуску** сервісу
3. **Важко тестувати** різні конфігурації
4. **Не гнучко** - однаковий код для всіх середовищ (dev/prod)
---
## ✅ КОНФІГУРАЦІЯ (Config) - Що це?
**Конфігурація** = значення, які зберігаються окремо від коду (в файлах, змінних середовища, БД) і можуть змінюватись без редагування коду.
### Приклад конфігурації:
```yaml
# ✅ КОНФІГ - значення в окремому файлі router-config.yml
llm_profiles:
qwen3_science_8b:
provider: ollama
model: qwen3:8b
max_tokens: 2048
```
```python
# ✅ КОД читає з конфігу
llm_profile = router_config.get("llm_profiles", {}).get("qwen3_science_8b")
model = llm_profile.get("model") # Беремо з конфігу, не хардкодимо!
```
### Переваги конфігурації:
1. **Зміна без редагування коду** - просто змінити YAML файл
2. **Різні конфіги для різних середовищ** (dev/prod/staging)
3. **Легко тестувати** - можна створити test-config.yml
4. **Гнучко** - один код, багато конфігурацій
---
## 📊 ПОРІВНЯННЯ
| Аспект | Хардкод | Конфігурація |
|--------|---------|--------------|
| **Де зберігається?** | В коді | В окремих файлах |
| **Як змінити?** | Редагувати код | Редагувати конфіг |
| **Потрібен рестарт?** | Так (перекомпіляція) | Так (перезапуск) |
| **Гнучкість** | Низька | Висока |
| **Тестування** | Важко | Легко |
---
## 🔧 ЩО МИ ВИПРАВИЛИ?
### БУЛО (хардкод):
```python
# ❌ Хардкод - модель завжди "qwen3-8b"
local_model = "qwen3-8b"
```
### СТАЛО (з конфігу):
```python
# ✅ Читаємо з конфігу
if llm_profile.get("provider") == "ollama":
ollama_model = llm_profile.get("model", "qwen3:8b")
local_model = ollama_model.replace(":", "-") # qwen3:8b → qwen3-8b
```
### Результат:
- ✅ Модель береться з `router-config.yml`
- ✅ Якщо змінити конфіг → зміниться поведінка
-Не треба редагувати код для зміни моделі
- ✅ Різні агенти можуть мати різні локальні моделі
---
## 💡 КОЛИ ВИКОРИСТОВУВАТИ?
### Хардкод - тільки для:
- Константи (π = 3.14, версія API)
- Значення, які ніколи не зміняться
- Технічні деталі (timeout = 5.0 сек)
### Конфігурація - для:
- Моделі LLM
- API ключі
- URL сервісів
- Параметри (temperature, max_tokens)
- Налаштування агентів
---
## 📝 ПРИКЛАД З НАШОГО ПРОЄКТУ
### router-config.yml (конфігурація):
```yaml
agents:
helion:
default_llm: qwen3_science_8b # ← Можна змінити тут
llm_profiles:
qwen3_science_8b:
provider: ollama
model: qwen3:8b # ← Можна змінити модель тут
```
### main.py (код):
```python
# Читаємо з конфігу
default_llm = agent_config.get("default_llm", "qwen3-8b")
llm_profile = llm_profiles.get(default_llm, {})
model = llm_profile.get("model") # ← Беремо з конфігу!
```
**Тепер можна змінити модель просто редагуванням YAML файлу!** 🎉

View File

@@ -0,0 +1,183 @@
# Qdrant Canonical Migration - Cutover Checklist
**Status:** GO
**Date:** 2026-01-26
**Risk Level:** Operational (security invariants verified)
---
## Pre-Cutover Verification
### Security Invariants (VERIFIED ✅)
| Invariant | Status |
|-----------|--------|
| `tenant_id` always required | ✅ |
| `agent_ids ⊆ allowed_agent_ids` (or admin) | ✅ |
| Admin default: no private | ✅ |
| Empty `should` → error | ✅ |
| Private only via owner | ✅ |
| Qdrant `match.any` format | ✅ |
---
## Cutover Steps
### 1. Deploy
```bash
# Copy code to NODE1
scp -r services/memory/qdrant/ root@NODE1:/opt/microdao-daarion/services/memory/
scp docs/memory/canonical_collections.yaml root@NODE1:/opt/microdao-daarion/docs/memory/
scp scripts/qdrant_*.py root@NODE1:/opt/microdao-daarion/scripts/
# IMPORTANT: Verify dim/metric in canonical_collections.yaml matches live embedding
# Current: dim=1024, metric=cosine
# Restart service that owns Qdrant reads/writes
docker compose restart memory-service
# OR
systemctl restart memory-service
```
### 2. Migration
```bash
# Dry run - MUST pass before real migration
python3 scripts/qdrant_migrate_to_canonical.py --all --dry-run 2>&1 | tee migration_dry_run.log
# Verify dry run output:
# - Target collection name(s) shown
# - Per-collection counts listed
# - Zero dim/metric mismatches (unless --skip-dim-check used)
# Real migration
python3 scripts/qdrant_migrate_to_canonical.py --all --continue-on-error 2>&1 | tee migration_$(date +%Y%m%d_%H%M%S).log
# Review summary:
# - Collections processed: X/Y
# - Points migrated: N
# - Errors: should be 0 or minimal
```
### 3. Parity Check
```bash
python3 scripts/qdrant_parity_check.py --agents helion,nutra,druid 2>&1 | tee parity_check.log
# Requirements:
# - Count parity within tolerance
# - topK overlap threshold passes
# - Schema validation passes
```
### 4. Dual-Read Window
```bash
# Enable dual-read for validation
export DUAL_READ_OLD=true
# Restart service to pick up env change
docker compose restart memory-service
```
**Validation queries (must pass):**
| Query Type | Expected Result |
|------------|-----------------|
| Agent-only (Helion) | Returns own docs, no other agents |
| Multi-agent (DAARWIZZ) | Returns from allowed agents only |
| Private visibility | Only owner sees private |
```bash
# Run smoke test
python3 scripts/qdrant_smoke_test.py --host localhost
```
### 5. Cutover
```bash
# Disable dual-read
export DUAL_READ_OLD=false
# Ensure no legacy writes
export DUAL_WRITE_OLD=false
# OR remove these env vars entirely
# Restart service
docker compose restart memory-service
# Verify service is healthy
curl -s http://localhost:8000/health
```
### 6. Post-Cutover Guard
```bash
# Keep legacy collections for rollback window (recommended: 7 days)
# DO NOT delete legacy collections yet
# After rollback window (7 days):
# 1. Run one more parity check
python3 scripts/qdrant_parity_check.py --all
# 2. If parity passes, delete legacy collections
# WARNING: This is irreversible
# python3 -c "
# from qdrant_client import QdrantClient
# client = QdrantClient(host='localhost', port=6333)
# legacy = ['helion_docs', 'nutra_messages', ...] # list all legacy
# for col in legacy:
# client.delete_collection(col)
# print(f'Deleted: {col}')
# "
```
---
## Rollback Procedure
If issues arise after cutover:
```bash
# 1. Re-enable dual-read from legacy
export DUAL_READ_OLD=true
export DUAL_WRITE_OLD=true # if needed
# 2. Restart service
docker compose restart memory-service
# 3. Investigate issues
# - Check migration logs
# - Check parity results
# - Review error messages
# 4. If canonical data is corrupted, switch to legacy-only mode:
# (requires code change to bypass canonical reads)
```
---
## Files Reference
| File | Purpose |
|------|---------|
| `services/memory/qdrant/` | Canonical Qdrant module |
| `docs/memory/canonical_collections.yaml` | Collection config |
| `docs/memory/cm_payload_v1.md` | Payload schema docs |
| `scripts/qdrant_migrate_to_canonical.py` | Migration tool |
| `scripts/qdrant_parity_check.py` | Parity verification |
| `scripts/qdrant_smoke_test.py` | Security smoke test |
---
## Sign-off
- [ ] Dry run passed
- [ ] Migration completed
- [ ] Parity check passed
- [ ] Dual-read validation passed
- [ ] Cutover completed
- [ ] Post-cutover health verified
- [ ] Rollback window started (7 days)
- [ ] Legacy collections deleted (after rollback window)

View File

@@ -0,0 +1,142 @@
# Canonical Qdrant Collections Configuration
# Version: 1.0
# Last Updated: 2026-01-26
# Default embedding configuration
embedding:
text:
model: "cohere-embed-multilingual-v3"
dim: 1024
metric: "cosine"
code:
model: "openai-text-embedding-3-small"
dim: 1536
metric: "cosine"
# Canonical collections
collections:
text:
name: "cm_text_1024_v1"
dim: 1024
metric: "cosine"
description: "Main text embeddings collection"
payload_indexes:
- field: "tenant_id"
type: "keyword"
- field: "team_id"
type: "keyword"
- field: "agent_id"
type: "keyword"
- field: "scope"
type: "keyword"
- field: "visibility"
type: "keyword"
- field: "indexed"
type: "bool"
- field: "source_id"
type: "keyword"
- field: "tags"
type: "keyword"
- field: "created_at"
type: "datetime"
# Tenant configuration
tenants:
daarion:
id: "t_daarion"
default_team: "team_core"
# Team configuration
teams:
core:
id: "team_core"
tenant_id: "t_daarion"
# Agent slug mapping (legacy name -> canonical slug)
agent_slugs:
helion: "agt_helion"
Helion: "agt_helion"
HELION: "agt_helion"
nutra: "agt_nutra"
Nutra: "agt_nutra"
NUTRA: "agt_nutra"
druid: "agt_druid"
Druid: "agt_druid"
DRUID: "agt_druid"
greenfood: "agt_greenfood"
Greenfood: "agt_greenfood"
GREENFOOD: "agt_greenfood"
agromatrix: "agt_agromatrix"
AgroMatrix: "agt_agromatrix"
AGROMATRIX: "agt_agromatrix"
daarwizz: "agt_daarwizz"
Daarwizz: "agt_daarwizz"
DAARWIZZ: "agt_daarwizz"
alateya: "agt_alateya"
Alateya: "agt_alateya"
ALATEYA: "agt_alateya"
# Legacy collection mapping rules
legacy_collection_mapping:
# Pattern: collection_name_regex -> (agent_slug_group, scope, tags)
patterns:
- regex: "^([a-z]+)_docs$"
agent_group: 1
scope: "docs"
tags: []
- regex: "^([a-z]+)_messages$"
agent_group: 1
scope: "messages"
tags: []
- regex: "^([a-z]+)_memory_items$"
agent_group: 1
scope: "memory"
tags: []
- regex: "^([a-z]+)_artifacts$"
agent_group: 1
scope: "artifacts"
tags: []
- regex: "^druid_legal_kb$"
agent_group: null
agent_id: "agt_druid"
scope: "docs"
tags: ["legal_kb"]
- regex: "^nutra_food_knowledge$"
agent_group: null
agent_id: "agt_nutra"
scope: "docs"
tags: ["food_kb"]
- regex: "^memories$"
agent_group: null
scope: "memory"
tags: []
- regex: "^messages$"
agent_group: null
scope: "messages"
tags: []
# Feature flags for migration
feature_flags:
dual_write_enabled: false
dual_read_enabled: false
canonical_write_only: false
legacy_read_fallback: true
# Defaults for migration
migration_defaults:
visibility: "confidential"
owner_kind: "agent"
indexed: true

View File

@@ -0,0 +1,216 @@
# Co-Memory Payload Schema v1 (cm_payload_v1)
**Version:** 1.0
**Status:** Canonical
**Last Updated:** 2026-01-26
## Overview
This document defines the canonical payload schema for all vectors stored in Qdrant across the DAARION platform. The schema enables:
- **Unlimited agents** without creating new collections
- **Fine-grained access control** via payload filters
- **Multi-tenant isolation** via tenant_id
- **Consistent querying** across all memory types
## Design Principles
1. **One collection = one embedding space** (same dim + metric)
2. **No per-agent collections** - agents identified by `agent_id` field
3. **Access control via payload** - visibility + ACL fields
4. **Stable identifiers** - ULIDs for all entities
---
## Collection Naming Convention
```
cm_<type>_<dim>_v<version>
```
Examples:
- `cm_text_1024_v1` - text embeddings, 1024 dimensions
- `cm_code_768_v1` - code embeddings, 768 dimensions
- `cm_mm_512_v1` - multimodal embeddings, 512 dimensions
---
## Payload Schema
### Required Fields (MVP)
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `schema_version` | string | Always `"cm_payload_v1"` | `"cm_payload_v1"` |
| `tenant_id` | string | Tenant identifier | `"t_daarion"` |
| `team_id` | string | Team identifier (nullable) | `"team_core"` |
| `project_id` | string | Project identifier (nullable) | `"proj_helion"` |
| `agent_id` | string | Agent identifier (nullable) | `"agt_helion"` |
| `owner_kind` | enum | Owner type | `"agent"` / `"team"` / `"user"` |
| `owner_id` | string | Owner identifier | `"agt_helion"` |
| `scope` | enum | Content type | `"docs"` / `"messages"` / `"memory"` / `"artifacts"` / `"signals"` |
| `visibility` | enum | Access level | `"public"` / `"confidential"` / `"private"` |
| `indexed` | boolean | Searchable by AI | `true` |
| `source_kind` | enum | Source type | `"document"` / `"wiki"` / `"message"` / `"artifact"` / `"web"` / `"code"` |
| `source_id` | string | Source identifier | `"doc_01HQ..."` |
| `chunk.chunk_id` | string | Chunk identifier | `"chk_01HQ..."` |
| `chunk.chunk_idx` | integer | Chunk index in source | `0` |
| `fingerprint` | string | Content hash (SHA256) | `"a1b2c3..."` |
| `created_at` | string | ISO 8601 timestamp | `"2026-01-26T12:00:00Z"` |
### Optional Fields (Recommended)
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `acl.read_team_ids` | array[string] | Teams with read access | `["team_core"]` |
| `acl.read_agent_ids` | array[string] | Agents with read access | `["agt_nutra"]` |
| `acl.read_role_ids` | array[string] | Roles with read access | `["role_admin"]` |
| `tags` | array[string] | Content tags | `["legal_kb", "contracts"]` |
| `lang` | string | Language code | `"uk"` / `"en"` |
| `importance` | float | Importance score 0-1 | `0.8` |
| `ttl_days` | integer | Auto-delete after N days | `365` |
| `embedding.model` | string | Embedding model ID | `"cohere-embed-v3"` |
| `embedding.dim` | integer | Vector dimension | `1024` |
| `embedding.metric` | string | Distance metric | `"cosine"` |
| `updated_at` | string | Last update timestamp | `"2026-01-26T12:00:00Z"` |
---
## Identifier Formats
| Entity | Prefix | Format | Example |
|--------|--------|--------|---------|
| Tenant | `t_` | `t_<slug>` | `t_daarion` |
| Team | `team_` | `team_<slug>` | `team_core` |
| Project | `proj_` | `proj_<slug>` | `proj_helion` |
| Agent | `agt_` | `agt_<slug>` | `agt_helion` |
| Document | `doc_` | `doc_<ulid>` | `doc_01HQXYZ...` |
| Message | `msg_` | `msg_<ulid>` | `msg_01HQXYZ...` |
| Artifact | `art_` | `art_<ulid>` | `art_01HQXYZ...` |
| Chunk | `chk_` | `chk_<ulid>` | `chk_01HQXYZ...` |
---
## Scope Enum
| Value | Description | Typical Sources |
|-------|-------------|-----------------|
| `docs` | Documents, knowledge bases | PDF, Google Docs, Wiki |
| `messages` | Conversations | Telegram, Slack, Email |
| `memory` | Agent memory items | Session notes, learned facts |
| `artifacts` | Generated content | Reports, presentations |
| `signals` | Events, notifications | System events |
---
## Visibility Enum
| Value | Access Rule |
|-------|-------------|
| `public` | Anyone in tenant/team can read |
| `confidential` | Owner + ACL-granted readers |
| `private` | Only owner can read |
---
## Access Control Rules
### Private Content
```python
visibility == "private" AND owner_kind == request.owner_kind AND owner_id == request.owner_id
```
### Confidential Content
```python
visibility == "confidential" AND (
(owner_kind == request.owner_kind AND owner_id == request.owner_id) OR
request.agent_id IN acl.read_agent_ids OR
request.team_id IN acl.read_team_ids OR
request.role_id IN acl.read_role_ids
)
```
### Public Content
```python
visibility == "public" AND team_id == request.team_id
```
---
## Migration Mapping (Legacy Collections)
| Old Collection Pattern | New Payload |
|------------------------|-------------|
| `helion_docs` | `agent_id="agt_helion"`, `scope="docs"` |
| `nutra_messages` | `agent_id="agt_nutra"`, `scope="messages"` |
| `druid_legal_kb` | `agent_id="agt_druid"`, `scope="docs"`, `tags=["legal_kb"]` |
| `nutra_food_knowledge` | `agent_id="agt_nutra"`, `scope="docs"`, `tags=["food_kb"]` |
| `*_memory_items` | `scope="memory"` |
| `*_artifacts` | `scope="artifacts"` |
---
## Example Payloads
### Document Chunk (Helion Knowledge Base)
```json
{
"schema_version": "cm_payload_v1",
"tenant_id": "t_daarion",
"team_id": "team_core",
"project_id": "proj_helion",
"agent_id": "agt_helion",
"owner_kind": "agent",
"owner_id": "agt_helion",
"scope": "docs",
"visibility": "confidential",
"indexed": true,
"source_kind": "document",
"source_id": "doc_01HQ8K9X2NPQR3FGJKLM5678",
"chunk": {
"chunk_id": "chk_01HQ8K9X3MPQR3FGJKLM9012",
"chunk_idx": 0
},
"fingerprint": "sha256:a1b2c3d4e5f6...",
"created_at": "2026-01-26T12:00:00Z",
"tags": ["product", "features"],
"lang": "uk",
"embedding": {
"model": "cohere-embed-multilingual-v3",
"dim": 1024,
"metric": "cosine"
}
}
```
### Message (Telegram Conversation)
```json
{
"schema_version": "cm_payload_v1",
"tenant_id": "t_daarion",
"team_id": "team_core",
"agent_id": "agt_helion",
"owner_kind": "user",
"owner_id": "user_tg_123456",
"scope": "messages",
"visibility": "private",
"indexed": true,
"source_kind": "message",
"source_id": "msg_01HQ8K9X4NPQR3FGJKLM3456",
"chunk": {
"chunk_id": "chk_01HQ8K9X5MPQR3FGJKLM7890",
"chunk_idx": 0
},
"fingerprint": "sha256:b2c3d4e5f6g7...",
"created_at": "2026-01-26T12:05:00Z",
"channel_id": "tg_chat_789"
}
```
---
## Changelog
- **v1.0** (2026-01-26): Initial canonical schema

View File

@@ -0,0 +1,178 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://daarion.city/schemas/cm_payload_v1.json",
"title": "Co-Memory Payload Schema v1",
"description": "Canonical payload schema for Qdrant vectors in DAARION platform",
"type": "object",
"required": [
"schema_version",
"tenant_id",
"owner_kind",
"owner_id",
"scope",
"visibility",
"indexed",
"source_kind",
"source_id",
"chunk",
"fingerprint",
"created_at"
],
"properties": {
"schema_version": {
"type": "string",
"const": "cm_payload_v1",
"description": "Schema version identifier"
},
"tenant_id": {
"type": "string",
"pattern": "^t_[a-z0-9_]+$",
"description": "Tenant identifier (t_<slug>)"
},
"team_id": {
"type": ["string", "null"],
"pattern": "^team_[a-z0-9_]+$",
"description": "Team identifier (team_<slug>)"
},
"project_id": {
"type": ["string", "null"],
"pattern": "^proj_[a-z0-9_]+$",
"description": "Project identifier (proj_<slug>)"
},
"agent_id": {
"type": ["string", "null"],
"pattern": "^agt_[a-z0-9_]+$",
"description": "Agent identifier (agt_<slug>)"
},
"owner_kind": {
"type": "string",
"enum": ["user", "team", "agent"],
"description": "Type of owner"
},
"owner_id": {
"type": "string",
"minLength": 1,
"description": "Owner identifier"
},
"scope": {
"type": "string",
"enum": ["docs", "messages", "memory", "artifacts", "signals"],
"description": "Content type/scope"
},
"visibility": {
"type": "string",
"enum": ["public", "confidential", "private"],
"description": "Access visibility level"
},
"indexed": {
"type": "boolean",
"description": "Whether content is searchable by AI"
},
"source_kind": {
"type": "string",
"enum": ["document", "wiki", "message", "artifact", "web", "code"],
"description": "Type of source content"
},
"source_id": {
"type": "string",
"pattern": "^(doc|msg|art|web|code)_[A-Za-z0-9]+$",
"description": "Source identifier with type prefix"
},
"chunk": {
"type": "object",
"required": ["chunk_id", "chunk_idx"],
"properties": {
"chunk_id": {
"type": "string",
"pattern": "^chk_[A-Za-z0-9]+$",
"description": "Chunk identifier"
},
"chunk_idx": {
"type": "integer",
"minimum": 0,
"description": "Chunk index within source"
}
}
},
"fingerprint": {
"type": "string",
"minLength": 1,
"description": "Content hash for deduplication"
},
"created_at": {
"type": "string",
"format": "date-time",
"description": "Creation timestamp (ISO 8601)"
},
"updated_at": {
"type": ["string", "null"],
"format": "date-time",
"description": "Last update timestamp (ISO 8601)"
},
"acl": {
"type": "object",
"properties": {
"read_team_ids": {
"type": "array",
"items": {"type": "string"},
"description": "Teams with read access"
},
"read_agent_ids": {
"type": "array",
"items": {"type": "string"},
"description": "Agents with read access"
},
"read_role_ids": {
"type": "array",
"items": {"type": "string"},
"description": "Roles with read access"
}
}
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Content tags for filtering"
},
"lang": {
"type": ["string", "null"],
"pattern": "^[a-z]{2}(-[A-Z]{2})?$",
"description": "Language code (ISO 639-1)"
},
"importance": {
"type": ["number", "null"],
"minimum": 0,
"maximum": 1,
"description": "Importance score (0-1)"
},
"ttl_days": {
"type": ["integer", "null"],
"minimum": 1,
"description": "Auto-delete after N days"
},
"channel_id": {
"type": ["string", "null"],
"description": "Channel/chat identifier for messages"
},
"embedding": {
"type": "object",
"properties": {
"model": {
"type": "string",
"description": "Embedding model identifier"
},
"dim": {
"type": "integer",
"minimum": 1,
"description": "Vector dimension"
},
"metric": {
"type": "string",
"enum": ["cosine", "dot", "euclidean"],
"description": "Distance metric"
}
}
}
},
"additionalProperties": true
}