# Co-Memory Payload Schema v1 (cm_payload_v1) **Version:** 1.0 **Status:** Canonical **Last Updated:** 2026-01-26 ## Overview This document defines the canonical payload schema for all vectors stored in Qdrant across the DAARION platform. The schema enables: - **Unlimited agents** without creating new collections - **Fine-grained access control** via payload filters - **Multi-tenant isolation** via tenant_id - **Consistent querying** across all memory types ## Design Principles 1. **One collection = one embedding space** (same dim + metric) 2. **No per-agent collections** - agents identified by `agent_id` field 3. **Access control via payload** - visibility + ACL fields 4. **Stable identifiers** - ULIDs for all entities --- ## Collection Naming Convention ``` cm___v ``` Examples: - `cm_text_1024_v1` - text embeddings, 1024 dimensions - `cm_code_768_v1` - code embeddings, 768 dimensions - `cm_mm_512_v1` - multimodal embeddings, 512 dimensions --- ## Payload Schema ### Required Fields (MVP) | Field | Type | Description | Example | |-------|------|-------------|---------| | `schema_version` | string | Always `"cm_payload_v1"` | `"cm_payload_v1"` | | `tenant_id` | string | Tenant identifier | `"t_daarion"` | | `team_id` | string | Team identifier (nullable) | `"team_core"` | | `project_id` | string | Project identifier (nullable) | `"proj_helion"` | | `agent_id` | string | Agent identifier (nullable) | `"agt_helion"` | | `owner_kind` | enum | Owner type | `"agent"` / `"team"` / `"user"` | | `owner_id` | string | Owner identifier | `"agt_helion"` | | `scope` | enum | Content type | `"docs"` / `"messages"` / `"memory"` / `"artifacts"` / `"signals"` | | `visibility` | enum | Access level | `"public"` / `"confidential"` / `"private"` | | `indexed` | boolean | Searchable by AI | `true` | | `source_kind` | enum | Source type | `"document"` / `"wiki"` / `"message"` / `"artifact"` / `"web"` / `"code"` | | `source_id` | string | Source identifier | `"doc_01HQ..."` | | `chunk.chunk_id` | string | Chunk identifier | `"chk_01HQ..."` | | `chunk.chunk_idx` | integer | Chunk index in source | `0` | | `fingerprint` | string | Content hash (SHA256) | `"a1b2c3..."` | | `created_at` | string | ISO 8601 timestamp | `"2026-01-26T12:00:00Z"` | ### Optional Fields (Recommended) | Field | Type | Description | Example | |-------|------|-------------|---------| | `acl.read_team_ids` | array[string] | Teams with read access | `["team_core"]` | | `acl.read_agent_ids` | array[string] | Agents with read access | `["agt_nutra"]` | | `acl.read_role_ids` | array[string] | Roles with read access | `["role_admin"]` | | `tags` | array[string] | Content tags | `["legal_kb", "contracts"]` | | `lang` | string | Language code | `"uk"` / `"en"` | | `importance` | float | Importance score 0-1 | `0.8` | | `ttl_days` | integer | Auto-delete after N days | `365` | | `embedding.model` | string | Embedding model ID | `"cohere-embed-v3"` | | `embedding.dim` | integer | Vector dimension | `1024` | | `embedding.metric` | string | Distance metric | `"cosine"` | | `updated_at` | string | Last update timestamp | `"2026-01-26T12:00:00Z"` | --- ## Identifier Formats | Entity | Prefix | Format | Example | |--------|--------|--------|---------| | Tenant | `t_` | `t_` | `t_daarion` | | Team | `team_` | `team_` | `team_core` | | Project | `proj_` | `proj_` | `proj_helion` | | Agent | `agt_` | `agt_` | `agt_helion` | | Document | `doc_` | `doc_` | `doc_01HQXYZ...` | | Message | `msg_` | `msg_` | `msg_01HQXYZ...` | | Artifact | `art_` | `art_` | `art_01HQXYZ...` | | Chunk | `chk_` | `chk_` | `chk_01HQXYZ...` | --- ## Scope Enum | Value | Description | Typical Sources | |-------|-------------|-----------------| | `docs` | Documents, knowledge bases | PDF, Google Docs, Wiki | | `messages` | Conversations | Telegram, Slack, Email | | `memory` | Agent memory items | Session notes, learned facts | | `artifacts` | Generated content | Reports, presentations | | `signals` | Events, notifications | System events | --- ## Visibility Enum | Value | Access Rule | |-------|-------------| | `public` | Anyone in tenant/team can read | | `confidential` | Owner + ACL-granted readers | | `private` | Only owner can read | --- ## Access Control Rules ### Private Content ```python visibility == "private" AND owner_kind == request.owner_kind AND owner_id == request.owner_id ``` ### Confidential Content ```python visibility == "confidential" AND ( (owner_kind == request.owner_kind AND owner_id == request.owner_id) OR request.agent_id IN acl.read_agent_ids OR request.team_id IN acl.read_team_ids OR request.role_id IN acl.read_role_ids ) ``` ### Public Content ```python visibility == "public" AND team_id == request.team_id ``` --- ## Migration Mapping (Legacy Collections) | Old Collection Pattern | New Payload | |------------------------|-------------| | `helion_docs` | `agent_id="agt_helion"`, `scope="docs"` | | `nutra_messages` | `agent_id="agt_nutra"`, `scope="messages"` | | `druid_legal_kb` | `agent_id="agt_druid"`, `scope="docs"`, `tags=["legal_kb"]` | | `nutra_food_knowledge` | `agent_id="agt_nutra"`, `scope="docs"`, `tags=["food_kb"]` | | `*_memory_items` | `scope="memory"` | | `*_artifacts` | `scope="artifacts"` | --- ## Example Payloads ### Document Chunk (Helion Knowledge Base) ```json { "schema_version": "cm_payload_v1", "tenant_id": "t_daarion", "team_id": "team_core", "project_id": "proj_helion", "agent_id": "agt_helion", "owner_kind": "agent", "owner_id": "agt_helion", "scope": "docs", "visibility": "confidential", "indexed": true, "source_kind": "document", "source_id": "doc_01HQ8K9X2NPQR3FGJKLM5678", "chunk": { "chunk_id": "chk_01HQ8K9X3MPQR3FGJKLM9012", "chunk_idx": 0 }, "fingerprint": "sha256:a1b2c3d4e5f6...", "created_at": "2026-01-26T12:00:00Z", "tags": ["product", "features"], "lang": "uk", "embedding": { "model": "cohere-embed-multilingual-v3", "dim": 1024, "metric": "cosine" } } ``` ### Message (Telegram Conversation) ```json { "schema_version": "cm_payload_v1", "tenant_id": "t_daarion", "team_id": "team_core", "agent_id": "agt_helion", "owner_kind": "user", "owner_id": "user_tg_123456", "scope": "messages", "visibility": "private", "indexed": true, "source_kind": "message", "source_id": "msg_01HQ8K9X4NPQR3FGJKLM3456", "chunk": { "chunk_id": "chk_01HQ8K9X5MPQR3FGJKLM7890", "chunk_idx": 0 }, "fingerprint": "sha256:b2c3d4e5f6g7...", "created_at": "2026-01-26T12:05:00Z", "channel_id": "tg_chat_789" } ``` --- ## Changelog - **v1.0** (2026-01-26): Initial canonical schema