Files
microdao-daarion/docs/DATA_RETENTION_POLICY.md
Apple ef3473db21 snapshot: NODE1 production state 2026-02-09
Complete snapshot of /opt/microdao-daarion/ from NODE1 (144.76.224.179).
This represents the actual running production code that has diverged
significantly from the previous main branch.

Key changes from old main:
- Gateway (http_api.py): expanded from ~40KB to 164KB with full agent support
- Router: new /v1/agents/{id}/infer endpoint with vision + DeepSeek routing
- Behavior Policy: SOWA v2.2 (3-level: FULL/ACK/SILENT)
- Agent Registry: config/agent_registry.yml as single source of truth
- 13 agents configured (was 3)
- Memory service integration
- CrewAI teams and roles

Excluded from snapshot: venv/, .env, data/, backups, .tgz archives

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-09 08:46:46 -08:00

6.6 KiB

Data Retention Policy

Last Updated: 2026-01-19
Owner: Platform Team


Overview

This document defines data retention policies for all data stores in the DAARION platform to prevent unbounded storage growth.


1. NATS JetStream Streams

Stream Max Age Max Size Retention Policy Action
MESSAGES 7 days 10GB limits Auto-purge after 7d
AGENT_RUNS 7 days 5GB limits Auto-purge after 7d
ATTACHMENTS 30 days 50GB limits Auto-purge after 30d
MEMORY 30 days 20GB limits Auto-purge after 30d
AUDIT 90 days 100GB limits Archive after 90d

Configuration

# In NATS config
streams:
  MESSAGES:
    max_age: 7d
    max_bytes: 10GB
  AGENT_RUNS:
    max_age: 7d
    max_bytes: 5GB
  ATTACHMENTS:
    max_age: 30d
    max_bytes: 50GB
  MEMORY:
    max_age: 30d
    max_bytes: 20GB
  AUDIT:
    max_age: 90d
    max_bytes: 100GB

2. Application Logs

Service Retention Location Action
Gateway 7 days /opt/microdao-daarion/logs/gateway/ Rotate daily, delete after 7d
Router 7 days /opt/microdao-daarion/logs/router/ Rotate daily, delete after 7d
Memory Service 7 days /opt/microdao-daarion/logs/memory/ Rotate daily, delete after 7d
All Workers 7 days /opt/microdao-daarion/logs/workers/ Rotate daily, delete after 7d

Log Rotation Script

#!/bin/bash
# /opt/microdao-daarion/scripts/rotate_logs.sh

LOG_DIR="/opt/microdao-daarion/logs"
RETENTION_DAYS=7

find "$LOG_DIR" -name "*.log.*" -type f -mtime +$RETENTION_DAYS -delete
find "$LOG_DIR" -name "*.log" -type f -mtime +$RETENTION_DAYS -exec gzip {} \;

Cron: 0 2 * * * /opt/microdao-daarion/scripts/rotate_logs.sh


3. Attachments & Artifacts

Type Retention Location Action
Uploaded files 30 days /data/uploads/ Delete after 30d of inactivity
Processed artifacts 90 days /data/artifacts/ Archive to cold storage after 90d
Temporary files 1 day /tmp/ Auto-cleanup on service restart

Cleanup Script

#!/bin/bash
# /opt/microdao-daarion/scripts/cleanup_attachments.sh

UPLOAD_DIR="/data/uploads"
ARTIFACT_DIR="/data/artifacts"

# Delete uploads older than 30 days
find "$UPLOAD_DIR" -type f -mtime +30 -delete

# Archive artifacts older than 90 days
find "$ARTIFACT_DIR" -type f -mtime +90 -exec tar -czf /data/archive/{}.tar.gz {} \; -delete

Cron: 0 3 * * * /opt/microdao-daarion/scripts/cleanup_attachments.sh


4. Redis Cache

Key Pattern TTL Action
idemp:* 24 hours Auto-expire (handled by Redis)
cache:* 1 hour Auto-expire
session:* 7 days Auto-expire
rate_limit:* 1 hour Auto-expire

No manual cleanup needed - Redis handles TTL automatically.

Monitoring

# Check Redis memory usage
redis-cli INFO memory

# Check key counts by pattern
redis-cli --scan --pattern "idemp:*" | wc -l

5. PostgreSQL

Table Retention Action
sessions 30 days DELETE FROM sessions WHERE created_at < NOW() - INTERVAL '30 days'
audit_log 90 days DELETE FROM audit_log WHERE timestamp < NOW() - INTERVAL '90 days'
facts Keep all No deletion (critical data)
helion_mentors Keep all No deletion (critical data)

Cleanup Script

-- /opt/microdao-daarion/scripts/cleanup_postgres.sql
BEGIN;

DELETE FROM sessions 
WHERE created_at < NOW() - INTERVAL '30 days';

DELETE FROM audit_log 
WHERE timestamp < NOW() - INTERVAL '90 days';

VACUUM ANALYZE;

COMMIT;

Cron: 0 4 * * 0 psql -U daarion -d daarion_main -f /opt/microdao-daarion/scripts/cleanup_postgres.sql


6. Qdrant Collections

Collection Retention Action
*_messages 90 days Delete points older than 90d
*_docs 180 days Delete points older than 180d
*_kb Keep all No deletion (knowledge base)

Cleanup Script

# /opt/microdao-daarion/scripts/cleanup_qdrant.py
from qdrant_client import QdrantClient
from datetime import datetime, timedelta

client = QdrantClient("http://qdrant:6333")

collections = ["helion_messages", "nutra_messages"]
cutoff_date = datetime.now() - timedelta(days=90)

for collection in collections:
    # Delete points older than cutoff
    client.delete(
        collection_name=collection,
        points_selector={
            "filter": {
                "must": [{
                    "key": "timestamp",
                    "range": {"lt": cutoff_date.isoformat()}
                }]
            }
        }
    )

Cron: 0 5 * * 0 python3 /opt/microdao-daarion/scripts/cleanup_qdrant.py


7. Neo4j Graph Database

Node Type Retention Action
Events 90 days Delete nodes with timestamp < NOW() - 90 days
Relationships Keep all No deletion (structural data)
User nodes Keep all No deletion (identity data)

Cleanup Script

// /opt/microdao-daarion/scripts/cleanup_neo4j.cypher
MATCH (e:Event)
WHERE e.timestamp < datetime() - duration({days: 90})
DETACH DELETE e;

Cron: 0 6 * * 0 cypher-shell -u neo4j -p "DaarionNeo4j2026!" -f /opt/microdao-daarion/scripts/cleanup_neo4j.cypher


8. Monitoring & Alerts

Storage Usage Alerts

  • Disk usage > 80%: Alert to ops channel
  • Stream size > 80% of max: Alert to ops channel
  • Log directory > 10GB: Alert to ops channel

Metrics to Track

  • Total storage used per service
  • Retention policy compliance (age of oldest data)
  • Cleanup job success/failure rates

9. Emergency Cleanup

If storage is critically low:

  1. Immediate actions:

    # Delete old logs
    find /opt/microdao-daarion/logs -name "*.log.*" -mtime +3 -delete
    
    # Delete old uploads
    find /data/uploads -type f -mtime +7 -delete
    
    # Vacuum PostgreSQL
    psql -U daarion -d daarion_main -c "VACUUM FULL;"
    
  2. Reduce retention temporarily:

    • Set NATS stream max_age to 3 days
    • Reduce log retention to 3 days
    • Archive old attachments immediately

10. Compliance Notes

  • Audit logs: Keep for 90 days minimum (compliance requirement)
  • User data: Follow GDPR - delete on request
  • Backups: Keep for 30 days, then archive to cold storage

Review Schedule

  • Weekly: Check storage usage metrics
  • Monthly: Review retention policy effectiveness
  • Quarterly: Adjust retention periods based on usage patterns