Files
microdao-daarion/PRODUCTION-CHECKLIST.md
Ivan Tytar 3cacf67cf5 feat: Initial commit - DAGI Stack v0.2.0 (Phase 2 Complete)
- Router Core with rule-based routing (1530 lines)
- DevTools Backend (file ops, test execution) (393 lines)
- CrewAI Orchestrator (4 workflows, 12 agents) (358 lines)
- Bot Gateway (Telegram/Discord) (321 lines)
- RBAC Service (role resolution) (272 lines)
- Structured logging (utils/logger.py)
- Docker deployment (docker-compose.yml)
- Comprehensive documentation (57KB)
- Test suites (41 tests, 95% coverage)
- Phase 4 roadmap & ecosystem integration plans

Production-ready infrastructure for DAARION microDAOs.
2025-11-15 14:35:24 +01:00

5.7 KiB

Production Readiness Checklist

This checklist ensures DAGI Stack is ready for production deployment.

Pre-Production Verification

Security

  • .env in .gitignore - secrets protected
  • .env.example documented - all variables explained
  • Secret generation commands provided
  • All .env values filled with real credentials
  • RBAC_SECRET_KEY generated (openssl rand -hex 32)
  • Bot tokens configured (Telegram/Discord)

Infrastructure

  • docker-compose.yml configured - 5 services defined
  • Dockerfiles created for all services
  • .dockerignore optimized
  • Health checks configured (30s interval)
  • Networks and volumes defined
  • Disk space available (10GB+)
  • RAM available (4GB+)

Testing

  • smoke.sh test suite created
  • Smoke tests passing (run ./smoke.sh)
  • Router health check passing
  • DevTools health check passing
  • CrewAI health check passing
  • RBAC health check passing
  • Gateway health check passing

Observability

  • Structured JSON logging implemented
  • Request IDs for tracing
  • Log levels configurable (LOG_LEVEL)
  • Service names in logs
  • Log rotation configured (optional)
  • Monitoring dashboards (future)

Documentation

  • README.md comprehensive
  • Architecture diagram included
  • Quick start guide
  • Services overview
  • Configuration examples
  • DEPLOYMENT.md created
  • CHANGELOG.md maintained
  • PHASE-2-COMPLETE.md summary

Configuration

  • router-config.yml validated
  • Routing rules prioritized
  • Timeouts configured
  • LLM provider URLs verified
  • Ollama model pulled (if using local)

🚀 Deployment Steps

1. Initial Setup

# Clone repository
git clone https://github.com/daarion/dagi-stack.git
cd dagi-stack

# Configure environment
cp .env.example .env
nano .env

# Generate secrets
export RBAC_SECRET_KEY=$(openssl rand -hex 32)
echo "RBAC_SECRET_KEY=$RBAC_SECRET_KEY" >> .env

2. Pre-flight Check

# Verify Docker
docker --version
docker-compose --version

# Verify resources
df -h | grep /var/lib/docker
free -h

# Validate configuration
cat .env | grep -v '^#' | grep '='

3. Service Startup

# Start all services
docker-compose up -d

# Wait for health checks
sleep 30

# Verify all healthy
docker-compose ps

4. Smoke Test

# Run test suite
./smoke.sh

# Expected: All tests passing

5. Manual Verification

# Test Router
curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello", "mode": "chat", "metadata": {}}'

# Test DevTools
curl -X POST http://localhost:8008/fs/read \
  -H "Content-Type: application/json" \
  -d '{"path": "README.md"}'

# Test CrewAI
curl -X GET http://localhost:9010/workflow/list

# Test RBAC
curl -X POST http://localhost:9200/rbac/resolve \
  -H "Content-Type: application/json" \
  -d '{"dao_id": "greenfood-dao", "user_id": "tg:12345"}'

# Test Gateway
curl http://localhost:9300/health

🔧 Production Configuration

Environment Variables (Required)

# Bots
TELEGRAM_BOT_TOKEN=your_token_here
DISCORD_BOT_TOKEN=your_token_here

# LLM
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=qwen3:8b

# Security
RBAC_SECRET_KEY=your_generated_secret_here

# Ports (optional, defaults)
ROUTER_PORT=9102
GATEWAY_PORT=9300
DEVTOOLS_PORT=8008
CREWAI_PORT=9010
RBAC_PORT=9200

Firewall Rules

# Allow external access (Gateway only)
sudo ufw allow 9300/tcp

# Block internal services from external access
sudo ufw deny 8008/tcp
sudo ufw deny 9010/tcp
sudo ufw deny 9200/tcp

# Allow Router if needed externally
sudo ufw allow 9102/tcp

Nginx Reverse Proxy (Optional)

server {
    listen 80;
    server_name gateway.daarion.city;
    
    location / {
        proxy_pass http://localhost:9300;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

📊 Monitoring

Health Checks

# Create health check cron job
cat > /etc/cron.d/dagi-health << 'CRON'
*/5 * * * * root /opt/dagi-stack/smoke.sh > /var/log/dagi-health.log 2>&1
CRON

Log Monitoring

# View live logs
docker-compose logs -f

# Check for errors
docker-compose logs | grep -i error

# Service-specific logs
docker-compose logs router | tail -100

Disk Usage

# Check Docker volumes
docker system df

# Clean up if needed
docker system prune -a

🔄 Maintenance

Daily Tasks

  • Check health endpoints
  • Review error logs
  • Monitor disk usage

Weekly Tasks

  • Run smoke tests
  • Check for Docker image updates
  • Review RBAC database size
  • Backup configurations

Monthly Tasks

  • Update dependencies
  • Security patches
  • Performance optimization
  • Capacity planning

🐛 Troubleshooting

Service won't start

# Check logs
docker-compose logs <service>

# Check resources
docker stats

# Restart service
docker-compose restart <service>

Health check fails

# Test manually
curl http://localhost:<port>/health

# Check container status
docker-compose ps

# Check network
docker network ls
docker network inspect dagi-network

LLM timeout

# Increase timeout in router-config.yml
timeout_ms: 60000

# Restart router
docker-compose restart router

# Check Ollama
curl http://localhost:11434/api/tags

📞 Escalation

If issues persist:

  1. Check GitHub Issues: https://github.com/daarion/dagi-stack/issues
  2. Discord support: https://discord.gg/daarion
  3. Email: dev@daarion.city

Last updated: 2024-11-15 Version: 0.2.0