Complete snapshot of /opt/microdao-daarion/ from NODE1 (144.76.224.179).
This represents the actual running production code that has diverged
significantly from the previous main branch.
Key changes from old main:
- Gateway (http_api.py): expanded from ~40KB to 164KB with full agent support
- Router: new /v1/agents/{id}/infer endpoint with vision + DeepSeek routing
- Behavior Policy: SOWA v2.2 (3-level: FULL/ACK/SILENT)
- Agent Registry: config/agent_registry.yml as single source of truth
- 13 agents configured (was 3)
- Memory service integration
- CrewAI teams and roles
Excluded from snapshot: venv/, .env, data/, backups, .tgz archives
Co-authored-by: Cursor <cursoragent@cursor.com>
- docker-compose.node1.yml: Add network aliases (router, gateway,
memory-service, qdrant, nats, neo4j) to eliminate manual
`docker network connect --alias` commands
- docker-compose.node1.yml: ROUTER_URL now uses env variable with
fallback: ${ROUTER_URL:-http://router:8000}
- docker-compose.node1.yml: Increase router healthcheck start_period
to 30s and retries to 5
- .gitignore: Add noda1-credentials.local.mdc (local-only SSH creds)
- scripts/node1/verify_agents.sh: Improved output with agent list
- docs: Add NODA1-AGENT-VERIFICATION.md, NODA1-AGENT-ARCHITECTURE.md,
NODA1-VERIFICATION-REPORT-2026-02-03.md
- config/README.md: How to add new agents
- .cursor/rules/, .cursor/skills/: NODA1 operations skill for Cursor
Root cause fixed: Gateway could not resolve 'router' DNS name when
Router container was named 'dagi-staging-router' without alias.
Co-authored-by: Cursor <cursoragent@cursor.com>
ROOT CAUSE: Monitor was doing DROP DATABASE when NODE2 agents were missing,
but the backup didn't have NODE2 agents, causing an infinite loop.
FIX:
- FULL RECOVERY (DROP DATABASE) only when MicroDAOs < 5 (critical data loss)
- SOFT RECOVERY (just sync agents) when MicroDAOs exist but agents missing
- Prefer backup with NODE2 agents (full_backup_with_node2*.sql)
- Never DROP DATABASE if MicroDAOs exist
This prevents the daily data loss issue.
- Check for at least 45 NODE2 agents (out of 50 expected)
- This prevents false positives when only core agents exist
- Better detection of actual data loss
- Add monitor-db-stability.sh for automatic recovery
- Improve PostgreSQL shutdown settings to prevent data loss
- Add checkpoint and WAL settings for better persistence
- This script was trying to assign test agents (ag_atlas, etc.) to NODE2
- Use sync-node2-dagi-agents.py instead for loading real agents
- Test agents are now automatically removed by health check
- Add apply-migrations.sh for automatic migration application
- Add ensure-db-persistence.sh for database integrity checks
- Add db-health-check.sh for periodic health monitoring
- Improve PostgreSQL configuration in docker-compose.db.yml
- Add proper shutdown settings to prevent data loss
- Fix get_dagi_router_agents to use router_healthy from node_cache first
- Fallback to direct API call only if cache is unavailable
- This fixes NODE2 agents showing as 'stale' when router is actually healthy
- Fix CITY_SERVICE_URL in scripts (remove /api/city, use /api)
- Add NGINX reverse proxy config for assets.daarion.space
- Add script to migrate assets from /static/uploads to MinIO
- Add script to update asset URLs in database after migration
- Created city-lobby room as main public chat with DAARWIZZ
- Fixed /api/city/rooms proxy to use correct backend path (/api/v1/city/rooms)
- Updated district rooms with zone keys (leadership, system, engineering, etc.)
- Set MicroDAO lobbies as primary rooms
- Created seed_city_rooms.py script
- Created TASK_PHASE_CITY_ROOMS_AND_PUBLIC_CHAT_v1.md
Total: 35 rooms, 31 public, 10 districts
- Add migration 042_node_cache_router_metrics.sql
- Node guardian now collects router health and sends in heartbeat
- City-service uses cached router_healthy from node_cache
- This allows NODE2 router status to be displayed correctly
- Add migration 041_node_local_endpoints.sql
- Add get_node_endpoints() to repo_city.py
- Update routes_city.py to use DB endpoints instead of hardcoded URLs
- Update node-guardian-loop.py to use NODE_SWAPPER_URL/NODE_ROUTER_URL env vars
- Update launchd plist for NODE2 with router URL