fix(ops): Add network aliases and stabilize DNS for NODA1

- docker-compose.node1.yml: Add network aliases (router, gateway,
  memory-service, qdrant, nats, neo4j) to eliminate manual
  `docker network connect --alias` commands
- docker-compose.node1.yml: ROUTER_URL now uses env variable with
  fallback: ${ROUTER_URL:-http://router:8000}
- docker-compose.node1.yml: Increase router healthcheck start_period
  to 30s and retries to 5
- .gitignore: Add noda1-credentials.local.mdc (local-only SSH creds)
- scripts/node1/verify_agents.sh: Improved output with agent list
- docs: Add NODA1-AGENT-VERIFICATION.md, NODA1-AGENT-ARCHITECTURE.md,
  NODA1-VERIFICATION-REPORT-2026-02-03.md
- config/README.md: How to add new agents
- .cursor/rules/, .cursor/skills/: NODA1 operations skill for Cursor

Root cause fixed: Gateway could not resolve 'router' DNS name when
Router container was named 'dagi-staging-router' without alias.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Apple
2026-02-03 05:55:56 -08:00
parent 8f046e7226
commit a46a70c014
10 changed files with 537 additions and 15 deletions

View File

@@ -37,14 +37,16 @@ services:
- ./services/router/router_config.yaml:/app/router_config.yaml:ro
- ./logs:/app/logs
networks:
- dagi-network
dagi-network:
aliases:
- router
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000/health')\""]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
retries: 5
start_period: 30s
# Swapper Service для NODE1 - Dynamic LLM + OCR model loading
swapper-service:
@@ -124,7 +126,8 @@ services:
ports:
- "9300:9300"
environment:
- ROUTER_URL=http://router:8000
# На NODA1 якщо router контейнер називається dagi-staging-router — в .env задати ROUTER_URL=http://dagi-staging-router:8000
- ROUTER_URL=${ROUTER_URL:-http://router:8000}
- SERVICE_ID=gateway
- SERVICE_ROLE=gateway
- BRAND_INTAKE_URL=http://brand-intake:9211
@@ -165,7 +168,9 @@ services:
- router
- memory-service
networks:
- dagi-network
dagi-network:
aliases:
- gateway
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:9300/health')\""]
@@ -184,7 +189,9 @@ services:
volumes:
- nats-data-node1:/data
networks:
- dagi-network
dagi-network:
aliases:
- nats
restart: unless-stopped
# MinIO Object Storage
@@ -433,7 +440,9 @@ services:
depends_on:
- qdrant
networks:
- dagi-network
dagi-network:
aliases:
- memory-service
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000/health')\""]
@@ -472,7 +481,9 @@ services:
volumes:
- qdrant-data-node1:/qdrant/storage
networks:
- dagi-network
dagi-network:
aliases:
- qdrant
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:6333/healthz || exit 1"]
@@ -496,7 +507,9 @@ services:
- neo4j-data-node1:/data
- neo4j-logs-node1:/logs
networks:
- dagi-network
dagi-network:
aliases:
- neo4j
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:7474"]