90 lines
3.4 KiB
Markdown
90 lines
3.4 KiB
Markdown
# AGENT REGISTRY Decision (NODE1 Runtime)
|
|
|
|
Date: 2026-02-16
|
|
Scope: Decide how to reconcile `config/agent_registry.yml` with the real NODE1 runtime architecture.
|
|
Source policy: Runtime-first (facts from `/opt/microdao-daarion` on NODE1).
|
|
|
|
## Runtime Facts (Verified)
|
|
|
|
### NODE1 current state
|
|
- Runtime root: `/opt/microdao-daarion`
|
|
- Branch/HEAD: `codex/inventory-audit-20260214` / `6fcd406d36fa04be78073c039bca759baea10e7b`
|
|
- Core health:
|
|
- Router `9102` healthy
|
|
- Gateway `9300` healthy
|
|
- Swapper `8890` healthy
|
|
- Canary status:
|
|
- `ops/canary_all.sh` -> PASS
|
|
- `ops/canary_senpai_osr_guard.sh` -> PASS
|
|
|
|
### Agents observed in runtime files
|
|
- `config/agent_registry.yml`:
|
|
- Total agents: 15
|
|
- Internal agents: `monitor`, `devtools`
|
|
- `comfy`: absent
|
|
- `config/router_agents.json`:
|
|
- Total agents: 16
|
|
- `comfy`: present
|
|
- `gateway /health`:
|
|
- `agents_count`: 13 user-facing agents
|
|
|
|
Conclusion: runtime has a documented mismatch:
|
|
- registry source (`agent_registry.yml`) says 15 (without `comfy`)
|
|
- generated router registry (`router_agents.json`) says 16 (with `comfy`)
|
|
|
|
## Connectivity Facts (NODE3/NODE4)
|
|
|
|
### From this workstation
|
|
- SSH `zevs@212.8.58.133:33147` -> `Network is unreachable`
|
|
- SSH `zevss@212.8.58.133:33148` -> `Network is unreachable`
|
|
|
|
### From NODE1 to NODE3/NODE4 address
|
|
- `212.8.58.133:8880` (expected Comfy API) -> timeout
|
|
- `212.8.58.133:33147` -> `No route to host`
|
|
- `212.8.58.133:33148` -> `No route to host`
|
|
|
|
Conclusion: NODE1 currently has no reliable network path to NODE3/NODE4 services.
|
|
|
|
## Decision
|
|
|
|
Decision ID: `ADR-NODE1-REGISTRY-2026-02-16-A`
|
|
|
|
1. For NODE1 production, treat `comfy` as disabled/unavailable until connectivity to NODE3 is restored.
|
|
2. Align registry artifacts so they describe actual runtime, not aspirational topology:
|
|
- `config/agent_registry.yml` and generated outputs must be consistent on NODE1.
|
|
- Do not keep `comfy` in generated runtime registries while NODE1 cannot reach Comfy endpoint.
|
|
3. Keep existing media-delivery code paths in gateway/router (safe and already validated), but mark external generation as conditional on reachable endpoint.
|
|
|
|
Rationale:
|
|
- Prevents hidden routing to unreachable services.
|
|
- Removes ambiguity between source registry and generated files.
|
|
- Matches observed healthy production behavior (13 user-facing + 2 internal).
|
|
|
|
## Operational Rules Until NODE3/NODE4 Access Is Restored
|
|
|
|
1. Do not advertise `comfy` as active in NODE1 runtime registries.
|
|
2. Keep `COMFY_AGENT_URL` as optional env only (non-authoritative for agent availability).
|
|
3. Before enabling `comfy` on NODE1, require:
|
|
- successful TCP check to `212.8.58.133:8880`
|
|
- successful API health call
|
|
- post-enable canary pass
|
|
|
|
## Required Follow-up Actions
|
|
|
|
1. Reconcile registry source/generation pipeline on canonical repo:
|
|
- ensure one deterministic generated set from `config/agent_registry.yml`
|
|
- remove stale generated artifacts that conflict with source
|
|
2. Add explicit status field for external agents (example: `enabled`, `reachable`) to avoid binary present/absent confusion.
|
|
3. Add pre-deploy guard:
|
|
- if external agent endpoint unreachable, block publish of that agent to NODE1 runtime registries.
|
|
|
|
## Verification Commands (Used)
|
|
|
|
On NODE1:
|
|
- `python3` check of `config/agent_registry.yml` and `config/router_agents.json` counts
|
|
- `curl http://127.0.0.1:9300/health`
|
|
- `ops/canary_all.sh`
|
|
- `ops/canary_senpai_osr_guard.sh`
|
|
- `nc` and `curl` checks to `212.8.58.133:{8880,33147,33148}`
|
|
|