84 lines
3.0 KiB
Markdown
84 lines
3.0 KiB
Markdown
# NODE Architecture Reconciliation Plan (NODE1 + NODE3 + NODE4)
|
|
|
|
Date: 2026-02-16
|
|
Policy: Runtime-first for current state, roadmap-preserving for NODE3/NODE4.
|
|
|
|
## 1) Documents Confirmed (Legacy/Planning Set)
|
|
|
|
Found in worktrees (not in current main tree root):
|
|
- `.worktrees/origin-main/IMPLEMENTATION-STATUS.md`
|
|
- `.worktrees/origin-main/ARCHITECTURE-150-NODES.md`
|
|
- `.worktrees/origin-main/infrastructure/auth/AUTH-IMPLEMENTATION-PLAN.md`
|
|
- `.worktrees/origin-main/infrastructure/matrix-gateway/README.md`
|
|
|
|
Same copies found in:
|
|
- `.worktrees/docs-node1-sync/...`
|
|
|
|
These files are valid architecture/program documents (dated 2026-01-10), but they are not an exact reflection of current NODE1 runtime code state on 2026-02-16.
|
|
|
|
## 2) Current Runtime Truth (NODE1)
|
|
|
|
- Runtime root: `/opt/microdao-daarion`
|
|
- Router/Gateway/Swapper healthy.
|
|
- Canary suite passing:
|
|
- `ops/canary_all.sh`
|
|
- `ops/canary_senpai_osr_guard.sh`
|
|
- Router endpoint contract in runtime:
|
|
- active: `POST /v1/agents/{agent_id}/infer`
|
|
- not active: `POST /route`
|
|
|
|
## 3) NODE3/NODE4 Policy (Do NOT remove from architecture)
|
|
|
|
NODE3/NODE4 remain part of target architecture and deployment plan.
|
|
|
|
Current status (observed now):
|
|
- From laptop: `212.8.58.133:33147` and `:33148` unreachable.
|
|
- From NODE1: `212.8.58.133:8880` timeout, `:33147/:33148` no route.
|
|
|
|
Interpretation:
|
|
- This is a connectivity/runtime availability issue, not an architecture removal decision.
|
|
- Keep NODE3/NODE4 in docs and topology as `planned/temporarily_unreachable`.
|
|
|
|
## 4) Operating Model Until Connectivity Restored
|
|
|
|
Use explicit mode labeling:
|
|
- `ACTIVE`: reachable and health-checked.
|
|
- `DEGRADED`: included in architecture but currently unreachable.
|
|
- `DISABLED`: intentionally turned off (not the case for NODE3/NODE4 now).
|
|
|
|
Current recommendation:
|
|
- NODE1: `ACTIVE`
|
|
- NODE3: `DEGRADED`
|
|
- NODE4: `DEGRADED`
|
|
|
|
## 5) Reconciliation Rules
|
|
|
|
1. Do not delete NODE3/NODE4 docs, routes, or architecture references.
|
|
2. Mark external generation dependencies as conditional by reachability checks.
|
|
3. Runtime registries/config must not advertise unavailable external agents as locally active.
|
|
4. Keep roadmap docs (150 nodes, auth, matrix gateway) as strategic references; do not treat them as runtime contract files.
|
|
|
|
## 6) Action Plan (No Risk to Production)
|
|
|
|
1. Create a single "Architecture Status Board" document that maps:
|
|
- planned topology (NODE1/2/3/4...)
|
|
- current health/reachability per node
|
|
- last verified timestamp.
|
|
2. Add preflight checks for external node dependencies in deployment scripts:
|
|
- TCP check
|
|
- service health check
|
|
- fallback behavior logging.
|
|
3. Resolve registry drift:
|
|
- align `config/agent_registry.yml` and generated registry artifacts on NODE1 runtime.
|
|
4. After NODE3/NODE4 connectivity returns:
|
|
- run connectivity proof
|
|
- run media generation smoke
|
|
- switch node status from `DEGRADED` to `ACTIVE`.
|
|
|
|
## 7) Decision Summary
|
|
|
|
- Keep NODE3/NODE4 in architecture and planning.
|
|
- Use runtime-first truth for what is currently active.
|
|
- Maintain explicit degraded-mode status instead of silent exclusion.
|
|
|