P2: Global multi-node model selection + NCS on NODA1
Architecture for 150+ nodes: - global_capabilities_client.py: NATS scatter-gather discovery using wildcard subject node.*.capabilities.get — zero static node lists. New nodes auto-register by deploying NCS and subscribing to NATS. Dead nodes expire from cache after 3x TTL automatically. Multi-node model_select.py: - ModelSelection now includes node, local, via_nats fields - select_best_model prefers local candidates, then remote - Prefer list resolution: local first, remote second - All logged per request: node, runtime, model, local/remote NODA1 compose: - Added node-capabilities service (NCS) to docker-compose.node1.yml - NATS subscription: node.noda1.capabilities.get - Router env: NODE_CAPABILITIES_URL + ENABLE_GLOBAL_CAPS_NATS=true NODA2 compose: - Router env: ENABLE_GLOBAL_CAPS_NATS=true Router main.py: - Startup: initializes global_capabilities_client (NATS connect + first discovery). Falls back to local-only capabilities_client if unavailable. - /infer: uses get_global_capabilities() for cross-node model pool - Offload support: send_offload_request(node_id, type, payload) via NATS Verified on NODA2: - Global caps: 1 node, 14 models (NODA1 not yet deployed) - Sofiia: cloud_grok → grok-4-1-fast-reasoning (OK) - Helion: NCS → qwen3:14b local (OK) - When NODA1 deploys NCS, its models appear automatically via NATS discovery Made-with: Cursor
This commit is contained in:
@@ -25,8 +25,9 @@ services:
|
||||
- XAI_API_KEY=${XAI_API_KEY}
|
||||
- GROK_API_KEY=${XAI_API_KEY}
|
||||
- DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY:-}
|
||||
# ── Node Capabilities ─────────────────────────────────────────────────
|
||||
# ── Node Capabilities (multi-node model selection) ────────────────────
|
||||
- NODE_CAPABILITIES_URL=http://node-capabilities:8099/capabilities
|
||||
- ENABLE_GLOBAL_CAPS_NATS=true
|
||||
# ── Persistence backends ──────────────────────────────────────────────
|
||||
- ALERT_BACKEND=postgres
|
||||
- ALERT_DATABASE_URL=${ALERT_DATABASE_URL:-${DATABASE_URL}}
|
||||
|
||||
Reference in New Issue
Block a user