Apple
a605b8c43e
P3.1: GPU/Queue-aware routing — NCS metrics + scoring-based model selection
NCS (services/node-capabilities/metrics.py):
- NodeLoad: inflight_jobs, queue_depth, concurrency_limit, estimated_wait_ms,
cpu_load_1m, mem_pressure (macOS + Linux), rtt_ms_to_hub
- RuntimeLoad: per-runtime healthy, p50_ms, p95_ms from rolling 50-sample window
- POST /capabilities/report_latency for node-worker → NCS reporting
- NCS fetches worker metrics via NODE_WORKER_URL
Node Worker:
- GET /metrics endpoint (inflight, concurrency, latency buffers)
- Latency tracking per job type (llm/vision) with rolling buffer
- Fire-and-forget latency reporting to NCS after each successful job
Router (model_select v3):
- score_candidate(): wait + model_latency + cross_node_penalty + prefer_bonus
- LOCAL_THRESHOLD_MS=250: prefer local if within threshold of remote
- ModelSelection.score field for observability
- Structured [score] logs with chosen node, model, and score breakdown
Tests: 19 new (12 scoring + 7 NCS metrics), 36 total pass
Docs: ops/runbook_p3_1.md, ops/CHANGELOG_FABRIC.md
No breaking changes to JobRequest/JobResponse or capabilities schema.
Made-with: Cursor
2026-02-27 02:55:44 -08:00
..
2026-01-28 07:06:07 -08:00
2026-02-09 08:46:46 -08:00
2026-01-28 06:40:34 -08:00
2026-01-28 06:40:34 -08:00
2026-01-28 06:40:34 -08:00
2026-01-17 08:16:37 -08:00
2026-01-17 08:16:37 -08:00
2026-02-19 00:14:12 -08:00
2026-02-19 00:14:12 -08:00
2026-02-19 00:14:18 -08:00
2026-02-09 08:46:46 -08:00
2026-02-09 08:46:46 -08:00
2026-02-20 10:24:59 -08:00
2026-02-09 08:46:46 -08:00
2026-01-17 08:16:37 -08:00
2026-02-09 08:46:46 -08:00
2026-01-28 06:40:34 -08:00
2026-02-09 08:46:46 -08:00
2026-02-09 08:46:46 -08:00
2026-02-09 11:46:15 -08:00
2026-02-09 10:26:03 -08:00
2026-02-27 02:55:44 -08:00
2026-02-27 02:55:44 -08:00
2026-02-19 00:14:12 -08:00
2026-02-19 00:14:12 -08:00
2026-02-19 00:14:12 -08:00
2026-02-19 00:14:12 -08:00
2026-02-09 08:46:46 -08:00
2026-01-28 06:40:34 -08:00
2026-01-28 06:40:34 -08:00
2026-02-09 08:46:46 -08:00
2026-01-28 06:40:34 -08:00
2026-02-27 02:55:44 -08:00
2026-02-09 08:46:46 -08:00
2026-02-19 00:14:18 -08:00
2026-02-27 01:27:38 -08:00
2025-11-17 05:24:36 -08:00