P3.2+P3.3+P3.4: NODA1 node-worker + NATS auth config + Prometheus counters
P3.2 — Multi-node deployment: - Added node-worker service to docker-compose.node1.yml (NODE_ID=noda1) - NCS NODA1 now has NODE_WORKER_URL for metrics collection - Fixed NODE_ID consistency: router NODA1 uses 'noda1' - NODA2 node-worker/NCS gets NCS_REPORT_URL for latency reporting P3.3 — NATS accounts/auth (opt-in config): - config/nats-server.conf with 3 accounts: SYS, FABRIC, APP - Per-user topic permissions (router, ncs, node_worker) - Leafnode listener :7422 with auth - Not yet activated (requires credential provisioning) P3.4 — Prometheus counters: - Router /fabric_metrics: caps_refresh, caps_stale, model_select, offload_total, breaker_state, score_ms histogram - Node Worker /prom_metrics: jobs_total, inflight gauge, latency_ms histogram - NCS /prom_metrics: runtime_health, runtime_p50/p95, node_wait_ms - All bound to 127.0.0.1 (not externally exposed) Made-with: Cursor
This commit is contained in:
@@ -31,6 +31,16 @@ async def metrics():
|
||||
return worker.get_metrics()
|
||||
|
||||
|
||||
@app.get("/prom_metrics")
|
||||
async def prom_metrics():
|
||||
from fastapi.responses import Response
|
||||
import fabric_metrics as fm
|
||||
data = fm.get_metrics_text()
|
||||
if data:
|
||||
return Response(content=data, media_type="text/plain; charset=utf-8")
|
||||
return {"error": "prometheus_client not installed"}
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup():
|
||||
global _nats_client
|
||||
|
||||
Reference in New Issue
Block a user