Files
microdao-daarion/services/senpai-md-consumer/senpai/md_consumer/metrics.py
Apple 09dee24342 feat: MD pipeline — market-data-service hardening + SenpAI NATS consumer
Producer (market-data-service):
- Backpressure: smart drop policy (heartbeats→quotes→trades preserved)
- Heartbeat monitor: synthetic HeartbeatEvent on provider silence
- Graceful shutdown: WS→bus→storage→DB engine cleanup sequence
- Bybit V5 public WS provider (backup for Binance, no API key needed)
- FailoverManager: health-based provider switching with recovery
- NATS output adapter: md.events.{type}.{symbol} for SenpAI
- /bus-stats endpoint for backpressure monitoring
- Dockerfile + docker-compose.node1.yml integration
- 36 tests (parsing + bus + failover), requirements.lock

Consumer (senpai-md-consumer):
- NATSConsumer: subscribe md.events.>, queue group senpai-md, backpressure
- State store: LatestState + RollingWindow (deque, 60s)
- Feature engine: 11 features (mid, spread, VWAP, return, vol, latency)
- Rule-based signals: long/short on return+volume+spread conditions
- Publisher: rate-limited features + signals + alerts to NATS
- HTTP API: /health, /metrics, /state/latest, /features/latest, /stats
- 10 Prometheus metrics
- Dockerfile + docker-compose.senpai.yml
- 41 tests (parsing + state + features + rate-limit), requirements.lock

CI: ruff + pytest + smoke import for both services
Tests: 77 total passed, lint clean
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-09 11:46:15 -08:00

73 lines
2.6 KiB
Python

"""
Prometheus metrics for SenpAI market-data consumer.
"""
from prometheus_client import Counter, Gauge, Histogram
# ── Inbound events ─────────────────────────────────────────────────────
EVENTS_IN = Counter(
"senpai_events_in_total",
"Total events received from NATS",
["event_type", "provider"],
)
EVENTS_DROPPED = Counter(
"senpai_events_dropped_total",
"Events dropped due to backpressure or errors",
["reason", "event_type"],
)
# ── Queue ──────────────────────────────────────────────────────────────
QUEUE_FILL = Gauge(
"senpai_queue_fill_ratio",
"Internal processing queue fill ratio (0..1)",
)
QUEUE_SIZE = Gauge(
"senpai_queue_size",
"Current number of items in processing queue",
)
# ── Processing ─────────────────────────────────────────────────────────
PROCESSING_LATENCY = Histogram(
"senpai_processing_latency_ms",
"End-to-end processing latency (NATS receive to feature publish) in ms",
buckets=[0.1, 0.5, 1, 2, 5, 10, 25, 50, 100, 250],
)
# ── Feature publishing ─────────────────────────────────────────────────
FEATURE_PUBLISH = Counter(
"senpai_feature_publish_total",
"Total feature snapshots published to NATS",
["symbol"],
)
FEATURE_PUBLISH_ERRORS = Counter(
"senpai_feature_publish_errors_total",
"Failed feature publishes",
["symbol"],
)
# ── Signals ────────────────────────────────────────────────────────────
SIGNALS_EMITTED = Counter(
"senpai_signals_emitted_total",
"Trade signals emitted",
["symbol", "direction"],
)
ALERTS_EMITTED = Counter(
"senpai_alerts_emitted_total",
"Alerts emitted",
["alert_type"],
)
# ── NATS connection ───────────────────────────────────────────────────
NATS_CONNECTED = Gauge(
"senpai_nats_connected",
"Whether NATS connection is alive (1=yes, 0=no)",
)
NATS_RECONNECTS = Counter(
"senpai_nats_reconnects_total",
"Number of NATS reconnections",
)