New service: real-time market data collection with unified event model. Architecture: - Domain events: TradeEvent, QuoteEvent, BookL2Event, HeartbeatEvent - Provider interface: MarketDataProvider ABC with connect/subscribe/stream/close - Async EventBus with fan-out to multiple consumers Providers: - BinanceProvider: public WebSocket (trades + bookTicker), no API key needed, auto-reconnect with exponential backoff, heartbeat timeout detection - AlpacaProvider: IEX real-time data + paper trading auth, dry-run mode when no keys configured (heartbeats only) Consumers: - StorageConsumer: SQLite (via SQLAlchemy async) + JSONL append-only log - MetricsConsumer: Prometheus counters, latency histograms, events/sec gauge - PrintConsumer: sampled structured logging (1/100 events) CLI: python -m app run --provider binance --symbols BTCUSDT,ETHUSDT HTTP: /health, /metrics (Prometheus), /latest?symbol=XXX Tests: 19/19 passed (Binance parse, Alpaca parse, bus smoke tests) Config: pydantic-settings + .env, all secrets via environment variables. Co-authored-by: Cursor <cursoragent@cursor.com>
3.2 KiB
3.2 KiB
Market Data Service (SenpAI)
Real-time market data collection and normalization for the SenpAI/Gordon trading agent.
Quick Start
1. Install
cd services/market-data-service
pip install -r requirements.txt
2. Copy config
cp .env.example .env
3. Run (Binance — no keys needed)
python -m app run --provider binance --symbols BTCUSDT,ETHUSDT
4. Run (Alpaca — paper trading)
First, get free paper-trading API keys:
- Sign up at https://app.alpaca.markets
- Switch to Paper Trading in the dashboard
- Go to API Keys → Generate New Key
- Add to
.env:ALPACA_KEY=your_key_here ALPACA_SECRET=your_secret_here ALPACA_DRY_RUN=false - Run:
python -m app run --provider alpaca --symbols AAPL,TSLA
Without keys, Alpaca runs in dry-run mode (heartbeats only).
5. Run both providers
python -m app run --provider all --symbols BTCUSDT,AAPL
HTTP Endpoints
Once running, the service exposes:
| Endpoint | Description |
|---|---|
GET /health |
Service health check |
GET /metrics |
Prometheus metrics |
GET /latest?symbol=BTCUSDT |
Latest trade + quote from SQLite |
Default port: 8891 (configurable via HTTP_PORT).
View Data
SQLite
sqlite3 market_data.db "SELECT * FROM trades ORDER BY ts_recv DESC LIMIT 5;"
JSONL Event Log
tail -5 events.jsonl | python -m json.tool
Prometheus Metrics
curl http://localhost:8891/metrics
Key metrics:
market_events_total— events by provider/type/symbolmarket_exchange_latency_ms— exchange-to-receive latencymarket_events_per_second— throughput gauge
Architecture
Provider (Binance/Alpaca)
│ raw WebSocket messages
▼
Adapter (_parse → domain Event)
│ TradeEvent / QuoteEvent / BookL2Event
▼
EventBus (asyncio.Queue fan-out)
├─▶ StorageConsumer → SQLite + JSONL
├─▶ MetricsConsumer → Prometheus counters/histograms
└─▶ PrintConsumer → structured log (sampled 1/100)
Adding a New Provider
- Create
app/providers/your_provider.py - Subclass
MarketDataProvider:
from app.providers import MarketDataProvider
from app.domain.events import Event, TradeEvent
class YourProvider(MarketDataProvider):
name = "your_provider"
async def connect(self) -> None:
# Establish connection
...
async def subscribe(self, symbols: list[str]) -> None:
# Subscribe to streams
...
async def stream(self) -> AsyncIterator[Event]:
# Yield normalized events, handle reconnect
while True:
raw = await self._receive()
yield self._parse(raw)
async def close(self) -> None:
...
- Register in
app/providers/__init__.py:
from app.providers.your_provider import YourProvider
registry = {
...
"your_provider": YourProvider,
}
- Run:
python -m app run --provider your_provider --symbols ...
Tests
pytest tests/ -v
TODO: Future Providers
- CoinAPI (REST + WebSocket, paid tier)
- IQFeed (US equities, DTN subscription)
- Polygon.io (real-time + historical)
- Interactive Brokers TWS API