feat: Add presence heartbeat for Matrix online status

- matrix-gateway: POST /internal/matrix/presence/online endpoint
- usePresenceHeartbeat hook with activity tracking
- Auto away after 5 min inactivity
- Offline on page close/visibility change
- Integrated in MatrixChatRoom component
This commit is contained in:
Apple
2025-11-27 00:19:40 -08:00
parent 5bed515852
commit 3de3c8cb36
6371 changed files with 1317450 additions and 932 deletions

View File

@@ -0,0 +1,24 @@
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Expose port
EXPOSE 7013
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import httpx; httpx.get('http://localhost:7013/health').raise_for_status()"
# Run
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7013"]

View File

@@ -0,0 +1,363 @@
# Usage Engine
**Port:** 7013
**Purpose:** Collect and report usage metrics for DAARION
## Features
**Collectors (NATS):**
- `usage.llm` — LLM call tracking
- `usage.tool` — Tool execution tracking
- `usage.agent` — Agent invocation tracking
- `messaging.message.created` — Message tracking
**Aggregators (PostgreSQL):**
- Summary by microDAO, agent, time period
- Model usage breakdown
- Agent activity breakdown
- Tool usage breakdown
**API:**
- `/internal/usage/summary` — Comprehensive usage report
- `/internal/usage/models` — Model-specific usage
- `/internal/usage/agents` — Agent-specific usage
- `/internal/usage/tools` — Tool-specific usage
## API
### GET /internal/usage/summary
Get comprehensive usage summary:
```bash
curl "http://localhost:7013/internal/usage/summary?microdao_id=microdao:7&period_hours=24"
```
**Response:**
```json
{
"summary": {
"period_start": "2025-11-23T12:00:00Z",
"period_end": "2025-11-24T12:00:00Z",
"microdao_id": "microdao:7",
"llm_calls_total": 145,
"llm_tokens_total": 87432,
"tool_calls_total": 23,
"agent_invocations_total": 56,
"messages_sent": 342
},
"models": [
{
"model": "gpt-4.1-mini",
"provider": "openai",
"calls": 120,
"tokens": 75000,
"avg_latency_ms": 1250
}
],
"agents": [
{
"agent_id": "agent:sofia",
"invocations": 45,
"llm_calls": 120,
"tool_calls": 15,
"total_tokens": 60000
}
],
"tools": [
{
"tool_id": "projects.list",
"tool_name": "List Projects",
"calls": 12,
"success_rate": 0.95,
"avg_latency_ms": 450
}
]
}
```
### Query Parameters
- `microdao_id` — Filter by microDAO (optional)
- `agent_id` — Filter by agent (optional)
- `period_hours` — Time period (1-720 hours, default 24)
### GET /internal/usage/models
Model usage breakdown:
```bash
curl "http://localhost:7013/internal/usage/models?period_hours=168"
```
### GET /internal/usage/agents
Agent activity breakdown:
```bash
curl "http://localhost:7013/internal/usage/agents?microdao_id=microdao:7"
```
### GET /internal/usage/tools
Tool execution breakdown:
```bash
curl "http://localhost:7013/internal/usage/tools?period_hours=24"
```
## NATS Integration
### Published Events (None)
Usage Engine only consumes events.
### Consumed Events
#### 1. usage.llm
From: `llm-proxy`
```json
{
"event_id": "evt-123",
"timestamp": "2025-11-24T12:00:00Z",
"actor_id": "user:93",
"actor_type": "human",
"agent_id": "agent:sofia",
"microdao_id": "microdao:7",
"model": "gpt-4.1-mini",
"provider": "openai",
"prompt_tokens": 450,
"completion_tokens": 120,
"total_tokens": 570,
"latency_ms": 1250,
"success": true
}
```
#### 2. usage.tool
From: `toolcore`
```json
{
"event_id": "evt-456",
"timestamp": "2025-11-24T12:01:00Z",
"actor_id": "agent:sofia",
"actor_type": "agent",
"agent_id": "agent:sofia",
"microdao_id": "microdao:7",
"tool_id": "projects.list",
"tool_name": "List Projects",
"success": true,
"latency_ms": 450
}
```
#### 3. usage.agent
From: `agent-runtime`
```json
{
"event_id": "evt-789",
"timestamp": "2025-11-24T12:02:00Z",
"agent_id": "agent:sofia",
"microdao_id": "microdao:7",
"channel_id": "channel-uuid",
"trigger": "message",
"duration_ms": 3450,
"llm_calls": 2,
"tool_calls": 1,
"success": true
}
```
#### 4. messaging.message.created
From: `messaging-service`
```json
{
"channel_id": "channel-uuid",
"message_id": "msg-uuid",
"sender_id": "user:93",
"sender_type": "human",
"microdao_id": "microdao:7",
"created_at": "2025-11-24T12:03:00Z"
}
```
## Database Schema
### usage_llm
```sql
CREATE TABLE usage_llm (
event_id TEXT PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
actor_id TEXT NOT NULL,
actor_type TEXT NOT NULL,
agent_id TEXT,
microdao_id TEXT,
model TEXT NOT NULL,
provider TEXT NOT NULL,
prompt_tokens INT NOT NULL,
completion_tokens INT NOT NULL,
total_tokens INT NOT NULL,
latency_ms INT NOT NULL,
success BOOLEAN NOT NULL DEFAULT true,
error TEXT,
metadata JSONB
);
CREATE INDEX idx_usage_llm_timestamp ON usage_llm(timestamp DESC);
CREATE INDEX idx_usage_llm_microdao ON usage_llm(microdao_id, timestamp DESC);
CREATE INDEX idx_usage_llm_agent ON usage_llm(agent_id, timestamp DESC);
```
### usage_tool
```sql
CREATE TABLE usage_tool (
event_id TEXT PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
actor_id TEXT NOT NULL,
actor_type TEXT NOT NULL,
agent_id TEXT,
microdao_id TEXT,
tool_id TEXT NOT NULL,
tool_name TEXT NOT NULL,
success BOOLEAN NOT NULL,
latency_ms INT NOT NULL,
error TEXT,
metadata JSONB
);
CREATE INDEX idx_usage_tool_timestamp ON usage_tool(timestamp DESC);
CREATE INDEX idx_usage_tool_microdao ON usage_tool(microdao_id, timestamp DESC);
```
### usage_agent
```sql
CREATE TABLE usage_agent (
event_id TEXT PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
agent_id TEXT NOT NULL,
microdao_id TEXT,
channel_id TEXT,
trigger TEXT NOT NULL,
duration_ms INT NOT NULL,
llm_calls INT DEFAULT 0,
tool_calls INT DEFAULT 0,
success BOOLEAN NOT NULL DEFAULT true,
error TEXT,
metadata JSONB
);
CREATE INDEX idx_usage_agent_timestamp ON usage_agent(timestamp DESC);
CREATE INDEX idx_usage_agent_id ON usage_agent(agent_id, timestamp DESC);
```
### usage_message
```sql
CREATE TABLE usage_message (
event_id TEXT PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
actor_id TEXT NOT NULL,
actor_type TEXT NOT NULL,
microdao_id TEXT NOT NULL,
channel_id TEXT NOT NULL,
message_length INT NOT NULL,
metadata JSONB
);
CREATE INDEX idx_usage_message_timestamp ON usage_message(timestamp DESC);
CREATE INDEX idx_usage_message_microdao ON usage_message(microdao_id, timestamp DESC);
```
## Setup
### Local Development
```bash
cd services/usage-engine
pip install -r requirements.txt
export DATABASE_URL="postgresql://..."
export NATS_URL="nats://localhost:4222"
python main.py
```
### Docker
```bash
docker build -t usage-engine .
docker run -p 7013:7013 \
-e DATABASE_URL="postgresql://..." \
-e NATS_URL="nats://nats:4222" \
usage-engine
```
## Testing
### Publish Test Events
```bash
# LLM event
nats pub usage.llm '{"event_id":"test-1","timestamp":"2025-11-24T12:00:00Z",...}'
# Check aggregation
curl "http://localhost:7013/internal/usage/summary?period_hours=1"
```
## Integration
### llm-proxy Integration
After every LLM call:
```python
await publish_nats_event("usage.llm", {
"event_id": str(uuid4()),
"timestamp": datetime.utcnow().isoformat(),
"model": model,
"total_tokens": usage.total_tokens,
# ...
})
```
### toolcore Integration
After every tool execution:
```python
await publish_nats_event("usage.tool", {
"event_id": str(uuid4()),
"tool_id": tool_id,
"success": success,
# ...
})
```
### agent-runtime Integration
After every agent invocation:
```python
await publish_nats_event("usage.agent", {
"event_id": str(uuid4()),
"agent_id": agent_id,
"llm_calls": llm_call_count,
"tool_calls": tool_call_count,
# ...
})
```
## Roadmap
### Phase 4 (Current):
- ✅ NATS collectors
- ✅ PostgreSQL storage
- ✅ Basic aggregation API
### Phase 5:
- 🔜 Real-time dashboards (WebSockets)
- 🔜 Cost estimation (per model)
- 🔜 Billing integration
- 🔜 Quota management
- 🔜 Anomaly detection
---
**Status:** ✅ Phase 4 Ready
**Version:** 1.0.0
**Last Updated:** 2025-11-24

View File

@@ -0,0 +1,239 @@
"""
Usage Data Aggregators
Queries and aggregates usage data from database
"""
import asyncpg
from datetime import datetime, timedelta
from typing import Optional, List
from models import (
UsageSummary,
ModelUsage,
AgentUsage,
ToolUsage,
UsageQueryRequest
)
class UsageAggregator:
"""Aggregates usage data for reporting"""
def __init__(self, db_pool: asyncpg.Pool):
self.db_pool = db_pool
async def get_summary(
self,
microdao_id: Optional[str] = None,
agent_id: Optional[str] = None,
period_hours: int = 24
) -> UsageSummary:
"""Get aggregated usage summary"""
period_start = datetime.utcnow() - timedelta(hours=period_hours)
period_end = datetime.utcnow()
async with self.db_pool.acquire() as conn:
# LLM stats
llm_stats = await conn.fetchrow("""
SELECT
COUNT(*) as calls,
SUM(total_tokens) as tokens_total,
SUM(prompt_tokens) as tokens_prompt,
SUM(completion_tokens) as tokens_completion,
AVG(latency_ms) as latency_avg
FROM usage_llm
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
AND ($4::text IS NULL OR agent_id = $4)
""", period_start, period_end, microdao_id, agent_id)
# Tool stats
tool_stats = await conn.fetchrow("""
SELECT
COUNT(*) as calls,
SUM(CASE WHEN success THEN 1 ELSE 0 END) as success,
SUM(CASE WHEN NOT success THEN 1 ELSE 0 END) as failed,
AVG(latency_ms) as latency_avg
FROM usage_tool
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
AND ($4::text IS NULL OR agent_id = $4)
""", period_start, period_end, microdao_id, agent_id)
# Agent stats
agent_stats = await conn.fetchrow("""
SELECT
COUNT(*) as invocations,
SUM(CASE WHEN success THEN 1 ELSE 0 END) as success,
SUM(CASE WHEN NOT success THEN 1 ELSE 0 END) as failed
FROM usage_agent
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
AND ($4::text IS NULL OR agent_id = $4)
""", period_start, period_end, microdao_id, agent_id)
# Message stats
message_stats = await conn.fetchrow("""
SELECT
COUNT(*) as sent,
SUM(message_length) as total_length
FROM usage_message
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
""", period_start, period_end, microdao_id)
return UsageSummary(
period_start=period_start,
period_end=period_end,
microdao_id=microdao_id,
agent_id=agent_id,
llm_calls_total=llm_stats['calls'] or 0,
llm_tokens_total=llm_stats['tokens_total'] or 0,
llm_tokens_prompt=llm_stats['tokens_prompt'] or 0,
llm_tokens_completion=llm_stats['tokens_completion'] or 0,
llm_latency_avg_ms=float(llm_stats['latency_avg'] or 0),
tool_calls_total=tool_stats['calls'] or 0,
tool_calls_success=tool_stats['success'] or 0,
tool_calls_failed=tool_stats['failed'] or 0,
tool_latency_avg_ms=float(tool_stats['latency_avg'] or 0),
agent_invocations_total=agent_stats['invocations'] or 0,
agent_invocations_success=agent_stats['success'] or 0,
agent_invocations_failed=agent_stats['failed'] or 0,
messages_sent=message_stats['sent'] or 0,
messages_total_length=message_stats['total_length'] or 0
)
async def get_model_breakdown(
self,
microdao_id: Optional[str] = None,
period_hours: int = 24
) -> List[ModelUsage]:
"""Get usage breakdown by model"""
period_start = datetime.utcnow() - timedelta(hours=period_hours)
period_end = datetime.utcnow()
async with self.db_pool.acquire() as conn:
rows = await conn.fetch("""
SELECT
model,
provider,
COUNT(*) as calls,
SUM(total_tokens) as tokens,
AVG(latency_ms) as latency_avg
FROM usage_llm
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
GROUP BY model, provider
ORDER BY tokens DESC
LIMIT 20
""", period_start, period_end, microdao_id)
return [
ModelUsage(
model=row['model'],
provider=row['provider'],
calls=row['calls'],
tokens=row['tokens'] or 0,
avg_latency_ms=float(row['latency_avg'] or 0)
)
for row in rows
]
async def get_agent_breakdown(
self,
microdao_id: Optional[str] = None,
period_hours: int = 24
) -> List[AgentUsage]:
"""Get usage breakdown by agent"""
period_start = datetime.utcnow() - timedelta(hours=period_hours)
period_end = datetime.utcnow()
async with self.db_pool.acquire() as conn:
rows = await conn.fetch("""
SELECT
a.agent_id,
COUNT(DISTINCT a.event_id) as invocations,
COALESCE(SUM(a.llm_calls), 0) as llm_calls,
COALESCE(SUM(a.tool_calls), 0) as tool_calls,
COALESCE(llm.tokens, 0) as total_tokens,
COALESCE(msg.messages, 0) as messages_sent
FROM usage_agent a
LEFT JOIN (
SELECT agent_id, SUM(total_tokens) as tokens
FROM usage_llm
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
GROUP BY agent_id
) llm ON llm.agent_id = a.agent_id
LEFT JOIN (
SELECT actor_id, COUNT(*) as messages
FROM usage_message
WHERE timestamp >= $1 AND timestamp <= $2
AND actor_type = 'agent'
AND ($3::text IS NULL OR microdao_id = $3)
GROUP BY actor_id
) msg ON msg.actor_id = a.agent_id
WHERE a.timestamp >= $1 AND a.timestamp <= $2
AND ($3::text IS NULL OR a.microdao_id = $3)
GROUP BY a.agent_id, llm.tokens, msg.messages
ORDER BY invocations DESC
LIMIT 20
""", period_start, period_end, microdao_id)
return [
AgentUsage(
agent_id=row['agent_id'],
invocations=row['invocations'],
llm_calls=row['llm_calls'],
tool_calls=row['tool_calls'],
messages_sent=row['messages_sent'],
total_tokens=row['total_tokens']
)
for row in rows
]
async def get_tool_breakdown(
self,
microdao_id: Optional[str] = None,
period_hours: int = 24
) -> List[ToolUsage]:
"""Get usage breakdown by tool"""
period_start = datetime.utcnow() - timedelta(hours=period_hours)
period_end = datetime.utcnow()
async with self.db_pool.acquire() as conn:
rows = await conn.fetch("""
SELECT
tool_id,
tool_name,
COUNT(*) as calls,
AVG(CASE WHEN success THEN 1.0 ELSE 0.0 END) as success_rate,
AVG(latency_ms) as latency_avg
FROM usage_tool
WHERE timestamp >= $1 AND timestamp <= $2
AND ($3::text IS NULL OR microdao_id = $3)
GROUP BY tool_id, tool_name
ORDER BY calls DESC
LIMIT 20
""", period_start, period_end, microdao_id)
return [
ToolUsage(
tool_id=row['tool_id'],
tool_name=row['tool_name'],
calls=row['calls'],
success_rate=float(row['success_rate'] or 0),
avg_latency_ms=float(row['latency_avg'] or 0)
)
for row in rows
]

View File

@@ -0,0 +1,184 @@
"""
Usage Event Collectors (NATS Listeners)
Collects usage events from various services via NATS
"""
import json
import asyncio
import asyncpg
from datetime import datetime
from typing import Optional
from models import (
LlmUsageEvent,
ToolUsageEvent,
AgentInvocationEvent,
MessageUsageEvent,
UsageEventType
)
class UsageCollector:
"""Collects and stores usage events from NATS"""
def __init__(self, nats_client, db_pool: asyncpg.Pool):
self.nc = nats_client
self.db_pool = db_pool
self.subscriptions = []
async def start(self):
"""Subscribe to all usage subjects"""
print("🎧 Starting usage collectors...")
# Subscribe to LLM usage
sub_llm = await self.nc.subscribe("usage.llm", cb=self._handle_llm_event)
self.subscriptions.append(sub_llm)
print("✅ Subscribed to usage.llm")
# Subscribe to Tool usage
sub_tool = await self.nc.subscribe("usage.tool", cb=self._handle_tool_event)
self.subscriptions.append(sub_tool)
print("✅ Subscribed to usage.tool")
# Subscribe to Agent invocations
sub_agent = await self.nc.subscribe("usage.agent", cb=self._handle_agent_event)
self.subscriptions.append(sub_agent)
print("✅ Subscribed to usage.agent")
# Subscribe to Message events
sub_message = await self.nc.subscribe("messaging.message.created", cb=self._handle_message_event)
self.subscriptions.append(sub_message)
print("✅ Subscribed to messaging.message.created")
print("🎧 All collectors active")
async def stop(self):
"""Unsubscribe from all subjects"""
for sub in self.subscriptions:
await sub.unsubscribe()
print("🛑 All collectors stopped")
# ========================================================================
# Event Handlers
# ========================================================================
async def _handle_llm_event(self, msg):
"""Handle LLM usage event"""
try:
data = json.loads(msg.data.decode())
event = LlmUsageEvent(**data)
await self._store_llm_event(event)
print(f"📊 LLM usage: {event.model} | {event.total_tokens} tokens | {event.latency_ms}ms")
except Exception as e:
print(f"❌ Error handling LLM event: {e}")
async def _handle_tool_event(self, msg):
"""Handle tool usage event"""
try:
data = json.loads(msg.data.decode())
event = ToolUsageEvent(**data)
await self._store_tool_event(event)
print(f"📊 Tool usage: {event.tool_id} | success={event.success} | {event.latency_ms}ms")
except Exception as e:
print(f"❌ Error handling tool event: {e}")
async def _handle_agent_event(self, msg):
"""Handle agent invocation event"""
try:
data = json.loads(msg.data.decode())
event = AgentInvocationEvent(**data)
await self._store_agent_event(event)
print(f"📊 Agent invocation: {event.agent_id} | {event.duration_ms}ms | LLM:{event.llm_calls} Tool:{event.tool_calls}")
except Exception as e:
print(f"❌ Error handling agent event: {e}")
async def _handle_message_event(self, msg):
"""Handle message sent event"""
try:
data = json.loads(msg.data.decode())
# Convert messaging event to usage event
event = MessageUsageEvent(
event_id=data.get("message_id", "unknown"),
timestamp=datetime.fromisoformat(data.get("created_at", datetime.utcnow().isoformat())),
actor_id=data.get("sender_id", "unknown"),
actor_type=data.get("sender_type", "human"),
microdao_id=data.get("microdao_id", "unknown"),
channel_id=data.get("channel_id", "unknown"),
message_length=len(data.get("content_preview", "")),
metadata={"matrix_event_id": data.get("matrix_event_id")}
)
await self._store_message_event(event)
print(f"📊 Message sent: {event.actor_id} | {event.message_length} chars")
except Exception as e:
print(f"❌ Error handling message event: {e}")
# ========================================================================
# Database Storage
# ========================================================================
async def _store_llm_event(self, event: LlmUsageEvent):
"""Store LLM usage event to database"""
async with self.db_pool.acquire() as conn:
await conn.execute("""
INSERT INTO usage_llm
(event_id, timestamp, actor_id, actor_type, agent_id, microdao_id,
model, provider, prompt_tokens, completion_tokens, total_tokens,
latency_ms, success, error, metadata)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
ON CONFLICT (event_id) DO NOTHING
""",
event.event_id, event.timestamp, event.actor_id, event.actor_type.value,
event.agent_id, event.microdao_id, event.model, event.provider,
event.prompt_tokens, event.completion_tokens, event.total_tokens,
event.latency_ms, event.success, event.error,
json.dumps(event.metadata or {})
)
async def _store_tool_event(self, event: ToolUsageEvent):
"""Store tool usage event to database"""
async with self.db_pool.acquire() as conn:
await conn.execute("""
INSERT INTO usage_tool
(event_id, timestamp, actor_id, actor_type, agent_id, microdao_id,
tool_id, tool_name, success, latency_ms, error, metadata)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12)
ON CONFLICT (event_id) DO NOTHING
""",
event.event_id, event.timestamp, event.actor_id, event.actor_type.value,
event.agent_id, event.microdao_id, event.tool_id, event.tool_name,
event.success, event.latency_ms, event.error,
json.dumps(event.metadata or {})
)
async def _store_agent_event(self, event: AgentInvocationEvent):
"""Store agent invocation event to database"""
async with self.db_pool.acquire() as conn:
await conn.execute("""
INSERT INTO usage_agent
(event_id, timestamp, agent_id, microdao_id, channel_id,
trigger, duration_ms, llm_calls, tool_calls, success, error, metadata)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12)
ON CONFLICT (event_id) DO NOTHING
""",
event.event_id, event.timestamp, event.agent_id, event.microdao_id,
event.channel_id, event.trigger, event.duration_ms, event.llm_calls,
event.tool_calls, event.success, event.error,
json.dumps(event.metadata or {})
)
async def _store_message_event(self, event: MessageUsageEvent):
"""Store message usage event to database"""
async with self.db_pool.acquire() as conn:
await conn.execute("""
INSERT INTO usage_message
(event_id, timestamp, actor_id, actor_type, microdao_id, channel_id,
message_length, metadata)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
ON CONFLICT (event_id) DO NOTHING
""",
event.event_id, event.timestamp, event.actor_id, event.actor_type.value,
event.microdao_id, event.channel_id, event.message_length,
json.dumps(event.metadata or {})
)

View File

@@ -0,0 +1,221 @@
"""
DAARION Usage Engine
Port: 7013
Collects and reports usage metrics (LLM, Tools, Agents, Messages)
"""
import os
import asyncio
import asyncpg
import nats
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException, Query
from fastapi.middleware.cors import CORSMiddleware
from typing import Optional
from models import UsageQueryRequest, UsageQueryResponse
from collectors import UsageCollector
from aggregators import UsageAggregator
# ============================================================================
# Configuration
# ============================================================================
DATABASE_URL = os.getenv("DATABASE_URL", "postgresql://postgres:postgres@localhost:5432/daarion")
NATS_URL = os.getenv("NATS_URL", "nats://nats:4222")
# ============================================================================
# Global State
# ============================================================================
db_pool: Optional[asyncpg.Pool] = None
nc: Optional[nats.NATS] = None
collector: Optional[UsageCollector] = None
aggregator: Optional[UsageAggregator] = None
# ============================================================================
# App Setup
# ============================================================================
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Startup and shutdown"""
global db_pool, nc, collector, aggregator
print("🚀 Starting Usage Engine...")
# Database
db_pool = await asyncpg.create_pool(DATABASE_URL, min_size=2, max_size=10)
print("✅ Database pool created")
# NATS
try:
nc = await nats.connect(NATS_URL)
print(f"✅ Connected to NATS at {NATS_URL}")
except Exception as e:
print(f"❌ Failed to connect to NATS: {e}")
nc = None
# Collector
if nc:
collector = UsageCollector(nc, db_pool)
await collector.start()
else:
print("⚠️ NATS not available, collector disabled")
# Aggregator
aggregator = UsageAggregator(db_pool)
print("✅ Aggregator ready")
print("✅ Usage Engine ready")
yield
# Shutdown
print("🛑 Shutting down Usage Engine...")
if collector:
await collector.stop()
if nc:
await nc.close()
if db_pool:
await db_pool.close()
app = FastAPI(
title="DAARION Usage Engine",
version="1.0.0",
description="Usage tracking and reporting for LLM, Tools, Agents",
lifespan=lifespan
)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# ============================================================================
# API Endpoints
# ============================================================================
@app.get("/internal/usage/summary", response_model=UsageQueryResponse)
async def get_usage_summary(
microdao_id: Optional[str] = Query(None),
agent_id: Optional[str] = Query(None),
period_hours: int = Query(24, ge=1, le=720)
):
"""
Get aggregated usage summary
Query parameters:
- microdao_id: Filter by microDAO (optional)
- agent_id: Filter by agent (optional)
- period_hours: Time period (1-720 hours, default 24)
"""
if not aggregator:
raise HTTPException(500, "Aggregator not initialized")
# Get summary
summary = await aggregator.get_summary(
microdao_id=microdao_id,
agent_id=agent_id,
period_hours=period_hours
)
# Get breakdowns
models = await aggregator.get_model_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
agents = await aggregator.get_agent_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
tools = await aggregator.get_tool_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
return UsageQueryResponse(
summary=summary,
models=models,
agents=agents,
tools=tools
)
@app.get("/internal/usage/models")
async def get_model_usage(
microdao_id: Optional[str] = Query(None),
period_hours: int = Query(24, ge=1, le=720)
):
"""Get usage breakdown by model"""
if not aggregator:
raise HTTPException(500, "Aggregator not initialized")
models = await aggregator.get_model_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
return {"models": models}
@app.get("/internal/usage/agents")
async def get_agent_usage(
microdao_id: Optional[str] = Query(None),
period_hours: int = Query(24, ge=1, le=720)
):
"""Get usage breakdown by agent"""
if not aggregator:
raise HTTPException(500, "Aggregator not initialized")
agents = await aggregator.get_agent_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
return {"agents": agents}
@app.get("/internal/usage/tools")
async def get_tool_usage(
microdao_id: Optional[str] = Query(None),
period_hours: int = Query(24, ge=1, le=720)
):
"""Get usage breakdown by tool"""
if not aggregator:
raise HTTPException(500, "Aggregator not initialized")
tools = await aggregator.get_tool_breakdown(
microdao_id=microdao_id,
period_hours=period_hours
)
return {"tools": tools}
@app.get("/health")
async def health():
"""Health check"""
return {
"status": "ok",
"service": "usage-engine",
"nats_connected": nc is not None,
"collector_active": collector is not None,
"aggregator_ready": aggregator is not None
}
# ============================================================================
# Run
# ============================================================================
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=7013)

View File

@@ -0,0 +1,161 @@
"""
Usage Engine Data Models
Tracks LLM calls, tool executions, agent invocations
"""
from pydantic import BaseModel, Field
from datetime import datetime
from typing import Optional, Dict, Any
from enum import Enum
class UsageEventType(str, Enum):
LLM_CALL = "llm_call"
TOOL_CALL = "tool_call"
AGENT_INVOCATION = "agent_invocation"
MESSAGE_SENT = "message_sent"
class ActorType(str, Enum):
HUMAN = "human"
AGENT = "agent"
SERVICE = "service"
# ============================================================================
# Usage Events (inbound from NATS)
# ============================================================================
class LlmUsageEvent(BaseModel):
"""LLM call usage event from llm-proxy"""
event_id: str
timestamp: datetime
actor_id: str
actor_type: ActorType
agent_id: Optional[str] = None
microdao_id: Optional[str] = None
model: str
provider: str # "openai", "deepseek", "local"
prompt_tokens: int
completion_tokens: int
total_tokens: int
latency_ms: int
success: bool = True
error: Optional[str] = None
metadata: Optional[Dict[str, Any]] = None
class ToolUsageEvent(BaseModel):
"""Tool execution usage event from toolcore"""
event_id: str
timestamp: datetime
actor_id: str
actor_type: ActorType
agent_id: Optional[str] = None
microdao_id: Optional[str] = None
tool_id: str
tool_name: str
success: bool
latency_ms: int
error: Optional[str] = None
metadata: Optional[Dict[str, Any]] = None
class AgentInvocationEvent(BaseModel):
"""Agent invocation usage event from agent-runtime"""
event_id: str
timestamp: datetime
agent_id: str
microdao_id: Optional[str] = None
channel_id: Optional[str] = None
trigger: str # "message", "scheduled", "manual"
duration_ms: int
llm_calls: int = 0
tool_calls: int = 0
success: bool = True
error: Optional[str] = None
metadata: Optional[Dict[str, Any]] = None
class MessageUsageEvent(BaseModel):
"""Message sent usage event from messaging-service"""
event_id: str
timestamp: datetime
actor_id: str
actor_type: ActorType
microdao_id: str
channel_id: str
message_length: int
metadata: Optional[Dict[str, Any]] = None
# ============================================================================
# Aggregated Usage Reports (outbound API)
# ============================================================================
class UsageSummary(BaseModel):
"""Aggregated usage summary"""
period_start: datetime
period_end: datetime
microdao_id: Optional[str] = None
agent_id: Optional[str] = None
# LLM stats
llm_calls_total: int = 0
llm_tokens_total: int = 0
llm_tokens_prompt: int = 0
llm_tokens_completion: int = 0
llm_latency_avg_ms: float = 0.0
# Tool stats
tool_calls_total: int = 0
tool_calls_success: int = 0
tool_calls_failed: int = 0
tool_latency_avg_ms: float = 0.0
# Agent stats
agent_invocations_total: int = 0
agent_invocations_success: int = 0
agent_invocations_failed: int = 0
# Message stats
messages_sent: int = 0
messages_total_length: int = 0
class ModelUsage(BaseModel):
"""Usage by model"""
model: str
provider: str
calls: int
tokens: int
avg_latency_ms: float
class AgentUsage(BaseModel):
"""Usage by agent"""
agent_id: str
invocations: int
llm_calls: int
tool_calls: int
messages_sent: int
total_tokens: int
class ToolUsage(BaseModel):
"""Usage by tool"""
tool_id: str
tool_name: str
calls: int
success_rate: float
avg_latency_ms: float
# ============================================================================
# API Request/Response Models
# ============================================================================
class UsageQueryRequest(BaseModel):
"""Request for usage summary"""
microdao_id: Optional[str] = None
agent_id: Optional[str] = None
period_hours: int = Field(24, ge=1, le=720) # 1h - 30 days
class UsageQueryResponse(BaseModel):
"""Response for usage summary"""
summary: UsageSummary
models: list[ModelUsage] = []
agents: list[AgentUsage] = []
tools: list[ToolUsage] = []

View File

@@ -0,0 +1,10 @@
fastapi==0.109.0
uvicorn[standard]==0.27.0
pydantic==2.5.3
asyncpg==0.29.0
nats-py==2.6.0
python-multipart==0.0.6