Files
microdao-daarion/docs/aistalk/contract.md
Apple 67225a39fa docs(platform): add policy configs, runbooks, ops scripts and platform documentation
Config policies (16 files): alert_routing, architecture_pressure, backlog,
cost_weights, data_governance, incident_escalation, incident_intelligence,
network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix,
release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout

Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard,
deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice,
cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule),
task_registry, voice alerts/ha/latency/policy

Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks,
NODA1/NODA2 status and setup, audit index and traces, backlog, incident,
supervisor, tools, voice, opencode, release, risk, aistalk, spacebot

Made-with: Cursor
2026-03-03 07:14:53 -08:00

7.3 KiB
Raw Blame History

AISTALK ↔ Sofiia Console — Integration Contract

Version: 1.0
Date: 2026-02-25
Status: STUB READY — integration pending AISTALK implementation


Overview

AISTALK connects to Sofiia Console BFF (sofiia-console, port 8002) via two channels:

Channel Direction Protocol
/ws/events BFF → AISTALK WebSocket (text/JSON)
/api/chat/send AISTALK → BFF HTTP POST
/api/voice/stt AISTALK → BFF HTTP POST multipart
/api/voice/tts AISTALK → BFF HTTP POST → audio stream

1. WebSocket Event Stream: /ws/events

AISTALK connects as a subscriber to receive all platform events in real time.

Connection

ws://<BFF_HOST>:8002/ws/events

Optional auth header (if SOFIIA_CONSOLE_API_KEY is set):

X-API-Key: <key>

Keep-alive (ping/pong)

Client should send {"type":"ping"} every 1030s.
Server responds with {"type":"pong","ts":"..."}.

Event Envelope

Every event has this shape:

{
  "v": 1,
  "type": "<event_type>",
  "ts": "2026-02-25T12:34:56.789Z",
  "project_id": "default",
  "session_id": "sess_abc123",
  "user_id": "console_user",
  "data": { ... }
}

Event Types AISTALK Should Consume

chat.message — user sent a message

{
  "data": {
    "text": "...",
    "provider": "ollama|router",
    "model": "ollama:glm-4.7-flash:32k"
  }
}

chat.reply — Sofiia replied

{
  "data": {
    "text": "...",
    "provider": "ollama|router",
    "model": "...",
    "latency_ms": 1234
  }
}

AISTALK should TTS this text (if voice channel is active) via /api/voice/tts.

voice.stt — STT lifecycle

{
  "data": {
    "phase": "start|done|error",
    "elapsed_ms": 456
  }
}

AISTALK uses phase=start to mute its own mic; phase=done to unmute.

voice.tts — TTS lifecycle

{
  "data": {
    "phase": "start|done|error",
    "voice": "Polina",
    "elapsed_ms": 789
  }
}

AISTALK uses phase=start to begin audio playback; phase=done as end signal.

ops.run — governance operation result

{
  "data": {
    "name": "risk_dashboard|pressure_dashboard|backlog_generate_weekly|release_check",
    "ok": true,
    "elapsed_ms": 999
  }
}

nodes.status — node network heartbeat (every 15s)

{
  "data": {
    "bff_uptime_s": 3600,
    "ws_clients": 2,
    "nodes": [
      {"id": "NODA1", "online": true, "router_ok": true, "router_latency_ms": 12},
      {"id": "NODA2", "online": true, "router_ok": true, "router_latency_ms": 5}
    ],
    "nodes_ts": "2026-02-25T12:34:50Z"
  }
}

error — platform error

{
  "data": {
    "where": "bff|router|memory|ollama",
    "message": "...",
    "code": "optional_code"
  }
}

Event Types AISTALK Should Ignore

  • tool.called / tool.result — internal governance, not relevant for voice
  • Any type not listed above — forward compatibility, AISTALK must not crash on unknown types

2. Sending Text to Sofiia: POST /api/chat/send

AISTALK sends user text (transcribed from voice or typed):

POST http://<BFF_HOST>:8002/api/chat/send
Content-Type: application/json
X-API-Key: <key>

{
  "message": "Sofiia, покажи risk dashboard",
  "model": "ollama:glm-4.7-flash:32k",
  "project_id": "aistalk",
  "session_id": "aistalk_sess_<uuid>",
  "user_id": "aistalk_user",
  "provider": "ollama"
}

Response:

{
  "ok": true,
  "project_id": "aistalk",
  "session_id": "aistalk_sess_...",
  "user_id": "aistalk_user",
  "response": "Ось Risk Dashboard...",
  "model": "ollama:glm-4.7-flash:32k",
  "backend": "ollama",
  "meta": {"latency_ms": 1234, "tokens_est": 87}
}

AISTALK should use the response field text for TTS.


3. Speech-to-Text: POST /api/voice/stt

POST http://<BFF_HOST>:8002/api/voice/stt?session_id=<sid>&project_id=<pid>
Content-Type: multipart/form-data
X-API-Key: <key>

audio=<binary; MIME: audio/webm or audio/wav>

Response:

{
  "text": "Sofiia, покажи risk dashboard",
  "language": "uk",
  "segments": [...]
}

Audio constraints:

  • Max size: no hard limit, but keep under 10MB per chunk
  • Format: audio/webm (Opus) or audio/wav
  • Duration: up to 60s per chunk

4. Text-to-Speech: POST /api/voice/tts

POST http://<BFF_HOST>:8002/api/voice/tts
Content-Type: application/json
X-API-Key: <key>

{
  "text": "Ось Risk Dashboard для gateway...",
  "voice": "default",
  "speed": 1.0,
  "session_id": "aistalk_sess_...",
  "project_id": "aistalk"
}

Response: audio/wav binary stream (or audio/mpeg).

Voice options (Ukrainian):

voice description
default Polina Neural (uk-UA, edge-tts)
Ostap Ostap Neural (uk-UA, edge-tts)
Milena Milena (macOS, fallback)
Yuri Yuri (macOS, fallback)

Text limit: 500 chars per call (BFF enforces). Split longer responses.


5. AISTALK Adapter Interface (BFF-side stub)

File: services/sofiia-console/app/adapters/aistalk.py

class AISTALKAdapter:
    def send_text(self, project_id, session_id, text) -> None
    def send_audio(self, project_id, session_id, audio_bytes, mime) -> None
    def handle_event(self, event: dict) -> None   # called on chat.reply, ops.run etc.
    def on_event(self, event: dict) -> None        # alias

Activation:

AISTALK_ENABLED=true
AISTALK_URL=http://<aistalk-bridge>:<port>
AISTALK_API_KEY=<optional>

Currently the adapter is a noop stub with logging. Replace send_text / send_audio / handle_event with actual HTTP/WebSocket calls to AISTALK bridge when ready.


6. Session Identity

AISTALK must use consistent project_id and session_id across all calls in one conversation:

project_id: "aistalk"           # fixed
session_id: "aistalk_<uuid>"    # new UUID per conversation
user_id:    "aistalk_user"      # fixed or per-user identity

This ensures memory continuity in memory-service and proper WS event filtering.


7. Rate Limits (BFF enforces)

Endpoint Limit
/api/chat/send 30 req/min per IP
/api/voice/stt 20 req/min per IP
/api/voice/tts 30 req/min per IP

AISTALK should implement backoff on HTTP 429.


8. Hello World Verification

# 1. Connect WS
wscat -c ws://localhost:8002/ws/events

# 2. Send a message
curl -X POST http://localhost:8002/api/chat/send \
  -H "Content-Type: application/json" \
  -d '{"message":"привіт Sofiia","model":"ollama:glm-4.7-flash:32k","project_id":"aistalk","session_id":"test_001","user_id":"aistalk_user"}'

# 3. WS should receive chat.message + chat.reply events

# 4. TTS test
curl -X POST http://localhost:8002/api/voice/tts \
  -H "Content-Type: application/json" \
  -d '{"text":"Привіт! Я Sofiia.","voice":"default"}' \
  --output test.wav && afplay test.wav

9. Full-Duplex Voice Flow (AISTALK sequence)

User speaks
  → AISTALK records audio
  → POST /api/voice/stt  (receives text)
  → POST /api/chat/send  (receives reply text)
  → POST /api/voice/tts  (receives audio)
  → AISTALK plays audio

WS events observed:
  voice.stt {phase:start} → voice.stt {phase:done}
  → chat.message → chat.reply
  → voice.tts {phase:start} → voice.tts {phase:done}

Echo cancellation: AISTALK must mute its microphone during TTS playback (voice.tts phase=start → mute, phase=done → unmute).