# AISTALK ↔ Sofiia Console — Integration Contract Version: 1.0 Date: 2026-02-25 Status: **STUB READY** — integration pending AISTALK implementation --- ## Overview AISTALK connects to Sofiia Console BFF (`sofiia-console`, port 8002) via two channels: | Channel | Direction | Protocol | |---|---|---| | `/ws/events` | BFF → AISTALK | WebSocket (text/JSON) | | `/api/chat/send` | AISTALK → BFF | HTTP POST | | `/api/voice/stt` | AISTALK → BFF | HTTP POST multipart | | `/api/voice/tts` | AISTALK → BFF | HTTP POST → audio stream | --- ## 1. WebSocket Event Stream: `/ws/events` AISTALK connects as a subscriber to receive all platform events in real time. ### Connection ``` ws://:8002/ws/events ``` Optional auth header (if `SOFIIA_CONSOLE_API_KEY` is set): ``` X-API-Key: ``` ### Keep-alive (ping/pong) Client should send `{"type":"ping"}` every 10–30s. Server responds with `{"type":"pong","ts":"..."}`. ### Event Envelope Every event has this shape: ```json { "v": 1, "type": "", "ts": "2026-02-25T12:34:56.789Z", "project_id": "default", "session_id": "sess_abc123", "user_id": "console_user", "data": { ... } } ``` ### Event Types AISTALK Should Consume #### `chat.message` — user sent a message ```json { "data": { "text": "...", "provider": "ollama|router", "model": "ollama:glm-4.7-flash:32k" } } ``` #### `chat.reply` — Sofiia replied ```json { "data": { "text": "...", "provider": "ollama|router", "model": "...", "latency_ms": 1234 } } ``` > AISTALK should TTS this text (if voice channel is active) via `/api/voice/tts`. #### `voice.stt` — STT lifecycle ```json { "data": { "phase": "start|done|error", "elapsed_ms": 456 } } ``` > AISTALK uses `phase=start` to mute its own mic; `phase=done` to unmute. #### `voice.tts` — TTS lifecycle ```json { "data": { "phase": "start|done|error", "voice": "Polina", "elapsed_ms": 789 } } ``` > AISTALK uses `phase=start` to begin audio playback; `phase=done` as end signal. #### `ops.run` — governance operation result ```json { "data": { "name": "risk_dashboard|pressure_dashboard|backlog_generate_weekly|release_check", "ok": true, "elapsed_ms": 999 } } ``` #### `nodes.status` — node network heartbeat (every 15s) ```json { "data": { "bff_uptime_s": 3600, "ws_clients": 2, "nodes": [ {"id": "NODA1", "online": true, "router_ok": true, "router_latency_ms": 12}, {"id": "NODA2", "online": true, "router_ok": true, "router_latency_ms": 5} ], "nodes_ts": "2026-02-25T12:34:50Z" } } ``` #### `error` — platform error ```json { "data": { "where": "bff|router|memory|ollama", "message": "...", "code": "optional_code" } } ``` ### Event Types AISTALK Should Ignore - `tool.called` / `tool.result` — internal governance, not relevant for voice - Any `type` not listed above — forward compatibility, AISTALK must not crash on unknown types --- ## 2. Sending Text to Sofiia: `POST /api/chat/send` AISTALK sends user text (transcribed from voice or typed): ```http POST http://:8002/api/chat/send Content-Type: application/json X-API-Key: { "message": "Sofiia, покажи risk dashboard", "model": "ollama:glm-4.7-flash:32k", "project_id": "aistalk", "session_id": "aistalk_sess_", "user_id": "aistalk_user", "provider": "ollama" } ``` Response: ```json { "ok": true, "project_id": "aistalk", "session_id": "aistalk_sess_...", "user_id": "aistalk_user", "response": "Ось Risk Dashboard...", "model": "ollama:glm-4.7-flash:32k", "backend": "ollama", "meta": {"latency_ms": 1234, "tokens_est": 87} } ``` AISTALK should use the `response` field text for TTS. --- ## 3. Speech-to-Text: `POST /api/voice/stt` ```http POST http://:8002/api/voice/stt?session_id=&project_id= Content-Type: multipart/form-data X-API-Key: audio= ``` Response: ```json { "text": "Sofiia, покажи risk dashboard", "language": "uk", "segments": [...] } ``` Audio constraints: - Max size: no hard limit, but keep under 10MB per chunk - Format: `audio/webm` (Opus) or `audio/wav` - Duration: up to 60s per chunk --- ## 4. Text-to-Speech: `POST /api/voice/tts` ```http POST http://:8002/api/voice/tts Content-Type: application/json X-API-Key: { "text": "Ось Risk Dashboard для gateway...", "voice": "default", "speed": 1.0, "session_id": "aistalk_sess_...", "project_id": "aistalk" } ``` Response: `audio/wav` binary stream (or `audio/mpeg`). Voice options (Ukrainian): | voice | description | |---|---| | `default` | Polina Neural (uk-UA, edge-tts) | | `Ostap` | Ostap Neural (uk-UA, edge-tts) | | `Milena` | Milena (macOS, fallback) | | `Yuri` | Yuri (macOS, fallback) | Text limit: 500 chars per call (BFF enforces). Split longer responses. --- ## 5. AISTALK Adapter Interface (BFF-side stub) File: `services/sofiia-console/app/adapters/aistalk.py` ```python class AISTALKAdapter: def send_text(self, project_id, session_id, text) -> None def send_audio(self, project_id, session_id, audio_bytes, mime) -> None def handle_event(self, event: dict) -> None # called on chat.reply, ops.run etc. def on_event(self, event: dict) -> None # alias ``` Activation: ```env AISTALK_ENABLED=true AISTALK_URL=http://: AISTALK_API_KEY= ``` Currently the adapter is a **noop stub** with logging. Replace `send_text` / `send_audio` / `handle_event` with actual HTTP/WebSocket calls to AISTALK bridge when ready. --- ## 6. Session Identity AISTALK must use consistent `project_id` and `session_id` across all calls in one conversation: ``` project_id: "aistalk" # fixed session_id: "aistalk_" # new UUID per conversation user_id: "aistalk_user" # fixed or per-user identity ``` This ensures memory continuity in memory-service and proper WS event filtering. --- ## 7. Rate Limits (BFF enforces) | Endpoint | Limit | |---|---| | `/api/chat/send` | 30 req/min per IP | | `/api/voice/stt` | 20 req/min per IP | | `/api/voice/tts` | 30 req/min per IP | AISTALK should implement backoff on HTTP 429. --- ## 8. Hello World Verification ```bash # 1. Connect WS wscat -c ws://localhost:8002/ws/events # 2. Send a message curl -X POST http://localhost:8002/api/chat/send \ -H "Content-Type: application/json" \ -d '{"message":"привіт Sofiia","model":"ollama:glm-4.7-flash:32k","project_id":"aistalk","session_id":"test_001","user_id":"aistalk_user"}' # 3. WS should receive chat.message + chat.reply events # 4. TTS test curl -X POST http://localhost:8002/api/voice/tts \ -H "Content-Type: application/json" \ -d '{"text":"Привіт! Я Sofiia.","voice":"default"}' \ --output test.wav && afplay test.wav ``` --- ## 9. Full-Duplex Voice Flow (AISTALK sequence) ``` User speaks → AISTALK records audio → POST /api/voice/stt (receives text) → POST /api/chat/send (receives reply text) → POST /api/voice/tts (receives audio) → AISTALK plays audio WS events observed: voice.stt {phase:start} → voice.stt {phase:done} → chat.message → chat.reply → voice.tts {phase:start} → voice.tts {phase:done} ``` Echo cancellation: AISTALK must mute its microphone during TTS playback (`voice.tts phase=start` → mute, `phase=done` → unmute).