Config policies (16 files): alert_routing, architecture_pressure, backlog, cost_weights, data_governance, incident_escalation, incident_intelligence, network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix, release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard, deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice, cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule), task_registry, voice alerts/ha/latency/policy Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks, NODA1/NODA2 status and setup, audit index and traces, backlog, incident, supervisor, tools, voice, opencode, release, risk, aistalk, spacebot Made-with: Cursor
7.3 KiB
AISTALK ↔ Sofiia Console — Integration Contract
Version: 1.0
Date: 2026-02-25
Status: STUB READY — integration pending AISTALK implementation
Overview
AISTALK connects to Sofiia Console BFF (sofiia-console, port 8002) via two channels:
| Channel | Direction | Protocol |
|---|---|---|
/ws/events |
BFF → AISTALK | WebSocket (text/JSON) |
/api/chat/send |
AISTALK → BFF | HTTP POST |
/api/voice/stt |
AISTALK → BFF | HTTP POST multipart |
/api/voice/tts |
AISTALK → BFF | HTTP POST → audio stream |
1. WebSocket Event Stream: /ws/events
AISTALK connects as a subscriber to receive all platform events in real time.
Connection
ws://<BFF_HOST>:8002/ws/events
Optional auth header (if SOFIIA_CONSOLE_API_KEY is set):
X-API-Key: <key>
Keep-alive (ping/pong)
Client should send {"type":"ping"} every 10–30s.
Server responds with {"type":"pong","ts":"..."}.
Event Envelope
Every event has this shape:
{
"v": 1,
"type": "<event_type>",
"ts": "2026-02-25T12:34:56.789Z",
"project_id": "default",
"session_id": "sess_abc123",
"user_id": "console_user",
"data": { ... }
}
Event Types AISTALK Should Consume
chat.message — user sent a message
{
"data": {
"text": "...",
"provider": "ollama|router",
"model": "ollama:glm-4.7-flash:32k"
}
}
chat.reply — Sofiia replied
{
"data": {
"text": "...",
"provider": "ollama|router",
"model": "...",
"latency_ms": 1234
}
}
AISTALK should TTS this text (if voice channel is active) via
/api/voice/tts.
voice.stt — STT lifecycle
{
"data": {
"phase": "start|done|error",
"elapsed_ms": 456
}
}
AISTALK uses
phase=startto mute its own mic;phase=doneto unmute.
voice.tts — TTS lifecycle
{
"data": {
"phase": "start|done|error",
"voice": "Polina",
"elapsed_ms": 789
}
}
AISTALK uses
phase=startto begin audio playback;phase=doneas end signal.
ops.run — governance operation result
{
"data": {
"name": "risk_dashboard|pressure_dashboard|backlog_generate_weekly|release_check",
"ok": true,
"elapsed_ms": 999
}
}
nodes.status — node network heartbeat (every 15s)
{
"data": {
"bff_uptime_s": 3600,
"ws_clients": 2,
"nodes": [
{"id": "NODA1", "online": true, "router_ok": true, "router_latency_ms": 12},
{"id": "NODA2", "online": true, "router_ok": true, "router_latency_ms": 5}
],
"nodes_ts": "2026-02-25T12:34:50Z"
}
}
error — platform error
{
"data": {
"where": "bff|router|memory|ollama",
"message": "...",
"code": "optional_code"
}
}
Event Types AISTALK Should Ignore
tool.called/tool.result— internal governance, not relevant for voice- Any
typenot listed above — forward compatibility, AISTALK must not crash on unknown types
2. Sending Text to Sofiia: POST /api/chat/send
AISTALK sends user text (transcribed from voice or typed):
POST http://<BFF_HOST>:8002/api/chat/send
Content-Type: application/json
X-API-Key: <key>
{
"message": "Sofiia, покажи risk dashboard",
"model": "ollama:glm-4.7-flash:32k",
"project_id": "aistalk",
"session_id": "aistalk_sess_<uuid>",
"user_id": "aistalk_user",
"provider": "ollama"
}
Response:
{
"ok": true,
"project_id": "aistalk",
"session_id": "aistalk_sess_...",
"user_id": "aistalk_user",
"response": "Ось Risk Dashboard...",
"model": "ollama:glm-4.7-flash:32k",
"backend": "ollama",
"meta": {"latency_ms": 1234, "tokens_est": 87}
}
AISTALK should use the response field text for TTS.
3. Speech-to-Text: POST /api/voice/stt
POST http://<BFF_HOST>:8002/api/voice/stt?session_id=<sid>&project_id=<pid>
Content-Type: multipart/form-data
X-API-Key: <key>
audio=<binary; MIME: audio/webm or audio/wav>
Response:
{
"text": "Sofiia, покажи risk dashboard",
"language": "uk",
"segments": [...]
}
Audio constraints:
- Max size: no hard limit, but keep under 10MB per chunk
- Format:
audio/webm(Opus) oraudio/wav - Duration: up to 60s per chunk
4. Text-to-Speech: POST /api/voice/tts
POST http://<BFF_HOST>:8002/api/voice/tts
Content-Type: application/json
X-API-Key: <key>
{
"text": "Ось Risk Dashboard для gateway...",
"voice": "default",
"speed": 1.0,
"session_id": "aistalk_sess_...",
"project_id": "aistalk"
}
Response: audio/wav binary stream (or audio/mpeg).
Voice options (Ukrainian):
| voice | description |
|---|---|
default |
Polina Neural (uk-UA, edge-tts) |
Ostap |
Ostap Neural (uk-UA, edge-tts) |
Milena |
Milena (macOS, fallback) |
Yuri |
Yuri (macOS, fallback) |
Text limit: 500 chars per call (BFF enforces). Split longer responses.
5. AISTALK Adapter Interface (BFF-side stub)
File: services/sofiia-console/app/adapters/aistalk.py
class AISTALKAdapter:
def send_text(self, project_id, session_id, text) -> None
def send_audio(self, project_id, session_id, audio_bytes, mime) -> None
def handle_event(self, event: dict) -> None # called on chat.reply, ops.run etc.
def on_event(self, event: dict) -> None # alias
Activation:
AISTALK_ENABLED=true
AISTALK_URL=http://<aistalk-bridge>:<port>
AISTALK_API_KEY=<optional>
Currently the adapter is a noop stub with logging. Replace send_text / send_audio / handle_event with actual HTTP/WebSocket calls to AISTALK bridge when ready.
6. Session Identity
AISTALK must use consistent project_id and session_id across all calls in one conversation:
project_id: "aistalk" # fixed
session_id: "aistalk_<uuid>" # new UUID per conversation
user_id: "aistalk_user" # fixed or per-user identity
This ensures memory continuity in memory-service and proper WS event filtering.
7. Rate Limits (BFF enforces)
| Endpoint | Limit |
|---|---|
/api/chat/send |
30 req/min per IP |
/api/voice/stt |
20 req/min per IP |
/api/voice/tts |
30 req/min per IP |
AISTALK should implement backoff on HTTP 429.
8. Hello World Verification
# 1. Connect WS
wscat -c ws://localhost:8002/ws/events
# 2. Send a message
curl -X POST http://localhost:8002/api/chat/send \
-H "Content-Type: application/json" \
-d '{"message":"привіт Sofiia","model":"ollama:glm-4.7-flash:32k","project_id":"aistalk","session_id":"test_001","user_id":"aistalk_user"}'
# 3. WS should receive chat.message + chat.reply events
# 4. TTS test
curl -X POST http://localhost:8002/api/voice/tts \
-H "Content-Type: application/json" \
-d '{"text":"Привіт! Я Sofiia.","voice":"default"}' \
--output test.wav && afplay test.wav
9. Full-Duplex Voice Flow (AISTALK sequence)
User speaks
→ AISTALK records audio
→ POST /api/voice/stt (receives text)
→ POST /api/chat/send (receives reply text)
→ POST /api/voice/tts (receives audio)
→ AISTALK plays audio
WS events observed:
voice.stt {phase:start} → voice.stt {phase:done}
→ chat.message → chat.reply
→ voice.tts {phase:start} → voice.tts {phase:done}
Echo cancellation: AISTALK must mute its microphone during TTS playback (voice.tts phase=start → mute, phase=done → unmute).