Files
microdao-daarion/docs/ops/experience_bus_phase4.md

100 lines
3.4 KiB
Markdown

# Experience Bus Phase-4 (Gateway Hooks)
## Scope
- Source: `gateway` (Telegram webhook path).
- Emits `agent.experience.v1.<agent_id>` events with:
- `source="gateway"`
- `request_id`/`correlation_id`
- `policy.sowa_decision` + normalized `reason`
- `feedback.user_signal` (`none|positive|negative|retry|timeout`)
- Optional DB append to `agent_experience_events` (fail-open).
## Environment (gateway)
- `NATS_URL=nats://nats:4222`
- `EXPERIENCE_BUS_ENABLED=true`
- `EXPERIENCE_ENABLE_NATS=true`
- `EXPERIENCE_ENABLE_DB=true`
- `EXPERIENCE_STREAM_NAME=EXPERIENCE`
- `EXPERIENCE_SUBJECT_PREFIX=agent.experience.v1`
- `EXPERIENCE_DATABASE_URL=postgresql://<user>:<pass>@<host>:5432/daarion_memory`
- `GATEWAY_USER_SIGNAL_RETRY_WINDOW_SECONDS=30`
## Metrics
- `gateway_experience_published_total{status="ok|err"}`
- `gateway_policy_decisions_total{sowa_decision,reason}`
- `gateway_user_signal_total{user_signal}`
- `gateway_webhook_latency_ms`
## Correlation contract
- Gateway creates `request_id` (`correlation_id`) per webhook cycle.
- Gateway forwards it to router via:
- `metadata.request_id`
- `metadata.trace_id`
- `X-Request-Id` header
- Router writes same `request_id` in its event payload for join.
## Smoke
```bash
# 1) Send webhook payload (agent-specific endpoint)
curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
-H "content-type: application/json" \
-d '{
"update_id": 900001,
"message": {
"message_id": 101,
"date": 1760000000,
"text": "дякую",
"chat": {"id": "smoke-chat-1", "type": "private"},
"from": {"id": 7001, "username": "smoke_user", "is_bot": false}
}
}'
# 2) Retry signal (same text quickly)
curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
-H "content-type: application/json" \
-d '{
"update_id": 900002,
"message": {
"message_id": 102,
"date": 1760000005,
"text": "перевір",
"chat": {"id": "smoke-chat-1", "type": "private"},
"from": {"id": 7001, "username": "smoke_user", "is_bot": false}
}
}'
curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
-H "content-type: application/json" \
-d '{
"update_id": 900003,
"message": {
"message_id": 103,
"date": 1760000010,
"text": "перевір",
"chat": {"id": "smoke-chat-1", "type": "private"},
"from": {"id": 7001, "username": "smoke_user", "is_bot": false}
}
}'
# 3) Verify metrics
curl -sS http://127.0.0.1:9300/metrics | grep -E 'gateway_experience_published_total|gateway_policy_decisions_total|gateway_user_signal_total|gateway_webhook_latency_ms'
# 4) Verify DB rows
docker exec dagi-postgres psql -U daarion -d daarion_memory -tAc \
"SELECT count(*) FROM agent_experience_events WHERE source='gateway' AND ts > now()-interval '10 minutes';"
# 5) Verify correlation join (gateway <-> router)
docker exec dagi-postgres psql -U daarion -d daarion_memory -P pager=off -c \
"SELECT source, agent_id, request_id, task_type, ts
FROM agent_experience_events
WHERE ts > now()-interval '10 minutes'
AND source IN ('gateway','router')
ORDER BY ts DESC LIMIT 40;"
```
## Acceptance
- Gateway publishes and stores events without blocking webhook path.
- `request_id` can join gateway and router records for same conversation turn.
- `policy.sowa_decision` and `feedback.user_signal` are present in gateway `raw` event.
- If NATS/DB unavailable, webhook still returns normal success path (fail-open telemetry).