Files
microdao-daarion/docs/ops/experience_bus_phase4.md

3.4 KiB

Experience Bus Phase-4 (Gateway Hooks)

Scope

  • Source: gateway (Telegram webhook path).
  • Emits agent.experience.v1.<agent_id> events with:
    • source="gateway"
    • request_id/correlation_id
    • policy.sowa_decision + normalized reason
    • feedback.user_signal (none|positive|negative|retry|timeout)
  • Optional DB append to agent_experience_events (fail-open).

Environment (gateway)

  • NATS_URL=nats://nats:4222
  • EXPERIENCE_BUS_ENABLED=true
  • EXPERIENCE_ENABLE_NATS=true
  • EXPERIENCE_ENABLE_DB=true
  • EXPERIENCE_STREAM_NAME=EXPERIENCE
  • EXPERIENCE_SUBJECT_PREFIX=agent.experience.v1
  • EXPERIENCE_DATABASE_URL=postgresql://<user>:<pass>@<host>:5432/daarion_memory
  • GATEWAY_USER_SIGNAL_RETRY_WINDOW_SECONDS=30

Metrics

  • gateway_experience_published_total{status="ok|err"}
  • gateway_policy_decisions_total{sowa_decision,reason}
  • gateway_user_signal_total{user_signal}
  • gateway_webhook_latency_ms

Correlation contract

  • Gateway creates request_id (correlation_id) per webhook cycle.
  • Gateway forwards it to router via:
    • metadata.request_id
    • metadata.trace_id
    • X-Request-Id header
  • Router writes same request_id in its event payload for join.

Smoke

# 1) Send webhook payload (agent-specific endpoint)
curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
  -H "content-type: application/json" \
  -d '{
    "update_id": 900001,
    "message": {
      "message_id": 101,
      "date": 1760000000,
      "text": "дякую",
      "chat": {"id": "smoke-chat-1", "type": "private"},
      "from": {"id": 7001, "username": "smoke_user", "is_bot": false}
    }
  }'

# 2) Retry signal (same text quickly)
curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
  -H "content-type: application/json" \
  -d '{
    "update_id": 900002,
    "message": {
      "message_id": 102,
      "date": 1760000005,
      "text": "перевір",
      "chat": {"id": "smoke-chat-1", "type": "private"},
      "from": {"id": 7001, "username": "smoke_user", "is_bot": false}
    }
  }'

curl -sS -X POST "http://127.0.0.1:9300/helion/telegram/webhook" \
  -H "content-type: application/json" \
  -d '{
    "update_id": 900003,
    "message": {
      "message_id": 103,
      "date": 1760000010,
      "text": "перевір",
      "chat": {"id": "smoke-chat-1", "type": "private"},
      "from": {"id": 7001, "username": "smoke_user", "is_bot": false}
    }
  }'

# 3) Verify metrics
curl -sS http://127.0.0.1:9300/metrics | grep -E 'gateway_experience_published_total|gateway_policy_decisions_total|gateway_user_signal_total|gateway_webhook_latency_ms'

# 4) Verify DB rows
docker exec dagi-postgres psql -U daarion -d daarion_memory -tAc \
  "SELECT count(*) FROM agent_experience_events WHERE source='gateway' AND ts > now()-interval '10 minutes';"

# 5) Verify correlation join (gateway <-> router)
docker exec dagi-postgres psql -U daarion -d daarion_memory -P pager=off -c \
  "SELECT source, agent_id, request_id, task_type, ts
     FROM agent_experience_events
    WHERE ts > now()-interval '10 minutes'
      AND source IN ('gateway','router')
 ORDER BY ts DESC LIMIT 40;"

Acceptance

  • Gateway publishes and stores events without blocking webhook path.
  • request_id can join gateway and router records for same conversation turn.
  • policy.sowa_decision and feedback.user_signal are present in gateway raw event.
  • If NATS/DB unavailable, webhook still returns normal success path (fail-open telemetry).