71 lines
2.6 KiB
Markdown
71 lines
2.6 KiB
Markdown
# Experience Bus Phase-3 (Router Runtime Retrieval)
|
|
|
|
## Scope
|
|
- Read path only in `router` before `/v1/agents/{id}/infer`.
|
|
- Retrieves lessons from `agent_lessons` and injects a compact block:
|
|
- `Operational Lessons (apply if relevant)`
|
|
- Attach policy:
|
|
- after last error / latency spike: always-on, `K=7`
|
|
- otherwise sampled attach, default `10%`, `K=3`
|
|
|
|
## Environment
|
|
- `LESSONS_ATTACH_ENABLED=true`
|
|
- `LESSONS_DATABASE_URL=postgresql://<user>:<pass>@<host>:5432/daarion_memory`
|
|
- `LESSONS_ATTACH_MIN=3`
|
|
- `LESSONS_ATTACH_MAX=7`
|
|
- `LESSONS_ATTACH_SAMPLE_PCT=10`
|
|
- `LESSONS_ATTACH_TIMEOUT_MS=25`
|
|
- `LESSONS_ATTACH_MAX_CHARS=1200`
|
|
- `LESSONS_SIGNAL_CACHE_TTL_SECONDS=300`
|
|
- `EXPERIENCE_LATENCY_SPIKE_MS=5000`
|
|
|
|
## Metrics
|
|
- `lessons_retrieved_total{status="ok|timeout|err"}`
|
|
- `lessons_attached_total{count="0|1-3|4-7"}`
|
|
- `lessons_attach_latency_ms`
|
|
|
|
## Safety
|
|
- Lessons block never includes raw user text.
|
|
- Guard filters skip lessons containing prompt-injection-like markers:
|
|
- `ignore previous`, `system:`, `developer:`, fenced code blocks.
|
|
|
|
## Smoke
|
|
```bash
|
|
# 1) Seed synthetic lessons for one agent (example: agromatrix)
|
|
docker exec dagi-postgres psql -U daarion -d daarion_memory -c "
|
|
INSERT INTO agent_lessons (lesson_id, lesson_key, ts, scope, agent_id, task_type, trigger, action, avoid, signals, evidence, raw)
|
|
SELECT
|
|
gen_random_uuid(),
|
|
md5(random()::text || clock_timestamp()::text),
|
|
now() - (g * interval '1 minute'),
|
|
'agent',
|
|
'agromatrix',
|
|
'infer',
|
|
'when retrying after model timeout',
|
|
'switch provider or reduce token budget first',
|
|
'avoid repeating the same failed provider with same payload',
|
|
'{"error_class":"TimeoutError","provider":"deepseek","model":"deepseek-chat","profile":"reasoning"}'::jsonb,
|
|
'{"count":3}'::jsonb,
|
|
'{}'::jsonb
|
|
FROM generate_series(1,10) g;"
|
|
|
|
# 2) Send infer calls
|
|
for i in $(seq 1 20); do
|
|
curl -sS -m 12 -o /dev/null \
|
|
-X POST "http://127.0.0.1:9102/v1/agents/agromatrix/infer" \
|
|
-H "content-type: application/json" \
|
|
-d "{\"prompt\":\"phase3-smoke-${i}\",\"metadata\":{\"agent_id\":\"agromatrix\"}}" || true
|
|
done
|
|
|
|
# 3) Check metrics
|
|
curl -sS http://127.0.0.1:9102/metrics | grep -E 'lessons_retrieved_total|lessons_attached_total|lessons_attach_latency_ms'
|
|
|
|
# 4) Simulate DB issue (optional): lessons retrieval should fail-open and infer remains 200
|
|
# (temporarily point LESSONS_DATABASE_URL to bad DSN + restart router)
|
|
```
|
|
|
|
## Acceptance
|
|
- Router logs include `lessons_attached=<k>` during sampled or always-on retrieval.
|
|
- Infer path remains healthy when lessons DB is unavailable.
|
|
- p95 infer latency impact stays controlled at sampling `10%`.
|