# Experience Bus Phase-3 (Router Runtime Retrieval) ## Scope - Read path only in `router` before `/v1/agents/{id}/infer`. - Retrieves lessons from `agent_lessons` and injects a compact block: - `Operational Lessons (apply if relevant)` - Attach policy: - after last error / latency spike: always-on, `K=7` - otherwise sampled attach, default `10%`, `K=3` ## Environment - `LESSONS_ATTACH_ENABLED=true` - `LESSONS_DATABASE_URL=postgresql://:@:5432/daarion_memory` - `LESSONS_ATTACH_MIN=3` - `LESSONS_ATTACH_MAX=7` - `LESSONS_ATTACH_SAMPLE_PCT=10` - `LESSONS_ATTACH_TIMEOUT_MS=25` - `LESSONS_ATTACH_MAX_CHARS=1200` - `LESSONS_SIGNAL_CACHE_TTL_SECONDS=300` - `EXPERIENCE_LATENCY_SPIKE_MS=5000` ## Metrics - `lessons_retrieved_total{status="ok|timeout|err"}` - `lessons_attached_total{count="0|1-3|4-7"}` - `lessons_attach_latency_ms` ## Safety - Lessons block never includes raw user text. - Guard filters skip lessons containing prompt-injection-like markers: - `ignore previous`, `system:`, `developer:`, fenced code blocks. ## Smoke ```bash # 1) Seed synthetic lessons for one agent (example: agromatrix) docker exec dagi-postgres psql -U daarion -d daarion_memory -c " INSERT INTO agent_lessons (lesson_id, lesson_key, ts, scope, agent_id, task_type, trigger, action, avoid, signals, evidence, raw) SELECT gen_random_uuid(), md5(random()::text || clock_timestamp()::text), now() - (g * interval '1 minute'), 'agent', 'agromatrix', 'infer', 'when retrying after model timeout', 'switch provider or reduce token budget first', 'avoid repeating the same failed provider with same payload', '{"error_class":"TimeoutError","provider":"deepseek","model":"deepseek-chat","profile":"reasoning"}'::jsonb, '{"count":3}'::jsonb, '{}'::jsonb FROM generate_series(1,10) g;" # 2) Send infer calls for i in $(seq 1 20); do curl -sS -m 12 -o /dev/null \ -X POST "http://127.0.0.1:9102/v1/agents/agromatrix/infer" \ -H "content-type: application/json" \ -d "{\"prompt\":\"phase3-smoke-${i}\",\"metadata\":{\"agent_id\":\"agromatrix\"}}" || true done # 3) Check metrics curl -sS http://127.0.0.1:9102/metrics | grep -E 'lessons_retrieved_total|lessons_attached_total|lessons_attach_latency_ms' # 4) Simulate DB issue (optional): lessons retrieval should fail-open and infer remains 200 # (temporarily point LESSONS_DATABASE_URL to bad DSN + restart router) ``` ## Acceptance - Router logs include `lessons_attached=` during sampled or always-on retrieval. - Infer path remains healthy when lessons DB is unavailable. - p95 infer latency impact stays controlled at sampling `10%`.