Files
microdao-daarion/docs/CHAOS_TEST_REPORT.md
Apple ef3473db21 snapshot: NODE1 production state 2026-02-09
Complete snapshot of /opt/microdao-daarion/ from NODE1 (144.76.224.179).
This represents the actual running production code that has diverged
significantly from the previous main branch.

Key changes from old main:
- Gateway (http_api.py): expanded from ~40KB to 164KB with full agent support
- Router: new /v1/agents/{id}/infer endpoint with vision + DeepSeek routing
- Behavior Policy: SOWA v2.2 (3-level: FULL/ACK/SILENT)
- Agent Registry: config/agent_registry.yml as single source of truth
- 13 agents configured (was 3)
- Memory service integration
- CrewAI teams and roles

Excluded from snapshot: venv/, .env, data/, backups, .tgz archives

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-09 08:46:46 -08:00

2.8 KiB
Raw Blame History

Chaos Test Report

Test Start/End (UTC) Max Lag DLQ Peak p95 Latency Unique Success Notes

Baseline (staging)

  • Time (UTC): 2026-01-19 15:49:15
  • Streams: reported lag=0 during tests (jsz ok)

| A Kill Worker | 2026-01-19 15:49:12Z15:51:20Z | 0 | 0 | 39374.01 ms | 100% (50/60 unique) | restart crewai-worker | | B Kill Router | 2026-01-19 15:51:21Z15:53:29Z | 0 | 0 | 6174.56 ms | 100% (50/60 unique) | restart router | | C Block Postgres | 2026-01-19 15:53:30Z15:55:38Z | 0 | 0 | 6207.92 ms | 100% (50/60 unique) | stop/start postgres 60s | | D DLQ Replay | 2026-01-19 16:08:30Z16:09:10Z | n/a | 1→0 | n/a | completed | forced fail + replay, job_id=dlq-test-1768839356 | | A Kill Worker | 2026-01-19 16:56:53 2026-01-19 16:59:01 | 0 | 0 | 57974.01 | (100.00% | restart crewai-worker | | B Kill Router | 2026-01-19 16:59:01 2026-01-19 17:01:10 | 0 | 0 | 6183.63 | (100.00% | restart router | | C Block Postgres | 2026-01-19 17:01:10 2026-01-19 17:03:19 | 0 | 0 | 6206.32 | (100.00% | stop/start postgres 60s | | D DLQ Replay | 2026-01-19 17:03:19 2026-01-19 17:03:32 | n/a | see log | n/a | n/a | dlq_replay.py | | A Kill Worker | 2026-01-19 17:04:15 2026-01-19 17:06:24 | 0 | 0 | 76807.84 | (100.00% | restart crewai-worker | | B Kill Router | 2026-01-19 17:06:24 2026-01-19 17:08:33 | 0 | 0 | 6171.86 | (100.00% | restart router | | C Block Postgres | 2026-01-19 17:08:33 2026-01-19 17:10:41 | 0 | 0 | 6210.77 | (100.00% | stop/start postgres 60s | | D DLQ Replay | 2026-01-19 17:10:41 2026-01-19 17:10:54 | n/a | see log | n/a | n/a | dlq_replay.py | | A Kill Worker | 2026-01-19 17:13:25 2026-01-19 17:15:34 | 0 | 0 | 96020.54 | (100.00% | restart crewai-worker | | B Kill Router | 2026-01-19 17:15:34 2026-01-19 17:17:43 | 0 | 0 | 6169.57 | (100.00% | restart router | | C Block Postgres | 2026-01-19 17:17:43 2026-01-19 17:19:51 | 0 | 0 | 6212.49 | (100.00% | stop/start postgres 60s | | D DLQ Replay | 2026-01-19 17:19:51 2026-01-19 17:20:08 | n/a | see log | n/a | completed | forced fail + replay, job_id=dlq-test-1768838617, subject=agent.run.completed.helion, replay_count=n/a | | A Kill Worker | 2026-01-19 17:20:51 2026-01-19 17:23:00 | 0 | 0 | 115620.04 | (100.00% | restart crewai-worker | | B Kill Router | 2026-01-19 17:23:00 2026-01-19 17:25:08 | 0 | 0 | 6175.69 | (100.00% | restart router | | C Block Postgres | 2026-01-19 17:25:08 2026-01-19 17:27:17 | 0 | 0 | 5950.39 | (100.00% | stop/start postgres 60s | | D DLQ Replay | 2026-01-19 17:27:17 2026-01-19 17:27:34 | n/a | see log | n/a | completed | forced fail + replay, job_id=dlq-test-1768838617, subject=agent.run.completed.helion, replay_count=n/a |