Apple
a24dae8e18
feat(matrix-bridge-dagi): add backpressure queue with N workers (H2)
...
Reader + N workers architecture:
Reader: sync_poll → rate_check → dedupe → queue.put_nowait()
Workers (WORKER_CONCURRENCY, default 2): queue.get() → invoke → send → audit
Drop policy (queue full):
- put_nowait() raises QueueFull → dropped immediately (reader never blocks)
- audit matrix.queue_full + on_queue_dropped callback
- metric: matrix_bridge_queue_dropped_total{room_id,agent_id}
Graceful shutdown:
1. stop_event → reader exits loop
2. queue.join() with QUEUE_DRAIN_TIMEOUT_S (default 5s) → workers finish in-flight
3. worker tasks cancelled
New config env vars:
QUEUE_MAX_EVENTS (default 100)
WORKER_CONCURRENCY (default 2)
QUEUE_DRAIN_TIMEOUT_S (default 5)
New metrics (H3 additions):
matrix_bridge_queue_size (gauge)
matrix_bridge_queue_dropped_total (counter)
matrix_bridge_queue_wait_seconds histogram (buckets: 0.01…30s)
/health: queue.size, queue.max, queue.workers
MatrixIngressLoop: queue_size + worker_count properties
6 queue tests: enqueue/process, full-drop-audit, concurrency barrier,
graceful drain, wait metric, rate-limit-before-enqueue
Total: 71 passed
Made-with: Cursor
2026-03-05 01:07:04 -08:00
Apple
a4e95482bc
feat(matrix-bridge-dagi): add rate limiting (H1) and metrics (H3)
...
H1 — InMemoryRateLimiter (sliding window, no Redis):
- Per-room: RATE_LIMIT_ROOM_RPM (default 20/min)
- Per-sender: RATE_LIMIT_SENDER_RPM (default 10/min)
- Room checked before sender — sender quota not charged on room block
- Blocked messages: audit matrix.rate_limited + on_rate_limited callback
- reset() for ops/test, stats() exposed in /health
H3 — Extended Prometheus metrics:
- matrix_bridge_rate_limited_total{room_id,agent_id,limit_type}
- matrix_bridge_send_duration_seconds histogram (invoke was already there)
- matrix_bridge_invoke_duration_seconds buckets tuned for LLM latency
- matrix_bridge_rate_limiter_active_rooms/senders gauges
- on_invoke_latency + on_send_latency callbacks wired in ingress loop
16 new tests: rate limiter unit (13) + ingress integration (3)
Total: 65 passed
Made-with: Cursor
2026-03-05 00:54:14 -08:00
Apple
313d777c84
ops(nginx): finalize matrix.daarion.space HTTPS config with Synapse proxy
...
Made-with: Cursor
2026-03-05 00:42:28 -08:00
Apple
e5480e92db
ops(nginx): add matrix.daarion.space vhost config (HTTP + HTTPS template)
...
Made-with: Cursor
2026-03-03 09:00:23 -08:00
Apple
b27dd79ece
fix(sofiia-console): pass SOFIIA_INTERNAL_TOKEN env var to container
...
Made-with: Cursor
2026-03-03 08:08:23 -08:00
Apple
cad3663508
feat(matrix-bridge-dagi): add egress, audit integration, fix router endpoint (PR-M1.4)
...
Closes the full Matrix ↔ DAGI loop:
Egress:
- invoke Router POST /v1/agents/{agent_id}/infer (field: prompt, response: response)
- send_text() reply to Matrix room with idempotent txn_id = make_txn_id(room_id, event_id)
- empty reply → skip send (no spam)
- reply truncated to 4000 chars if needed
Audit (via sofiia-console POST /api/audit/internal):
- matrix.message.received (on ingress)
- matrix.agent.replied (on successful reply)
- matrix.error (on router/send failure, with error_code)
- fire-and-forget: audit failures never crash the loop
Router URL fix:
- DAGI_GATEWAY_URL now points to dagi-router-node1:8000 (not gateway:9300)
- Session ID: stable per room — matrix:{room_localpart} (memory context)
9 tests: invoke endpoint, fallback fields, audit write, full cycle,
dedupe, empty reply skip, metric callbacks
Made-with: Cursor
2026-03-03 08:06:49 -08:00
Apple
8d564fbbe5
feat(sofiia-console): add internal audit ingest endpoint for trusted services
...
Adds POST /api/audit/internal authenticated via X-Internal-Service-Token header
(SOFIIA_INTERNAL_TOKEN env). Allows matrix-bridge-dagi and other internal services
to write audit events without team keys. Reuses existing audit_log() + db layer.
Made-with: Cursor
2026-03-03 08:03:49 -08:00
Apple
88bdaf214b
fix(matrix-bridge-dagi): add BRIDGE_ROOM_MAP to docker-compose env
...
Made-with: Cursor
2026-03-03 07:52:59 -08:00
Apple
dbfab78f02
feat(matrix-bridge-dagi): add room mapping, ingress loop, synapse setup (PR-M1.2 + PR-M1.3)
...
PR-M1.2 — room-to-agent mapping:
- adds room_mapping.py: parse BRIDGE_ROOM_MAP (format: agent:!room_id:server)
- RoomMappingConfig with O(1) room→agent lookup, agent allowlist check
- /bridge/mappings endpoint (read-only ops summary, no secrets)
- health endpoint now includes mappings_count
- 21 tests for parsing, validation, allowlist, summary
PR-M1.3 — Matrix ingress loop:
- adds ingress.py: MatrixIngressLoop asyncio task
- sync_poll → extract → dedupe → _invoke_gateway (POST /v1/invoke)
- gateway payload: agent_id, node_id, message, metadata (transport, room_id, event_id, sender)
- exponential backoff on errors (2s..60s)
- joins all mapped rooms at startup
- metric callbacks: on_message_received, on_gateway_error
- graceful shutdown via asyncio.Event
- 5 ingress tests (invoke, dedupe, callbacks, empty-map idle)
Synapse setup (docker-compose.synapse-node1.yml):
- fixed volume: bind mount ./synapse-data instead of named volume
- added port mapping 127.0.0.1:8008:8008
Synapse running on NODA1 (localhost:8008), bot @dagi_bridge:daarion.space created,
room !QwHczWXgefDHBEVkTH:daarion.space created, all 4 values in .env on NODA1.
Made-with: Cursor
2026-03-03 07:51:13 -08:00
Apple
d8506da179
feat(matrix-bridge-dagi): add matrix client wrapper and synapse setup (PR-M1.1)
...
- adds MatrixClient with send_text/sync_poll/join_room/whoami (idempotent via txn_id)
- LRU dedupe for incoming event_ids (2048 capacity)
- exponential backoff retry (max 3 attempts) for 429/5xx/network errors
- extract_room_messages: filters own messages, non-text, duplicates
- health endpoint now probes matrix_reachable + gateway_reachable at startup
- adds docker-compose.synapse-node1.yml (Synapse + Postgres for NODA1)
- adds ops/runbook-matrix-setup.md (10-step setup: DNS, config, bot, room, .env)
- 19 tests passing, no real Synapse required
Made-with: Cursor
2026-03-03 07:38:54 -08:00
Apple
1d8482f4c1
feat(matrix-bridge-dagi): scaffold service with health, metrics and config (PR-M1.0)
...
New service: services/matrix-bridge-dagi/
- app/config.py: BridgeConfig dataclass, load_config() with full env validation
(MATRIX_HOMESERVER_URL, MATRIX_ACCESS_TOKEN, MATRIX_USER_ID, SOFIIA_ROOM_ID,
DAGI_GATEWAY_URL, SOFIIA_CONSOLE_URL, SOFIIA_INTERNAL_TOKEN, rate limits)
- app/main.py: FastAPI app with lifespan, GET /health, GET /metrics (prometheus)
health returns: ok, node_id, homeserver, bridge_user, sofiia_room_id,
allowed_agents, gateway, uptime_s; graceful error state when config missing
- requirements.txt: fastapi, uvicorn, httpx, prometheus-client, pyyaml
- Dockerfile: python:3.11-slim, port 7030, BUILD_SHA/BUILD_TIME args
docker-compose.matrix-bridge-node1.yml:
- standalone override file (node1 network, port 127.0.0.1:7030)
- all env vars wired: MATRIX_*, SOFIIA_ROOM_ID, DAGI_GATEWAY_URL,
SOFIIA_CONSOLE_URL, SOFIIA_INTERNAL_TOKEN, rate limit policy
- healthcheck, restart: unless-stopped
DoD: config validates, health/metrics respond, imports clean
Made-with: Cursor
2026-03-03 07:28:24 -08:00
Apple
5994a3a56f
feat(node-capabilities): add voice HA capability pass-through from node-worker
...
Made-with: Cursor
2026-03-03 07:15:39 -08:00
Apple
fa749fa56c
chore(infra): add NODA2 setup files, docker-compose configs and root config
...
- AGENTS.md: Sofiia Chief AI Architect role definition
- SOFIIA_IN_OPENCODE.md, SOFIIA_NODA2_SETUP.md: NODA2 setup documentation
- agromatrix_stepan_noda1_APPLY.md, agromatrix_stepan_noda1_prod.patch: AgroMatrix production patch
- docker-compose.memory-node2.yml: memory service for NODA2
- docker-compose.node2-sofiia-supervisor.yml: sofiia supervisor for NODA2
- gateway-bot/gateway_boot.py, monitor_prompt.txt, vision_guard.py: gateway extras
- models/Modelfile.qwen3.5-35b-a3b: Qwen model definition for NODA3
- opencode.json: OpenCode providers and agents config
- scripts/init-sofiia-memory.py, scripts/node2/*, start-memory-node2.sh: NODA2 init scripts
- setup_sofiia_node2.sh: NODA2 full setup script
Made-with: Cursor
2026-03-03 07:15:20 -08:00
Apple
67225a39fa
docs(platform): add policy configs, runbooks, ops scripts and platform documentation
...
Config policies (16 files): alert_routing, architecture_pressure, backlog,
cost_weights, data_governance, incident_escalation, incident_intelligence,
network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix,
release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout
Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard,
deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice,
cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule),
task_registry, voice alerts/ha/latency/policy
Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks,
NODA1/NODA2 status and setup, audit index and traces, backlog, incident,
supervisor, tools, voice, opencode, release, risk, aistalk, spacebot
Made-with: Cursor
2026-03-03 07:14:53 -08:00
Apple
129e4ea1fc
feat(platform): add new services, tools, tests and crews modules
...
New router intelligence modules (26 files): alert_ingest/store, audit_store,
architecture_pressure, backlog_generator/store, cost_analyzer, data_governance,
dependency_scanner, drift_analyzer, incident_* (5 files), llm_enrichment,
platform_priority_digest, provider_budget, release_check_runner, risk_* (6 files),
signature_state_store, sofiia_auto_router, tool_governance
New services:
- sofiia-console: Dockerfile, adapters/, monitor/nodes/ops/voice modules, launchd, react static
- memory-service: integration_endpoints, integrations, voice_endpoints, static UI
- aurora-service: full app suite (analysis, job_store, orchestrator, reporting, schemas, subagents)
- sofiia-supervisor: new supervisor service
- aistalk-bridge-lite: Telegram bridge lite
- calendar-service: CalDAV calendar service with reminders
- mlx-stt-service / mlx-tts-service: Apple Silicon speech services
- binance-bot-monitor: market monitor service
- node-worker: STT/TTS memory providers
New tools (9): agent_email, browser_tool, contract_tool, observability_tool,
oncall_tool, pr_reviewer_tool, repo_tool, safe_code_executor, secure_vault
New crews: agromatrix_crew (10 modules: depth_classifier, doc_facts, doc_focus,
farm_state, light_reply, llm_factory, memory_manager, proactivity, reflection_engine,
session_context, style_adapter, telemetry)
Tests: 85+ test files for all new modules
Made-with: Cursor
2026-03-03 07:14:14 -08:00
Apple
e9dedffa48
feat(production): sync all modified production files to git
...
Includes updates across gateway, router, node-worker, memory-service,
aurora-service, swapper, sofiia-console UI and node2 infrastructure:
- gateway-bot: Dockerfile, http_api.py, druid/aistalk prompts, doc_service
- services/router: main.py, router-config.yml, fabric_metrics, memory_retrieval,
offload_client, prompt_builder
- services/node-worker: worker.py, main.py, config.py, fabric_metrics
- services/memory-service: Dockerfile, database.py, main.py, requirements
- services/aurora-service: main.py (+399), kling.py, quality_report.py
- services/swapper-service: main.py, swapper_config_node2.yaml
- services/sofiia-console: static/index.html (console UI update)
- config: agent_registry, crewai_agents/teams, router_agents
- ops/fabric_preflight.sh: updated preflight checks
- router-config.yml, docker-compose.node2.yml: infra updates
- docs: NODA1-AGENT-ARCHITECTURE, fabric_contract updated
Made-with: Cursor
2026-03-03 07:13:29 -08:00
Apple
9aac835882
chore(git): fix .gitignore — remove duplicate node_modules, add .venv-macos and runtime artifacts
...
- remove 13 duplicate 'node_modules' lines (cursor auto-added)
- add .venv-macos/ (aurora-service Python venv, 24k files)
- add ops/preflight_snapshots/, ops/voice_audit_results/, ops/voice_latency_report.json
- add *.bak and router-config.yml.bak backup files
- add services/sofiia-console/data/ (runbook runner artifacts dir)
Made-with: Cursor
2026-03-03 07:13:03 -08:00
Apple
2962d33a3b
feat(sofiia-console): add artifacts list endpoint + team onboarding doc
...
- runbook_artifacts.py: adds list_run_artifacts() returning files with
names, paths, sizes, mtime_utc from release_artifacts/<run_id>/
- runbook_runs_router.py: adds GET /api/runbooks/runs/{run_id}/artifacts
- docs/runbook/team-onboarding-console.md: one-page team onboarding doc
covering access, rehearsal run steps, audit auth model (strict, no
localhost bypass), artifacts location, abort procedure
Made-with: Cursor
2026-03-03 06:55:49 -08:00
Apple
e0bea910b9
feat(sofiia-console): add multi-user team key auth + fix aurora DNS env
...
- auth.py: adds SOFIIA_CONSOLE_TEAM_KEYS="name:key,..." support;
require_auth now returns identity ("operator"/"user:<name>") for audit;
validate_any_key checks primary + team keys; login sets per-user cookie
- main.py: auth/login+check endpoints return identity field;
imports validate_any_key, _expected_team_cookie_tokens from auth
- docker-compose.node1.yml: adds SOFIIA_CONSOLE_TEAM_KEYS env var;
adds AURORA_SERVICE_URL=http://127.0.0.1:9401 to prevent DNS lookup
failure for aurora-service (not deployed on NODA1)
Made-with: Cursor
2026-03-03 06:38:26 -08:00
Apple
32989525fb
feat(sofiia-prompt): add hardware dev Sergiy Plis (@vetr369) to development team
...
- adds Development Team section with Сергій Миколайович Пліс (@vetr369)
as Hardware Engineer & Infrastructure Specialist for DAGI nodes
- grants developer-level access to technical node/infra information
Made-with: Cursor
2026-03-03 05:07:59 -08:00
Apple
8879da1e7f
feat(sofiia-console): add auto-evidence and post-review generation from runbook runs
...
- adds runbook_artifacts.py: server-side render of release_evidence.md and
post_review.md from DB step results (no shell); saves to
SOFIIA_DATA_DIR/release_artifacts/<run_id>/
- evidence: auto-fills preflight/smoke/script outcomes, step table, timestamps
- post_review: auto-fills metadata, smoke results, incidents from step statuses;
leaves [TODO] markers for manual observation sections
- adds POST /api/runbooks/runs/{run_id}/evidence and /post_review endpoints
- updates runbook_runs.evidence_path in DB after render
- adds 11 tests covering file creation, key sections, TODO markers, 404s, API
Made-with: Cursor
2026-03-03 05:07:52 -08:00
Apple
0603184524
feat(sofiia-console): add safe script executor for allowlisted runbook steps
...
- adds safe_executor.py: REPO_ROOT confinement, strict script allowlist,
env key allowlist (STRICT/SOFIIA_URL/BFF_A/BFF_B/NODE_ID/AGENT_ID),
stdin=DEVNULL, 8KB output cap, timeout clamp (max 300s), non-root warn
- integrates script action_type into runbook_runner: next_step handles
http_check and script branches; running_as_root -> step_status=warn
- extends runbook_parser: rehearsal-v1 now includes 3 built-in script steps
(preflight, idempotency smoke, generate evidence) after http_checks
- adds tests/test_sofiia_safe_executor.py: 12 tests covering path traversal,
absolute path, non-allowlist, env drop, timeout, exit_code, mocked subprocess
Made-with: Cursor
2026-03-03 04:57:22 -08:00
Apple
ad8bddf595
feat(sofiia-console): add guided runbook runner with http checks and audit integration
...
adds runbook_runs/runbook_steps state machine
parses markdown runbooks into guided steps
supports allowlisted http_check (health/metrics/audit)
integrates runbook execution with audit trail
exposes authenticated runbook runs API
Made-with: Cursor
2026-03-03 04:49:19 -08:00
Apple
4db1774a34
feat(sofiia-console): rank runbook search results with bm25
...
FTS path: score = bm25(docs_chunks_fts), ORDER BY score ASC; LIKE fallback: score null; test asserts score key present
Made-with: Cursor
2026-03-03 04:36:52 -08:00
Apple
63fec4371a
feat(sofiia-console): add runbooks index status endpoint
...
GET /api/runbooks/status returns docs_root, indexed_files, indexed_chunks, last_indexed_at, fts_available; docs_index_meta table and set on rebuild
Made-with: Cursor
2026-03-03 04:35:18 -08:00
Apple
ef3ff80645
feat(sofiia-console): add docs index and runbook search API (FTS5)
...
adds SQLite docs index (files/chunks + FTS5) and CLI rebuild
exposes authenticated runbook search/preview/raw endpoints
Made-with: Cursor
2026-03-03 04:26:34 -08:00
Apple
bddb6cd75a
docs(dev): index release evidence template in runbook README
...
Made-with: Cursor
2026-03-03 04:00:15 -08:00
Apple
3c199be6d3
docs(dev): index release and rehearsal runbooks in docs/runbook
...
Made-with: Cursor
2026-03-03 03:55:29 -08:00
Apple
55a5e541df
docs(dev): add v1 30-min rehearsal execution checklist
...
includes preflight, restart, smoke, observation, evidence steps
defines success criteria and metrics to collect for next-step decision
Made-with: Cursor
2026-03-03 03:54:53 -08:00
Apple
ad74e4c0ba
docs(dev): add sofiia-console post-release review template
...
Made-with: Cursor
2026-03-02 10:20:24 -08:00
Apple
3df414d35a
docs(dev): add sofiia-console v1 technical release announcement
...
Made-with: Cursor
2026-03-02 10:17:53 -08:00
Apple
e75fd334bf
ops(dev): add release evidence auto-generator script
...
Made-with: Cursor
2026-03-02 10:13:06 -08:00
Apple
47073ba761
docs(dev): add release runbook for sofiia-console
...
Made-with: Cursor
2026-03-02 10:00:08 -08:00
Apple
6a0d2ff103
ops(dev): extend preflight with audit retention checks
...
Made-with: Cursor
2026-03-02 09:59:22 -08:00
Apple
1d18634c01
ops(dev): add audit retention pruning script
...
Made-with: Cursor
2026-03-02 09:47:39 -08:00
Apple
e2c2333b6f
feat(sofiia-console): protect audit endpoint with admin token
...
Made-with: Cursor
2026-03-02 09:42:10 -08:00
Apple
11e0ba7264
feat(sofiia-console): add audit query endpoint with cursor pagination
...
Made-with: Cursor
2026-03-02 09:36:11 -08:00
Apple
9e70fc83d2
ops(dev): add secrets rotation runbook and sofiia-console preflight checks
...
Made-with: Cursor
2026-03-02 09:32:18 -08:00
Apple
3246440ac8
feat(sofiia-console): add audit trail for operator actions
...
Made-with: Cursor
2026-03-02 09:29:14 -08:00
Apple
9b89ace2fc
feat(sofiia-console): add rate limiting for chat send (per-chat and per-operator)
...
Made-with: Cursor
2026-03-02 09:24:21 -08:00
Apple
de8002eacd
ops(dev): add redis idempotency A/B smoke script
...
Made-with: Cursor
2026-03-02 09:14:28 -08:00
Apple
d85aa507a2
docs(dev): add redis docker-compose smoke snippet for sofiia-console
...
Made-with: Cursor
2026-03-02 09:11:45 -08:00
Apple
9f085509dd
test(sofiia-console): cover redis idempotency backend
...
Made-with: Cursor
2026-03-02 09:08:54 -08:00
Apple
3b16739671
feat(sofiia-console): add RedisIdempotencyStore backend
...
Made-with: Cursor
2026-03-02 09:08:52 -08:00
Apple
0b30775ac1
feat(sofiia-console): add structured json logging for chat ops
...
Made-with: Cursor
2026-03-02 08:24:54 -08:00
Apple
98555aa483
test(sofiia-console): add multi-node e2e routing test
...
Made-with: Cursor
2026-03-02 08:18:59 -08:00
Apple
e504df7dfa
feat(sofiia-console): harden cursor pagination with tie-breaker
...
Version cursor payloads and keep backward compatibility while adding dedicated tie-breaker regression coverage for equal timestamps to prevent pagination duplicates and gaps.
Made-with: Cursor
2026-03-02 08:12:19 -08:00
Apple
0c626943d6
refactor(sofiia-console): extract idempotency store abstraction
...
Move idempotency TTL/LRU logic into a dedicated store module with a swap-ready interface and wire chat send flow to use store get/set semantics without changing API behavior.
Made-with: Cursor
2026-03-02 08:11:13 -08:00
Apple
b9c548f1a6
test(sofiia-console): cover noda2 router_url fallback in legacy local run
...
Add regression coverage for router URL resolution when NODE_ID is unset and ROUTER_URL is present, and verify explicit NODES_NODA2_ROUTER_URL keeps higher priority.
Made-with: Cursor
2026-03-02 08:00:35 -08:00
Apple
93f94030f4
feat(sofiia-console): expose /metrics and add basic ops counters
...
Expose Prometheus-style metrics endpoint and add counters for send requests, idempotency replays, and cursor pagination calls, including a safe in-process fallback exposition when prometheus_client is unavailable.
Made-with: Cursor
2026-03-02 04:52:04 -08:00