Config policies (16 files): alert_routing, architecture_pressure, backlog, cost_weights, data_governance, incident_escalation, incident_intelligence, network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix, release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard, deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice, cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule), task_registry, voice alerts/ha/latency/policy Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks, NODA1/NODA2 status and setup, audit index and traces, backlog, incident, supervisor, tools, voice, opencode, release, risk, aistalk, spacebot Made-with: Cursor
103 lines
5.5 KiB
Plaintext
103 lines
5.5 KiB
Plaintext
# ─── DAARION Operational Scheduled Jobs ─────────────────────────────────────
|
|
# Add these entries to `/etc/cron.d/daarion-ops` (NODE1, as ops user)
|
|
# or use `crontab -e`.
|
|
#
|
|
# Format: minute hour dom month dow command
|
|
# All times UTC (TZ=UTC is set below).
|
|
#
|
|
# Requires:
|
|
# REPO_ROOT=/path/to/microdao-daarion
|
|
# ROUTER_URL=http://localhost:8000 (or http://dagi-router-node1:8000)
|
|
# DATABASE_URL=postgresql://... (if using Postgres backends)
|
|
# ALERT_DATABASE_URL=... (optional, overrides DATABASE_URL for alerts)
|
|
#
|
|
# Replace /opt/daarion/microdao-daarion and python3 path as needed.
|
|
|
|
SHELL=/bin/bash
|
|
TZ=UTC
|
|
REPO_ROOT=/opt/daarion/microdao-daarion
|
|
PYTHON=/usr/local/bin/python3
|
|
ROUTER_URL=http://localhost:8000
|
|
RUN_JOB=$PYTHON $REPO_ROOT/ops/scripts/run_governance_job.py
|
|
|
|
# ── Daily 03:30 — Audit JSONL cleanup (enforce retention_days=30) ────────────
|
|
30 3 * * * ops $PYTHON $REPO_ROOT/ops/scripts/audit_cleanup.py \
|
|
--audit-dir $REPO_ROOT/ops/audit \
|
|
--retention-days 30 \
|
|
>> /var/log/daarion/audit_cleanup.log 2>&1
|
|
|
|
# ── Daily 09:00 — FinOps cost digest (saves to ops/reports/cost/) ─────────────
|
|
0 9 * * * ops $PYTHON $REPO_ROOT/ops/scripts/schedule_jobs.py daily_cost_digest \
|
|
>> /var/log/daarion/cost_digest.log 2>&1
|
|
|
|
# ── Daily 09:10 — Privacy audit digest (saves to ops/reports/privacy/) ─────────
|
|
10 9 * * * ops $PYTHON $REPO_ROOT/ops/scripts/schedule_jobs.py daily_privacy_digest \
|
|
>> /var/log/daarion/privacy_digest.log 2>&1
|
|
|
|
# ── Weekly Monday 02:00 — Full drift analysis (saves to ops/reports/drift/) ────
|
|
0 2 * * 1 ops $PYTHON $REPO_ROOT/ops/scripts/schedule_jobs.py weekly_drift_full \
|
|
>> /var/log/daarion/drift_full.log 2>&1
|
|
|
|
# ═══════════════════════════════════════════════════════════════════════════════
|
|
# ── GOVERNANCE ENGINE — Risk / Pressure / Backlog Jobs ───────────────────────
|
|
# ═══════════════════════════════════════════════════════════════════════════════
|
|
# All governance jobs use run_governance_job.py → POST /v1/tools/execute
|
|
# Logs rotate daily via logrotate or append-only (safe).
|
|
|
|
# ── Hourly — Risk score snapshot (saves to risk_history_store) ───────────────
|
|
0 * * * * ops $RUN_JOB \
|
|
--tool risk_history_tool --action snapshot \
|
|
--params-json '{"env":"prod"}' \
|
|
>> /var/log/daarion/risk_snapshot.log 2>&1
|
|
|
|
# ── Daily 09:00 — Daily Risk Digest (saves to ops/reports/risk/YYYY-MM-DD.*) ─
|
|
0 9 * * * ops $RUN_JOB \
|
|
--tool risk_history_tool --action digest \
|
|
--params-json '{"env":"prod"}' \
|
|
>> /var/log/daarion/risk_digest.log 2>&1
|
|
|
|
# ── Daily 03:20 — Risk history cleanup (remove old snapshots) ────────────────
|
|
20 3 * * * ops $RUN_JOB \
|
|
--tool risk_history_tool --action cleanup \
|
|
--params-json '{}' \
|
|
>> /var/log/daarion/risk_cleanup.log 2>&1
|
|
|
|
# ── Monday 06:00 — Weekly Platform Priority Digest (ops/reports/platform/YYYY-WW.*) ─
|
|
0 6 * * 1 ops $RUN_JOB \
|
|
--tool architecture_pressure_tool --action digest \
|
|
--params-json '{"env":"prod"}' \
|
|
>> /var/log/daarion/platform_digest.log 2>&1
|
|
|
|
# ── Monday 06:20 — Weekly Backlog Auto-Generation (20 min after platform digest) ─
|
|
20 6 * * 1 ops $RUN_JOB \
|
|
--tool backlog_tool --action auto_generate_weekly \
|
|
--params-json '{"env":"prod"}' \
|
|
>> /var/log/daarion/backlog_generate.log 2>&1
|
|
|
|
# ── Daily 03:40 — Backlog cleanup (remove done/canceled items older than 180d) ─
|
|
40 3 * * * ops $RUN_JOB \
|
|
--tool backlog_tool --action cleanup \
|
|
--params-json '{"env":"prod","retention_days":180}' \
|
|
>> /var/log/daarion/backlog_cleanup.log 2>&1
|
|
|
|
# ═══════════════════════════════════════════════════════════════════════════════
|
|
# ── VOICE CANARY — Runtime health check (NODA2) ───────────────────────────────
|
|
# ═══════════════════════════════════════════════════════════════════════════════
|
|
# Runs every 7 minutes: live synthesis test for Polina + Ostap.
|
|
# Writes ops/voice_canary_last.json for voice_policy_update.py.
|
|
# Sends alert webhook if voices fail or degrade.
|
|
# Does NOT hard-fail (runtime mode) — alerting handles escalation.
|
|
#
|
|
# Required env (set at top of this file or in /etc/cron.d/daarion-ops):
|
|
# MEMORY_SERVICE_URL=http://localhost:8000 (or docker service name on NODA2)
|
|
# ALERT_WEBHOOK_URL=<slack/telegram webhook> (optional)
|
|
# PUSHGATEWAY_URL=http://localhost:9091 (optional, for Prometheus)
|
|
|
|
MEMORY_SERVICE_URL=http://localhost:8000
|
|
|
|
*/7 * * * * ops MEMORY_SERVICE_URL=$MEMORY_SERVICE_URL \
|
|
ALERT_WEBHOOK_URL=$ALERT_WEBHOOK_URL \
|
|
PUSHGATEWAY_URL=$PUSHGATEWAY_URL \
|
|
$PYTHON $REPO_ROOT/ops/scripts/voice_canary.py --mode runtime \
|
|
>> /var/log/daarion/voice_canary.log 2>&1
|