Files
microdao-daarion/docs/incident/followups.md
Apple 67225a39fa docs(platform): add policy configs, runbooks, ops scripts and platform documentation
Config policies (16 files): alert_routing, architecture_pressure, backlog,
cost_weights, data_governance, incident_escalation, incident_intelligence,
network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix,
release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout

Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard,
deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice,
cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule),
task_registry, voice alerts/ha/latency/policy

Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks,
NODA1/NODA2 status and setup, audit index and traces, backlog, incident,
supervisor, tools, voice, opencode, release, risk, aistalk, spacebot

Made-with: Cursor
2026-03-03 07:14:53 -08:00

103 lines
2.6 KiB
Markdown

# Follow-up Tracker & Release Gate
## Overview
Follow-ups are structured action items attached to incidents via `incident_append_event` with `type=followup`. The `followup_watch` gate in `release_check` uses them to block or warn about releases for services with unresolved issues.
## Follow-up Event Schema
When appending a follow-up event to an incident:
```json
{
"action": "incident_append_event",
"incident_id": "inc_20250123_0900_abc1",
"type": "followup",
"message": "Upgrade postgres driver",
"meta": {
"title": "Upgrade postgres driver to fix connection leak",
"owner": "sofiia",
"priority": "P1",
"due_date": "2025-02-01T00:00:00Z",
"status": "open",
"links": ["https://github.com/org/repo/issues/42"]
}
}
```
### Meta Fields
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `title` | string | yes | Short description |
| `owner` | string | yes | Agent ID or handle |
| `priority` | enum | yes | P0, P1, P2, P3 |
| `due_date` | ISO8601 | yes | Deadline |
| `status` | enum | yes | open, done, cancelled |
| `links` | array | no | Related PRs/issues/ADRs |
## oncall_tool: incident_followups_summary
Summarises open incidents and overdue follow-ups for a service.
### Request
```json
{
"action": "incident_followups_summary",
"service": "gateway",
"env": "prod",
"window_days": 30
}
```
### Response
```json
{
"open_incidents": [
{"id": "inc_...", "severity": "P1", "status": "open", "started_at": "...", "title": "..."}
],
"overdue_followups": [
{"incident_id": "inc_...", "title": "...", "due_date": "...", "priority": "P1", "owner": "sofiia"}
],
"stats": {
"open_incidents": 1,
"overdue": 1,
"total_open_followups": 3
}
}
```
## Release Gate: followup_watch
### Behaviour per GatePolicy mode
| Mode | Behaviour |
|------|-----------|
| `off` | Gate skipped entirely |
| `warn` | Always pass=True; adds recommendations for open P0/P1 and overdue follow-ups |
| `strict` | Blocks release (`pass=false`) if open incidents match `fail_on` severities or overdue follow-ups exist |
### Configuration
In `config/release_gate_policy.yml`:
```yaml
followup_watch:
mode: "warn" # off | warn | strict
fail_on: ["P0", "P1"] # Severities that block in strict mode
```
### release_check inputs
| Input | Type | Default | Description |
|-------|------|---------|-------------|
| `run_followup_watch` | bool | true | Enable/disable gate |
| `followup_watch_window_days` | int | 30 | Incident scan window |
| `followup_watch_env` | string | "any" | Filter by environment |
## RBAC
`incident_followups_summary` requires `tools.oncall.read` entitlement.