Config policies (16 files): alert_routing, architecture_pressure, backlog, cost_weights, data_governance, incident_escalation, incident_intelligence, network_allowlist, nodes_registry, observability_sources, rbac_tools_matrix, release_gate, risk_attribution, risk_policy, slo_policy, tool_limits, tools_rollout Ops (22 files): Caddyfile, calendar compose, grafana voice dashboard, deployments/incidents logs, runbooks for alerts/audit/backlog/incidents/sofiia/voice, cron jobs, scripts (alert_triage, audit_cleanup, migrate_*, governance, schedule), task_registry, voice alerts/ha/latency/policy Docs (30+ files): HUMANIZED_STEPAN v2.7-v3 changelogs and runbooks, NODA1/NODA2 status and setup, audit index and traces, backlog, incident, supervisor, tools, voice, opencode, release, risk, aistalk, spacebot Made-with: Cursor
103 lines
2.6 KiB
Markdown
103 lines
2.6 KiB
Markdown
# Follow-up Tracker & Release Gate
|
|
|
|
## Overview
|
|
|
|
Follow-ups are structured action items attached to incidents via `incident_append_event` with `type=followup`. The `followup_watch` gate in `release_check` uses them to block or warn about releases for services with unresolved issues.
|
|
|
|
## Follow-up Event Schema
|
|
|
|
When appending a follow-up event to an incident:
|
|
|
|
```json
|
|
{
|
|
"action": "incident_append_event",
|
|
"incident_id": "inc_20250123_0900_abc1",
|
|
"type": "followup",
|
|
"message": "Upgrade postgres driver",
|
|
"meta": {
|
|
"title": "Upgrade postgres driver to fix connection leak",
|
|
"owner": "sofiia",
|
|
"priority": "P1",
|
|
"due_date": "2025-02-01T00:00:00Z",
|
|
"status": "open",
|
|
"links": ["https://github.com/org/repo/issues/42"]
|
|
}
|
|
}
|
|
```
|
|
|
|
### Meta Fields
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `title` | string | yes | Short description |
|
|
| `owner` | string | yes | Agent ID or handle |
|
|
| `priority` | enum | yes | P0, P1, P2, P3 |
|
|
| `due_date` | ISO8601 | yes | Deadline |
|
|
| `status` | enum | yes | open, done, cancelled |
|
|
| `links` | array | no | Related PRs/issues/ADRs |
|
|
|
|
## oncall_tool: incident_followups_summary
|
|
|
|
Summarises open incidents and overdue follow-ups for a service.
|
|
|
|
### Request
|
|
|
|
```json
|
|
{
|
|
"action": "incident_followups_summary",
|
|
"service": "gateway",
|
|
"env": "prod",
|
|
"window_days": 30
|
|
}
|
|
```
|
|
|
|
### Response
|
|
|
|
```json
|
|
{
|
|
"open_incidents": [
|
|
{"id": "inc_...", "severity": "P1", "status": "open", "started_at": "...", "title": "..."}
|
|
],
|
|
"overdue_followups": [
|
|
{"incident_id": "inc_...", "title": "...", "due_date": "...", "priority": "P1", "owner": "sofiia"}
|
|
],
|
|
"stats": {
|
|
"open_incidents": 1,
|
|
"overdue": 1,
|
|
"total_open_followups": 3
|
|
}
|
|
}
|
|
```
|
|
|
|
## Release Gate: followup_watch
|
|
|
|
### Behaviour per GatePolicy mode
|
|
|
|
| Mode | Behaviour |
|
|
|------|-----------|
|
|
| `off` | Gate skipped entirely |
|
|
| `warn` | Always pass=True; adds recommendations for open P0/P1 and overdue follow-ups |
|
|
| `strict` | Blocks release (`pass=false`) if open incidents match `fail_on` severities or overdue follow-ups exist |
|
|
|
|
### Configuration
|
|
|
|
In `config/release_gate_policy.yml`:
|
|
|
|
```yaml
|
|
followup_watch:
|
|
mode: "warn" # off | warn | strict
|
|
fail_on: ["P0", "P1"] # Severities that block in strict mode
|
|
```
|
|
|
|
### release_check inputs
|
|
|
|
| Input | Type | Default | Description |
|
|
|-------|------|---------|-------------|
|
|
| `run_followup_watch` | bool | true | Enable/disable gate |
|
|
| `followup_watch_window_days` | int | 30 | Incident scan window |
|
|
| `followup_watch_env` | string | "any" | Filter by environment |
|
|
|
|
## RBAC
|
|
|
|
`incident_followups_summary` requires `tools.oncall.read` entitlement.
|