feat: Add presence heartbeat for Matrix online status
- matrix-gateway: POST /internal/matrix/presence/online endpoint - usePresenceHeartbeat hook with activity tracking - Auto away after 5 min inactivity - Offline on page close/visibility change - Integrated in MatrixChatRoom component
This commit is contained in:
606
docs/PHASE4_DETAILED_PLAN.md
Normal file
606
docs/PHASE4_DETAILED_PLAN.md
Normal file
@@ -0,0 +1,606 @@
|
||||
# 📋 PHASE 4: SECURITY LAYER — Детальний План
|
||||
|
||||
**Мета:** Повноцінний безпековий шар для DAARION
|
||||
**Термін:** 4-6 тижнів (або 3-4 години automated)
|
||||
**Залежності:** Phase 1-3 complete
|
||||
|
||||
---
|
||||
|
||||
## 🎯 OVERVIEW
|
||||
|
||||
Phase 4 додає критичну інфраструктуру безпеки:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ SECURITY LAYER (Phase 4) │
|
||||
├─────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. AUTH SERVICE │
|
||||
│ └─ Identity & Sessions │
|
||||
│ │
|
||||
│ 2. PDP SERVICE (Policy Decision) │
|
||||
│ └─ Centralized access control │
|
||||
│ │
|
||||
│ 3. PEP HOOKS (Policy Enforcement) │
|
||||
│ └─ Enforce decisions in services │
|
||||
│ │
|
||||
│ 4. USAGE ENGINE │
|
||||
│ └─ Track LLM/Tools/Agent usage │
|
||||
│ │
|
||||
│ 5. AUDIT LOG │
|
||||
│ └─ Security events & compliance │
|
||||
│ │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📦 DELIVERABLES (40+ files)
|
||||
|
||||
### 1. **auth-service** (8 files) ✅ COMPLETE
|
||||
```
|
||||
services/auth-service/
|
||||
├── models.py ✅ ActorIdentity, SessionToken, ApiKey
|
||||
├── actor_context.py ✅ build_actor_context, require_actor
|
||||
├── routes_sessions.py ✅ /auth/login, /me, /logout
|
||||
├── routes_api_keys.py ✅ /auth/api-keys CRUD
|
||||
├── main.py ✅ FastAPI app + DB tables
|
||||
├── requirements.txt ✅
|
||||
├── Dockerfile ✅
|
||||
└── README.md ✅ Complete documentation
|
||||
```
|
||||
|
||||
**Port:** 7011
|
||||
**Status:** ✅ Working
|
||||
**Features:**
|
||||
- Mock login (3 test users)
|
||||
- Session tokens (7-day expiry)
|
||||
- API keys with optional expiration
|
||||
- ActorContext helper for other services
|
||||
|
||||
---
|
||||
|
||||
### 2. **pdp-service** (8 files) 🔄 20% COMPLETE
|
||||
```
|
||||
services/pdp-service/
|
||||
├── models.py ✅ PolicyRequest, PolicyDecision
|
||||
├── engine.py 🔜 Policy evaluation logic
|
||||
├── policy_store.py 🔜 Config-based policy storage
|
||||
├── main.py 🔜 FastAPI app
|
||||
├── config.yaml 🔜 microDAO/channel policies
|
||||
├── requirements.txt 🔜
|
||||
├── Dockerfile 🔜
|
||||
└── README.md 🔜 Complete documentation
|
||||
```
|
||||
|
||||
**Port:** 7012
|
||||
**Purpose:** Centralized Policy Decision Point
|
||||
|
||||
**Key Features:**
|
||||
- Evaluate access requests (actor + action + resource)
|
||||
- Config-based policies (v1)
|
||||
- Support for:
|
||||
- MicroDAO access (owner/admin/member)
|
||||
- Channel access (SEND_MESSAGE, READ)
|
||||
- Tool execution (EXEC_TOOL)
|
||||
- Agent management (MANAGE)
|
||||
- Usage viewing (VIEW_USAGE)
|
||||
|
||||
**Policy Types:**
|
||||
|
||||
#### MicroDAO Policies
|
||||
```yaml
|
||||
microdao_policies:
|
||||
- microdao_id: "microdao:daarion"
|
||||
owners: ["user:1"]
|
||||
admins: ["user:1", "user:93"]
|
||||
members: ["user:*"] # All users
|
||||
```
|
||||
|
||||
#### Channel Policies
|
||||
```yaml
|
||||
channel_policies:
|
||||
- channel_id: "channel-uuid-123"
|
||||
microdao_id: "microdao:daarion"
|
||||
allowed_roles: ["member", "admin", "owner"]
|
||||
blocked_users: []
|
||||
```
|
||||
|
||||
#### Tool Policies
|
||||
```yaml
|
||||
tool_policies:
|
||||
- tool_id: "projects.list"
|
||||
allowed_agents: ["agent:sofia", "agent:pm"]
|
||||
allowed_user_roles: ["admin", "owner"]
|
||||
```
|
||||
|
||||
**Policy Evaluation Logic:**
|
||||
|
||||
```python
|
||||
def evaluate(request: PolicyRequest) -> PolicyDecision:
|
||||
# 1. System Admin bypass (careful!)
|
||||
if "system_admin" in request.actor.roles:
|
||||
return permit("system_admin")
|
||||
|
||||
# 2. Resource-specific rules
|
||||
if request.resource.type == "microdao":
|
||||
if is_microdao_owner(actor, resource):
|
||||
return permit("microdao_owner")
|
||||
if is_microdao_admin(actor, resource):
|
||||
return permit("microdao_admin")
|
||||
if request.action == "read" and is_member(actor, resource):
|
||||
return permit("member")
|
||||
return deny("not_authorized")
|
||||
|
||||
if request.resource.type == "channel":
|
||||
if not is_channel_member(actor, resource):
|
||||
return deny("not_channel_member")
|
||||
if request.action == "send_message":
|
||||
if is_blocked(actor, resource):
|
||||
return deny("blocked")
|
||||
return permit("channel_member")
|
||||
|
||||
if request.resource.type == "tool":
|
||||
if actor.actor_id in tool.allowed_agents:
|
||||
return permit("allowed_agent")
|
||||
return deny("tool_not_allowed")
|
||||
|
||||
# Default deny
|
||||
return deny("no_matching_policy")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. **usage-engine** (8 files) 🔜 0% COMPLETE
|
||||
```
|
||||
services/usage-engine/
|
||||
├── models.py 🔜 LlmUsageEvent, ToolUsageEvent
|
||||
├── collectors.py 🔜 NATS listeners
|
||||
├── aggregators.py 🔜 Aggregate stats
|
||||
├── reporters.py 🔜 API endpoints
|
||||
├── main.py 🔜 FastAPI app
|
||||
├── requirements.txt 🔜
|
||||
├── Dockerfile 🔜
|
||||
└── README.md 🔜 Complete documentation
|
||||
```
|
||||
|
||||
**Port:** 7013
|
||||
**Purpose:** Usage tracking & billing foundation
|
||||
|
||||
**NATS Subjects:**
|
||||
- `usage.llm` — LLM calls (from llm-proxy)
|
||||
- `usage.tool` — Tool executions (from toolcore)
|
||||
- `usage.agent` — Agent invocations (from agent-runtime)
|
||||
|
||||
**Events:**
|
||||
|
||||
#### LLM Usage Event
|
||||
```json
|
||||
{
|
||||
"event_id": "evt-123",
|
||||
"timestamp": "2025-11-24T12:34:56Z",
|
||||
"actor": {
|
||||
"actor_id": "user:93",
|
||||
"actor_type": "human",
|
||||
"microdao_ids": ["microdao:7"]
|
||||
},
|
||||
"agent_id": "agent:sofia",
|
||||
"microdao_id": "microdao:7",
|
||||
"model": "gpt-4.1-mini",
|
||||
"provider": "openai",
|
||||
"prompt_tokens": 1234,
|
||||
"completion_tokens": 567,
|
||||
"total_tokens": 1801,
|
||||
"latency_ms": 2345,
|
||||
"cost_usd": 0.0234
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool Usage Event
|
||||
```json
|
||||
{
|
||||
"event_id": "evt-456",
|
||||
"timestamp": "2025-11-24T12:35:00Z",
|
||||
"actor": {
|
||||
"actor_id": "agent:sofia",
|
||||
"actor_type": "agent"
|
||||
},
|
||||
"agent_id": "agent:sofia",
|
||||
"microdao_id": "microdao:7",
|
||||
"tool_id": "projects.list",
|
||||
"success": true,
|
||||
"latency_ms": 123,
|
||||
"result_size_bytes": 4567
|
||||
}
|
||||
```
|
||||
|
||||
**API Endpoints:**
|
||||
|
||||
```http
|
||||
GET /internal/usage/summary?microdao_id=microdao:7&period=24h
|
||||
→ Aggregate stats (tokens, calls, cost)
|
||||
|
||||
GET /internal/usage/agents?microdao_id=microdao:7&period=7d
|
||||
→ Top agents by usage
|
||||
|
||||
GET /internal/usage/models?period=24h
|
||||
→ Model distribution
|
||||
|
||||
GET /internal/usage/costs?microdao_id=microdao:7&period=30d
|
||||
→ Cost breakdown
|
||||
```
|
||||
|
||||
**Database Tables:**
|
||||
|
||||
```sql
|
||||
CREATE TABLE usage_llm (
|
||||
id UUID PRIMARY KEY,
|
||||
timestamp TIMESTAMPTZ NOT NULL,
|
||||
actor_id TEXT NOT NULL,
|
||||
agent_id TEXT,
|
||||
microdao_id TEXT,
|
||||
model TEXT NOT NULL,
|
||||
provider TEXT NOT NULL,
|
||||
prompt_tokens INT NOT NULL,
|
||||
completion_tokens INT NOT NULL,
|
||||
total_tokens INT NOT NULL,
|
||||
latency_ms INT,
|
||||
cost_usd DECIMAL(10, 6)
|
||||
);
|
||||
|
||||
CREATE TABLE usage_tool (
|
||||
id UUID PRIMARY KEY,
|
||||
timestamp TIMESTAMPTZ NOT NULL,
|
||||
actor_id TEXT NOT NULL,
|
||||
agent_id TEXT,
|
||||
microdao_id TEXT,
|
||||
tool_id TEXT NOT NULL,
|
||||
success BOOLEAN NOT NULL,
|
||||
latency_ms INT,
|
||||
result_size_bytes INT
|
||||
);
|
||||
|
||||
-- Indexes for fast queries
|
||||
CREATE INDEX idx_usage_llm_microdao_time ON usage_llm(microdao_id, timestamp DESC);
|
||||
CREATE INDEX idx_usage_llm_agent ON usage_llm(agent_id, timestamp DESC);
|
||||
CREATE INDEX idx_usage_tool_microdao ON usage_tool(microdao_id, timestamp DESC);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. **PEP Integration** (3 services) 🔜 0% COMPLETE
|
||||
|
||||
#### 4.1 messaging-service PEP
|
||||
**File:** `services/messaging-service/pep_middleware.py`
|
||||
|
||||
```python
|
||||
from auth_service_client import get_actor_context
|
||||
from pdp_service_client import evaluate_policy
|
||||
|
||||
async def check_send_message_permission(
|
||||
actor_id: str,
|
||||
channel_id: str,
|
||||
db_pool: asyncpg.Pool
|
||||
) -> bool:
|
||||
"""Check if actor can send message to channel"""
|
||||
|
||||
# 1. Get actor context
|
||||
actor = await get_actor_context(actor_id, db_pool)
|
||||
|
||||
# 2. Evaluate policy
|
||||
decision = await evaluate_policy(
|
||||
actor=actor,
|
||||
action="send_message",
|
||||
resource={"type": "channel", "id": channel_id}
|
||||
)
|
||||
|
||||
# 3. Return decision
|
||||
return decision.effect == "permit"
|
||||
```
|
||||
|
||||
**Integration Points:**
|
||||
- `POST /api/messaging/channels/{channel_id}/messages` — check before send
|
||||
- `POST /api/messaging/channels` — check MANAGE permission
|
||||
- `POST /api/messaging/channels/{channel_id}/members` — check INVITE permission
|
||||
|
||||
#### 4.2 agent-runtime PEP
|
||||
**File:** `services/agent-runtime/pep_client.py`
|
||||
|
||||
```python
|
||||
async def check_tool_execution_permission(
|
||||
agent_id: str,
|
||||
tool_id: str,
|
||||
microdao_id: str
|
||||
) -> bool:
|
||||
"""Check if agent can execute tool"""
|
||||
|
||||
# Build agent actor
|
||||
actor = ActorIdentity(
|
||||
actor_id=agent_id,
|
||||
actor_type="agent",
|
||||
microdao_ids=[microdao_id],
|
||||
roles=["agent"]
|
||||
)
|
||||
|
||||
# Evaluate
|
||||
decision = await evaluate_policy(
|
||||
actor=actor,
|
||||
action="exec_tool",
|
||||
resource={"type": "tool", "id": tool_id}
|
||||
)
|
||||
|
||||
return decision.effect == "permit"
|
||||
```
|
||||
|
||||
**Integration:** Before calling toolcore in `handle_invocation()`
|
||||
|
||||
#### 4.3 toolcore PEP
|
||||
**Already has:** `allowed_agents` in registry
|
||||
**Additional:** Cross-check with PDP for user-initiated tool calls
|
||||
|
||||
---
|
||||
|
||||
### 5. **Audit Log** (1 migration) 🔜 0% COMPLETE
|
||||
|
||||
**File:** `migrations/004_create_security_audit.sql`
|
||||
|
||||
```sql
|
||||
CREATE TABLE security_audit (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
actor_id TEXT NOT NULL,
|
||||
actor_type TEXT NOT NULL,
|
||||
action TEXT NOT NULL,
|
||||
resource_type TEXT NOT NULL,
|
||||
resource_id TEXT NOT NULL,
|
||||
decision TEXT NOT NULL, -- permit/deny
|
||||
reason TEXT,
|
||||
context JSONB,
|
||||
ip_address INET,
|
||||
user_agent TEXT
|
||||
);
|
||||
|
||||
CREATE INDEX idx_audit_timestamp ON security_audit(timestamp DESC);
|
||||
CREATE INDEX idx_audit_actor ON security_audit(actor_id, timestamp DESC);
|
||||
CREATE INDEX idx_audit_decision ON security_audit(decision, timestamp DESC);
|
||||
CREATE INDEX idx_audit_resource ON security_audit(resource_type, resource_id);
|
||||
```
|
||||
|
||||
**PDP Integration:**
|
||||
After every `evaluate()` call, write to audit log:
|
||||
|
||||
```python
|
||||
async def log_audit_event(
|
||||
request: PolicyRequest,
|
||||
decision: PolicyDecision,
|
||||
context: dict = None
|
||||
):
|
||||
"""Write audit log entry"""
|
||||
await db.execute("""
|
||||
INSERT INTO security_audit
|
||||
(actor_id, actor_type, action, resource_type, resource_id,
|
||||
decision, reason, context)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
|
||||
""",
|
||||
request.actor.actor_id,
|
||||
request.actor.actor_type,
|
||||
request.action,
|
||||
request.resource.type,
|
||||
request.resource.id,
|
||||
decision.effect,
|
||||
decision.reason,
|
||||
json.dumps(context or {})
|
||||
)
|
||||
```
|
||||
|
||||
**NATS Security Events:**
|
||||
- `security.suspicious` — Publish on:
|
||||
- Multiple deny events (>5 in 1 min)
|
||||
- Unusual tool execution attempts
|
||||
- Privilege escalation attempts
|
||||
|
||||
---
|
||||
|
||||
### 6. **Infrastructure** (3 files) 🔜 0% COMPLETE
|
||||
|
||||
#### 6.1 docker-compose.phase4.yml
|
||||
```yaml
|
||||
services:
|
||||
auth-service:
|
||||
build: ./services/auth-service
|
||||
ports: ["7011:7011"]
|
||||
environment:
|
||||
- DATABASE_URL=postgresql://...
|
||||
|
||||
pdp-service:
|
||||
build: ./services/pdp-service
|
||||
ports: ["7012:7012"]
|
||||
environment:
|
||||
- DATABASE_URL=postgresql://...
|
||||
|
||||
usage-engine:
|
||||
build: ./services/usage-engine
|
||||
ports: ["7013:7013"]
|
||||
environment:
|
||||
- DATABASE_URL=postgresql://...
|
||||
- NATS_URL=nats://nats:4222
|
||||
|
||||
# + All Phase 3 services
|
||||
llm-proxy:
|
||||
environment:
|
||||
- AUTH_SERVICE_URL=http://auth-service:7011
|
||||
|
||||
# etc...
|
||||
```
|
||||
|
||||
#### 6.2 scripts/start-phase4.sh
|
||||
#### 6.3 scripts/stop-phase4.sh
|
||||
|
||||
---
|
||||
|
||||
### 7. **Documentation** (4 files) 🔜 0% COMPLETE
|
||||
|
||||
#### 7.1 docs/AUTH_SERVICE_SPEC.md
|
||||
- Actor model
|
||||
- Session management
|
||||
- API keys
|
||||
- Integration guide
|
||||
|
||||
#### 7.2 docs/PDP_SPEC.md
|
||||
- Policy model
|
||||
- Evaluation logic
|
||||
- Policy configuration
|
||||
- Adding new rules
|
||||
|
||||
#### 7.3 docs/USAGE_ENGINE_SPEC.md
|
||||
- Event model
|
||||
- NATS integration
|
||||
- Aggregation queries
|
||||
- Billing foundation
|
||||
|
||||
#### 7.4 PHASE4_READY.md
|
||||
- Overview
|
||||
- Quick start
|
||||
- Testing guide
|
||||
- Production readiness
|
||||
|
||||
---
|
||||
|
||||
## 📊 IMPLEMENTATION ROADMAP
|
||||
|
||||
### Week 1: Core Services
|
||||
- ✅ auth-service (complete)
|
||||
- 🔄 pdp-service (20% → 100%)
|
||||
- 🔜 usage-engine (0% → 100%)
|
||||
|
||||
### Week 2: Integration
|
||||
- 🔜 PEP hooks (messaging-service)
|
||||
- 🔜 PEP hooks (agent-runtime)
|
||||
- 🔜 PEP hooks (toolcore)
|
||||
|
||||
### Week 3: Audit & Testing
|
||||
- 🔜 Audit log migration
|
||||
- 🔜 Security events (NATS)
|
||||
- 🔜 E2E testing
|
||||
|
||||
### Week 4: Documentation & Polish
|
||||
- 🔜 All docs (4 files)
|
||||
- 🔜 docker-compose
|
||||
- 🔜 Scripts
|
||||
- 🔜 PHASE4_READY.md
|
||||
|
||||
---
|
||||
|
||||
## 🎯 ACCEPTANCE CRITERIA
|
||||
|
||||
### Auth Service: ✅
|
||||
- [x] Login works with mock users
|
||||
- [x] Session tokens created & validated
|
||||
- [x] API keys CRUD functional
|
||||
- [x] actor_context helper ready
|
||||
|
||||
### PDP Service: 🔜
|
||||
- [ ] /internal/pdp/evaluate works
|
||||
- [ ] MicroDAO access rules
|
||||
- [ ] Channel access rules
|
||||
- [ ] Tool execution rules
|
||||
- [ ] 10+ unit tests
|
||||
|
||||
### PEP Integration: 🔜
|
||||
- [ ] messaging-service blocks unauthorized sends
|
||||
- [ ] agent-runtime checks tool permissions
|
||||
- [ ] toolcore enforces allowed_agents
|
||||
|
||||
### Usage Engine: 🔜
|
||||
- [ ] usage.llm events collected
|
||||
- [ ] usage.tool events collected
|
||||
- [ ] /internal/usage/summary works
|
||||
- [ ] Database tables created
|
||||
|
||||
### Audit Log: 🔜
|
||||
- [ ] security_audit table exists
|
||||
- [ ] PDP writes every decision
|
||||
- [ ] Can query last 100 events
|
||||
- [ ] security.suspicious events published
|
||||
|
||||
### Infrastructure: 🔜
|
||||
- [ ] docker-compose.phase4.yml works
|
||||
- [ ] All services healthy
|
||||
- [ ] Start/stop scripts functional
|
||||
- [ ] Documentation complete
|
||||
|
||||
---
|
||||
|
||||
## 🚀 QUICK START (After Complete)
|
||||
|
||||
```bash
|
||||
# 1. Start Phase 4
|
||||
./scripts/start-phase4.sh
|
||||
|
||||
# 2. Test Auth
|
||||
curl -X POST http://localhost:7011/auth/login \
|
||||
-d '{"email": "user@daarion.city"}'
|
||||
|
||||
# 3. Test PDP
|
||||
curl -X POST http://localhost:7012/internal/pdp/evaluate \
|
||||
-d '{
|
||||
"actor": {...},
|
||||
"action": "send_message",
|
||||
"resource": {"type": "channel", "id": "..."}
|
||||
}'
|
||||
|
||||
# 4. Check Usage
|
||||
curl http://localhost:7013/internal/usage/summary?period=24h
|
||||
|
||||
# 5. View Audit
|
||||
docker exec daarion-postgres psql -U postgres -d daarion \
|
||||
-c "SELECT * FROM security_audit ORDER BY timestamp DESC LIMIT 10;"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔜 AFTER PHASE 4
|
||||
|
||||
### Phase 5: Advanced Features
|
||||
- Real Passkey integration
|
||||
- OAuth2 providers
|
||||
- Advanced policy language (ABAC)
|
||||
- Dynamic policy updates
|
||||
- Cost allocation & billing
|
||||
- Security analytics dashboard
|
||||
|
||||
### Phase 6: Production Hardening
|
||||
- Rate limiting (Redis)
|
||||
- DDoS protection
|
||||
- Penetration testing
|
||||
- Security audit
|
||||
- Compliance certification
|
||||
|
||||
---
|
||||
|
||||
## 📚 RESOURCES
|
||||
|
||||
**Specs:**
|
||||
- Phase 4 Master Task (user-provided)
|
||||
- [PHASE4_STARTED.md](../PHASE4_STARTED.md)
|
||||
|
||||
**Related:**
|
||||
- [PHASE3_IMPLEMENTATION_COMPLETE.md](../PHASE3_IMPLEMENTATION_COMPLETE.md)
|
||||
- [ALL_PHASES_STATUS.md](../ALL_PHASES_STATUS.md)
|
||||
|
||||
**Standards:**
|
||||
- RBAC (Role-Based Access Control)
|
||||
- ABAC (Attribute-Based Access Control)
|
||||
- OAuth 2.0 / OpenID Connect
|
||||
- Audit logging best practices
|
||||
|
||||
---
|
||||
|
||||
**Status:** 📋 Detailed Plan Complete
|
||||
**Next:** Continue Implementation
|
||||
**Version:** 1.0.0
|
||||
**Last Updated:** 2025-11-24
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user