Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents) - Added session-start/log/end scripts - Installed Git hooks for auto-logging commits/pushes - Added shell integration for zsh - Created CHANGELOG.md - Documented today's session (2026-01-10)
619 lines
15 KiB
Markdown
619 lines
15 KiB
Markdown
# 📋 PHASE 4: SECURITY LAYER — Детальний План
|
||
|
||
**Мета:** Повноцінний безпековий шар для DAARION
|
||
**Термін:** 4-6 тижнів (або 3-4 години automated)
|
||
**Залежності:** Phase 1-3 complete
|
||
|
||
---
|
||
|
||
## 🎯 OVERVIEW
|
||
|
||
Phase 4 додає критичну інфраструктуру безпеки:
|
||
|
||
```
|
||
┌─────────────────────────────────────────┐
|
||
│ SECURITY LAYER (Phase 4) │
|
||
├─────────────────────────────────────────┤
|
||
│ │
|
||
│ 1. AUTH SERVICE │
|
||
│ └─ Identity & Sessions │
|
||
│ │
|
||
│ 2. PDP SERVICE (Policy Decision) │
|
||
│ └─ Centralized access control │
|
||
│ │
|
||
│ 3. PEP HOOKS (Policy Enforcement) │
|
||
│ └─ Enforce decisions in services │
|
||
│ │
|
||
│ 4. USAGE ENGINE │
|
||
│ └─ Track LLM/Tools/Agent usage │
|
||
│ │
|
||
│ 5. AUDIT LOG │
|
||
│ └─ Security events & compliance │
|
||
│ │
|
||
└─────────────────────────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 📦 DELIVERABLES (40+ files)
|
||
|
||
### 1. **auth-service** (8 files) ✅ COMPLETE
|
||
```
|
||
services/auth-service/
|
||
├── models.py ✅ ActorIdentity, SessionToken, ApiKey
|
||
├── actor_context.py ✅ build_actor_context, require_actor
|
||
├── routes_sessions.py ✅ /auth/login, /me, /logout
|
||
├── routes_api_keys.py ✅ /auth/api-keys CRUD
|
||
├── main.py ✅ FastAPI app + DB tables
|
||
├── requirements.txt ✅
|
||
├── Dockerfile ✅
|
||
└── README.md ✅ Complete documentation
|
||
```
|
||
|
||
**Port:** 7011
|
||
**Status:** ✅ Working
|
||
**Features:**
|
||
- Mock login (3 test users)
|
||
- Session tokens (7-day expiry)
|
||
- API keys with optional expiration
|
||
- ActorContext helper for other services
|
||
|
||
---
|
||
|
||
### 2. **pdp-service** (8 files) 🔄 20% COMPLETE
|
||
```
|
||
services/pdp-service/
|
||
├── models.py ✅ PolicyRequest, PolicyDecision
|
||
├── engine.py 🔜 Policy evaluation logic
|
||
├── policy_store.py 🔜 Config-based policy storage
|
||
├── main.py 🔜 FastAPI app
|
||
├── config.yaml 🔜 microDAO/channel policies
|
||
├── requirements.txt 🔜
|
||
├── Dockerfile 🔜
|
||
└── README.md 🔜 Complete documentation
|
||
```
|
||
|
||
**Port:** 7012
|
||
**Purpose:** Centralized Policy Decision Point
|
||
|
||
**Key Features:**
|
||
- Evaluate access requests (actor + action + resource)
|
||
- Config-based policies (v1)
|
||
- Support for:
|
||
- MicroDAO access (owner/admin/member)
|
||
- Channel access (SEND_MESSAGE, READ)
|
||
- Tool execution (EXEC_TOOL)
|
||
- Agent management (MANAGE)
|
||
- Usage viewing (VIEW_USAGE)
|
||
|
||
**Policy Types:**
|
||
|
||
#### MicroDAO Policies
|
||
```yaml
|
||
microdao_policies:
|
||
- microdao_id: "microdao:daarion"
|
||
owners: ["user:1"]
|
||
admins: ["user:1", "user:93"]
|
||
members: ["user:*"] # All users
|
||
```
|
||
|
||
#### Channel Policies
|
||
```yaml
|
||
channel_policies:
|
||
- channel_id: "channel-uuid-123"
|
||
microdao_id: "microdao:daarion"
|
||
allowed_roles: ["member", "admin", "owner"]
|
||
blocked_users: []
|
||
```
|
||
|
||
#### Tool Policies
|
||
```yaml
|
||
tool_policies:
|
||
- tool_id: "projects.list"
|
||
allowed_agents: ["agent:sofia", "agent:pm"]
|
||
allowed_user_roles: ["admin", "owner"]
|
||
```
|
||
|
||
**Policy Evaluation Logic:**
|
||
|
||
```python
|
||
def evaluate(request: PolicyRequest) -> PolicyDecision:
|
||
# 1. System Admin bypass (careful!)
|
||
if "system_admin" in request.actor.roles:
|
||
return permit("system_admin")
|
||
|
||
# 2. Resource-specific rules
|
||
if request.resource.type == "microdao":
|
||
if is_microdao_owner(actor, resource):
|
||
return permit("microdao_owner")
|
||
if is_microdao_admin(actor, resource):
|
||
return permit("microdao_admin")
|
||
if request.action == "read" and is_member(actor, resource):
|
||
return permit("member")
|
||
return deny("not_authorized")
|
||
|
||
if request.resource.type == "channel":
|
||
if not is_channel_member(actor, resource):
|
||
return deny("not_channel_member")
|
||
if request.action == "send_message":
|
||
if is_blocked(actor, resource):
|
||
return deny("blocked")
|
||
return permit("channel_member")
|
||
|
||
if request.resource.type == "tool":
|
||
if actor.actor_id in tool.allowed_agents:
|
||
return permit("allowed_agent")
|
||
return deny("tool_not_allowed")
|
||
|
||
# Default deny
|
||
return deny("no_matching_policy")
|
||
```
|
||
|
||
---
|
||
|
||
### 3. **usage-engine** (8 files) 🔜 0% COMPLETE
|
||
```
|
||
services/usage-engine/
|
||
├── models.py 🔜 LlmUsageEvent, ToolUsageEvent
|
||
├── collectors.py 🔜 NATS listeners
|
||
├── aggregators.py 🔜 Aggregate stats
|
||
├── reporters.py 🔜 API endpoints
|
||
├── main.py 🔜 FastAPI app
|
||
├── requirements.txt 🔜
|
||
├── Dockerfile 🔜
|
||
└── README.md 🔜 Complete documentation
|
||
```
|
||
|
||
**Port:** 7013
|
||
**Purpose:** Usage tracking & billing foundation
|
||
|
||
**NATS Subjects:**
|
||
- `usage.llm` — LLM calls (from llm-proxy)
|
||
- `usage.tool` — Tool executions (from toolcore)
|
||
- `usage.agent` — Agent invocations (from agent-runtime)
|
||
|
||
**Events:**
|
||
|
||
#### LLM Usage Event
|
||
```json
|
||
{
|
||
"event_id": "evt-123",
|
||
"timestamp": "2025-11-24T12:34:56Z",
|
||
"actor": {
|
||
"actor_id": "user:93",
|
||
"actor_type": "human",
|
||
"microdao_ids": ["microdao:7"]
|
||
},
|
||
"agent_id": "agent:sofia",
|
||
"microdao_id": "microdao:7",
|
||
"model": "gpt-4.1-mini",
|
||
"provider": "openai",
|
||
"prompt_tokens": 1234,
|
||
"completion_tokens": 567,
|
||
"total_tokens": 1801,
|
||
"latency_ms": 2345,
|
||
"cost_usd": 0.0234
|
||
}
|
||
```
|
||
|
||
#### Tool Usage Event
|
||
```json
|
||
{
|
||
"event_id": "evt-456",
|
||
"timestamp": "2025-11-24T12:35:00Z",
|
||
"actor": {
|
||
"actor_id": "agent:sofia",
|
||
"actor_type": "agent"
|
||
},
|
||
"agent_id": "agent:sofia",
|
||
"microdao_id": "microdao:7",
|
||
"tool_id": "projects.list",
|
||
"success": true,
|
||
"latency_ms": 123,
|
||
"result_size_bytes": 4567
|
||
}
|
||
```
|
||
|
||
**API Endpoints:**
|
||
|
||
```http
|
||
GET /internal/usage/summary?microdao_id=microdao:7&period=24h
|
||
→ Aggregate stats (tokens, calls, cost)
|
||
|
||
GET /internal/usage/agents?microdao_id=microdao:7&period=7d
|
||
→ Top agents by usage
|
||
|
||
GET /internal/usage/models?period=24h
|
||
→ Model distribution
|
||
|
||
GET /internal/usage/costs?microdao_id=microdao:7&period=30d
|
||
→ Cost breakdown
|
||
```
|
||
|
||
**Database Tables:**
|
||
|
||
```sql
|
||
CREATE TABLE usage_llm (
|
||
id UUID PRIMARY KEY,
|
||
timestamp TIMESTAMPTZ NOT NULL,
|
||
actor_id TEXT NOT NULL,
|
||
agent_id TEXT,
|
||
microdao_id TEXT,
|
||
model TEXT NOT NULL,
|
||
provider TEXT NOT NULL,
|
||
prompt_tokens INT NOT NULL,
|
||
completion_tokens INT NOT NULL,
|
||
total_tokens INT NOT NULL,
|
||
latency_ms INT,
|
||
cost_usd DECIMAL(10, 6)
|
||
);
|
||
|
||
CREATE TABLE usage_tool (
|
||
id UUID PRIMARY KEY,
|
||
timestamp TIMESTAMPTZ NOT NULL,
|
||
actor_id TEXT NOT NULL,
|
||
agent_id TEXT,
|
||
microdao_id TEXT,
|
||
tool_id TEXT NOT NULL,
|
||
success BOOLEAN NOT NULL,
|
||
latency_ms INT,
|
||
result_size_bytes INT
|
||
);
|
||
|
||
-- Indexes for fast queries
|
||
CREATE INDEX idx_usage_llm_microdao_time ON usage_llm(microdao_id, timestamp DESC);
|
||
CREATE INDEX idx_usage_llm_agent ON usage_llm(agent_id, timestamp DESC);
|
||
CREATE INDEX idx_usage_tool_microdao ON usage_tool(microdao_id, timestamp DESC);
|
||
```
|
||
|
||
---
|
||
|
||
### 4. **PEP Integration** (3 services) 🔜 0% COMPLETE
|
||
|
||
#### 4.1 messaging-service PEP
|
||
**File:** `services/messaging-service/pep_middleware.py`
|
||
|
||
```python
|
||
from auth_service_client import get_actor_context
|
||
from pdp_service_client import evaluate_policy
|
||
|
||
async def check_send_message_permission(
|
||
actor_id: str,
|
||
channel_id: str,
|
||
db_pool: asyncpg.Pool
|
||
) -> bool:
|
||
"""Check if actor can send message to channel"""
|
||
|
||
# 1. Get actor context
|
||
actor = await get_actor_context(actor_id, db_pool)
|
||
|
||
# 2. Evaluate policy
|
||
decision = await evaluate_policy(
|
||
actor=actor,
|
||
action="send_message",
|
||
resource={"type": "channel", "id": channel_id}
|
||
)
|
||
|
||
# 3. Return decision
|
||
return decision.effect == "permit"
|
||
```
|
||
|
||
**Integration Points:**
|
||
- `POST /api/messaging/channels/{channel_id}/messages` — check before send
|
||
- `POST /api/messaging/channels` — check MANAGE permission
|
||
- `POST /api/messaging/channels/{channel_id}/members` — check INVITE permission
|
||
|
||
#### 4.2 agent-runtime PEP
|
||
**File:** `services/agent-runtime/pep_client.py`
|
||
|
||
```python
|
||
async def check_tool_execution_permission(
|
||
agent_id: str,
|
||
tool_id: str,
|
||
microdao_id: str
|
||
) -> bool:
|
||
"""Check if agent can execute tool"""
|
||
|
||
# Build agent actor
|
||
actor = ActorIdentity(
|
||
actor_id=agent_id,
|
||
actor_type="agent",
|
||
microdao_ids=[microdao_id],
|
||
roles=["agent"]
|
||
)
|
||
|
||
# Evaluate
|
||
decision = await evaluate_policy(
|
||
actor=actor,
|
||
action="exec_tool",
|
||
resource={"type": "tool", "id": tool_id}
|
||
)
|
||
|
||
return decision.effect == "permit"
|
||
```
|
||
|
||
**Integration:** Before calling toolcore in `handle_invocation()`
|
||
|
||
#### 4.3 toolcore PEP
|
||
**Already has:** `allowed_agents` in registry
|
||
**Additional:** Cross-check with PDP for user-initiated tool calls
|
||
|
||
---
|
||
|
||
### 5. **Audit Log** (1 migration) 🔜 0% COMPLETE
|
||
|
||
**File:** `migrations/004_create_security_audit.sql`
|
||
|
||
```sql
|
||
CREATE TABLE security_audit (
|
||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
actor_id TEXT NOT NULL,
|
||
actor_type TEXT NOT NULL,
|
||
action TEXT NOT NULL,
|
||
resource_type TEXT NOT NULL,
|
||
resource_id TEXT NOT NULL,
|
||
decision TEXT NOT NULL, -- permit/deny
|
||
reason TEXT,
|
||
context JSONB,
|
||
ip_address INET,
|
||
user_agent TEXT
|
||
);
|
||
|
||
CREATE INDEX idx_audit_timestamp ON security_audit(timestamp DESC);
|
||
CREATE INDEX idx_audit_actor ON security_audit(actor_id, timestamp DESC);
|
||
CREATE INDEX idx_audit_decision ON security_audit(decision, timestamp DESC);
|
||
CREATE INDEX idx_audit_resource ON security_audit(resource_type, resource_id);
|
||
```
|
||
|
||
**PDP Integration:**
|
||
After every `evaluate()` call, write to audit log:
|
||
|
||
```python
|
||
async def log_audit_event(
|
||
request: PolicyRequest,
|
||
decision: PolicyDecision,
|
||
context: dict = None
|
||
):
|
||
"""Write audit log entry"""
|
||
await db.execute("""
|
||
INSERT INTO security_audit
|
||
(actor_id, actor_type, action, resource_type, resource_id,
|
||
decision, reason, context)
|
||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
|
||
""",
|
||
request.actor.actor_id,
|
||
request.actor.actor_type,
|
||
request.action,
|
||
request.resource.type,
|
||
request.resource.id,
|
||
decision.effect,
|
||
decision.reason,
|
||
json.dumps(context or {})
|
||
)
|
||
```
|
||
|
||
**NATS Security Events:**
|
||
- `security.suspicious` — Publish on:
|
||
- Multiple deny events (>5 in 1 min)
|
||
- Unusual tool execution attempts
|
||
- Privilege escalation attempts
|
||
|
||
---
|
||
|
||
### 6. **Infrastructure** (3 files) 🔜 0% COMPLETE
|
||
|
||
#### 6.1 docker-compose.phase4.yml
|
||
```yaml
|
||
services:
|
||
auth-service:
|
||
build: ./services/auth-service
|
||
ports: ["7011:7011"]
|
||
environment:
|
||
- DATABASE_URL=postgresql://...
|
||
|
||
pdp-service:
|
||
build: ./services/pdp-service
|
||
ports: ["7012:7012"]
|
||
environment:
|
||
- DATABASE_URL=postgresql://...
|
||
|
||
usage-engine:
|
||
build: ./services/usage-engine
|
||
ports: ["7013:7013"]
|
||
environment:
|
||
- DATABASE_URL=postgresql://...
|
||
- NATS_URL=nats://nats:4222
|
||
|
||
# + All Phase 3 services
|
||
llm-proxy:
|
||
environment:
|
||
- AUTH_SERVICE_URL=http://auth-service:7011
|
||
|
||
# etc...
|
||
```
|
||
|
||
#### 6.2 scripts/start-phase4.sh
|
||
#### 6.3 scripts/stop-phase4.sh
|
||
|
||
---
|
||
|
||
### 7. **Documentation** (4 files) 🔜 0% COMPLETE
|
||
|
||
#### 7.1 docs/AUTH_SERVICE_SPEC.md
|
||
- Actor model
|
||
- Session management
|
||
- API keys
|
||
- Integration guide
|
||
|
||
#### 7.2 docs/PDP_SPEC.md
|
||
- Policy model
|
||
- Evaluation logic
|
||
- Policy configuration
|
||
- Adding new rules
|
||
|
||
#### 7.3 docs/USAGE_ENGINE_SPEC.md
|
||
- Event model
|
||
- NATS integration
|
||
- Aggregation queries
|
||
- Billing foundation
|
||
|
||
#### 7.4 PHASE4_READY.md
|
||
- Overview
|
||
- Quick start
|
||
- Testing guide
|
||
- Production readiness
|
||
|
||
---
|
||
|
||
## 📊 IMPLEMENTATION ROADMAP
|
||
|
||
### Week 1: Core Services
|
||
- ✅ auth-service (complete)
|
||
- 🔄 pdp-service (20% → 100%)
|
||
- 🔜 usage-engine (0% → 100%)
|
||
|
||
### Week 2: Integration
|
||
- 🔜 PEP hooks (messaging-service)
|
||
- 🔜 PEP hooks (agent-runtime)
|
||
- 🔜 PEP hooks (toolcore)
|
||
|
||
### Week 3: Audit & Testing
|
||
- 🔜 Audit log migration
|
||
- 🔜 Security events (NATS)
|
||
- 🔜 E2E testing
|
||
|
||
### Week 4: Documentation & Polish
|
||
- 🔜 All docs (4 files)
|
||
- 🔜 docker-compose
|
||
- 🔜 Scripts
|
||
- 🔜 PHASE4_READY.md
|
||
|
||
---
|
||
|
||
## 🎯 ACCEPTANCE CRITERIA
|
||
|
||
### Auth Service: ✅
|
||
- [x] Login works with mock users
|
||
- [x] Session tokens created & validated
|
||
- [x] API keys CRUD functional
|
||
- [x] actor_context helper ready
|
||
|
||
### PDP Service: 🔜
|
||
- [ ] /internal/pdp/evaluate works
|
||
- [ ] MicroDAO access rules
|
||
- [ ] Channel access rules
|
||
- [ ] Tool execution rules
|
||
- [ ] 10+ unit tests
|
||
|
||
### PEP Integration: 🔜
|
||
- [ ] messaging-service blocks unauthorized sends
|
||
- [ ] agent-runtime checks tool permissions
|
||
- [ ] toolcore enforces allowed_agents
|
||
|
||
### Usage Engine: 🔜
|
||
- [ ] usage.llm events collected
|
||
- [ ] usage.tool events collected
|
||
- [ ] /internal/usage/summary works
|
||
- [ ] Database tables created
|
||
|
||
### Audit Log: 🔜
|
||
- [ ] security_audit table exists
|
||
- [ ] PDP writes every decision
|
||
- [ ] Can query last 100 events
|
||
- [ ] security.suspicious events published
|
||
|
||
### Infrastructure: 🔜
|
||
- [ ] docker-compose.phase4.yml works
|
||
- [ ] All services healthy
|
||
- [ ] Start/stop scripts functional
|
||
- [ ] Documentation complete
|
||
|
||
---
|
||
|
||
## 🚀 QUICK START (After Complete)
|
||
|
||
```bash
|
||
# 1. Start Phase 4
|
||
./scripts/start-phase4.sh
|
||
|
||
# 2. Test Auth
|
||
curl -X POST http://localhost:7011/auth/login \
|
||
-d '{"email": "user@daarion.city"}'
|
||
|
||
# 3. Test PDP
|
||
curl -X POST http://localhost:7012/internal/pdp/evaluate \
|
||
-d '{
|
||
"actor": {...},
|
||
"action": "send_message",
|
||
"resource": {"type": "channel", "id": "..."}
|
||
}'
|
||
|
||
# 4. Check Usage
|
||
curl http://localhost:7013/internal/usage/summary?period=24h
|
||
|
||
# 5. View Audit
|
||
docker exec daarion-postgres psql -U postgres -d daarion \
|
||
-c "SELECT * FROM security_audit ORDER BY timestamp DESC LIMIT 10;"
|
||
```
|
||
|
||
---
|
||
|
||
## 🔜 AFTER PHASE 4
|
||
|
||
### Phase 5: Advanced Features
|
||
- Real Passkey integration
|
||
- OAuth2 providers
|
||
- Advanced policy language (ABAC)
|
||
- Dynamic policy updates
|
||
- Cost allocation & billing
|
||
- Security analytics dashboard
|
||
|
||
### Phase 6: Production Hardening
|
||
- Rate limiting (Redis)
|
||
- DDoS protection
|
||
- Penetration testing
|
||
- Security audit
|
||
- Compliance certification
|
||
|
||
---
|
||
|
||
## 📚 RESOURCES
|
||
|
||
**Specs:**
|
||
- Phase 4 Master Task (user-provided)
|
||
- [PHASE4_STARTED.md](../PHASE4_STARTED.md)
|
||
|
||
**Related:**
|
||
- [PHASE3_IMPLEMENTATION_COMPLETE.md](../PHASE3_IMPLEMENTATION_COMPLETE.md)
|
||
- [ALL_PHASES_STATUS.md](../ALL_PHASES_STATUS.md)
|
||
|
||
**Standards:**
|
||
- RBAC (Role-Based Access Control)
|
||
- ABAC (Attribute-Based Access Control)
|
||
- OAuth 2.0 / OpenID Connect
|
||
- Audit logging best practices
|
||
|
||
---
|
||
|
||
**Status:** 📋 Detailed Plan Complete
|
||
**Next:** Continue Implementation
|
||
**Version:** 1.0.0
|
||
**Last Updated:** 2025-11-24
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|