PHASE 3 ROADMAP โ Core Agent Services¶
After Phase 2 Agent Integration
Status: ๐ Planning โ โ
SPEC READY
Master Task: PHASE3_MASTER_TASK.md โญ
Summary: PHASE3_READY.md
Priority: High
Estimated Time: 6-8 weeks
Dependencies: Phase 2 complete
๐ฏ Goal¶
Replace Phase 2 stubs with production-ready services: - Real LLM Proxy (multi-provider routing) - Real Agent Memory (RAG + vector DB) - Tool Registry (agent actions) - Agent Blueprint Management (CRUD + versioning)
๐ฆ Phase 3 Components¶
1. LLM Proxy Service (2 weeks)¶
Purpose: Centralized LLM gateway with routing, rate limiting, cost tracking
Features: - Multi-provider support (OpenAI, Anthropic, DeepSeek, Local) - Model selection & routing - Rate limiting per agent/microDAO - Cost tracking & billing - Streaming support - Error handling & retries - Prompt sanitization
API:
POST /internal/llm/proxy
{
"model": "gpt-4",
"messages": [...],
"stream": false,
"max_tokens": 1000,
"agent_id": "agent:sofia",
"microdao_id": "microdao:daarion"
}
GET /internal/llm/models
โ List available models
GET /internal/llm/usage?agent_id=agent:sofia&period=30d
โ Usage statistics
Tech Stack: - FastAPI - httpx for provider calls - Redis for rate limiting - PostgreSQL for usage tracking
Files:
services/llm-proxy/
โโโ main.py
โโโ providers/
โ โโโ openai.py
โ โโโ anthropic.py
โ โโโ deepseek.py
โ โโโ local.py
โโโ routing.py
โโโ rate_limiter.py
โโโ cost_tracker.py
โโโ models.py
โโโ config.yaml
2. Agent Memory Service (2 weeks)¶
Purpose: Persistent memory + RAG for agents
Features: - Short-term memory (recent context) - Mid-term memory (session/task memory) - Long-term memory (knowledge base) - Vector search (RAG) - Memory indexing (from channel history) - Memory pruning (for cost/performance) - Per-agent & per-microDAO isolation
API:
POST /internal/agent-memory/query
{
"agent_id": "agent:sofia",
"microdao_id": "microdao:daarion",
"query": "What did we discuss about Phase 2?",
"k": 5,
"memory_types": ["mid_term", "long_term"]
}
โ Top-k relevant memories
POST /internal/agent-memory/store
{
"agent_id": "agent:sofia",
"microdao_id": "microdao:daarion",
"memory_type": "mid_term",
"content": {
"user_message": "...",
"agent_reply": "...",
"context": {...}
}
}
โ Store new memory
GET /internal/agent-memory/agents/{agent_id}/stats
โ Memory usage stats
Tech Stack: - FastAPI - PostgreSQL (structured memory) - Qdrant/Weaviate/ChromaDB (vector DB for RAG) - LangChain/LlamaIndex (RAG helpers)
Files:
services/agent-memory/
โโโ main.py
โโโ vector_store.py
โโโ memory_manager.py
โโโ rag_engine.py
โโโ indexer.py
โโโ models.py
โโโ config.yaml
3. Tool Registry Service (1.5 weeks)¶
Purpose: Centralized tool definitions & execution for agents
Features: - Tool catalog (list all available tools) - Tool execution (secure sandbox) - Tool permissions (agent โ tool mapping) - Tool versioning - Execution logs & auditing
Tools (initial set):
- create_task(channel_id, title, description)
- create_followup(user_id, message_id, reminder_text, due_date)
- search_docs(query)
- create_project(microdao_id, name, description)
- summarize_channel(channel_id, period)
- send_notification(user_id, text)
API:
GET /internal/tools/catalog
โ List all tools
POST /internal/tools/execute
{
"tool_name": "create_task",
"agent_id": "agent:sofia",
"microdao_id": "microdao:daarion",
"parameters": {
"channel_id": "...",
"title": "Review Phase 2",
"description": "..."
}
}
โ Execute tool, return result
GET /internal/tools/agents/{agent_id}/permissions
โ List tools agent can use
Tech Stack: - FastAPI - Dynamic tool loading (plugins) - Sandboxed execution (Docker/gVisor) - PostgreSQL (tool definitions, permissions, logs)
Files:
services/tool-registry/
โโโ main.py
โโโ catalog.py
โโโ executor.py
โโโ sandbox.py
โโโ permissions.py
โโโ tools/
โ โโโ task_tools.py
โ โโโ project_tools.py
โ โโโ notification_tools.py
โ โโโ ...
โโโ config.yaml
4. Agent Blueprint Service (1 week)¶
Purpose: CRUD + versioning for agent definitions
Features: - Create/Read/Update/Delete agent blueprints - Blueprint versioning - Blueprint templates (archetypes) - Blueprint validation - Blueprint inheritance
API:
GET /internal/agents/blueprints
โ List all blueprints
POST /internal/agents/blueprints
{
"code": "sofia_prime_v2",
"name": "Sofia Prime v2",
"model": "gpt-4.1",
"instructions": "...",
"capabilities": {...},
"tools": ["create_task", "summarize_channel"]
}
โ Create blueprint
GET /internal/agents/blueprints/{blueprint_id}
โ Get blueprint
GET /internal/agents/{agent_id}/blueprint
โ Get blueprint for specific agent instance
PUT /internal/agents/blueprints/{blueprint_id}
โ Update blueprint (creates new version)
Tech Stack: - FastAPI - PostgreSQL (blueprints, versions) - YAML/JSON schema validation
Files:
services/agents-service/
โโโ main.py
โโโ blueprints/
โ โโโ crud.py
โ โโโ versioning.py
โ โโโ validation.py
โ โโโ templates.py
โโโ models.py
โโโ config.yaml
5. Integration Updates (1 week)¶
Update agent-runtime to use real services:
# Before (Phase 2):
blueprint = await load_agent_blueprint(agent_id) # Mock
memory = await query_memory(...) # Stub
llm_response = await generate_response(...) # Stub
# After (Phase 3):
blueprint = await agents_service.get_blueprint(agent_id) # Real
memory = await memory_service.query(...) # Real RAG
llm_response = await llm_proxy.generate(...) # Real multi-provider
# NEW: Tool usage
if llm_suggests_tool_use:
tool_result = await tool_registry.execute(tool_name, parameters)
# Add tool result to context, call LLM again
๐ Timeline¶
Week 1-2: LLM Proxy¶
- Week 1: Core routing + OpenAI provider
- Week 2: Multi-provider + rate limiting + cost tracking
Week 3-4: Agent Memory¶
- Week 3: Vector store setup + basic RAG
- Week 4: Memory management + indexing
Week 5-6: Tool Registry¶
- Week 5: Catalog + basic tools (task, followup)
- Week 6: Executor + permissions + sandboxing
Week 7: Agent Blueprint Service¶
- CRUD + versioning + validation
Week 8: Integration & Testing¶
- Update agent-runtime
- E2E testing
- Performance optimization
- Documentation
๐งช Testing Strategy¶
LLM Proxy Testing:¶
- Unit: Each provider (OpenAI, Anthropic, etc.)
- Integration: Rate limiting, cost tracking
- Load: 100 concurrent requests
- Failover: Provider unavailable scenarios
Agent Memory Testing:¶
- RAG accuracy: Retrieve relevant memories
- Memory indexing: Auto-index from channels
- Vector search performance: < 500ms
- Memory pruning: Clean old memories
Tool Registry Testing:¶
- Tool execution: All tools work
- Permissions: Agent cannot use unauthorized tools
- Sandboxing: Tools cannot escape sandbox
- Audit logs: All executions logged
E2E Testing:¶
- User asks agent to create task โ Task created
- User asks agent to summarize โ Summary posted
- Agent uses memory correctly in replies
- Multiple providers work (switch between OpenAI/DeepSeek)
๐ฏ Acceptance Criteria¶
Phase 3 Complete When:¶
- โ LLM Proxy supports 3+ providers
- โ Agent Memory RAG works (< 500ms queries)
- โ Tool Registry has 5+ working tools
- โ Agent Blueprint CRUD works
- โ agent-runtime integrated with all services
- โ E2E: User โ Agent (with tool use) โ Result
- โ Cost tracking shows LLM usage per agent
- โ Memory usage shows per agent/microDAO
- โ All services pass health checks
- โ Documentation complete
๐ Success Metrics¶
| Metric | Target |
|---|---|
| LLM response time | < 2s (non-streaming) |
| Memory query time | < 500ms |
| Tool execution time | < 3s |
| E2E agent reply | < 5s (with tool use) |
| LLM cost per request | < $0.05 |
| System uptime | > 99.5% |
๐ Dependencies¶
External Services:¶
- OpenAI API (for GPT-4)
- Anthropic API (for Claude, optional)
- DeepSeek API (optional)
- Qdrant/Weaviate (for vector DB)
Internal Services:¶
- PostgreSQL (for all structured data)
- Redis (for rate limiting, caching)
- NATS (for events)
๐ก Optional Enhancements (Phase 3.5)¶
LLM Proxy:¶
- Streaming SSE support
- Local model support (Ollama, vLLM)
- Prompt caching
- A/B testing for prompts
Agent Memory:¶
- Hierarchical memory (microDAO โ team โ agent)
- Memory sharing between agents
- Memory snapshots (save/restore agent state)
- Memory analytics dashboard
Tool Registry:¶
- Tool marketplace (community tools)
- Tool composition (chain tools)
- Visual tool builder
- Tool usage analytics
๐ Quick Start (After Phase 2)¶
To prepare for Phase 3:¶
# 1. Review Phase 3 roadmap
cat docs/tasks/PHASE3_ROADMAP.md
# 2. Set up external services
# - Get OpenAI API key
# - Set up Qdrant (Docker or cloud)
# - Set up Redis
# 3. Start with LLM Proxy
mkdir -p services/llm-proxy
cd services/llm-proxy
# Follow PHASE3_LLM_PROXY_TASK.md (to be created)
๐ Task Files (To Be Created)¶
After Phase 2 complete, create detailed tasks:
- TASK_PHASE3_LLM_PROXY.md (2 weeks)
- TASK_PHASE3_AGENT_MEMORY.md (2 weeks)
- TASK_PHASE3_TOOL_REGISTRY.md (1.5 weeks)
- TASK_PHASE3_BLUEPRINT_SERVICE.md (1 week)
- TASK_PHASE3_INTEGRATION.md (1 week)
๐ Architecture Evolution¶
Phase 1 (Complete):¶
User โ Frontend โ messaging-service โ Matrix โ Frontend
Phase 2 (Current):¶
User โ Messenger โ agent_filter โ Router โ agent-runtime (stub) โ Reply
Phase 3 (Target):¶
User โ Messenger
โ
agent_filter โ Router โ agent-runtime
โ
โโ LLM Proxy โ [OpenAI | Anthropic | DeepSeek]
โโ Agent Memory โ [Vector DB | PostgreSQL]
โโ Tool Registry โ [Task | Project | Notification tools]
โโ Agent Blueprint โ [Definitions | Versions]
โ
Reply with tool results
โ Current Status¶
- โ Phase 1: Messenger Core (Complete)
- ๐ Phase 2: Agent Integration (In Progress)
- ๐ Phase 3: Core Services (This Roadmap)
- ๐ Phase 4: Advanced Features (TBD)
Ready for Phase 3?
First complete Phase 2, then return to this roadmap for detailed implementation tasks.
Version: 1.0.0
Date: 2025-11-24
Status: Planning