Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents) - Added session-start/log/end scripts - Installed Git hooks for auto-logging commits/pushes - Added shell integration for zsh - Created CHANGELOG.md - Documented today's session (2026-01-10)
8.8 KiB
8.8 KiB
🚀 PHASE 3 READY — LLM Proxy + Memory + Tools
Status: 📋 Ready to implement
Dependencies: Phase 2 complete ✅
Estimated Time: 6-8 weeks
Priority: High
🎯 Goal
Зробити агентів DAARION по-справжньому розумними:
- LLM Proxy — єдина точка для всіх LLM запитів (OpenAI, DeepSeek, Local)
- Memory Orchestrator — єдиний API для short/mid/long-term памʼяті
- Toolcore — реєстр інструментів + безпечне виконання
Phase 3 = Infrastructure for Agent Intelligence
📦 What Will Be Built
1. LLM Proxy Service
Port: 7007
Purpose: Unified LLM gateway
Features:
- ✅ Multi-provider support (OpenAI, DeepSeek, Local)
- ✅ Model routing (logical → physical models)
- ✅ Usage logging (tokens, latency per agent)
- ✅ Rate limiting per agent
- ✅ Cost tracking hooks
API:
POST /internal/llm/proxy
{
"model": "gpt-4.1-mini",
"messages": [...],
"metadata": { "agent_id": "...", "microdao_id": "..." }
}
Deliverables: 10 files
main.py,models.py,router.pyproviders/(OpenAI, DeepSeek, Local)config.yaml,Dockerfile,README.md
2. Memory Orchestrator Service
Port: 7008
Purpose: Unified memory API
Features:
- ✅ Short-term memory (channel context)
- ✅ Mid-term memory (agent RAG)
- ✅ Long-term memory (knowledge base)
- ✅ Vector search (embeddings)
- ✅ Memory indexing pipeline
API:
POST /internal/agent-memory/query
{
"agent_id": "agent:sofia",
"microdao_id": "microdao:7",
"query": "What were recent changes?",
"limit": 5
}
POST /internal/agent-memory/store
{
"agent_id": "...",
"content": { "user_message": "...", "agent_reply": "..." }
}
Deliverables: 9 files
main.py,models.py,router.pybackends/(PostgreSQL, Vector Store, KB)embedding_client.py,config.yaml,README.md
3. Toolcore Service
Port: 7009
Purpose: Tool registry + execution
Features:
- ✅ Tool registry (config-based → DB-backed later)
- ✅ Permission checks (agent → tool mapping)
- ✅ HTTP executor (call external services)
- ✅ Python executor (optional, for internal functions)
- ✅ Error handling + timeouts
API:
GET /internal/tools
→ List available tools
POST /internal/tools/call
{
"tool_id": "projects.list",
"agent_id": "agent:sofia",
"args": { "microdao_id": "microdao:7" }
}
Deliverables: 8 files
main.py,models.py,registry.pyexecutors/(HTTP, Python)config.yaml,Dockerfile,README.md
🔄 Updated Architecture
Before (Phase 2):
agent-runtime:
- Mock LLM responses
- Optional memory
- No tools
After (Phase 3):
agent-runtime:
↓
├─ LLM Proxy → [OpenAI | DeepSeek | Local]
├─ Memory Orchestrator → [Vector DB | PostgreSQL]
└─ Toolcore → [projects.list | task.create | ...]
🎯 Acceptance Criteria
LLM Proxy:
- ✅ 2+ providers working (e.g., OpenAI + Local stub)
- ✅ Model routing from config
- ✅ Usage logging per agent
- ✅ Health checks pass
Memory Orchestrator:
- ✅ Query returns relevant memories
- ✅ Store saves new memories
- ✅ Vector search works (simple cosine)
- ✅ agent-runtime integration
Toolcore:
- ✅ Tool registry loaded from config
- ✅ 1+ tool working (e.g., projects.list)
- ✅ Permission checks work
- ✅ HTTP executor functional
E2E:
- ✅ Agent uses real LLM (not mock)
- ✅ Agent uses memory (RAG)
- ✅ Agent can call tools
- ✅ Full flow: User → Agent (with tool) → Reply
📅 Timeline
| Week | Focus | Deliverables |
|---|---|---|
| 1-2 | LLM Proxy | Service + 2 providers |
| 3-4 | Memory Orchestrator | Service + vector search |
| 5-6 | Toolcore | Service + 1 tool |
| 7 | Integration | Update agent-runtime |
| 8 | Testing | E2E + optimization |
Total: 8 weeks (6-8 weeks realistic)
🚀 How to Start
Option 1: Cursor AI
# Copy Phase 3 master task
cat docs/tasks/PHASE3_MASTER_TASK.md | pbcopy
# Paste into Cursor AI
# Wait for implementation (~1-2 hours per service)
Option 2: Manual
# 1. Start with LLM Proxy
mkdir -p services/llm-proxy
cd services/llm-proxy
# Follow PHASE3_MASTER_TASK.md
# 2. Then Memory Orchestrator
mkdir -p services/memory-orchestrator
# ...
# 3. Then Toolcore
mkdir -p services/toolcore
# ...
🔗 Key Files
Specification:
- PHASE3_MASTER_TASK.md ⭐ Main task
- PHASE3_ROADMAP.md — Detailed planning
Phase 2 (Complete):
- PHASE2_COMPLETE.md — What's already built
- IMPLEMENTATION_SUMMARY.md
💡 Key Concepts
LLM Proxy:
- Logical models (gpt-4.1-mini) → Physical providers (OpenAI API)
- Routing via config
- Cost tracking per agent
- Graceful fallbacks
Memory Orchestrator:
- Short-term: Recent channel messages
- Mid-term: RAG embeddings (conversations, tasks)
- Long-term: Knowledge base (docs, roadmaps)
- Vector search for relevance
Toolcore:
- Static registry (config.yaml) → Dynamic registry (DB) later
- HTTP executor: Call external services
- Permission model: Agent → Tool allowlist
- Error handling: Timeouts, retries
📊 Service Ports
| Service | Port | Purpose |
|---|---|---|
| messaging-service | 7004 | REST + WebSocket |
| agent-filter | 7005 | Filtering |
| agent-runtime | 7006 | Agent execution |
| llm-proxy | 7007 | LLM gateway ✨ |
| memory-orchestrator | 7008 | Memory API ✨ |
| toolcore | 7009 | Tool execution ✨ |
| router | 8000 | Event routing |
🎓 What You'll Learn
Technologies:
- LLM API integration (OpenAI, DeepSeek)
- Vector embeddings + similarity search
- Tool execution patterns
- Provider abstraction
- Cost tracking
- Rate limiting
Architecture:
- Gateway pattern (LLM Proxy)
- Orchestrator pattern (Memory)
- Registry pattern (Toolcore)
- Multi-provider routing
- Graceful degradation
🐛 Expected Challenges
LLM Proxy:
- API key management
- Rate limits from providers
- Cost control
- Streaming support (Phase 3.5)
Mitigation:
- Environment variables for keys
- In-memory rate limiting
- Usage logging
- Streaming as TODO
Memory Orchestrator:
- Vector search performance
- Embedding generation latency
- Memory indexing pipeline
- Relevance tuning
Mitigation:
- Simple cosine similarity first
- Async embedding generation
- Background indexing jobs
- A/B testing for relevance
Toolcore:
- Tool permission model
- Execution sandboxing
- Error handling
- Tool discovery
Mitigation:
- Config-based permissions v1
- HTTP executor with timeouts
- Comprehensive error types
- Static registry → DB later
🔜 After Phase 3
Phase 3.5 (Optional Enhancements):
- Streaming LLM responses
- Advanced memory strategies
- Tool composition
- Agent-to-agent communication
Phase 4 (Next Major):
- Usage & Billing system
- Security (PDP/PEP)
- Advanced monitoring
- Agent marketplace
✅ Checklist Before Starting
Prerequisites:
- ✅ Phase 2 complete and tested
- ✅ NATS running
- ✅ PostgreSQL running
- ✅ Docker Compose working
- ✅ OpenAI API key (optional, can use local)
Recommended:
- Local LLM setup (Ollama/vLLM) for testing
- Vector DB exploration (pgvector extension)
- Review existing tools in your stack
🎉 Success Looks Like
After Phase 3:
- ✅ Agent Sofia uses real GPT-4 (not mock)
- ✅ Agent remembers past conversations (RAG)
- ✅ Agent can list projects (tool execution)
- ✅ All flows < 5s latency
- ✅ Usage tracked per agent
- ✅ Production ready
Example Flow:
User: "Sofia, що нового в проєкті X?"
↓
agent-runtime:
1. Query memory (past discussions about project X)
2. Call tool: projects.list(microdao_id)
3. Build prompt with context + tool results
4. Call LLM Proxy (GPT-4)
5. Post reply
↓
Sofia: "В проєкті X є 3 нові задачі:
1. Завершити Phase 2 тестування
2. Почати Phase 3 LLM integration
3. Оновити документацію
Останнє оновлення було вчора."
📞 Next Actions
This Week:
- ✅ Review PHASE3_MASTER_TASK.md
- ✅ Decide: Cursor AI or manual
- ✅ Set up OpenAI API key (or local LLM)
- ✅ Review tool requirements
Next Week:
- 🔜 Start LLM Proxy implementation
- 🔜 Test with 2 providers
- 🔜 Integrate with agent-runtime
Status: 📋 ALL SPECS READY
Version: 1.0.0
Last Updated: 2025-11-24
READY TO BUILD PHASE 3! 🚀