Files
microdao-daarion/PHASE3_READY.md
Apple 744c149300
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
Add automated session logging system
- Created logs/ structure (sessions, operations, incidents)
- Added session-start/log/end scripts
- Installed Git hooks for auto-logging commits/pushes
- Added shell integration for zsh
- Created CHANGELOG.md
- Documented today's session (2026-01-10)
2026-01-10 04:53:17 -08:00

8.8 KiB
Raw Permalink Blame History

🚀 PHASE 3 READY — LLM Proxy + Memory + Tools

Status: 📋 Ready to implement
Dependencies: Phase 2 complete
Estimated Time: 6-8 weeks
Priority: High


🎯 Goal

Зробити агентів DAARION по-справжньому розумними:

  • LLM Proxy — єдина точка для всіх LLM запитів (OpenAI, DeepSeek, Local)
  • Memory Orchestrator — єдиний API для short/mid/long-term памʼяті
  • Toolcore — реєстр інструментів + безпечне виконання

Phase 3 = Infrastructure for Agent Intelligence


📦 What Will Be Built

1. LLM Proxy Service

Port: 7007
Purpose: Unified LLM gateway

Features:

  • Multi-provider support (OpenAI, DeepSeek, Local)
  • Model routing (logical → physical models)
  • Usage logging (tokens, latency per agent)
  • Rate limiting per agent
  • Cost tracking hooks

API:

POST /internal/llm/proxy
{
  "model": "gpt-4.1-mini",
  "messages": [...],
  "metadata": { "agent_id": "...", "microdao_id": "..." }
}

Deliverables: 10 files

  • main.py, models.py, router.py
  • providers/ (OpenAI, DeepSeek, Local)
  • config.yaml, Dockerfile, README.md

2. Memory Orchestrator Service

Port: 7008
Purpose: Unified memory API

Features:

  • Short-term memory (channel context)
  • Mid-term memory (agent RAG)
  • Long-term memory (knowledge base)
  • Vector search (embeddings)
  • Memory indexing pipeline

API:

POST /internal/agent-memory/query
{
  "agent_id": "agent:sofia",
  "microdao_id": "microdao:7",
  "query": "What were recent changes?",
  "limit": 5
}

POST /internal/agent-memory/store
{
  "agent_id": "...",
  "content": { "user_message": "...", "agent_reply": "..." }
}

Deliverables: 9 files

  • main.py, models.py, router.py
  • backends/ (PostgreSQL, Vector Store, KB)
  • embedding_client.py, config.yaml, README.md

3. Toolcore Service

Port: 7009
Purpose: Tool registry + execution

Features:

  • Tool registry (config-based → DB-backed later)
  • Permission checks (agent → tool mapping)
  • HTTP executor (call external services)
  • Python executor (optional, for internal functions)
  • Error handling + timeouts

API:

GET /internal/tools
→ List available tools

POST /internal/tools/call
{
  "tool_id": "projects.list",
  "agent_id": "agent:sofia",
  "args": { "microdao_id": "microdao:7" }
}

Deliverables: 8 files

  • main.py, models.py, registry.py
  • executors/ (HTTP, Python)
  • config.yaml, Dockerfile, README.md

🔄 Updated Architecture

Before (Phase 2):

agent-runtime:
  - Mock LLM responses
  - Optional memory
  - No tools

After (Phase 3):

agent-runtime:
  ↓
  ├─ LLM Proxy → [OpenAI | DeepSeek | Local]
  ├─ Memory Orchestrator → [Vector DB | PostgreSQL]
  └─ Toolcore → [projects.list | task.create | ...]

🎯 Acceptance Criteria

LLM Proxy:

  • 2+ providers working (e.g., OpenAI + Local stub)
  • Model routing from config
  • Usage logging per agent
  • Health checks pass

Memory Orchestrator:

  • Query returns relevant memories
  • Store saves new memories
  • Vector search works (simple cosine)
  • agent-runtime integration

Toolcore:

  • Tool registry loaded from config
  • 1+ tool working (e.g., projects.list)
  • Permission checks work
  • HTTP executor functional

E2E:

  • Agent uses real LLM (not mock)
  • Agent uses memory (RAG)
  • Agent can call tools
  • Full flow: User → Agent (with tool) → Reply

📅 Timeline

Week Focus Deliverables
1-2 LLM Proxy Service + 2 providers
3-4 Memory Orchestrator Service + vector search
5-6 Toolcore Service + 1 tool
7 Integration Update agent-runtime
8 Testing E2E + optimization

Total: 8 weeks (6-8 weeks realistic)


🚀 How to Start

Option 1: Cursor AI

# Copy Phase 3 master task
cat docs/tasks/PHASE3_MASTER_TASK.md | pbcopy

# Paste into Cursor AI
# Wait for implementation (~1-2 hours per service)

Option 2: Manual

# 1. Start with LLM Proxy
mkdir -p services/llm-proxy
cd services/llm-proxy
# Follow PHASE3_MASTER_TASK.md

# 2. Then Memory Orchestrator
mkdir -p services/memory-orchestrator
# ...

# 3. Then Toolcore
mkdir -p services/toolcore
# ...

🔗 Key Files

Specification:

Phase 2 (Complete):


💡 Key Concepts

LLM Proxy:

  • Logical models (gpt-4.1-mini) → Physical providers (OpenAI API)
  • Routing via config
  • Cost tracking per agent
  • Graceful fallbacks

Memory Orchestrator:

  • Short-term: Recent channel messages
  • Mid-term: RAG embeddings (conversations, tasks)
  • Long-term: Knowledge base (docs, roadmaps)
  • Vector search for relevance

Toolcore:

  • Static registry (config.yaml) → Dynamic registry (DB) later
  • HTTP executor: Call external services
  • Permission model: Agent → Tool allowlist
  • Error handling: Timeouts, retries

📊 Service Ports

Service Port Purpose
messaging-service 7004 REST + WebSocket
agent-filter 7005 Filtering
agent-runtime 7006 Agent execution
llm-proxy 7007 LLM gateway
memory-orchestrator 7008 Memory API
toolcore 7009 Tool execution
router 8000 Event routing

🎓 What You'll Learn

Technologies:

  • LLM API integration (OpenAI, DeepSeek)
  • Vector embeddings + similarity search
  • Tool execution patterns
  • Provider abstraction
  • Cost tracking
  • Rate limiting

Architecture:

  • Gateway pattern (LLM Proxy)
  • Orchestrator pattern (Memory)
  • Registry pattern (Toolcore)
  • Multi-provider routing
  • Graceful degradation

🐛 Expected Challenges

LLM Proxy:

  • API key management
  • Rate limits from providers
  • Cost control
  • Streaming support (Phase 3.5)

Mitigation:

  • Environment variables for keys
  • In-memory rate limiting
  • Usage logging
  • Streaming as TODO

Memory Orchestrator:

  • Vector search performance
  • Embedding generation latency
  • Memory indexing pipeline
  • Relevance tuning

Mitigation:

  • Simple cosine similarity first
  • Async embedding generation
  • Background indexing jobs
  • A/B testing for relevance

Toolcore:

  • Tool permission model
  • Execution sandboxing
  • Error handling
  • Tool discovery

Mitigation:

  • Config-based permissions v1
  • HTTP executor with timeouts
  • Comprehensive error types
  • Static registry → DB later

🔜 After Phase 3

Phase 3.5 (Optional Enhancements):

  • Streaming LLM responses
  • Advanced memory strategies
  • Tool composition
  • Agent-to-agent communication

Phase 4 (Next Major):

  • Usage & Billing system
  • Security (PDP/PEP)
  • Advanced monitoring
  • Agent marketplace

Checklist Before Starting

Prerequisites:

  • Phase 2 complete and tested
  • NATS running
  • PostgreSQL running
  • Docker Compose working
  • OpenAI API key (optional, can use local)
  • Local LLM setup (Ollama/vLLM) for testing
  • Vector DB exploration (pgvector extension)
  • Review existing tools in your stack

🎉 Success Looks Like

After Phase 3:

  • Agent Sofia uses real GPT-4 (not mock)
  • Agent remembers past conversations (RAG)
  • Agent can list projects (tool execution)
  • All flows < 5s latency
  • Usage tracked per agent
  • Production ready

Example Flow:

User: "Sofia, що нового в проєкті X?"
    ↓
agent-runtime:
  1. Query memory (past discussions about project X)
  2. Call tool: projects.list(microdao_id)
  3. Build prompt with context + tool results
  4. Call LLM Proxy (GPT-4)
  5. Post reply
    ↓
Sofia: "В проєкті X є 3 нові задачі:
1. Завершити Phase 2 тестування
2. Почати Phase 3 LLM integration
3. Оновити документацію
Останнє оновлення було вчора."

📞 Next Actions

This Week:

  1. Review PHASE3_MASTER_TASK.md
  2. Decide: Cursor AI or manual
  3. Set up OpenAI API key (or local LLM)
  4. Review tool requirements

Next Week:

  1. 🔜 Start LLM Proxy implementation
  2. 🔜 Test with 2 providers
  3. 🔜 Integrate with agent-runtime

Status: 📋 ALL SPECS READY
Version: 1.0.0
Last Updated: 2025-11-24

READY TO BUILD PHASE 3! 🚀