410 lines
8.3 KiB
Markdown
410 lines
8.3 KiB
Markdown
# Agent Runtime Service
|
|
|
|
**Executes agent logic: reads context, calls LLM, posts replies**
|
|
|
|
## Purpose
|
|
|
|
Agent Runtime is the execution engine for DAARION agents. It processes agent invocations by:
|
|
1. Loading agent blueprints
|
|
2. Reading channel history for context
|
|
3. Querying agent memory (RAG)
|
|
4. Building prompts for LLM
|
|
5. Generating responses
|
|
6. Posting back to channels
|
|
7. Storing interactions in memory
|
|
|
|
## Architecture
|
|
|
|
```
|
|
router.invoke.agent (NATS)
|
|
↓
|
|
agent-runtime: Execute Agent Logic
|
|
↓
|
|
├─ Load Blueprint
|
|
├─ Read Channel History
|
|
├─ Query Memory (RAG)
|
|
├─ Call LLM
|
|
└─ Post Reply
|
|
↓
|
|
messaging-service → Matrix → Frontend
|
|
```
|
|
|
|
## Features
|
|
|
|
- **NATS Integration**: Subscribes to `router.invoke.agent`
|
|
- **Context Loading**: Fetches last 50 messages from channel
|
|
- **Memory Integration**: Queries agent memory for relevant context
|
|
- **LLM Integration**: Calls LLM Proxy with full context
|
|
- **Mock LLM**: Falls back to mock responses when LLM Proxy unavailable (Phase 2)
|
|
- **Message Posting**: Posts replies via messaging-service
|
|
- **Memory Writeback**: Stores interactions for future RAG
|
|
|
|
## API
|
|
|
|
### Health Check
|
|
```http
|
|
GET /health
|
|
|
|
Response:
|
|
{
|
|
"status": "ok",
|
|
"service": "agent-runtime",
|
|
"version": "1.0.0",
|
|
"nats_connected": true
|
|
}
|
|
```
|
|
|
|
### Test Invocation (Manual)
|
|
```http
|
|
POST /internal/agent-runtime/test-channel
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"agent_id": "agent:sofia",
|
|
"entrypoint": "channel_message",
|
|
"payload": {
|
|
"channel_id": "7c72d497-27aa-4e75-bb2f-4a4a21d4f91f",
|
|
"microdao_id": "microdao:daarion",
|
|
"rewrite_prompt": null
|
|
}
|
|
}
|
|
|
|
Response:
|
|
{
|
|
"status": "processed",
|
|
"agent_id": "agent:sofia"
|
|
}
|
|
```
|
|
|
|
## Configuration
|
|
|
|
**File**: `config.yaml`
|
|
|
|
```yaml
|
|
nats:
|
|
servers: ["nats://nats:4222"]
|
|
invocation_subject: "router.invoke.agent"
|
|
|
|
services:
|
|
messaging: "http://messaging-service:7004"
|
|
agent_memory: "http://agent-memory:7008"
|
|
llm_proxy: "http://llm-proxy:7007"
|
|
|
|
llm:
|
|
default_model: "gpt-4"
|
|
max_tokens: 1000
|
|
temperature: 0.7
|
|
|
|
memory:
|
|
query_top_k: 5
|
|
enable_writeback: true
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
- `NATS_URL`: NATS server URL (default: `nats://nats:4222`)
|
|
- `MESSAGING_SERVICE_URL`: messaging-service URL (default: `http://messaging-service:7004`)
|
|
- `AGENT_MEMORY_URL`: agent-memory URL (default: `http://agent-memory:7008`)
|
|
- `LLM_PROXY_URL`: LLM Proxy URL (default: `http://llm-proxy:7007`)
|
|
|
|
## Running Locally
|
|
|
|
### Install Dependencies
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### Run Service
|
|
```bash
|
|
uvicorn main:app --reload --port 7006
|
|
```
|
|
|
|
### Test
|
|
```bash
|
|
# Health check
|
|
curl http://localhost:7006/health
|
|
|
|
# Test invocation
|
|
curl -X POST http://localhost:7006/internal/agent-runtime/test-channel \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"agent_id": "agent:sofia",
|
|
"entrypoint": "channel_message",
|
|
"payload": {
|
|
"channel_id": "7c72d497-27aa-4e75-bb2f-4a4a21d4f91f",
|
|
"microdao_id": "microdao:daarion"
|
|
}
|
|
}'
|
|
```
|
|
|
|
## Docker
|
|
|
|
### Build
|
|
```bash
|
|
docker build -t daarion/agent-runtime:latest .
|
|
```
|
|
|
|
### Run
|
|
```bash
|
|
docker run -p 7006:7006 \
|
|
-e MESSAGING_SERVICE_URL=http://messaging-service:7004 \
|
|
-e NATS_URL=nats://nats:4222 \
|
|
daarion/agent-runtime:latest
|
|
```
|
|
|
|
## Agent Execution Flow
|
|
|
|
### 1. Receive Invocation
|
|
From NATS subject `router.invoke.agent`:
|
|
```json
|
|
{
|
|
"agent_id": "agent:sofia",
|
|
"entrypoint": "channel_message",
|
|
"payload": {
|
|
"channel_id": "uuid",
|
|
"message_id": "uuid",
|
|
"matrix_event_id": "$event:server",
|
|
"microdao_id": "microdao:X",
|
|
"rewrite_prompt": "Optional prompt modification"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. Load Blueprint
|
|
Currently mock (Phase 2), will call `agents-service` in Phase 3:
|
|
```python
|
|
blueprint = {
|
|
"name": "Sofia-Prime",
|
|
"model": "gpt-4",
|
|
"instructions": "System prompt...",
|
|
"capabilities": {...},
|
|
"tools": ["create_task", "summarize_channel"]
|
|
}
|
|
```
|
|
|
|
### 3. Load Context
|
|
Fetch last 50 messages from `messaging-service`:
|
|
```http
|
|
GET /api/messaging/channels/{channel_id}/messages?limit=50
|
|
```
|
|
|
|
### 4. Query Memory
|
|
RAG query to `agent-memory`:
|
|
```http
|
|
POST /internal/agent-memory/query
|
|
{
|
|
"agent_id": "agent:sofia",
|
|
"microdao_id": "microdao:daarion",
|
|
"query": "last user message",
|
|
"k": 5
|
|
}
|
|
```
|
|
|
|
### 5. Build Prompt
|
|
Combine:
|
|
- System prompt (from blueprint)
|
|
- Rewrite prompt (if quiet hours)
|
|
- Memory context (top-K results)
|
|
- Recent messages (last 10)
|
|
|
|
### 6. Generate Response
|
|
Call LLM Proxy:
|
|
```http
|
|
POST /internal/llm/proxy
|
|
{
|
|
"model": "gpt-4",
|
|
"messages": [
|
|
{"role": "system", "content": "..."},
|
|
{"role": "user", "content": "..."},
|
|
...
|
|
]
|
|
}
|
|
```
|
|
|
|
Falls back to mock response if unavailable.
|
|
|
|
### 7. Post Reply
|
|
Post to messaging-service:
|
|
```http
|
|
POST /internal/agents/{agent_id}/post-to-channel
|
|
{
|
|
"channel_id": "uuid",
|
|
"text": "Agent response..."
|
|
}
|
|
```
|
|
|
|
### 8. Store Memory
|
|
(Optional) Store interaction:
|
|
```http
|
|
POST /internal/agent-memory/store
|
|
{
|
|
"agent_id": "agent:sofia",
|
|
"microdao_id": "microdao:daarion",
|
|
"content": {
|
|
"user_message": "...",
|
|
"agent_reply": "..."
|
|
}
|
|
}
|
|
```
|
|
|
|
## Mock LLM (Phase 2)
|
|
|
|
When LLM Proxy is not available, agent-runtime uses mock responses based on keywords:
|
|
|
|
- "привіт" / "hello" → Greeting
|
|
- "допомож" / "help" → Help menu
|
|
- "дяку" / "thank" → Thanks acknowledgment
|
|
- "phase 2" → Phase 2 explanation
|
|
- "?" → Question response
|
|
|
|
This allows testing the full flow without real LLM in Phase 2.
|
|
|
|
## NATS Events
|
|
|
|
### Subscribes To
|
|
- **Subject**: `router.invoke.agent`
|
|
- **Payload**: `AgentInvocation`
|
|
|
|
### Publishes To
|
|
None directly (replies via messaging-service HTTP)
|
|
|
|
## Monitoring
|
|
|
|
### Logs
|
|
|
|
```bash
|
|
# Docker
|
|
docker logs -f agent-runtime
|
|
|
|
# Look for:
|
|
# ✅ Connected to NATS
|
|
# ✅ Subscribed to router.invoke.agent
|
|
# 🤖 Processing agent invocation
|
|
# 📝 Agent: agent:sofia
|
|
# ✅ Loaded blueprint
|
|
# ✅ Fetched N messages
|
|
# 💬 User message: ...
|
|
# 🤔 Generating response...
|
|
# ✅ Generated response
|
|
# ✅ Posted message to channel
|
|
# ✅ Agent replied successfully
|
|
```
|
|
|
|
### Metrics (Future)
|
|
|
|
- Total invocations processed
|
|
- Average response time
|
|
- LLM calls per agent
|
|
- Success/failure rate
|
|
- Memory query latency
|
|
|
|
## Troubleshooting
|
|
|
|
### Agent Not Responding
|
|
|
|
**Check logs**:
|
|
```bash
|
|
docker logs agent-runtime | tail -50
|
|
```
|
|
|
|
**Common issues**:
|
|
1. Not receiving invocations from router
|
|
2. Cannot fetch channel messages
|
|
3. LLM error (check mock fallback)
|
|
4. Cannot post to channel
|
|
|
|
### Cannot Post to Channel
|
|
|
|
```
|
|
❌ HTTP error posting message: 404
|
|
Endpoint not found. You may need to add it to messaging-service.
|
|
```
|
|
|
|
**Solution**: Ensure messaging-service has internal endpoint:
|
|
```python
|
|
POST /internal/agents/{agent_id}/post-to-channel
|
|
```
|
|
|
|
### Memory Errors
|
|
|
|
```
|
|
⚠️ Agent Memory service not available (Phase 2 - OK)
|
|
```
|
|
|
|
This is expected in Phase 2. Agent continues without memory.
|
|
|
|
## Phase 2 vs Phase 3
|
|
|
|
### Phase 2 (Current)
|
|
- ✅ NATS subscription
|
|
- ✅ Channel history reading
|
|
- ✅ Mock LLM responses
|
|
- ✅ Message posting
|
|
- ⚠️ Mock agent blueprint
|
|
- ⚠️ Memory service optional
|
|
|
|
### Phase 3 (Future)
|
|
- Real agent blueprint loading from agents-service
|
|
- Real LLM via LLM Proxy (OpenAI, Anthropic, DeepSeek)
|
|
- Full RAG with vector DB
|
|
- Tool invocation (create_task, etc.)
|
|
- Advanced prompt engineering
|
|
- Multi-agent coordination
|
|
|
|
## Development
|
|
|
|
### Adding New Capabilities
|
|
|
|
Edit `load_agent_blueprint()` in `main.py`:
|
|
|
|
```python
|
|
async def load_agent_blueprint(agent_id: str) -> AgentBlueprint:
|
|
return AgentBlueprint(
|
|
id=agent_id,
|
|
name="Sofia-Prime",
|
|
model="gpt-4",
|
|
instructions="Your custom instructions...",
|
|
capabilities={
|
|
"can_create_tasks": True,
|
|
"can_use_tools": True, # NEW
|
|
"can_access_docs": True # NEW
|
|
},
|
|
tools=["new_tool", ...] # NEW
|
|
)
|
|
```
|
|
|
|
### Testing Locally
|
|
|
|
1. Start messaging-service (or mock it)
|
|
2. Start NATS (or run in test mode)
|
|
3. Run agent-runtime
|
|
4. Send test invocation via HTTP
|
|
|
|
```bash
|
|
curl -X POST http://localhost:7006/internal/agent-runtime/test-channel \
|
|
-H "Content-Type: application/json" \
|
|
-d @test_invocation.json
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
- [ ] Tool invocation support
|
|
- [ ] Multi-turn conversations
|
|
- [ ] Streaming responses
|
|
- [ ] Context window management
|
|
- [ ] Agent personality customization
|
|
- [ ] A/B testing for prompts
|
|
- [ ] Agent analytics dashboard
|
|
|
|
## Version
|
|
|
|
**Current**: 1.0.0
|
|
**Status**: Production Ready (Phase 2)
|
|
**Last Updated**: 2025-11-24
|
|
|
|
|
|
|
|
|
|
|
|
|