fix(critical): Senpai using Helion's memory — 3 root causes fixed

1. YAML structure bug: Senpai was in `policies:` instead of `agents:` in router-config.yml. Router couldn't find Senpai config → no routing rule → fallback to local model. 2. tool_manager agent_id not passed: memory_search and graph_query tools were called without agent_id → defaulted to "helion" → ALL agents' tool calls searched Helion's Qdrant collections. Fixed: agent_id now flows from main.py → execute_tool → _memory_search. 3. Config not mounted: router-config.yml was baked into Docker image, host changes had no effect. Added volume mount in docker-compose. Also added: - Sofiia agent config + routing rule (was completely missing) - Senpai routing rule: cloud_deepseek (was falling to local qwen3:8b) - Anti-echo instruction for memory brief injection Deployed and verified on NODE1: Senpai now searches senpai_* collections. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-09 10:00:08 -08:00
parent 3b924118be
commit b9f7ca8ecf
3 changed files with 56 additions and 11 deletions
--- a/services/router/tool_manager.py
+++ b/services/router/tool_manager.py
@@ -351,9 +351,9 @@ class ToolManager:
        try:
            # Priority 1: Memory/Knowledge tools
            if tool_name == "memory_search":
-                return await self._memory_search(arguments)
+                return await self._memory_search(arguments, agent_id=agent_id)
            elif tool_name == "graph_query":
-                return await self._graph_query(arguments)
+                return await self._graph_query(arguments, agent_id=agent_id)
            # Priority 2: Web tools
            elif tool_name == "web_search":
                return await self._web_search(arguments)
@@ -382,7 +382,7 @@ class ToolManager:
            logger.error(f"Tool execution failed: {e}")
            return ToolResult(success=False, result=None, error=str(e))
    
-    async def _memory_search(self, args: Dict) -> ToolResult:
+    async def _memory_search(self, args: Dict, agent_id: str = None) -> ToolResult:
        """Search in Qdrant vector memory using Router's memory_retrieval - PRIORITY 1"""
        query = args.get("query")
        
@@ -393,6 +393,7 @@ class ToolManager:
            if memory_retrieval and memory_retrieval.qdrant_client:
                results = await memory_retrieval.search_memories(
                    query=query,
+                    agent_id=agent_id or helion,
                    limit=5
                )
                
@@ -543,7 +544,7 @@ class ToolManager:
        except Exception as e:
            return ToolResult(success=False, result=None, error=str(e))
    
-    async def _graph_query(self, args: Dict) -> ToolResult:
+    async def _graph_query(self, args: Dict, agent_id: str = None) -> ToolResult:
        """Query knowledge graph"""
        query = args.get("query")
        entity_type = args.get("entity_type")