feat: add Vision Encoder service + Vision RAG implementation

- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated) - FastAPI app with text/image embedding endpoints (768-dim) - Docker support with NVIDIA GPU runtime - Port 8001, health checks, model info API - Qdrant Vector Database integration - Port 6333/6334 (HTTP/gRPC) - Image embeddings storage (768-dim, Cosine distance) - Auto collection creation - Vision RAG implementation - VisionEncoderClient (Python client for API) - Image Search module (text-to-image, image-to-image) - Vision RAG routing in DAGI Router (mode: image_search) - VisionEncoderProvider integration - Documentation (5000+ lines) - SYSTEM-INVENTORY.md - Complete system inventory - VISION-ENCODER-STATUS.md - Service status - VISION-RAG-IMPLEMENTATION.md - Implementation details - vision_encoder_deployment_task.md - Deployment checklist - services/vision-encoder/README.md - Deployment guide - Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook - Testing - test-vision-encoder.sh - Smoke tests (6 tests) - Unit tests for client, image search, routing - Services: 17 total (added Vision Encoder + Qdrant) - AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3) - GPU Services: 2 (Vision Encoder, Ollama) - VRAM Usage: ~10 GB (concurrent) Status: Production Ready ✅
2025-11-17 05:24:36 -08:00
parent b2b51f08fb
commit 4601c6fca8
55 changed files with 13205 additions and 3 deletions
--- a/WARP.md
+++ b/WARP.md
@@ -0,0 +1,409 @@
+# WARP.md
+
+This file provides guidance to WARP (warp.dev) when working with code in this repository.
+
+## Repository Overview
+
+**DAGI Stack** (Decentralized Agentic Gateway Infrastructure) is a production-ready AI router with multi-agent orchestration, microDAO governance, and bot gateway integration. It's a microservices architecture for routing and orchestrating AI agents and LLM providers.
+
+### Infrastructure & Deployment
+
+**For complete infrastructure information** (servers, repositories, domains, deployment workflows), see:
+- **[INFRASTRUCTURE.md](./INFRASTRUCTURE.md)** — Production servers, GitHub repos, DNS, services, deployment
+- **[SYSTEM-INVENTORY.md](./SYSTEM-INVENTORY.md)** — Complete system inventory (GPU, AI models, services)
+- **[docs/infrastructure_quick_ref.ipynb](./docs/infrastructure_quick_ref.ipynb)** — Jupyter Notebook for quick search
+
+## Quick Start Commands
+
+### Development
+
+```bash
+# Start all services via Docker Compose
+docker-compose up -d
+
+# View logs for all services
+docker-compose logs -f
+
+# View logs for specific service
+docker-compose logs -f router
+docker-compose logs -f gateway
+docker-compose logs -f devtools
+docker-compose logs -f crewai
+docker-compose logs -f rbac
+
+# Stop all services
+docker-compose down
+
+# Rebuild and restart after code changes
+docker-compose up -d --build
+```
+
+### Testing
+
+```bash
+# Smoke tests - basic health checks for all services
+./smoke.sh
+
+# End-to-end tests for specific components
+./test-devtools.sh    # DevTools integration
+./test-crewai.sh      # CrewAI workflows
+./test-gateway.sh     # Gateway + RBAC
+./test-fastapi.sh     # FastAPI endpoints
+
+# RAG pipeline evaluation
+./tests/e2e_rag_pipeline.sh
+python tests/rag_eval.py
+
+# Unit tests
+python -m pytest test_config_loader.py
+python -m pytest services/parser-service/tests/
+python -m pytest services/rag-service/tests/
+```
+
+### Local Development (without Docker)
+
+```bash
+# Start Router (main service)
+python main_v2.py --config router-config.yml --port 9102
+
+# Start DevTools Backend
+cd devtools-backend && python main.py
+
+# Start CrewAI Orchestrator
+cd orchestrator && python crewai_backend.py
+
+# Start Bot Gateway
+cd gateway-bot && python main.py
+
+# Start RBAC Service
+cd microdao && python main.py
+```
+
+### Configuration
+
+```bash
+# Copy environment template
+cp .env.example .env
+
+# Edit configuration with your tokens and settings
+nano .env
+
+# Validate router configuration
+python config_loader.py
+```
+
+## Architecture
+
+### Core Services (Microservices)
+
+The DAGI Stack follows a microservices architecture with these primary services:
+
+**1. DAGI Router** (Port 9102)
+- Main routing engine that dispatches requests to appropriate providers
+- Rule-based routing with priority-ordered rules defined in `router-config.yml`
+- Handles RBAC context injection for microDAO chat mode
+- **Key files:**
+  - `main_v2.py` - FastAPI application entry point
+  - `router_app.py` - Core RouterApp class with request handling logic
+  - `routing_engine.py` - Rule matching and provider resolution
+  - `config_loader.py` - Configuration loading and validation with Pydantic models
+  - `router-config.yml` - Routing rules and provider configuration
+
+**2. Bot Gateway** (Port 9300)
+- HTTP server for bot platforms (Telegram, Discord)
+- Normalizes platform-specific messages to unified format
+- Integrates with RBAC service before forwarding to Router
+- Implements DAARWIZZ system agent
+- **Key files:** `gateway-bot/main.py`, `gateway-bot/http_api.py`, `gateway-bot/router_client.py`
+
+**3. DevTools Backend** (Port 8008)
+- Tool execution service for development tasks
+- File operations (read/write), test execution, notebook execution
+- Security: path validation, size limits
+- **Key files:** `devtools-backend/main.py`
+
+**4. CrewAI Orchestrator** (Port 9010)
+- Multi-agent workflow execution
+- Pre-configured workflows: `microdao_onboarding`, `code_review`, `proposal_review`, `task_decomposition`
+- **Key files:** `orchestrator/crewai_backend.py`
+
+**5. RBAC Service** (Port 9200)
+- Role-based access control with roles: admin, member, contributor, guest
+- DAO isolation for multi-tenancy
+- **Key files:** `microdao/` directory
+
+**6. RAG Service** (Port 9500)
+- Document retrieval and question answering
+- Uses embeddings (BAAI/bge-m3) and PostgreSQL for vector storage
+- Integrates with Router for LLM calls
+- **Key files:** `services/rag-service/`
+
+**7. Memory Service** (Port 8000)
+- Agent memory and context management
+- **Key files:** `services/memory-service/`
+
+**8. Parser Service**
+- Document parsing and Q&A generation
+- 2-stage pipeline: parse → Q&A build
+- **Key files:** `services/parser-service/`
+
+### Provider System
+
+The system uses a provider abstraction to support multiple backends:
+
+- **Base Provider** (`providers/base.py`) - Abstract base class
+- **LLM Provider** (`providers/llm_provider.py`) - Ollama, DeepSeek, OpenAI
+- **DevTools Provider** (`providers/devtools_provider.py`) - Development tools
+- **CrewAI Provider** (`providers/crewai_provider.py`) - Multi-agent orchestration
+- **Provider Registry** (`providers/registry.py`) - Centralized provider initialization
+
+### Routing System
+
+**Rule-Based Routing:**
+- Rules defined in `router-config.yml` with priority ordering (lower = higher priority)
+- Each rule specifies `when` conditions (mode, agent, metadata) and `use_llm`/`use_provider`
+- Routing engine (`routing_engine.py`) matches requests to providers via `RoutingTable` class
+- Special handling for `rag_query` mode (combines Memory + RAG → LLM)
+
+**Request Flow:**
+1. Request arrives at Router via HTTP POST `/route`
+2. RBAC context injection (if chat mode with dao_id/user_id)
+3. Rule matching in priority order
+4. Provider resolution and invocation
+5. Response returned with provider metadata
+
+### Configuration Management
+
+Configuration uses YAML + Pydantic validation:
+
+- **`router-config.yml`** - Main config file with:
+  - `node` - Node identification
+  - `llm_profiles` - LLM provider configurations
+  - `orchestrator_providers` - Orchestrator backends
+  - `agents` - Agent definitions with tools
+  - `routing` - Routing rules (priority-ordered)
+  - `telemetry` - Logging and metrics
+  - `policies` - Rate limiting, cost tracking
+
+- **`config_loader.py`** - Loads and validates config with Pydantic models:
+  - `RouterConfig` - Top-level config
+  - `LLMProfile` - LLM provider settings
+  - `AgentConfig` - Agent configuration
+  - `RoutingRule` - Individual routing rule
+
+## Key Concepts
+
+### Agents and Modes
+
+**Agents:**
+- `devtools` - Development assistant (code analysis, refactoring, testing)
+- `microdao_orchestrator` - Multi-agent workflow coordinator
+- DAARWIZZ - System orchestrator agent (in Gateway)
+
+**Modes:**
+- `chat` - Standard chat with RBAC context injection
+- `devtools` - Tool execution mode (file ops, tests)
+- `crew` - CrewAI workflow orchestration
+- `rag_query` - RAG + Memory hybrid query
+- `qa_build` - Q&A generation from documents
+
+### RBAC Context Injection
+
+For microDAO chat mode, the Router automatically enriches requests with RBAC context:
+- Fetches user roles and entitlements from RBAC service
+- Injects into `payload.context.rbac` before provider call
+- See `router_app.py:handle()` for implementation
+
+### Multi-Agent Ecosystem
+
+Follows DAARION.city agent hierarchy (A1-A4):
+- **A1** - DAARION.city system agents (DAARWIZZ)
+- **A2** - Platform agents (GREENFOOD, Energy Union, Water Union, etc.)
+- **A3** - Public microDAO agents
+- **A4** - Private microDAO agents
+
+See `docs/agents.md` for complete agent map.
+
+## Development Workflow
+
+### Adding a New LLM Provider
+
+1. Add profile to `router-config.yml`:
+```yaml
+llm_profiles:
+  my_new_provider:
+    provider: openai
+    base_url: https://api.example.com
+    model: my-model
+    api_key_env: MY_API_KEY
+```
+
+2. Add routing rule:
+```yaml
+routing:
+  - id: my_rule
+    priority: 50
+    when:
+      mode: custom_mode
+    use_llm: my_new_provider
+```
+
+3. Test configuration: `python config_loader.py`
+
+### Adding a New Routing Rule
+
+Rules in `router-config.yml` are evaluated in priority order (lower number = higher priority). Each rule has:
+- `id` - Unique identifier
+- `priority` - Evaluation order (1-100, lower is higher priority)
+- `when` - Matching conditions (mode, agent, metadata_has, task_type, and)
+- `use_llm` or `use_provider` - Target provider
+- `description` - Human-readable purpose
+
+### Debugging Routing
+
+```bash
+# Check which rule matches a request
+curl -X POST http://localhost:9102/route \
+  -H "Content-Type: application/json" \
+  -d '{"mode": "chat", "message": "test", "metadata": {}}'
+
+# View routing table
+curl http://localhost:9102/routing
+
+# Check available providers
+curl http://localhost:9102/providers
+```
+
+### Working with Docker Services
+
+```bash
+# View container status
+docker ps
+
+# Inspect container logs
+docker logs dagi-router
+docker logs -f dagi-gateway  # follow mode
+
+# Execute commands in container
+docker exec -it dagi-router bash
+
+# Restart specific service
+docker-compose restart router
+
+# Check service health
+curl http://localhost:9102/health
+```
+
+## Testing Strategy
+
+### Smoke Tests (`smoke.sh`)
+- Quick health checks for all services
+- Basic functional tests (Router→LLM, DevTools fs_read, CrewAI workflow list, RBAC resolve)
+- Run after deployment or major changes
+
+### End-to-End Tests
+- `test-devtools.sh` - Full Router→DevTools integration (file ops, tests)
+- `test-crewai.sh` - CrewAI workflow execution
+- `test-gateway.sh` - Gateway + RBAC + Router flow
+- Each test includes health checks, functional tests, and result validation
+
+### Unit Tests
+- `test_config_loader.py` - Configuration loading and validation
+- `services/parser-service/tests/` - Parser service components
+- `services/rag-service/tests/` - RAG query and ingestion
+- Use pytest: `python -m pytest <test_file>`
+
+## Common Tasks
+
+### Changing Router Configuration
+
+1. Edit `router-config.yml`
+2. Validate: `python config_loader.py`
+3. Restart router: `docker-compose restart router`
+4. Verify: `./smoke.sh`
+
+### Adding Environment Variables
+
+1. Add to `.env.example` with documentation
+2. Add to `.env` with actual value
+3. Add to `docker-compose.yml` environment section
+4. Reference in code via `os.getenv()`
+
+### Viewing Structured Logs
+
+All services use structured JSON logging. Example:
+```bash
+docker-compose logs -f router | jq -r '. | select(.level == "ERROR")'
+```
+
+### Testing RBAC Integration
+
+```bash
+curl -X POST http://localhost:9200/rbac/resolve \
+  -H "Content-Type: application/json" \
+  -d '{"dao_id": "greenfood-dao", "user_id": "tg:12345"}'
+```
+
+### Manual Router Requests
+
+```bash
+# Chat mode (with RBAC)
+curl -X POST http://localhost:9102/route \
+  -H "Content-Type: application/json" \
+  -d '{
+    "mode": "chat",
+    "message": "Hello",
+    "dao_id": "test-dao",
+    "user_id": "tg:123",
+    "metadata": {}
+  }'
+
+# DevTools mode
+curl -X POST http://localhost:9102/route \
+  -H "Content-Type: application/json" \
+  -d '{
+    "mode": "devtools",
+    "message": "read file",
+    "payload": {
+      "tool": "fs_read",
+      "params": {"path": "/app/README.md"}
+    }
+  }'
+```
+
+## Tech Stack
+
+- **Language:** Python 3.11+
+- **Framework:** FastAPI, Uvicorn
+- **Validation:** Pydantic
+- **Config:** YAML (PyYAML)
+- **HTTP Client:** httpx
+- **Containerization:** Docker, Docker Compose
+- **LLM Providers:** Ollama (local), DeepSeek, OpenAI
+- **Testing:** pytest, bash scripts
+- **Frontend:** React, TypeScript, Vite, TailwindCSS (for web UI)
+
+## File Structure Conventions
+
+- Root level: Main router components and entry points
+- `providers/` - Provider implementations (LLM, DevTools, CrewAI)
+- `gateway-bot/` - Bot gateway service (Telegram, Discord)
+- `devtools-backend/` - DevTools tool execution service
+- `orchestrator/` - CrewAI multi-agent orchestration
+- `microdao/` - RBAC service
+- `services/` - Additional services (RAG, Memory, Parser)
+- `tests/` - E2E tests and evaluation scripts
+- `docs/` - Documentation (including agents map)
+- `chart/` - Kubernetes Helm chart
+- Root scripts: `smoke.sh`, `test-*.sh` for testing
+
+## Important Notes
+
+- Router config is validated on startup - syntax errors will prevent service from starting
+- RBAC context injection only happens in `chat` mode with both `dao_id` and `user_id` present
+- All services expose `/health` endpoint for monitoring
+- Docker network `dagi-network` connects all services
+- Use structured logging - avoid print statements
+- Provider timeout defaults to 30s (configurable per profile in `router-config.yml`)
+- RAG query mode combines Memory context + RAG documents before calling LLM
+- When modifying routing rules, test with `./smoke.sh` before committing