Files

Apple 4601c6fca8 feat: add Vision Encoder service + Vision RAG implementation

- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
  - FastAPI app with text/image embedding endpoints (768-dim)
  - Docker support with NVIDIA GPU runtime
  - Port 8001, health checks, model info API

- Qdrant Vector Database integration
  - Port 6333/6334 (HTTP/gRPC)
  - Image embeddings storage (768-dim, Cosine distance)
  - Auto collection creation

- Vision RAG implementation
  - VisionEncoderClient (Python client for API)
  - Image Search module (text-to-image, image-to-image)
  - Vision RAG routing in DAGI Router (mode: image_search)
  - VisionEncoderProvider integration

- Documentation (5000+ lines)
  - SYSTEM-INVENTORY.md - Complete system inventory
  - VISION-ENCODER-STATUS.md - Service status
  - VISION-RAG-IMPLEMENTATION.md - Implementation details
  - vision_encoder_deployment_task.md - Deployment checklist
  - services/vision-encoder/README.md - Deployment guide
  - Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook

- Testing
  - test-vision-encoder.sh - Smoke tests (6 tests)
  - Unit tests for client, image search, routing

- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)

Status: Production Ready ✅

2025-11-17 05:24:36 -08:00

12 KiB

Raw Blame History

WARP.md

This file provides guidance to WARP (warp.dev) when working with code in this repository.

Repository Overview

DAGI Stack (Decentralized Agentic Gateway Infrastructure) is a production-ready AI router with multi-agent orchestration, microDAO governance, and bot gateway integration. It's a microservices architecture for routing and orchestrating AI agents and LLM providers.

Infrastructure & Deployment

For complete infrastructure information (servers, repositories, domains, deployment workflows), see:

INFRASTRUCTURE.md — Production servers, GitHub repos, DNS, services, deployment
SYSTEM-INVENTORY.md — Complete system inventory (GPU, AI models, services)
docs/infrastructure_quick_ref.ipynb — Jupyter Notebook for quick search

Quick Start Commands

Development

# Start all services via Docker Compose
docker-compose up -d

# View logs for all services
docker-compose logs -f

# View logs for specific service
docker-compose logs -f router
docker-compose logs -f gateway
docker-compose logs -f devtools
docker-compose logs -f crewai
docker-compose logs -f rbac

# Stop all services
docker-compose down

# Rebuild and restart after code changes
docker-compose up -d --build

Testing

# Smoke tests - basic health checks for all services
./smoke.sh

# End-to-end tests for specific components
./test-devtools.sh    # DevTools integration
./test-crewai.sh      # CrewAI workflows
./test-gateway.sh     # Gateway + RBAC
./test-fastapi.sh     # FastAPI endpoints

# RAG pipeline evaluation
./tests/e2e_rag_pipeline.sh
python tests/rag_eval.py

# Unit tests
python -m pytest test_config_loader.py
python -m pytest services/parser-service/tests/
python -m pytest services/rag-service/tests/

Local Development (without Docker)

# Start Router (main service)
python main_v2.py --config router-config.yml --port 9102

# Start DevTools Backend
cd devtools-backend && python main.py

# Start CrewAI Orchestrator
cd orchestrator && python crewai_backend.py

# Start Bot Gateway
cd gateway-bot && python main.py

# Start RBAC Service
cd microdao && python main.py

Configuration

# Copy environment template
cp .env.example .env

# Edit configuration with your tokens and settings
nano .env

# Validate router configuration
python config_loader.py

Architecture

Core Services (Microservices)

The DAGI Stack follows a microservices architecture with these primary services:

1. DAGI Router (Port 9102)

Main routing engine that dispatches requests to appropriate providers
Rule-based routing with priority-ordered rules defined in router-config.yml
Handles RBAC context injection for microDAO chat mode
Key files:
- main_v2.py - FastAPI application entry point
- router_app.py - Core RouterApp class with request handling logic
- routing_engine.py - Rule matching and provider resolution
- config_loader.py - Configuration loading and validation with Pydantic models
- router-config.yml - Routing rules and provider configuration

2. Bot Gateway (Port 9300)

HTTP server for bot platforms (Telegram, Discord)
Normalizes platform-specific messages to unified format
Integrates with RBAC service before forwarding to Router
Implements DAARWIZZ system agent
Key files: gateway-bot/main.py, gateway-bot/http_api.py, gateway-bot/router_client.py

3. DevTools Backend (Port 8008)

Tool execution service for development tasks
File operations (read/write), test execution, notebook execution
Security: path validation, size limits
Key files: devtools-backend/main.py

4. CrewAI Orchestrator (Port 9010)

Multi-agent workflow execution
Pre-configured workflows: microdao_onboarding, code_review, proposal_review, task_decomposition
Key files: orchestrator/crewai_backend.py

5. RBAC Service (Port 9200)

Role-based access control with roles: admin, member, contributor, guest
DAO isolation for multi-tenancy
Key files: microdao/ directory

6. RAG Service (Port 9500)

Document retrieval and question answering
Uses embeddings (BAAI/bge-m3) and PostgreSQL for vector storage
Integrates with Router for LLM calls
Key files: services/rag-service/

7. Memory Service (Port 8000)

Agent memory and context management
Key files: services/memory-service/

8. Parser Service

Document parsing and Q&A generation
2-stage pipeline: parse → Q&A build
Key files: services/parser-service/

Provider System

The system uses a provider abstraction to support multiple backends:

Base Provider (providers/base.py) - Abstract base class
LLM Provider (providers/llm_provider.py) - Ollama, DeepSeek, OpenAI
DevTools Provider (providers/devtools_provider.py) - Development tools
CrewAI Provider (providers/crewai_provider.py) - Multi-agent orchestration
Provider Registry (providers/registry.py) - Centralized provider initialization

Routing System

Rule-Based Routing:

Rules defined in router-config.yml with priority ordering (lower = higher priority)
Each rule specifies when conditions (mode, agent, metadata) and use_llm/use_provider
Routing engine (routing_engine.py) matches requests to providers via RoutingTable class
Special handling for rag_query mode (combines Memory + RAG → LLM)

Request Flow:

Request arrives at Router via HTTP POST /route
RBAC context injection (if chat mode with dao_id/user_id)
Rule matching in priority order
Provider resolution and invocation
Response returned with provider metadata

Configuration Management

Configuration uses YAML + Pydantic validation:

router-config.yml - Main config file with:
- node - Node identification
- llm_profiles - LLM provider configurations
- orchestrator_providers - Orchestrator backends
- agents - Agent definitions with tools
- routing - Routing rules (priority-ordered)
- telemetry - Logging and metrics
- policies - Rate limiting, cost tracking
config_loader.py - Loads and validates config with Pydantic models:
- RouterConfig - Top-level config
- LLMProfile - LLM provider settings
- AgentConfig - Agent configuration
- RoutingRule - Individual routing rule

Key Concepts

Agents and Modes

Agents:

devtools - Development assistant (code analysis, refactoring, testing)
microdao_orchestrator - Multi-agent workflow coordinator
DAARWIZZ - System orchestrator agent (in Gateway)

Modes:

chat - Standard chat with RBAC context injection
devtools - Tool execution mode (file ops, tests)
crew - CrewAI workflow orchestration
rag_query - RAG + Memory hybrid query
qa_build - Q&A generation from documents

RBAC Context Injection

For microDAO chat mode, the Router automatically enriches requests with RBAC context:

Fetches user roles and entitlements from RBAC service
Injects into payload.context.rbac before provider call
See router_app.py:handle() for implementation

Multi-Agent Ecosystem

Follows DAARION.city agent hierarchy (A1-A4):

A1 - DAARION.city system agents (DAARWIZZ)
A2 - Platform agents (GREENFOOD, Energy Union, Water Union, etc.)
A3 - Public microDAO agents
A4 - Private microDAO agents

See docs/agents.md for complete agent map.

Development Workflow

Adding a New LLM Provider

Add profile to router-config.yml:

llm_profiles:
  my_new_provider:
    provider: openai
    base_url: https://api.example.com
    model: my-model
    api_key_env: MY_API_KEY

Add routing rule:

routing:
  - id: my_rule
    priority: 50
    when:
      mode: custom_mode
    use_llm: my_new_provider

Test configuration: python config_loader.py

Adding a New Routing Rule

Rules in router-config.yml are evaluated in priority order (lower number = higher priority). Each rule has:

id - Unique identifier
priority - Evaluation order (1-100, lower is higher priority)
when - Matching conditions (mode, agent, metadata_has, task_type, and)
use_llm or use_provider - Target provider
description - Human-readable purpose

Debugging Routing

# Check which rule matches a request
curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{"mode": "chat", "message": "test", "metadata": {}}'

# View routing table
curl http://localhost:9102/routing

# Check available providers
curl http://localhost:9102/providers

Working with Docker Services

# View container status
docker ps

# Inspect container logs
docker logs dagi-router
docker logs -f dagi-gateway  # follow mode

# Execute commands in container
docker exec -it dagi-router bash

# Restart specific service
docker-compose restart router

# Check service health
curl http://localhost:9102/health

Testing Strategy

Smoke Tests (`smoke.sh`)

Quick health checks for all services
Basic functional tests (Router→LLM, DevTools fs_read, CrewAI workflow list, RBAC resolve)
Run after deployment or major changes

End-to-End Tests

test-devtools.sh - Full Router→DevTools integration (file ops, tests)
test-crewai.sh - CrewAI workflow execution
test-gateway.sh - Gateway + RBAC + Router flow
Each test includes health checks, functional tests, and result validation

Unit Tests

test_config_loader.py - Configuration loading and validation
services/parser-service/tests/ - Parser service components
services/rag-service/tests/ - RAG query and ingestion
Use pytest: python -m pytest <test_file>

Common Tasks

Changing Router Configuration

Edit router-config.yml
Validate: python config_loader.py
Restart router: docker-compose restart router
Verify: ./smoke.sh

Adding Environment Variables

Add to .env.example with documentation
Add to .env with actual value
Add to docker-compose.yml environment section
Reference in code via os.getenv()

Viewing Structured Logs

All services use structured JSON logging. Example:

docker-compose logs -f router | jq -r '. | select(.level == "ERROR")'

Testing RBAC Integration

curl -X POST http://localhost:9200/rbac/resolve \
  -H "Content-Type: application/json" \
  -d '{"dao_id": "greenfood-dao", "user_id": "tg:12345"}'

Manual Router Requests

# Chat mode (with RBAC)
curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "chat",
    "message": "Hello",
    "dao_id": "test-dao",
    "user_id": "tg:123",
    "metadata": {}
  }'

# DevTools mode
curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "devtools",
    "message": "read file",
    "payload": {
      "tool": "fs_read",
      "params": {"path": "/app/README.md"}
    }
  }'

Tech Stack

Language: Python 3.11+
Framework: FastAPI, Uvicorn
Validation: Pydantic
Config: YAML (PyYAML)
HTTP Client: httpx
Containerization: Docker, Docker Compose
LLM Providers: Ollama (local), DeepSeek, OpenAI
Testing: pytest, bash scripts
Frontend: React, TypeScript, Vite, TailwindCSS (for web UI)

File Structure Conventions

Root level: Main router components and entry points
providers/ - Provider implementations (LLM, DevTools, CrewAI)
gateway-bot/ - Bot gateway service (Telegram, Discord)
devtools-backend/ - DevTools tool execution service
orchestrator/ - CrewAI multi-agent orchestration
microdao/ - RBAC service
services/ - Additional services (RAG, Memory, Parser)
tests/ - E2E tests and evaluation scripts
docs/ - Documentation (including agents map)
chart/ - Kubernetes Helm chart
Root scripts: smoke.sh, test-*.sh for testing

Important Notes

Router config is validated on startup - syntax errors will prevent service from starting
RBAC context injection only happens in chat mode with both dao_id and user_id present
All services expose /health endpoint for monitoring
Docker network dagi-network connects all services
Use structured logging - avoid print statements
Provider timeout defaults to 30s (configurable per profile in router-config.yml)
RAG query mode combines Memory context + RAG documents before calling LLM
When modifying routing rules, test with ./smoke.sh before committing

12 KiB Raw Blame History