feat: add Vision Encoder service + Vision RAG implementation

- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
  - FastAPI app with text/image embedding endpoints (768-dim)
  - Docker support with NVIDIA GPU runtime
  - Port 8001, health checks, model info API

- Qdrant Vector Database integration
  - Port 6333/6334 (HTTP/gRPC)
  - Image embeddings storage (768-dim, Cosine distance)
  - Auto collection creation

- Vision RAG implementation
  - VisionEncoderClient (Python client for API)
  - Image Search module (text-to-image, image-to-image)
  - Vision RAG routing in DAGI Router (mode: image_search)
  - VisionEncoderProvider integration

- Documentation (5000+ lines)
  - SYSTEM-INVENTORY.md - Complete system inventory
  - VISION-ENCODER-STATUS.md - Service status
  - VISION-RAG-IMPLEMENTATION.md - Implementation details
  - vision_encoder_deployment_task.md - Deployment checklist
  - services/vision-encoder/README.md - Deployment guide
  - Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook

- Testing
  - test-vision-encoder.sh - Smoke tests (6 tests)
  - Unit tests for client, image search, routing

- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)

Status: Production Ready 
This commit is contained in:
Apple
2025-11-17 05:24:36 -08:00
parent b2b51f08fb
commit 4601c6fca8
55 changed files with 13205 additions and 3 deletions

View File

@@ -0,0 +1,217 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 🚀 Infrastructure Quick Reference — DAARION & MicroDAO\n",
"\n",
"**Версія:** 1.1.0 \n",
"**Останнє оновлення:** 2025-01-17 \n",
"\n",
"Цей notebook містить швидкий довідник по серверах, репозиторіях та endpoints для DAGI Stack.\n",
"\n",
"**NEW:** Vision Encoder + Qdrant vector database (OpenCLIP ViT-L/14)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Service Configuration (UPDATED with Vision Encoder + Qdrant)\n",
"SERVICES = {\n",
" \"router\": {\"port\": 9102, \"container\": \"dagi-router\", \"health\": \"http://localhost:9102/health\"},\n",
" \"gateway\": {\"port\": 9300, \"container\": \"dagi-gateway\", \"health\": \"http://localhost:9300/health\"},\n",
" \"devtools\": {\"port\": 8008, \"container\": \"dagi-devtools\", \"health\": \"http://localhost:8008/health\"},\n",
" \"crewai\": {\"port\": 9010, \"container\": \"dagi-crewai\", \"health\": \"http://localhost:9010/health\"},\n",
" \"rbac\": {\"port\": 9200, \"container\": \"dagi-rbac\", \"health\": \"http://localhost:9200/health\"},\n",
" \"rag\": {\"port\": 9500, \"container\": \"dagi-rag-service\", \"health\": \"http://localhost:9500/health\"},\n",
" \"memory\": {\"port\": 8000, \"container\": \"dagi-memory-service\", \"health\": \"http://localhost:8000/health\"},\n",
" \"parser\": {\"port\": 9400, \"container\": \"dagi-parser-service\", \"health\": \"http://localhost:9400/health\"},\n",
" \"vision_encoder\": {\"port\": 8001, \"container\": \"dagi-vision-encoder\", \"health\": \"http://localhost:8001/health\", \"gpu\": True},\n",
" \"postgres\": {\"port\": 5432, \"container\": \"dagi-postgres\", \"health\": None},\n",
" \"redis\": {\"port\": 6379, \"container\": \"redis\", \"health\": \"redis-cli PING\"},\n",
" \"neo4j\": {\"port\": 7474, \"container\": \"neo4j\", \"health\": \"http://localhost:7474\"},\n",
" \"qdrant\": {\"port\": 6333, \"container\": \"dagi-qdrant\", \"health\": \"http://localhost:6333/healthz\"},\n",
" \"grafana\": {\"port\": 3000, \"container\": \"grafana\", \"health\": \"http://localhost:3000\"},\n",
" \"prometheus\": {\"port\": 9090, \"container\": \"prometheus\", \"health\": \"http://localhost:9090\"},\n",
" \"ollama\": {\"port\": 11434, \"container\": \"ollama\", \"health\": \"http://localhost:11434/api/tags\"}\n",
"}\n",
"\n",
"print(\"Service\\t\\t\\tPort\\tContainer\\t\\t\\tHealth Endpoint\")\n",
"print(\"=\"*100)\n",
"for name, service in SERVICES.items():\n",
" health = service['health'] or \"N/A\"\n",
" gpu = \" [GPU]\" if service.get('gpu') else \"\"\n",
" print(f\"{name.upper():<20} {service['port']:<7} {service['container']:<30} {health}{gpu}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 🎨 Vision Encoder Service (NEW)\n",
"\n",
"### Overview\n",
"- **Service:** Vision Encoder (OpenCLIP ViT-L/14)\n",
"- **Port:** 8001\n",
"- **GPU:** Required (NVIDIA CUDA)\n",
"- **Embedding Dimension:** 768\n",
"- **Vector DB:** Qdrant (port 6333/6334)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Vision Encoder Configuration\n",
"VISION_ENCODER = {\n",
" \"service\": \"vision-encoder\",\n",
" \"port\": 8001,\n",
" \"container\": \"dagi-vision-encoder\",\n",
" \"gpu_required\": True,\n",
" \"model\": \"ViT-L-14\",\n",
" \"pretrained\": \"openai\",\n",
" \"embedding_dim\": 768,\n",
" \"endpoints\": {\n",
" \"health\": \"http://localhost:8001/health\",\n",
" \"info\": \"http://localhost:8001/info\",\n",
" \"embed_text\": \"http://localhost:8001/embed/text\",\n",
" \"embed_image\": \"http://localhost:8001/embed/image\",\n",
" \"docs\": \"http://localhost:8001/docs\"\n",
" },\n",
" \"qdrant\": {\n",
" \"host\": \"qdrant\",\n",
" \"port\": 6333,\n",
" \"grpc_port\": 6334,\n",
" \"health\": \"http://localhost:6333/healthz\"\n",
" }\n",
"}\n",
"\n",
"print(\"Vision Encoder Service Configuration:\")\n",
"print(\"=\"*80)\n",
"print(f\"Model: {VISION_ENCODER['model']} ({VISION_ENCODER['pretrained']})\")\n",
"print(f\"Embedding Dimension: {VISION_ENCODER['embedding_dim']}\")\n",
"print(f\"GPU Required: {VISION_ENCODER['gpu_required']}\")\n",
"print(f\"\\nEndpoints:\")\n",
"for name, url in VISION_ENCODER['endpoints'].items():\n",
" print(f\" {name:15} {url}\")\n",
"print(f\"\\nQdrant Vector DB:\")\n",
"print(f\" HTTP: http://localhost:{VISION_ENCODER['qdrant']['port']}\")\n",
"print(f\" gRPC: localhost:{VISION_ENCODER['qdrant']['grpc_port']}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Vision Encoder Testing Commands\n",
"VISION_ENCODER_TESTS = {\n",
" \"Health Check\": \"curl http://localhost:8001/health\",\n",
" \"Model Info\": \"curl http://localhost:8001/info\",\n",
" \"Text Embedding\": '''curl -X POST http://localhost:8001/embed/text -H \"Content-Type: application/json\" -d '{\"text\": \"DAARION governance\", \"normalize\": true}' ''',\n",
" \"Image Embedding\": '''curl -X POST http://localhost:8001/embed/image -H \"Content-Type: application/json\" -d '{\"image_url\": \"https://example.com/image.jpg\", \"normalize\": true}' ''',\n",
" \"Via Router (Text)\": '''curl -X POST http://localhost:9102/route -H \"Content-Type: application/json\" -d '{\"mode\": \"vision_embed\", \"message\": \"embed text\", \"payload\": {\"operation\": \"embed_text\", \"text\": \"test\", \"normalize\": true}}' ''',\n",
" \"Qdrant Health\": \"curl http://localhost:6333/healthz\",\n",
" \"Run Smoke Tests\": \"./test-vision-encoder.sh\"\n",
"}\n",
"\n",
"print(\"Vision Encoder Testing Commands:\")\n",
"print(\"=\"*80)\n",
"for name, cmd in VISION_ENCODER_TESTS.items():\n",
" print(f\"\\n{name}:\")\n",
" print(f\" {cmd}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 📖 Documentation Links (UPDATED)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Documentation References (UPDATED)\n",
"DOCS = {\n",
" \"Main Guide\": \"../WARP.md\",\n",
" \"Infrastructure\": \"../INFRASTRUCTURE.md\",\n",
" \"Agents Map\": \"../docs/agents.md\",\n",
" \"RAG Ingestion Status\": \"../RAG-INGESTION-STATUS.md\",\n",
" \"HMM Memory Status\": \"../HMM-MEMORY-STATUS.md\",\n",
" \"Crawl4AI Status\": \"../CRAWL4AI-STATUS.md\",\n",
" \"Vision Encoder Status\": \"../VISION-ENCODER-STATUS.md\",\n",
" \"Vision Encoder Deployment\": \"../services/vision-encoder/README.md\",\n",
" \"Repository Management\": \"../DAARION_CITY_REPO.md\",\n",
" \"Server Setup\": \"../SERVER_SETUP_INSTRUCTIONS.md\",\n",
" \"Deployment\": \"../DEPLOY-NOW.md\",\n",
" \"Helion Status\": \"../STATUS-HELION.md\",\n",
" \"Architecture Index\": \"../docs/cursor/README.md\",\n",
" \"API Reference\": \"../docs/api.md\"\n",
"}\n",
"\n",
"print(\"Documentation Quick Links:\")\n",
"print(\"=\"*80)\n",
"for name, path in DOCS.items():\n",
" print(f\"{name:<30} {path}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 📝 Notes & Updates\n",
"\n",
"### Recent Changes (2025-01-17)\n",
"- ✅ **Added Vision Encoder Service** (port 8001) with OpenCLIP ViT-L/14\n",
"- ✅ **Added Qdrant Vector Database** (port 6333/6334) for image/text embeddings\n",
"- ✅ **GPU Support** via NVIDIA CUDA + Docker runtime\n",
"- ✅ **DAGI Router integration** (mode: vision_embed)\n",
"- ✅ **768-dim embeddings** for multimodal RAG\n",
"- ✅ Created VISION-ENCODER-STATUS.md with full implementation details\n",
"- ✅ Added test-vision-encoder.sh smoke tests\n",
"\n",
"### Services Count: 17 (from 15)\n",
"- Total Services: 17\n",
"- GPU Services: 1 (Vision Encoder)\n",
"- Vector Databases: 1 (Qdrant)\n",
"\n",
"---\n",
"\n",
"**Last Updated:** 2025-01-17 by WARP AI \n",
"**Maintained by:** Ivan Tytar & DAARION Team"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}