feat: add Vision Encoder service + Vision RAG implementation

- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated) - FastAPI app with text/image embedding endpoints (768-dim) - Docker support with NVIDIA GPU runtime - Port 8001, health checks, model info API - Qdrant Vector Database integration - Port 6333/6334 (HTTP/gRPC) - Image embeddings storage (768-dim, Cosine distance) - Auto collection creation - Vision RAG implementation - VisionEncoderClient (Python client for API) - Image Search module (text-to-image, image-to-image) - Vision RAG routing in DAGI Router (mode: image_search) - VisionEncoderProvider integration - Documentation (5000+ lines) - SYSTEM-INVENTORY.md - Complete system inventory - VISION-ENCODER-STATUS.md - Service status - VISION-RAG-IMPLEMENTATION.md - Implementation details - vision_encoder_deployment_task.md - Deployment checklist - services/vision-encoder/README.md - Deployment guide - Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook - Testing - test-vision-encoder.sh - Smoke tests (6 tests) - Unit tests for client, image search, routing - Services: 17 total (added Vision Encoder + Qdrant) - AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3) - GPU Services: 2 (Vision Encoder, Ollama) - VRAM Usage: ~10 GB (concurrent) Status: Production Ready ✅
2025-11-17 05:24:36 -08:00
parent b2b51f08fb
commit 4601c6fca8
55 changed files with 13205 additions and 3 deletions
--- a/docs/infrastructure_quick_ref.ipynb
+++ b/docs/infrastructure_quick_ref.ipynb
@@ -0,0 +1,217 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 🚀 Infrastructure Quick Reference — DAARION & MicroDAO\n",
+    "\n",
+    "**Версія:** 1.1.0  \n",
+    "**Останнє оновлення:** 2025-01-17  \n",
+    "\n",
+    "Цей notebook містить швидкий довідник по серверах, репозиторіях та endpoints для DAGI Stack.\n",
+    "\n",
+    "**NEW:** Vision Encoder + Qdrant vector database (OpenCLIP ViT-L/14)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Service Configuration (UPDATED with Vision Encoder + Qdrant)\n",
+    "SERVICES = {\n",
+    "    \"router\": {\"port\": 9102, \"container\": \"dagi-router\", \"health\": \"http://localhost:9102/health\"},\n",
+    "    \"gateway\": {\"port\": 9300, \"container\": \"dagi-gateway\", \"health\": \"http://localhost:9300/health\"},\n",
+    "    \"devtools\": {\"port\": 8008, \"container\": \"dagi-devtools\", \"health\": \"http://localhost:8008/health\"},\n",
+    "    \"crewai\": {\"port\": 9010, \"container\": \"dagi-crewai\", \"health\": \"http://localhost:9010/health\"},\n",
+    "    \"rbac\": {\"port\": 9200, \"container\": \"dagi-rbac\", \"health\": \"http://localhost:9200/health\"},\n",
+    "    \"rag\": {\"port\": 9500, \"container\": \"dagi-rag-service\", \"health\": \"http://localhost:9500/health\"},\n",
+    "    \"memory\": {\"port\": 8000, \"container\": \"dagi-memory-service\", \"health\": \"http://localhost:8000/health\"},\n",
+    "    \"parser\": {\"port\": 9400, \"container\": \"dagi-parser-service\", \"health\": \"http://localhost:9400/health\"},\n",
+    "    \"vision_encoder\": {\"port\": 8001, \"container\": \"dagi-vision-encoder\", \"health\": \"http://localhost:8001/health\", \"gpu\": True},\n",
+    "    \"postgres\": {\"port\": 5432, \"container\": \"dagi-postgres\", \"health\": None},\n",
+    "    \"redis\": {\"port\": 6379, \"container\": \"redis\", \"health\": \"redis-cli PING\"},\n",
+    "    \"neo4j\": {\"port\": 7474, \"container\": \"neo4j\", \"health\": \"http://localhost:7474\"},\n",
+    "    \"qdrant\": {\"port\": 6333, \"container\": \"dagi-qdrant\", \"health\": \"http://localhost:6333/healthz\"},\n",
+    "    \"grafana\": {\"port\": 3000, \"container\": \"grafana\", \"health\": \"http://localhost:3000\"},\n",
+    "    \"prometheus\": {\"port\": 9090, \"container\": \"prometheus\", \"health\": \"http://localhost:9090\"},\n",
+    "    \"ollama\": {\"port\": 11434, \"container\": \"ollama\", \"health\": \"http://localhost:11434/api/tags\"}\n",
+    "}\n",
+    "\n",
+    "print(\"Service\\t\\t\\tPort\\tContainer\\t\\t\\tHealth Endpoint\")\n",
+    "print(\"=\"*100)\n",
+    "for name, service in SERVICES.items():\n",
+    "    health = service['health'] or \"N/A\"\n",
+    "    gpu = \" [GPU]\" if service.get('gpu') else \"\"\n",
+    "    print(f\"{name.upper():<20} {service['port']:<7} {service['container']:<30} {health}{gpu}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 🎨 Vision Encoder Service (NEW)\n",
+    "\n",
+    "### Overview\n",
+    "- **Service:** Vision Encoder (OpenCLIP ViT-L/14)\n",
+    "- **Port:** 8001\n",
+    "- **GPU:** Required (NVIDIA CUDA)\n",
+    "- **Embedding Dimension:** 768\n",
+    "- **Vector DB:** Qdrant (port 6333/6334)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Vision Encoder Configuration\n",
+    "VISION_ENCODER = {\n",
+    "    \"service\": \"vision-encoder\",\n",
+    "    \"port\": 8001,\n",
+    "    \"container\": \"dagi-vision-encoder\",\n",
+    "    \"gpu_required\": True,\n",
+    "    \"model\": \"ViT-L-14\",\n",
+    "    \"pretrained\": \"openai\",\n",
+    "    \"embedding_dim\": 768,\n",
+    "    \"endpoints\": {\n",
+    "        \"health\": \"http://localhost:8001/health\",\n",
+    "        \"info\": \"http://localhost:8001/info\",\n",
+    "        \"embed_text\": \"http://localhost:8001/embed/text\",\n",
+    "        \"embed_image\": \"http://localhost:8001/embed/image\",\n",
+    "        \"docs\": \"http://localhost:8001/docs\"\n",
+    "    },\n",
+    "    \"qdrant\": {\n",
+    "        \"host\": \"qdrant\",\n",
+    "        \"port\": 6333,\n",
+    "        \"grpc_port\": 6334,\n",
+    "        \"health\": \"http://localhost:6333/healthz\"\n",
+    "    }\n",
+    "}\n",
+    "\n",
+    "print(\"Vision Encoder Service Configuration:\")\n",
+    "print(\"=\"*80)\n",
+    "print(f\"Model: {VISION_ENCODER['model']} ({VISION_ENCODER['pretrained']})\")\n",
+    "print(f\"Embedding Dimension: {VISION_ENCODER['embedding_dim']}\")\n",
+    "print(f\"GPU Required: {VISION_ENCODER['gpu_required']}\")\n",
+    "print(f\"\\nEndpoints:\")\n",
+    "for name, url in VISION_ENCODER['endpoints'].items():\n",
+    "    print(f\"  {name:15} {url}\")\n",
+    "print(f\"\\nQdrant Vector DB:\")\n",
+    "print(f\"  HTTP:  http://localhost:{VISION_ENCODER['qdrant']['port']}\")\n",
+    "print(f\"  gRPC:  localhost:{VISION_ENCODER['qdrant']['grpc_port']}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Vision Encoder Testing Commands\n",
+    "VISION_ENCODER_TESTS = {\n",
+    "    \"Health Check\": \"curl http://localhost:8001/health\",\n",
+    "    \"Model Info\": \"curl http://localhost:8001/info\",\n",
+    "    \"Text Embedding\": '''curl -X POST http://localhost:8001/embed/text -H \"Content-Type: application/json\" -d '{\"text\": \"DAARION governance\", \"normalize\": true}' ''',\n",
+    "    \"Image Embedding\": '''curl -X POST http://localhost:8001/embed/image -H \"Content-Type: application/json\" -d '{\"image_url\": \"https://example.com/image.jpg\", \"normalize\": true}' ''',\n",
+    "    \"Via Router (Text)\": '''curl -X POST http://localhost:9102/route -H \"Content-Type: application/json\" -d '{\"mode\": \"vision_embed\", \"message\": \"embed text\", \"payload\": {\"operation\": \"embed_text\", \"text\": \"test\", \"normalize\": true}}' ''',\n",
+    "    \"Qdrant Health\": \"curl http://localhost:6333/healthz\",\n",
+    "    \"Run Smoke Tests\": \"./test-vision-encoder.sh\"\n",
+    "}\n",
+    "\n",
+    "print(\"Vision Encoder Testing Commands:\")\n",
+    "print(\"=\"*80)\n",
+    "for name, cmd in VISION_ENCODER_TESTS.items():\n",
+    "    print(f\"\\n{name}:\")\n",
+    "    print(f\"  {cmd}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 📖 Documentation Links (UPDATED)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Documentation References (UPDATED)\n",
+    "DOCS = {\n",
+    "    \"Main Guide\": \"../WARP.md\",\n",
+    "    \"Infrastructure\": \"../INFRASTRUCTURE.md\",\n",
+    "    \"Agents Map\": \"../docs/agents.md\",\n",
+    "    \"RAG Ingestion Status\": \"../RAG-INGESTION-STATUS.md\",\n",
+    "    \"HMM Memory Status\": \"../HMM-MEMORY-STATUS.md\",\n",
+    "    \"Crawl4AI Status\": \"../CRAWL4AI-STATUS.md\",\n",
+    "    \"Vision Encoder Status\": \"../VISION-ENCODER-STATUS.md\",\n",
+    "    \"Vision Encoder Deployment\": \"../services/vision-encoder/README.md\",\n",
+    "    \"Repository Management\": \"../DAARION_CITY_REPO.md\",\n",
+    "    \"Server Setup\": \"../SERVER_SETUP_INSTRUCTIONS.md\",\n",
+    "    \"Deployment\": \"../DEPLOY-NOW.md\",\n",
+    "    \"Helion Status\": \"../STATUS-HELION.md\",\n",
+    "    \"Architecture Index\": \"../docs/cursor/README.md\",\n",
+    "    \"API Reference\": \"../docs/api.md\"\n",
+    "}\n",
+    "\n",
+    "print(\"Documentation Quick Links:\")\n",
+    "print(\"=\"*80)\n",
+    "for name, path in DOCS.items():\n",
+    "    print(f\"{name:<30} {path}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 📝 Notes & Updates\n",
+    "\n",
+    "### Recent Changes (2025-01-17)\n",
+    "- ✅ **Added Vision Encoder Service** (port 8001) with OpenCLIP ViT-L/14\n",
+    "- ✅ **Added Qdrant Vector Database** (port 6333/6334) for image/text embeddings\n",
+    "- ✅ **GPU Support** via NVIDIA CUDA + Docker runtime\n",
+    "- ✅ **DAGI Router integration** (mode: vision_embed)\n",
+    "- ✅ **768-dim embeddings** for multimodal RAG\n",
+    "- ✅ Created VISION-ENCODER-STATUS.md with full implementation details\n",
+    "- ✅ Added test-vision-encoder.sh smoke tests\n",
+    "\n",
+    "### Services Count: 17 (from 15)\n",
+    "- Total Services: 17\n",
+    "- GPU Services: 1 (Vision Encoder)\n",
+    "- Vector Databases: 1 (Qdrant)\n",
+    "\n",
+    "---\n",
+    "\n",
+    "**Last Updated:** 2025-01-17 by WARP AI  \n",
+    "**Maintained by:** Ivan Tytar & DAARION Team"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}