# 🚀 Vision Encoder Deployment — Quick Guide **Server:** 144.76.224.179 (Hetzner GEX44 #2844465) **Status:** ✅ Code pushed to GitHub **Ready to deploy:** YES --- ## ⚡ Quick Deploy (One Command) SSH to server and run automated script: ```bash ssh root@144.76.224.179 'cd /opt/microdao-daarion && git pull origin main && ./deploy-vision-encoder.sh' ``` **That's it!** The script will: - ✅ Pull latest code - ✅ Check GPU & Docker GPU runtime - ✅ Build Vision Encoder image - ✅ Start Vision Encoder + Qdrant - ✅ Run health checks - ✅ Run smoke tests - ✅ Show GPU status --- ## 📋 Manual Deploy (Step by Step) If you prefer manual deployment: ### 1. SSH to Server ```bash ssh root@144.76.224.179 ``` ### 2. Navigate to Project ```bash cd /opt/microdao-daarion ``` ### 3. Pull Latest Code ```bash git pull origin main ``` ### 4. Check GPU ```bash nvidia-smi ``` Should show NVIDIA GPU with ~24 GB VRAM. ### 5. Build Vision Encoder ```bash docker-compose build vision-encoder ``` This takes 5-10 minutes (downloads PyTorch + OpenCLIP). ### 6. Start Services ```bash docker-compose up -d vision-encoder qdrant ``` ### 7. Check Logs ```bash docker-compose logs -f vision-encoder ``` Wait for: `"Model loaded successfully. Embedding dimension: 768"` ### 8. Verify Health ```bash curl http://localhost:8001/health curl http://localhost:6333/healthz ``` ### 9. Create Qdrant Collection ```bash curl -X PUT http://localhost:6333/collections/daarion_images \ -H "Content-Type: application/json" \ -d '{"vectors": {"size": 768, "distance": "Cosine"}}' ``` ### 10. Run Smoke Tests ```bash chmod +x ./test-vision-encoder.sh ./test-vision-encoder.sh ``` ### 11. Monitor GPU ```bash watch -n 1 nvidia-smi ``` Should show Vision Encoder using ~4 GB VRAM. --- ## 🔍 Verification ### Check All Services ```bash docker-compose ps ``` All 17 services should be "Up": - dagi-router (9102) - dagi-gateway (9300) - dagi-devtools (8008) - dagi-crewai (9010) - dagi-rbac (9200) - dagi-rag-service (9500) - dagi-memory-service (8000) - dagi-parser-service (9400) - **dagi-vision-encoder (8001)** ← NEW - dagi-postgres (5432) - redis (6379) - neo4j (7687/7474) - **dagi-qdrant (6333/6334)** ← NEW - grafana (3000) - prometheus (9090) - neo4j-exporter (9091) - ollama (11434) ### Test Vision Encoder API ```bash # Text embedding curl -X POST http://localhost:8001/embed/text \ -H "Content-Type: application/json" \ -d '{"text": "токеноміка DAARION", "normalize": true}' # Should return: {"embedding": [...], "dimension": 768, ...} ``` ### Test via Router ```bash curl -X POST http://localhost:9102/route \ -H "Content-Type: application/json" \ -d '{ "mode": "vision_embed", "message": "embed text", "payload": { "operation": "embed_text", "text": "DAARION governance", "normalize": true } }' ``` --- ## 📊 Expected Results ### GPU Usage ``` +-----------------------------------------------------------------------------+ | NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 NVIDIA GeForce... Off | 00000000:01:00.0 Off | N/A | | 35% 52C P2 85W / 350W | 4096MiB / 24576MiB | 15% Default | +-------------------------------+----------------------+----------------------+ ``` **VRAM Allocation:** - Vision Encoder: ~4 GB (always loaded) - Ollama (qwen3:8b): ~6 GB (when active) - Available: ~14 GB ### Service Logs Vision Encoder startup logs: ```json {"timestamp": "2025-01-17 13:00:00", "level": "INFO", "message": "Starting vision-encoder service..."} {"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Loading model ViT-L-14 with pretrained weights openai"} {"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Device: cuda"} {"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Model loaded successfully. Embedding dimension: 768"} {"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "GPU: NVIDIA GeForce RTX 3090, Memory: 24.00 GB"} {"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Uvicorn running on http://0.0.0.0:8001"} ``` --- ## 🐛 Troubleshooting ### Problem: GPU not detected **Check:** ```bash nvidia-smi ``` **Fix:** ```bash # Install NVIDIA drivers (if needed) sudo apt install nvidia-driver-535 sudo reboot # Install NVIDIA Container Toolkit distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker ``` ### Problem: Vision Encoder using CPU instead of GPU **Check device:** ```bash curl http://localhost:8001/health | jq '.device' ``` If returns `"cpu"`: 1. Check GPU runtime: `docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi` 2. Restart Vision Encoder: `docker-compose restart vision-encoder` 3. Check logs: `docker-compose logs vision-encoder` ### Problem: Out of Memory **Check GPU memory:** ```bash nvidia-smi ``` **Solutions:** 1. Use smaller model: Edit `docker-compose.yml` → `MODEL_NAME=ViT-B-32` (2 GB instead of 4 GB) 2. Stop Ollama temporarily: `docker stop ollama` 3. Restart services: `docker-compose restart vision-encoder` --- ## 📖 Documentation - **[SYSTEM-INVENTORY.md](./SYSTEM-INVENTORY.md)** — Complete system inventory (GPU, models, services) - **[VISION-ENCODER-STATUS.md](./VISION-ENCODER-STATUS.md)** — Vision Encoder service status - **[VISION-RAG-IMPLEMENTATION.md](./VISION-RAG-IMPLEMENTATION.md)** — Implementation details - **[services/vision-encoder/README.md](./services/vision-encoder/README.md)** — Full deployment guide - **[docs/cursor/vision_encoder_deployment_task.md](./docs/cursor/vision_encoder_deployment_task.md)** — Deployment checklist --- ## ✅ Deployment Checklist **Before Deployment:** - [x] Code committed to Git - [x] Code pushed to GitHub - [x] Documentation updated - [x] Tests created - [x] Deploy script created **After Deployment:** - [ ] Vision Encoder running (port 8001) - [ ] Qdrant running (port 6333) - [ ] Health checks passing - [ ] Smoke tests passing - [ ] GPU detected and used (~4 GB VRAM) - [ ] Qdrant collection created - [ ] Integration with Router working --- ## 🎯 Next Steps After Deployment ### 1. Index Existing Images ```bash # Example: Index images from Parser Service output python scripts/index_images.py --dao-id daarion --directory /data/images ``` ### 2. Test Image Search ```bash # Text-to-image search curl -X POST http://localhost:9102/route \ -H "Content-Type: application/json" \ -d '{ "mode": "image_search", "message": "знайди діаграми токеноміки", "dao_id": "daarion", "payload": {"top_k": 5} }' ``` ### 3. Monitor Performance ```bash # GPU usage watch -n 1 nvidia-smi # Service logs docker-compose logs -f vision-encoder # Request metrics curl http://localhost:9090/metrics | grep vision_encoder ``` --- **Status:** ✅ Ready to Deploy **Last Updated:** 2025-01-17 **Maintained by:** Ivan Tytar & DAARION Team