- One-command deploy via automated script - Manual step-by-step deployment - Verification checklist - Troubleshooting guide - Expected results and GPU usage - Next steps after deployment
7.5 KiB
7.5 KiB
🚀 Vision Encoder Deployment — Quick Guide
Server: 144.76.224.179 (Hetzner GEX44 #2844465)
Status: ✅ Code pushed to GitHub
Ready to deploy: YES
⚡ Quick Deploy (One Command)
SSH to server and run automated script:
ssh root@144.76.224.179 'cd /opt/microdao-daarion && git pull origin main && ./deploy-vision-encoder.sh'
That's it! The script will:
- ✅ Pull latest code
- ✅ Check GPU & Docker GPU runtime
- ✅ Build Vision Encoder image
- ✅ Start Vision Encoder + Qdrant
- ✅ Run health checks
- ✅ Run smoke tests
- ✅ Show GPU status
📋 Manual Deploy (Step by Step)
If you prefer manual deployment:
1. SSH to Server
ssh root@144.76.224.179
2. Navigate to Project
cd /opt/microdao-daarion
3. Pull Latest Code
git pull origin main
4. Check GPU
nvidia-smi
Should show NVIDIA GPU with ~24 GB VRAM.
5. Build Vision Encoder
docker-compose build vision-encoder
This takes 5-10 minutes (downloads PyTorch + OpenCLIP).
6. Start Services
docker-compose up -d vision-encoder qdrant
7. Check Logs
docker-compose logs -f vision-encoder
Wait for: "Model loaded successfully. Embedding dimension: 768"
8. Verify Health
curl http://localhost:8001/health
curl http://localhost:6333/healthz
9. Create Qdrant Collection
curl -X PUT http://localhost:6333/collections/daarion_images \
-H "Content-Type: application/json" \
-d '{"vectors": {"size": 768, "distance": "Cosine"}}'
10. Run Smoke Tests
chmod +x ./test-vision-encoder.sh
./test-vision-encoder.sh
11. Monitor GPU
watch -n 1 nvidia-smi
Should show Vision Encoder using ~4 GB VRAM.
🔍 Verification
Check All Services
docker-compose ps
All 17 services should be "Up":
- dagi-router (9102)
- dagi-gateway (9300)
- dagi-devtools (8008)
- dagi-crewai (9010)
- dagi-rbac (9200)
- dagi-rag-service (9500)
- dagi-memory-service (8000)
- dagi-parser-service (9400)
- dagi-vision-encoder (8001) ← NEW
- dagi-postgres (5432)
- redis (6379)
- neo4j (7687/7474)
- dagi-qdrant (6333/6334) ← NEW
- grafana (3000)
- prometheus (9090)
- neo4j-exporter (9091)
- ollama (11434)
Test Vision Encoder API
# Text embedding
curl -X POST http://localhost:8001/embed/text \
-H "Content-Type: application/json" \
-d '{"text": "токеноміка DAARION", "normalize": true}'
# Should return: {"embedding": [...], "dimension": 768, ...}
Test via Router
curl -X POST http://localhost:9102/route \
-H "Content-Type: application/json" \
-d '{
"mode": "vision_embed",
"message": "embed text",
"payload": {
"operation": "embed_text",
"text": "DAARION governance",
"normalize": true
}
}'
📊 Expected Results
GPU Usage
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce... Off | 00000000:01:00.0 Off | N/A |
| 35% 52C P2 85W / 350W | 4096MiB / 24576MiB | 15% Default |
+-------------------------------+----------------------+----------------------+
VRAM Allocation:
- Vision Encoder: ~4 GB (always loaded)
- Ollama (qwen3:8b): ~6 GB (when active)
- Available: ~14 GB
Service Logs
Vision Encoder startup logs:
{"timestamp": "2025-01-17 13:00:00", "level": "INFO", "message": "Starting vision-encoder service..."}
{"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Loading model ViT-L-14 with pretrained weights openai"}
{"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Device: cuda"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Model loaded successfully. Embedding dimension: 768"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "GPU: NVIDIA GeForce RTX 3090, Memory: 24.00 GB"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Uvicorn running on http://0.0.0.0:8001"}
🐛 Troubleshooting
Problem: GPU not detected
Check:
nvidia-smi
Fix:
# Install NVIDIA drivers (if needed)
sudo apt install nvidia-driver-535
sudo reboot
# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
Problem: Vision Encoder using CPU instead of GPU
Check device:
curl http://localhost:8001/health | jq '.device'
If returns "cpu":
- Check GPU runtime:
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi - Restart Vision Encoder:
docker-compose restart vision-encoder - Check logs:
docker-compose logs vision-encoder
Problem: Out of Memory
Check GPU memory:
nvidia-smi
Solutions:
- Use smaller model: Edit
docker-compose.yml→MODEL_NAME=ViT-B-32(2 GB instead of 4 GB) - Stop Ollama temporarily:
docker stop ollama - Restart services:
docker-compose restart vision-encoder
📖 Documentation
- SYSTEM-INVENTORY.md — Complete system inventory (GPU, models, services)
- VISION-ENCODER-STATUS.md — Vision Encoder service status
- VISION-RAG-IMPLEMENTATION.md — Implementation details
- services/vision-encoder/README.md — Full deployment guide
- docs/cursor/vision_encoder_deployment_task.md — Deployment checklist
✅ Deployment Checklist
Before Deployment:
- Code committed to Git
- Code pushed to GitHub
- Documentation updated
- Tests created
- Deploy script created
After Deployment:
- Vision Encoder running (port 8001)
- Qdrant running (port 6333)
- Health checks passing
- Smoke tests passing
- GPU detected and used (~4 GB VRAM)
- Qdrant collection created
- Integration with Router working
🎯 Next Steps After Deployment
1. Index Existing Images
# Example: Index images from Parser Service output
python scripts/index_images.py --dao-id daarion --directory /data/images
2. Test Image Search
# Text-to-image search
curl -X POST http://localhost:9102/route \
-H "Content-Type: application/json" \
-d '{
"mode": "image_search",
"message": "знайди діаграми токеноміки",
"dao_id": "daarion",
"payload": {"top_k": 5}
}'
3. Monitor Performance
# GPU usage
watch -n 1 nvidia-smi
# Service logs
docker-compose logs -f vision-encoder
# Request metrics
curl http://localhost:9090/metrics | grep vision_encoder
Status: ✅ Ready to Deploy
Last Updated: 2025-01-17
Maintained by: Ivan Tytar & DAARION Team