Files
microdao-daarion/DEPLOY-VISION-ENCODER.md
Apple 6f28171842 docs: add quick deployment guide for Vision Encoder
- One-command deploy via automated script
- Manual step-by-step deployment
- Verification checklist
- Troubleshooting guide
- Expected results and GPU usage
- Next steps after deployment
2025-11-17 05:26:52 -08:00

7.5 KiB

🚀 Vision Encoder Deployment — Quick Guide

Server: 144.76.224.179 (Hetzner GEX44 #2844465)
Status: Code pushed to GitHub
Ready to deploy: YES


Quick Deploy (One Command)

SSH to server and run automated script:

ssh root@144.76.224.179 'cd /opt/microdao-daarion && git pull origin main && ./deploy-vision-encoder.sh'

That's it! The script will:

  • Pull latest code
  • Check GPU & Docker GPU runtime
  • Build Vision Encoder image
  • Start Vision Encoder + Qdrant
  • Run health checks
  • Run smoke tests
  • Show GPU status

📋 Manual Deploy (Step by Step)

If you prefer manual deployment:

1. SSH to Server

ssh root@144.76.224.179

2. Navigate to Project

cd /opt/microdao-daarion

3. Pull Latest Code

git pull origin main

4. Check GPU

nvidia-smi

Should show NVIDIA GPU with ~24 GB VRAM.

5. Build Vision Encoder

docker-compose build vision-encoder

This takes 5-10 minutes (downloads PyTorch + OpenCLIP).

6. Start Services

docker-compose up -d vision-encoder qdrant

7. Check Logs

docker-compose logs -f vision-encoder

Wait for: "Model loaded successfully. Embedding dimension: 768"

8. Verify Health

curl http://localhost:8001/health
curl http://localhost:6333/healthz

9. Create Qdrant Collection

curl -X PUT http://localhost:6333/collections/daarion_images \
  -H "Content-Type: application/json" \
  -d '{"vectors": {"size": 768, "distance": "Cosine"}}'

10. Run Smoke Tests

chmod +x ./test-vision-encoder.sh
./test-vision-encoder.sh

11. Monitor GPU

watch -n 1 nvidia-smi

Should show Vision Encoder using ~4 GB VRAM.


🔍 Verification

Check All Services

docker-compose ps

All 17 services should be "Up":

  • dagi-router (9102)
  • dagi-gateway (9300)
  • dagi-devtools (8008)
  • dagi-crewai (9010)
  • dagi-rbac (9200)
  • dagi-rag-service (9500)
  • dagi-memory-service (8000)
  • dagi-parser-service (9400)
  • dagi-vision-encoder (8001) ← NEW
  • dagi-postgres (5432)
  • redis (6379)
  • neo4j (7687/7474)
  • dagi-qdrant (6333/6334) ← NEW
  • grafana (3000)
  • prometheus (9090)
  • neo4j-exporter (9091)
  • ollama (11434)

Test Vision Encoder API

# Text embedding
curl -X POST http://localhost:8001/embed/text \
  -H "Content-Type: application/json" \
  -d '{"text": "токеноміка DAARION", "normalize": true}'

# Should return: {"embedding": [...], "dimension": 768, ...}

Test via Router

curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "vision_embed",
    "message": "embed text",
    "payload": {
      "operation": "embed_text",
      "text": "DAARION governance",
      "normalize": true
    }
  }'

📊 Expected Results

GPU Usage

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05   Driver Version: 535.104.05   CUDA Version: 12.2    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce...  Off  | 00000000:01:00.0 Off |                  N/A |
| 35%   52C    P2    85W / 350W |   4096MiB / 24576MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+

VRAM Allocation:

  • Vision Encoder: ~4 GB (always loaded)
  • Ollama (qwen3:8b): ~6 GB (when active)
  • Available: ~14 GB

Service Logs

Vision Encoder startup logs:

{"timestamp": "2025-01-17 13:00:00", "level": "INFO", "message": "Starting vision-encoder service..."}
{"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Loading model ViT-L-14 with pretrained weights openai"}
{"timestamp": "2025-01-17 13:00:01", "level": "INFO", "message": "Device: cuda"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Model loaded successfully. Embedding dimension: 768"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "GPU: NVIDIA GeForce RTX 3090, Memory: 24.00 GB"}
{"timestamp": "2025-01-17 13:00:15", "level": "INFO", "message": "Uvicorn running on http://0.0.0.0:8001"}

🐛 Troubleshooting

Problem: GPU not detected

Check:

nvidia-smi

Fix:

# Install NVIDIA drivers (if needed)
sudo apt install nvidia-driver-535
sudo reboot

# Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

Problem: Vision Encoder using CPU instead of GPU

Check device:

curl http://localhost:8001/health | jq '.device'

If returns "cpu":

  1. Check GPU runtime: docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
  2. Restart Vision Encoder: docker-compose restart vision-encoder
  3. Check logs: docker-compose logs vision-encoder

Problem: Out of Memory

Check GPU memory:

nvidia-smi

Solutions:

  1. Use smaller model: Edit docker-compose.ymlMODEL_NAME=ViT-B-32 (2 GB instead of 4 GB)
  2. Stop Ollama temporarily: docker stop ollama
  3. Restart services: docker-compose restart vision-encoder

📖 Documentation


Deployment Checklist

Before Deployment:

  • Code committed to Git
  • Code pushed to GitHub
  • Documentation updated
  • Tests created
  • Deploy script created

After Deployment:

  • Vision Encoder running (port 8001)
  • Qdrant running (port 6333)
  • Health checks passing
  • Smoke tests passing
  • GPU detected and used (~4 GB VRAM)
  • Qdrant collection created
  • Integration with Router working

🎯 Next Steps After Deployment

1. Index Existing Images

# Example: Index images from Parser Service output
python scripts/index_images.py --dao-id daarion --directory /data/images
# Text-to-image search
curl -X POST http://localhost:9102/route \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "image_search",
    "message": "знайди діаграми токеноміки",
    "dao_id": "daarion",
    "payload": {"top_k": 5}
  }'

3. Monitor Performance

# GPU usage
watch -n 1 nvidia-smi

# Service logs
docker-compose logs -f vision-encoder

# Request metrics
curl http://localhost:9090/metrics | grep vision_encoder

Status: Ready to Deploy
Last Updated: 2025-01-17
Maintained by: Ivan Tytar & DAARION Team