- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
- FastAPI app with text/image embedding endpoints (768-dim)
- Docker support with NVIDIA GPU runtime
- Port 8001, health checks, model info API
- Qdrant Vector Database integration
- Port 6333/6334 (HTTP/gRPC)
- Image embeddings storage (768-dim, Cosine distance)
- Auto collection creation
- Vision RAG implementation
- VisionEncoderClient (Python client for API)
- Image Search module (text-to-image, image-to-image)
- Vision RAG routing in DAGI Router (mode: image_search)
- VisionEncoderProvider integration
- Documentation (5000+ lines)
- SYSTEM-INVENTORY.md - Complete system inventory
- VISION-ENCODER-STATUS.md - Service status
- VISION-RAG-IMPLEMENTATION.md - Implementation details
- vision_encoder_deployment_task.md - Deployment checklist
- services/vision-encoder/README.md - Deployment guide
- Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook
- Testing
- test-vision-encoder.sh - Smoke tests (6 tests)
- Unit tests for client, image search, routing
- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)
Status: Production Ready ✅
42 lines
979 B
Docker
42 lines
979 B
Docker
# Vision Encoder Service - GPU-ready Docker image
|
|
# Base: PyTorch with CUDA support
|
|
|
|
FROM pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
|
|
|
|
# Set working directory
|
|
WORKDIR /app
|
|
|
|
# Install system dependencies
|
|
RUN apt-get update && apt-get install -y \
|
|
curl \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Copy requirements first for better caching
|
|
COPY requirements.txt .
|
|
|
|
# Install Python dependencies
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
# Copy application code
|
|
COPY app/ ./app/
|
|
|
|
# Create cache directory for model weights
|
|
RUN mkdir -p /root/.cache/clip
|
|
|
|
# Set environment variables
|
|
ENV PYTHONUNBUFFERED=1
|
|
ENV DEVICE=cuda
|
|
ENV MODEL_NAME=ViT-L-14
|
|
ENV MODEL_PRETRAINED=openai
|
|
ENV PORT=8001
|
|
|
|
# Expose port
|
|
EXPOSE 8001
|
|
|
|
# Health check
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
|
CMD curl -f http://localhost:8001/health || exit 1
|
|
|
|
# Run the application
|
|
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8001"]
|