- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
- FastAPI app with text/image embedding endpoints (768-dim)
- Docker support with NVIDIA GPU runtime
- Port 8001, health checks, model info API
- Qdrant Vector Database integration
- Port 6333/6334 (HTTP/gRPC)
- Image embeddings storage (768-dim, Cosine distance)
- Auto collection creation
- Vision RAG implementation
- VisionEncoderClient (Python client for API)
- Image Search module (text-to-image, image-to-image)
- Vision RAG routing in DAGI Router (mode: image_search)
- VisionEncoderProvider integration
- Documentation (5000+ lines)
- SYSTEM-INVENTORY.md - Complete system inventory
- VISION-ENCODER-STATUS.md - Service status
- VISION-RAG-IMPLEMENTATION.md - Implementation details
- vision_encoder_deployment_task.md - Deployment checklist
- services/vision-encoder/README.md - Deployment guide
- Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook
- Testing
- test-vision-encoder.sh - Smoke tests (6 tests)
- Unit tests for client, image search, routing
- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)
Status: Production Ready ✅
31 lines
518 B
Plaintext
31 lines
518 B
Plaintext
# FastAPI and server
|
|
fastapi==0.104.1
|
|
uvicorn[standard]==0.24.0
|
|
python-multipart==0.0.6
|
|
pydantic==2.5.0
|
|
pydantic-settings==2.1.0
|
|
|
|
# Model and ML
|
|
torch>=2.0.0
|
|
transformers>=4.35.0
|
|
Pillow>=10.0.0
|
|
|
|
# PDF processing
|
|
pdf2image>=1.16.3
|
|
PyMuPDF>=1.23.0 # Alternative PDF library
|
|
|
|
# Image processing
|
|
opencv-python>=4.8.0 # Optional, for advanced image processing
|
|
|
|
# Utilities
|
|
python-dotenv>=1.0.1
|
|
|
|
# Messaging
|
|
nats-py>=2.7.0
|
|
|
|
# Testing
|
|
pytest>=7.4.0
|
|
pytest-asyncio>=0.21.0
|
|
httpx>=0.25.0 # For TestClient and Ollama client
|
|
|