microdao-daarion

Author	SHA1	Message	Date
Apple	4601c6fca8	feat: add Vision Encoder service + Vision RAG implementation - Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated) - FastAPI app with text/image embedding endpoints (768-dim) - Docker support with NVIDIA GPU runtime - Port 8001, health checks, model info API - Qdrant Vector Database integration - Port 6333/6334 (HTTP/gRPC) - Image embeddings storage (768-dim, Cosine distance) - Auto collection creation - Vision RAG implementation - VisionEncoderClient (Python client for API) - Image Search module (text-to-image, image-to-image) - Vision RAG routing in DAGI Router (mode: image_search) - VisionEncoderProvider integration - Documentation (5000+ lines) - SYSTEM-INVENTORY.md - Complete system inventory - VISION-ENCODER-STATUS.md - Service status - VISION-RAG-IMPLEMENTATION.md - Implementation details - vision_encoder_deployment_task.md - Deployment checklist - services/vision-encoder/README.md - Deployment guide - Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook - Testing - test-vision-encoder.sh - Smoke tests (6 tests) - Unit tests for client, image search, routing - Services: 17 total (added Vision Encoder + Qdrant) - AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3) - GPU Services: 2 (Vision Encoder, Ollama) - VRAM Usage: ~10 GB (concurrent) Status: Production Ready ✅	2025-11-17 05:24:36 -08:00
Apple	00f9102e50	feat: add Ollama runtime support and RAG implementation plan Ollama Runtime: - Add ollama_client.py for Ollama API integration - Support for dots-ocr model via Ollama - Add OLLAMA_BASE_URL configuration - Update inference.py to support Ollama runtime (RUNTIME_TYPE=ollama) - Update endpoints to handle async Ollama calls - Alternative to local transformers model RAG Implementation Plan: - Create TODO-RAG.md with detailed Haystack integration plan - Document Store setup (pgvector) - Embedding model selection - Ingest pipeline (PARSER → RAG) - Query pipeline (RAG → LLM) - Integration with DAGI Router - Bot commands (/upload_doc, /ask_doc) - Testing strategy Now supports three runtime modes: 1. Local transformers (RUNTIME_TYPE=local) 2. Ollama (RUNTIME_TYPE=ollama) 3. Dummy (USE_DUMMY_PARSER=true)	2025-11-16 02:56:36 -08:00
Apple	2a353040f6	feat: add tests and integrate dots.ocr model G.2.5 - Tests: - Add pytest test suite with fixtures - test_preprocessing.py - PDF/image loading, normalization, validation - test_postprocessing.py - chunks, QA pairs, markdown generation - test_inference.py - dummy parser and inference functions - test_api.py - API endpoint tests - Add pytest.ini configuration G.1.3 - dots.ocr Integration: - Update model_loader.py with real model loading code - Support for AutoModelForVision2Seq and AutoProcessor - Device handling (CUDA/CPU/MPS) with fallback - Error handling with dummy fallback option - Update inference.py with real model inference - Process images through model - Generate and decode outputs - Parse model output to blocks - Add model_output_parser.py - Parse JSON or plain text model output - Convert to structured blocks - Layout detection support (placeholder) Dependencies: - Add pytest, pytest-asyncio, httpx for testing	2025-11-15 13:25:01 -08:00
Apple	5e7cfc019e	feat: create PARSER service skeleton with FastAPI - Create parser-service/ with full structure - Add FastAPI app with endpoints (/parse, /parse_qa, /parse_markdown, /parse_chunks) - Add Pydantic schemas (ParsedDocument, ParsedBlock, ParsedChunk, etc.) - Add runtime module with model_loader and inference (with dummy parser) - Add configuration, Dockerfile, requirements.txt - Update TODO-PARSER-RAG.md with completed tasks - Ready for dots.ocr model integration	2025-11-15 13:15:08 -08:00

4 Commits