microdao-daarion

Author	SHA1	Message	Date
Apple	be22752590	feat: integrate dots.ocr native prompt modes and 2-stage qa_pairs pipeline Prompt Modes Integration: - Create local_runtime.py with DOTS_PROMPT_MAP - Map OutputMode to native dots.ocr prompt modes (prompt_layout_all_en, prompt_ocr, etc.) - Support dict_promptmode_to_prompt from dots.ocr with fallback prompts - Add layout_only and region modes to OutputMode enum 2-Stage Q&A Pipeline: - Create qa_builder.py for 2-stage qa_pairs generation - Stage 1: PARSER (dots.ocr) → raw JSON via prompt_layout_all_en - Stage 2: LLM (DAGI Router) → Q&A pairs via mode=qa_build - Update endpoints.py to use 2-stage pipeline for qa_pairs mode - Add ROUTER_BASE_URL and ROUTER_TIMEOUT to config Updates: - Update inference.py to use local_runtime with native prompts - Update ollama_client.py to use same prompt map - Add PROMPT_MODES.md documentation	2025-11-16 04:24:03 -08:00
Apple	49272b66e6	feat: add RAG converter utilities and update integration guide RAG Converter: - Create app/utils/rag_converter.py with conversion functions - parsed_doc_to_haystack_docs() - convert ParsedDocument to Haystack format - parsed_chunks_to_haystack_docs() - convert ParsedChunk list to Haystack - validate_parsed_doc_for_rag() - validate required fields before conversion - Automatic metadata extraction (dao_id, doc_id, page, block_type) - Preserve optional fields (bbox, section, reading_order) Integration Guide: - Update with ready-to-use converter functions - Add validation examples - Complete workflow examples	2025-11-16 03:03:20 -08:00
Apple	7251e519d6	feat: enhance model output parser and add integration guide Model Output Parser: - Support multiple dots.ocr output formats (JSON, structured text, plain text) - Normalize all formats to standard ParsedBlock structure - Handle JSON with blocks/pages arrays - Parse markdown-like structured text - Fallback to plain text parsing - Better error handling and logging Schemas: - Document must-have fields for RAG (doc_id, pages, metadata.dao_id) - ParsedChunk must-have fields (text, metadata.dao_id, metadata.doc_id) - Add detailed field descriptions for RAG integration Integration Guide: - Create INTEGRATION.md with complete integration guide - Document dots.ocr output formats - Show ParsedDocument → Haystack Documents conversion - Provide DAGI Router integration examples - RAG pipeline integration with filters - Complete workflow examples - RBAC integration recommendations	2025-11-16 03:02:42 -08:00
Apple	ca05c91799	feat: complete dots.ocr integration with deployment setup Model Loader: - Update model_loader.py with complete dots.ocr loading code - Proper device detection (CUDA/CPU/MPS) with fallback - Memory optimization (low_cpu_mem_usage) - Better error handling and logging - Support for local model paths and HF Hub Docker: - Multi-stage Dockerfile (CPU/CUDA builds) - docker-compose.yml for parser-service - .dockerignore for clean builds - Model cache volume for persistence Configuration: - Support DOTS_OCR_MODEL_ID and DEVICE env vars (backward compatible) - Better defaults and environment variable handling Deployment: - Add DEPLOYMENT.md with detailed instructions - Local deployment (venv) - Docker Compose deployment - Ollama runtime setup - Troubleshooting guide Integration: - Add parser-service to main docker-compose.yml - Configure volumes and networks - Health checks and dependencies	2025-11-16 03:00:01 -08:00
Apple	8713810d72	fix: remove async call from sync function	2025-11-16 02:56:45 -08:00
Apple	00f9102e50	feat: add Ollama runtime support and RAG implementation plan Ollama Runtime: - Add ollama_client.py for Ollama API integration - Support for dots-ocr model via Ollama - Add OLLAMA_BASE_URL configuration - Update inference.py to support Ollama runtime (RUNTIME_TYPE=ollama) - Update endpoints to handle async Ollama calls - Alternative to local transformers model RAG Implementation Plan: - Create TODO-RAG.md with detailed Haystack integration plan - Document Store setup (pgvector) - Embedding model selection - Ingest pipeline (PARSER → RAG) - Query pipeline (RAG → LLM) - Integration with DAGI Router - Bot commands (/upload_doc, /ask_doc) - Testing strategy Now supports three runtime modes: 1. Local transformers (RUNTIME_TYPE=local) 2. Ollama (RUNTIME_TYPE=ollama) 3. Dummy (USE_DUMMY_PARSER=true)	2025-11-16 02:56:36 -08:00
Apple	d56ff3493d	fix: remove duplicate except blocks in model_loader	2025-11-15 13:25:23 -08:00
Apple	2a353040f6	feat: add tests and integrate dots.ocr model G.2.5 - Tests: - Add pytest test suite with fixtures - test_preprocessing.py - PDF/image loading, normalization, validation - test_postprocessing.py - chunks, QA pairs, markdown generation - test_inference.py - dummy parser and inference functions - test_api.py - API endpoint tests - Add pytest.ini configuration G.1.3 - dots.ocr Integration: - Update model_loader.py with real model loading code - Support for AutoModelForVision2Seq and AutoProcessor - Device handling (CUDA/CPU/MPS) with fallback - Error handling with dummy fallback option - Update inference.py with real model inference - Process images through model - Generate and decode outputs - Parse model output to blocks - Add model_output_parser.py - Parse JSON or plain text model output - Convert to structured blocks - Layout detection support (placeholder) Dependencies: - Add pytest, pytest-asyncio, httpx for testing	2025-11-15 13:25:01 -08:00
Apple	4befecc425	feat: implement PDF/image preprocessing, post-processing, and dots.ocr integration prep G.2.3 - PDF/Image Support: - Add preprocessing.py with PDF→images conversion (pdf2image) - Add image loading and normalization - Add file type detection and validation - Support for PDF, PNG, JPEG, WebP, TIFF G.2.4 - Pre/Post-processing: - Add postprocessing.py with structured output builders - build_chunks() - semantic chunks for RAG - build_qa_pairs() - Q&A extraction - build_markdown() - Markdown conversion - Text normalization and chunking logic G.1.3 - dots.ocr Integration Prep: - Update model_loader.py with proper error handling - Add USE_DUMMY_PARSER and ALLOW_DUMMY_FALLBACK flags - Update inference.py to work with images list - Add parse_document_from_images() function - Ready for actual model integration Configuration: - Add PDF_DPI, IMAGE_MAX_SIZE, PAGE_RANGE settings - Add parser mode flags (USE_DUMMY_PARSER, ALLOW_DUMMY_FALLBACK) API Updates: - Update endpoints to use new preprocessing pipeline - Integrate post-processing for all output modes - Remove temp file handling (work directly with bytes)	2025-11-15 13:19:07 -08:00
Apple	0f6cfe046f	fix: add missing __init__.py files for parser-service modules	2025-11-15 13:15:16 -08:00
Apple	5e7cfc019e	feat: create PARSER service skeleton with FastAPI - Create parser-service/ with full structure - Add FastAPI app with endpoints (/parse, /parse_qa, /parse_markdown, /parse_chunks) - Add Pydantic schemas (ParsedDocument, ParsedBlock, ParsedChunk, etc.) - Add runtime module with model_loader and inference (with dummy parser) - Add configuration, Dockerfile, requirements.txt - Update TODO-PARSER-RAG.md with completed tasks - Ready for dots.ocr model integration	2025-11-15 13:15:08 -08:00

11 Commits