feat: add RAG converter utilities and update integration guide
RAG Converter: - Create app/utils/rag_converter.py with conversion functions - parsed_doc_to_haystack_docs() - convert ParsedDocument to Haystack format - parsed_chunks_to_haystack_docs() - convert ParsedChunk list to Haystack - validate_parsed_doc_for_rag() - validate required fields before conversion - Automatic metadata extraction (dao_id, doc_id, page, block_type) - Preserve optional fields (bbox, section, reading_order) Integration Guide: - Update with ready-to-use converter functions - Add validation examples - Complete workflow examples
This commit is contained in:
16
services/parser-service/app/utils/__init__.py
Normal file
16
services/parser-service/app/utils/__init__.py
Normal file
@@ -0,0 +1,16 @@
|
||||
"""
|
||||
Utility functions for PARSER Service
|
||||
"""
|
||||
|
||||
from app.utils.rag_converter import (
|
||||
parsed_doc_to_haystack_docs,
|
||||
parsed_chunks_to_haystack_docs,
|
||||
validate_parsed_doc_for_rag
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"parsed_doc_to_haystack_docs",
|
||||
"parsed_chunks_to_haystack_docs",
|
||||
"validate_parsed_doc_for_rag"
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user