feat: add RAG converter utilities and update integration guide
RAG Converter: - Create app/utils/rag_converter.py with conversion functions - parsed_doc_to_haystack_docs() - convert ParsedDocument to Haystack format - parsed_chunks_to_haystack_docs() - convert ParsedChunk list to Haystack - validate_parsed_doc_for_rag() - validate required fields before conversion - Automatic metadata extraction (dao_id, doc_id, page, block_type) - Preserve optional fields (bbox, section, reading_order) Integration Guide: - Update with ready-to-use converter functions - Add validation examples - Complete workflow examples
This commit is contained in:
@@ -174,6 +174,23 @@ async def route(request: RouterRequest):
|
||||
|
||||
### 1. Конвертація ParsedDocument → Haystack Documents
|
||||
|
||||
**Готова функція:** `app/utils/rag_converter.py`
|
||||
|
||||
```python
|
||||
from app.utils.rag_converter import parsed_doc_to_haystack_docs, validate_parsed_doc_for_rag
|
||||
|
||||
# Валідація перед конвертацією
|
||||
is_valid, errors = validate_parsed_doc_for_rag(parsed_doc)
|
||||
if not is_valid:
|
||||
logger.error(f"Document validation failed: {errors}")
|
||||
return
|
||||
|
||||
# Конвертація
|
||||
haystack_docs = parsed_doc_to_haystack_docs(parsed_doc)
|
||||
```
|
||||
|
||||
**Або вручну:**
|
||||
|
||||
```python
|
||||
from haystack.schema import Document
|
||||
|
||||
|
||||
Reference in New Issue
Block a user