feat: implement RAG Service MVP with PARSER + Memory integration
RAG Service Implementation: - Create rag-service/ with full structure (config, document_store, embedding, pipelines) - Document Store: PostgreSQL + pgvector via Haystack - Embedding: BAAI/bge-m3 (multilingual, 1024 dim) - Ingest Pipeline: Convert ParsedDocument to Haystack Documents, embed, index - Query Pipeline: Retrieve documents, generate answers via DAGI Router - FastAPI endpoints: /ingest, /query, /health Tests: - Unit tests for ingest and query pipelines - E2E test with example parsed JSON - Test fixtures with real PARSER output example Router Integration: - Add mode='rag_query' routing rule in router-config.yml - Priority 7, uses local_qwen3_8b for RAG queries Docker: - Add rag-service to docker-compose.yml - Configure dependencies (router, city-db) - Add model cache volume Documentation: - Complete README with API examples - Integration guides for PARSER and Router
This commit is contained in:
57
services/rag-service/app/document_store.py
Normal file
57
services/rag-service/app/document_store.py
Normal file
@@ -0,0 +1,57 @@
|
||||
"""
|
||||
Document Store for RAG Service
|
||||
Uses PostgreSQL + pgvector via Haystack
|
||||
"""
|
||||
|
||||
import logging
|
||||
from typing import Optional
|
||||
|
||||
from haystack.document_stores import PGVectorDocumentStore
|
||||
|
||||
from app.core.config import settings
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Global document store instance
|
||||
_document_store: Optional[PGVectorDocumentStore] = None
|
||||
|
||||
|
||||
def get_document_store() -> PGVectorDocumentStore:
|
||||
"""
|
||||
Get or create PGVectorDocumentStore instance
|
||||
|
||||
Returns:
|
||||
PGVectorDocumentStore configured with pgvector
|
||||
"""
|
||||
global _document_store
|
||||
|
||||
if _document_store is not None:
|
||||
return _document_store
|
||||
|
||||
logger.info(f"Initializing PGVectorDocumentStore: table={settings.RAG_TABLE_NAME}")
|
||||
logger.info(f"Connection: {settings.PG_DSN.split('@')[1] if '@' in settings.PG_DSN else 'hidden'}")
|
||||
|
||||
try:
|
||||
_document_store = PGVectorDocumentStore(
|
||||
connection_string=settings.PG_DSN,
|
||||
embedding_dim=settings.EMBED_DIM,
|
||||
table_name=settings.RAG_TABLE_NAME,
|
||||
search_strategy=settings.SEARCH_STRATEGY,
|
||||
# Additional options
|
||||
recreate_table=False, # Don't drop existing table
|
||||
similarity="cosine", # Cosine similarity for embeddings
|
||||
)
|
||||
|
||||
logger.info("PGVectorDocumentStore initialized successfully")
|
||||
return _document_store
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize DocumentStore: {e}", exc_info=True)
|
||||
raise RuntimeError(f"DocumentStore initialization failed: {e}") from e
|
||||
|
||||
|
||||
def reset_document_store():
|
||||
"""Reset global document store instance (for testing)"""
|
||||
global _document_store
|
||||
_document_store = None
|
||||
|
||||
Reference in New Issue
Block a user