feat: integrate dots.ocr native prompt modes and 2-stage qa_pairs pipeline
Prompt Modes Integration: - Create local_runtime.py with DOTS_PROMPT_MAP - Map OutputMode to native dots.ocr prompt modes (prompt_layout_all_en, prompt_ocr, etc.) - Support dict_promptmode_to_prompt from dots.ocr with fallback prompts - Add layout_only and region modes to OutputMode enum 2-Stage Q&A Pipeline: - Create qa_builder.py for 2-stage qa_pairs generation - Stage 1: PARSER (dots.ocr) → raw JSON via prompt_layout_all_en - Stage 2: LLM (DAGI Router) → Q&A pairs via mode=qa_build - Update endpoints.py to use 2-stage pipeline for qa_pairs mode - Add ROUTER_BASE_URL and ROUTER_TIMEOUT to config Updates: - Update inference.py to use local_runtime with native prompts - Update ollama_client.py to use same prompt map - Add PROMPT_MODES.md documentation
This commit is contained in:
@@ -115,7 +115,7 @@ class ParsedDocument(BaseModel):
|
||||
class ParseRequest(BaseModel):
|
||||
"""Parse request"""
|
||||
doc_url: Optional[str] = Field(None, description="Document URL")
|
||||
output_mode: Literal["raw_json", "markdown", "qa_pairs", "chunks"] = Field(
|
||||
output_mode: Literal["raw_json", "markdown", "qa_pairs", "chunks", "layout_only", "region"] = Field(
|
||||
"raw_json", description="Output mode"
|
||||
)
|
||||
dao_id: Optional[str] = Field(None, description="DAO ID")
|
||||
|
||||
Reference in New Issue
Block a user