Files

Apple 5290287058 feat: implement TTS, Document processing, and Memory Service /facts API

- TTS: xtts-v2 integration with voice cloning support
- Document: docling integration for PDF/DOCX/PPTX processing
- Memory Service: added /facts/upsert, /facts/{key}, /facts endpoints
- Added required dependencies (TTS, docling)

2026-01-17 08:16:37 -08:00

Dockerfile

feat: implement TTS, Document processing, and Memory Service /facts API

2026-01-17 08:16:37 -08:00

main.py

feat: implement TTS, Document processing, and Memory Service /facts API

2026-01-17 08:16:37 -08:00

README.md

feat: implement TTS, Document processing, and Memory Service /facts API

2026-01-17 08:16:37 -08:00

README.md

Chandra Document Processing Service

Wrapper service for Datalab Chandra OCR model for document and table processing.

Features

Document OCR with structure preservation
Table extraction with formatting
Handwriting recognition
Form processing
Output formats: Markdown, HTML, JSON

API Endpoints

Health Check

GET /health

Process Document

POST /process

Request:

file: Uploaded file (PDF, image)
doc_url: URL to document
doc_base64: Base64 encoded document
output_format: markdown, html, or json
accurate_mode: true/false

Response:

{
  "success": true,
  "output_format": "markdown",
  "result": {
    "markdown": "...",
    "metadata": {...}
  }
}

List Models

GET /models

Configuration

Environment variables:

CHANDRA_API_URL: URL to Chandra inference service (default: http://chandra-inference:8000)
CHANDRA_LICENSE_KEY: Datalab license key (if required)
CHANDRA_MODEL: Model name (chandra-small or chandra)

Integration

This service integrates with:

Router (OCR_URL and CHANDRA_URL)
Gateway (doc_service.py)
Memory Service (for storing processed documents)