Files
microdao-daarion/services/chandra-service/README.md
Apple 5290287058 feat: implement TTS, Document processing, and Memory Service /facts API
- TTS: xtts-v2 integration with voice cloning support
- Document: docling integration for PDF/DOCX/PPTX processing
- Memory Service: added /facts/upsert, /facts/{key}, /facts endpoints
- Added required dependencies (TTS, docling)
2026-01-17 08:16:37 -08:00

62 lines
1.2 KiB
Markdown

# Chandra Document Processing Service
Wrapper service for Datalab Chandra OCR model for document and table processing.
## Features
- Document OCR with structure preservation
- Table extraction with formatting
- Handwriting recognition
- Form processing
- Output formats: Markdown, HTML, JSON
## API Endpoints
### Health Check
```
GET /health
```
### Process Document
```
POST /process
```
**Request:**
- `file`: Uploaded file (PDF, image)
- `doc_url`: URL to document
- `doc_base64`: Base64 encoded document
- `output_format`: markdown, html, or json
- `accurate_mode`: true/false
**Response:**
```json
{
"success": true,
"output_format": "markdown",
"result": {
"markdown": "...",
"metadata": {...}
}
}
```
### List Models
```
GET /models
```
## Configuration
Environment variables:
- `CHANDRA_API_URL`: URL to Chandra inference service (default: `http://chandra-inference:8000`)
- `CHANDRA_LICENSE_KEY`: Datalab license key (if required)
- `CHANDRA_MODEL`: Model name (chandra-small or chandra)
## Integration
This service integrates with:
- Router (`OCR_URL` and `CHANDRA_URL`)
- Gateway (`doc_service.py`)
- Memory Service (for storing processed documents)