# Chandra Document Processing Service

Wrapper service for Datalab Chandra OCR model for document and table processing.

## Features

- Document OCR with structure preservation
- Table extraction with formatting
- Handwriting recognition
- Form processing
- Output formats: Markdown, HTML, JSON

## API Endpoints

### Health Check
```
GET /health
```

### Process Document
```
POST /process
```

**Request:**
- `file`: Uploaded file (PDF, image)
- `doc_url`: URL to document
- `doc_base64`: Base64 encoded document
- `output_format`: markdown, html, or json
- `accurate_mode`: true/false

**Response:**
```json
{
  "success": true,
  "output_format": "markdown",
  "result": {
    "markdown": "...",
    "metadata": {...}
  }
}
```

### List Models
```
GET /models
```

## Configuration

Environment variables:
- `CHANDRA_API_URL`: URL to Chandra inference service (default: `http://chandra-inference:8000`)
- `CHANDRA_LICENSE_KEY`: Datalab license key (if required)
- `CHANDRA_MODEL`: Model name (chandra-small or chandra)

## Integration

This service integrates with:
- Router (`OCR_URL` and `CHANDRA_URL`)
- Gateway (`doc_service.py`)
- Memory Service (for storing processed documents)