- TTS: xtts-v2 integration with voice cloning support
- Document: docling integration for PDF/DOCX/PPTX processing
- Memory Service: added /facts/upsert, /facts/{key}, /facts endpoints
- Added required dependencies (TTS, docling)
35 lines
662 B
Plaintext
35 lines
662 B
Plaintext
fastapi==0.104.1
|
|
uvicorn[standard]==0.24.0
|
|
httpx==0.25.2
|
|
pydantic==2.5.0
|
|
pyyaml==6.0.1
|
|
python-multipart==0.0.6
|
|
|
|
# HuggingFace dependencies for OCR models
|
|
torch>=2.0.0
|
|
torchvision>=0.15.0
|
|
transformers>=4.35.0
|
|
accelerate>=0.25.0
|
|
pillow>=10.0.0
|
|
tiktoken>=0.5.0
|
|
sentencepiece>=0.1.99
|
|
einops>=0.7.0
|
|
|
|
# STT (Speech-to-Text) dependencies
|
|
faster-whisper>=1.0.0
|
|
openai-whisper>=20231117
|
|
|
|
# Image Generation (Diffusion models)
|
|
diffusers @ git+https://github.com/huggingface/diffusers.git
|
|
safetensors>=0.4.0
|
|
|
|
# Web Scraping & Search
|
|
trafilatura>=1.6.0
|
|
duckduckgo-search>=4.0.0
|
|
lxml_html_clean>=0.1.0
|
|
|
|
# TTS (Text-to-Speech)
|
|
TTS>=0.22.0
|
|
|
|
# Document Processing
|
|
docling>=2.0.0 |