feat: Add Comfy Agent service for NODE3 image/video generation

- Create comfy-agent service with FastAPI + NATS integration
- ComfyUI client with HTTP/WebSocket support
- REST API: /generate/image, /generate/video, /status, /result
- NATS subjects: agent.invoke.comfy, comfy.request.*
- Async job queue with progress tracking
- Docker compose configuration for NODE3
- Update PROJECT-MASTER-INDEX.md with NODE2/NODE3 docs

Co-Authored-By: Warp <agent@warp.dev>
This commit is contained in:
Apple
2026-02-10 04:13:49 -08:00
parent 6e0887abcd
commit c41c68dc08
16 changed files with 815 additions and 1 deletions

View File

@@ -0,0 +1,18 @@
__pycache__
*.pyc
*.pyo
*.pyd
.Python
venv/
.venv/
*.egg-info/
.pytest_cache/
.mypy_cache/
.coverage
htmlcov/
dist/
build/
*.log
.DS_Store
.env
.env.local

View File

@@ -0,0 +1,13 @@
# services/comfy-agent/Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r /app/requirements.txt
COPY app /app/app
ENV PYTHONUNBUFFERED=1
EXPOSE 8880
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8880"]

View File

@@ -0,0 +1,215 @@
# Comfy Agent Service
**Image & Video Generation Service for NODE3**
## Overview
Comfy Agent is a specialized service that interfaces with ComfyUI for AI-powered image and video generation. It provides both REST API and NATS messaging interfaces, enabling other agents in the DAARION ecosystem to request generation tasks.
## Architecture
```
NODE1 Agents → NATS → Comfy Agent (NODE3) → ComfyUI (port 8188)
→ LTX-2 Models (293 GB)
```
## Features
- **REST API**: Synchronous HTTP endpoints for generation requests
- **NATS Integration**: Async message-based communication with other agents
- **Job Queue**: Handles concurrent generation requests with configurable concurrency
- **Progress Tracking**: Real-time progress updates via WebSocket monitoring
- **Result Storage**: File-based storage with URL-based result retrieval
## API Endpoints
### POST /generate/image
Generate an image from text prompt.
**Request:**
```json
{
"prompt": "a futuristic city of gifts, ultra-detailed, cinematic",
"negative_prompt": "blurry, low quality",
"width": 1024,
"height": 1024,
"steps": 28,
"seed": 12345
}
```
**Response:**
```json
{
"job_id": "job_abc123...",
"type": "text-to-image",
"status": "queued",
"progress": 0.0
}
```
### POST /generate/video
Generate a video from text prompt using LTX-2.
**Request:**
```json
{
"prompt": "a cat walking on the moon, cinematic",
"seconds": 4,
"fps": 24,
"steps": 30
}
```
### GET /status/{job_id}
Check the status of a generation job.
**Response:**
```json
{
"job_id": "job_abc123...",
"type": "text-to-image",
"status": "succeeded",
"progress": 1.0,
"result_url": "http://NODE3_IP:8880/files/job_abc123.../output.png"
}
```
### GET /result/{job_id}
Retrieve the final result (same as status).
### GET /healthz
Health check endpoint.
## NATS Integration
### Subscribed Topics
- `agent.invoke.comfy` - Main invocation channel from router
- `comfy.request.image` - Direct image generation requests
- `comfy.request.video` - Direct video generation requests
### Message Format
**Request:**
```json
{
"type": "text-to-image",
"workflow": {
"1": {"class_type": "CLIPTextEncode", ...},
"2": {"class_type": "CheckpointLoaderSimple", ...}
}
}
```
**Response:**
```json
{
"job_id": "job_abc123..."
}
```
## Configuration
Environment variables:
- `COMFYUI_HTTP` - ComfyUI HTTP endpoint (default: `http://127.0.0.1:8188`)
- `COMFYUI_WS` - ComfyUI WebSocket endpoint (default: `ws://127.0.0.1:8188/ws`)
- `NATS_URL` - NATS server URL (default: `nats://144.76.224.179:4222`)
- `STORAGE_PATH` - Path for result storage (default: `/data/comfy-results`)
- `PUBLIC_BASE_URL` - Public URL for accessing results (default: `http://212.8.58.133:8880/files`)
- `MAX_CONCURRENCY` - Max concurrent generations (default: `1`)
## Development
### Local Setup
```bash
cd services/comfy-agent
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
pip install -r requirements.txt
# Run locally
uvicorn app.main:app --reload --port 8880
```
### Docker Build
```bash
docker build -t comfy-agent:latest .
```
### Testing
```bash
# Test image generation
curl -X POST http://localhost:8880/generate/image \
-H "Content-Type: application/json" \
-d '{"prompt":"a futuristic city, cyberpunk style"}'
# Check status
curl http://localhost:8880/status/job_abc123...
# Health check
curl http://localhost:8880/healthz
```
## TODO / Roadmap
1. **Workflow Templates**: Replace placeholder workflows with actual ComfyUI workflows
- SDXL text-to-image workflow
- LTX-2 text-to-video workflow
- Image-to-video workflow
2. **Result Extraction**: Implement proper file extraction from ComfyUI history
- Download images/videos via `/view` endpoint
- Support multiple output formats (PNG, JPG, GIF, MP4)
- Handle batch outputs
3. **Advanced Features**:
- Workflow library management
- Custom model loading
- LoRA/ControlNet support
- Batch processing
- Queue prioritization
4. **Monitoring**:
- Prometheus metrics
- Grafana dashboards
- Alert on failures
- GPU usage tracking
5. **Storage**:
- S3/MinIO integration for scalable storage
- Result expiration/cleanup
- Thumbnail generation
## Integration with Agent Registry
Add to `config/agent_registry.yml`:
```yaml
comfy:
id: comfy
name: Comfy
role: Image & Video Generation Specialist
scope: node_local
node_id: node-3-threadripper-rtx3090
capabilities:
- text-to-image
- text-to-video
- image-to-video
- workflow-execution
api_endpoint: http://212.8.58.133:8880
nats_subject: agent.invoke.comfy
```
## License
Part of the DAARION MicroDAO project.
## Maintainers
- DAARION Team
- Last Updated: 2026-02-10

View File

@@ -0,0 +1 @@
# services/comfy-agent/app/__init__.py

View File

@@ -0,0 +1,51 @@
# services/comfy-agent/app/api.py
from fastapi import APIRouter, HTTPException
from .models import GenerateImageRequest, GenerateVideoRequest, JobStatus
from .jobs import JOB_STORE
from .worker import enqueue
router = APIRouter()
def _build_workflow_t2i(req: GenerateImageRequest) -> dict:
# MVP: placeholder graph; you will replace with your canonical Comfy workflow JSON.
# Keep it deterministic and param-driven.
return {
"1": {"class_type": "CLIPTextEncode", "inputs": {"text": req.prompt, "clip": ["2", 0]}},
"2": {"class_type": "CheckpointLoaderSimple", "inputs": {"ckpt_name": "sdxl.safetensors"}},
# TODO: Add complete workflow JSON for text-to-image
}
def _build_workflow_t2v(req: GenerateVideoRequest) -> dict:
# MVP placeholder for LTX-2 pipeline; replace with actual LTX-2 workflow.
return {
"1": {"class_type": "CLIPTextEncode", "inputs": {"text": req.prompt, "clip": ["2", 0]}},
# TODO: Add complete workflow JSON for text-to-video with LTX-2
}
@router.post("/generate/image", response_model=JobStatus)
async def generate_image(req: GenerateImageRequest):
job = JOB_STORE.create("text-to-image")
graph = _build_workflow_t2i(req)
enqueue(job.job_id, "text-to-image", graph)
return JOB_STORE.get(job.job_id)
@router.post("/generate/video", response_model=JobStatus)
async def generate_video(req: GenerateVideoRequest):
job = JOB_STORE.create("text-to-video")
graph = _build_workflow_t2v(req)
enqueue(job.job_id, "text-to-video", graph)
return JOB_STORE.get(job.job_id)
@router.get("/status/{job_id}", response_model=JobStatus)
async def status(job_id: str):
job = JOB_STORE.get(job_id)
if not job:
raise HTTPException(status_code=404, detail="job_not_found")
return job
@router.get("/result/{job_id}", response_model=JobStatus)
async def result(job_id: str):
job = JOB_STORE.get(job_id)
if not job:
raise HTTPException(status_code=404, detail="job_not_found")
return job

View File

@@ -0,0 +1,54 @@
# services/comfy-agent/app/comfyui_client.py
import asyncio
import httpx
import json
import websockets
from typing import Any, Dict, Optional, Callable
from .config import settings
ProgressCb = Callable[[float, str], None]
class ComfyUIClient:
def __init__(self) -> None:
self.http = httpx.AsyncClient(base_url=settings.COMFYUI_HTTP, timeout=60)
async def queue_prompt(self, prompt_graph: Dict[str, Any], client_id: str) -> str:
# ComfyUI expects: {"prompt": {...}, "client_id": "..."}
r = await self.http.post("/prompt", json={"prompt": prompt_graph, "client_id": client_id})
r.raise_for_status()
data = r.json()
# typically returns {"prompt_id": "...", "number": ...}
return data["prompt_id"]
async def wait_progress(self, client_id: str, prompt_id: str, on_progress: Optional[ProgressCb] = None) -> None:
# WS emits progress/executing/status; keep generic handling
ws_url = f"{settings.COMFYUI_WS}?clientId={client_id}"
async with websockets.connect(ws_url, max_size=50_000_000) as ws:
while True:
msg = await ws.recv()
evt = json.loads(msg)
# Best-effort progress mapping
if evt.get("type") == "progress":
data = evt.get("data", {})
max_v = float(data.get("max", 1.0))
val = float(data.get("value", 0.0))
p = 0.0 if max_v <= 0 else min(1.0, val / max_v)
if on_progress:
on_progress(p, "progress")
# completion signal varies; "executing" with node=None часто означає done
if evt.get("type") == "executing":
data = evt.get("data", {})
if data.get("prompt_id") == prompt_id and data.get("node") is None:
if on_progress:
on_progress(1.0, "done")
return
async def get_history(self, prompt_id: str) -> Dict[str, Any]:
r = await self.http.get(f"/history/{prompt_id}")
r.raise_for_status()
return r.json()
async def close(self) -> None:
await self.http.aclose()

View File

@@ -0,0 +1,22 @@
# services/comfy-agent/app/config.py
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
SERVICE_NAME: str = "comfy-agent"
API_HOST: str = "0.0.0.0"
API_PORT: int = 8880
COMFYUI_HTTP: str = "http://127.0.0.1:8188"
COMFYUI_WS: str = "ws://127.0.0.1:8188/ws"
NATS_URL: str = "nats://144.76.224.179:4222" # NODE1 production IP
NATS_SUBJECT_INVOKE: str = "agent.invoke.comfy"
NATS_SUBJECT_IMAGE: str = "comfy.request.image"
NATS_SUBJECT_VIDEO: str = "comfy.request.video"
STORAGE_PATH: str = "/data/comfy-results"
PUBLIC_BASE_URL: str = "http://212.8.58.133:8880/files" # NODE3 IP
MAX_CONCURRENCY: int = 1 # для LTX-2 стартово краще 1
settings = Settings()

View File

@@ -0,0 +1,25 @@
# services/comfy-agent/app/jobs.py
import uuid
from typing import Dict, Optional
from .models import JobStatus, GenType
class JobStore:
def __init__(self) -> None:
self._jobs: Dict[str, JobStatus] = {}
def create(self, gen_type: GenType) -> JobStatus:
job_id = f"job_{uuid.uuid4().hex}"
js = JobStatus(job_id=job_id, type=gen_type, status="queued", progress=0.0)
self._jobs[job_id] = js
return js
def get(self, job_id: str) -> Optional[JobStatus]:
return self._jobs.get(job_id)
def update(self, job_id: str, **patch) -> JobStatus:
js = self._jobs[job_id]
updated = js.model_copy(update=patch)
self._jobs[job_id] = updated
return updated
JOB_STORE = JobStore()

View File

@@ -0,0 +1,26 @@
# services/comfy-agent/app/main.py
import asyncio
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from .config import settings
from .api import router
from .worker import worker_loop
from .nats_client import start_nats
from .storage import ensure_storage
app = FastAPI(title="Comfy Agent Service", version="0.1.0")
app.include_router(router)
@app.on_event("startup")
async def startup():
ensure_storage()
# Static files for result URLs: /files/{job_id}/...
app.mount("/files", StaticFiles(directory=settings.STORAGE_PATH), name="files")
asyncio.create_task(worker_loop())
await start_nats()
@app.get("/healthz")
async def healthz():
return {"ok": True, "service": settings.SERVICE_NAME}

View File

@@ -0,0 +1,34 @@
# services/comfy-agent/app/models.py
from pydantic import BaseModel, Field
from typing import Any, Dict, Optional, Literal
GenType = Literal["text-to-image", "text-to-video", "image-to-video"]
class GenerateImageRequest(BaseModel):
prompt: str = Field(min_length=1)
negative_prompt: Optional[str] = None
width: int = 1024
height: int = 1024
steps: int = 28
seed: Optional[int] = None
workflow: Optional[str] = None
workflow_params: Dict[str, Any] = Field(default_factory=dict)
class GenerateVideoRequest(BaseModel):
prompt: str = Field(min_length=1)
seconds: int = 4
fps: int = 24
steps: int = 30
seed: Optional[int] = None
workflow: Optional[str] = None
workflow_params: Dict[str, Any] = Field(default_factory=dict)
class JobStatus(BaseModel):
job_id: str
type: GenType
status: Literal["queued", "running", "succeeded", "failed"]
progress: float = 0.0
message: Optional[str] = None
result_url: Optional[str] = None
error: Optional[str] = None
comfy_prompt_id: Optional[str] = None

View File

@@ -0,0 +1,36 @@
# services/comfy-agent/app/nats_client.py
import json
import asyncio
from nats.aio.client import Client as NATS
from .config import settings
from .jobs import JOB_STORE
from .worker import enqueue
async def start_nats() -> NATS:
nc = NATS()
await nc.connect(servers=[settings.NATS_URL])
async def handle(msg):
subj = msg.subject
reply = msg.reply
payload = json.loads(msg.data.decode("utf-8"))
# payload contract (MVP):
# { "type": "text-to-image|text-to-video", "workflow": {...} }
gen_type = payload.get("type", "text-to-image")
workflow = payload.get("workflow")
if not workflow:
if reply:
await nc.publish(reply, json.dumps({"error": "missing_workflow"}).encode())
return
job = JOB_STORE.create(gen_type)
enqueue(job.job_id, gen_type, workflow)
if reply:
await nc.publish(reply, json.dumps({"job_id": job.job_id}).encode())
await nc.subscribe(settings.NATS_SUBJECT_INVOKE, cb=handle)
await nc.subscribe(settings.NATS_SUBJECT_IMAGE, cb=handle)
await nc.subscribe(settings.NATS_SUBJECT_VIDEO, cb=handle)
return nc

View File

@@ -0,0 +1,16 @@
# services/comfy-agent/app/storage.py
import os
from pathlib import Path
from .config import settings
def ensure_storage() -> None:
Path(settings.STORAGE_PATH).mkdir(parents=True, exist_ok=True)
def make_job_dir(job_id: str) -> str:
ensure_storage()
d = os.path.join(settings.STORAGE_PATH, job_id)
Path(d).mkdir(parents=True, exist_ok=True)
return d
def public_url(job_id: str, filename: str) -> str:
return f"{settings.PUBLIC_BASE_URL}/{job_id}/{filename}"

View File

@@ -0,0 +1,60 @@
# services/comfy-agent/app/worker.py
import asyncio
import uuid
import os
import json
from typing import Any, Dict, Optional, Tuple
from .jobs import JOB_STORE
from .storage import make_job_dir, public_url
from .comfyui_client import ComfyUIClient
from .config import settings
_queue: "asyncio.Queue[Tuple[str, str, Dict[str, Any]]]" = asyncio.Queue()
def enqueue(job_id: str, gen_type: str, prompt_graph: Dict[str, Any]) -> None:
_queue.put_nowait((job_id, gen_type, prompt_graph))
async def _extract_first_output(history: Dict[str, Any], job_dir: str) -> Optional[str]:
# ComfyUI history structure can vary; implement a conservative extraction:
# try to find any "images" or "gifs"/"videos" outputs and download via /view
# For MVP: prefer /view?filename=...&type=output&subfolder=...
# Here we return a "manifest.json" to unblock integration even if file fetching needs refinement.
manifest_path = os.path.join(job_dir, "manifest.json")
with open(manifest_path, "w", encoding="utf-8") as f:
json.dump(history, f, ensure_ascii=False, indent=2)
return "manifest.json"
async def worker_loop() -> None:
client = ComfyUIClient()
sem = asyncio.Semaphore(settings.MAX_CONCURRENCY)
async def run_one(job_id: str, gen_type: str, prompt_graph: Dict[str, Any]) -> None:
async with sem:
JOB_STORE.update(job_id, status="running", progress=0.01)
client_id = f"comfy-agent-{uuid.uuid4().hex}"
def on_p(p: float, msg: str) -> None:
JOB_STORE.update(job_id, progress=float(p), message=msg)
try:
prompt_id = await client.queue_prompt(prompt_graph, client_id=client_id)
JOB_STORE.update(job_id, comfy_prompt_id=prompt_id)
await client.wait_progress(client_id=client_id, prompt_id=prompt_id, on_progress=on_p)
hist = await client.get_history(prompt_id)
job_dir = make_job_dir(job_id)
fname = await _extract_first_output(hist, job_dir)
if not fname:
JOB_STORE.update(job_id, status="failed", error="No outputs found in ComfyUI history")
return
url = public_url(job_id, fname)
JOB_STORE.update(job_id, status="succeeded", progress=1.0, result_url=url)
except Exception as e:
JOB_STORE.update(job_id, status="failed", error=str(e))
while True:
job_id, gen_type, prompt_graph = await _queue.get()
asyncio.create_task(run_one(job_id, gen_type, prompt_graph))

View File

@@ -0,0 +1,9 @@
fastapi==0.115.0
uvicorn[standard]==0.30.6
pydantic==2.8.2
pydantic-settings==2.4.0
httpx==0.27.2
websockets==12.0
nats-py==2.7.2
python-multipart==0.0.9
orjson==3.10.7