Files
microdao-daarion/services/comfy-agent/README.md
Apple c41c68dc08 feat: Add Comfy Agent service for NODE3 image/video generation
- Create comfy-agent service with FastAPI + NATS integration
- ComfyUI client with HTTP/WebSocket support
- REST API: /generate/image, /generate/video, /status, /result
- NATS subjects: agent.invoke.comfy, comfy.request.*
- Async job queue with progress tracking
- Docker compose configuration for NODE3
- Update PROJECT-MASTER-INDEX.md with NODE2/NODE3 docs

Co-Authored-By: Warp <agent@warp.dev>
2026-02-10 04:13:49 -08:00

4.7 KiB

Comfy Agent Service

Image & Video Generation Service for NODE3

Overview

Comfy Agent is a specialized service that interfaces with ComfyUI for AI-powered image and video generation. It provides both REST API and NATS messaging interfaces, enabling other agents in the DAARION ecosystem to request generation tasks.

Architecture

NODE1 Agents → NATS → Comfy Agent (NODE3) → ComfyUI (port 8188)
                                           → LTX-2 Models (293 GB)

Features

  • REST API: Synchronous HTTP endpoints for generation requests
  • NATS Integration: Async message-based communication with other agents
  • Job Queue: Handles concurrent generation requests with configurable concurrency
  • Progress Tracking: Real-time progress updates via WebSocket monitoring
  • Result Storage: File-based storage with URL-based result retrieval

API Endpoints

POST /generate/image

Generate an image from text prompt.

Request:

{
  "prompt": "a futuristic city of gifts, ultra-detailed, cinematic",
  "negative_prompt": "blurry, low quality",
  "width": 1024,
  "height": 1024,
  "steps": 28,
  "seed": 12345
}

Response:

{
  "job_id": "job_abc123...",
  "type": "text-to-image",
  "status": "queued",
  "progress": 0.0
}

POST /generate/video

Generate a video from text prompt using LTX-2.

Request:

{
  "prompt": "a cat walking on the moon, cinematic",
  "seconds": 4,
  "fps": 24,
  "steps": 30
}

GET /status/{job_id}

Check the status of a generation job.

Response:

{
  "job_id": "job_abc123...",
  "type": "text-to-image",
  "status": "succeeded",
  "progress": 1.0,
  "result_url": "http://NODE3_IP:8880/files/job_abc123.../output.png"
}

GET /result/{job_id}

Retrieve the final result (same as status).

GET /healthz

Health check endpoint.

NATS Integration

Subscribed Topics

  • agent.invoke.comfy - Main invocation channel from router
  • comfy.request.image - Direct image generation requests
  • comfy.request.video - Direct video generation requests

Message Format

Request:

{
  "type": "text-to-image",
  "workflow": {
    "1": {"class_type": "CLIPTextEncode", ...},
    "2": {"class_type": "CheckpointLoaderSimple", ...}
  }
}

Response:

{
  "job_id": "job_abc123..."
}

Configuration

Environment variables:

  • COMFYUI_HTTP - ComfyUI HTTP endpoint (default: http://127.0.0.1:8188)
  • COMFYUI_WS - ComfyUI WebSocket endpoint (default: ws://127.0.0.1:8188/ws)
  • NATS_URL - NATS server URL (default: nats://144.76.224.179:4222)
  • STORAGE_PATH - Path for result storage (default: /data/comfy-results)
  • PUBLIC_BASE_URL - Public URL for accessing results (default: http://212.8.58.133:8880/files)
  • MAX_CONCURRENCY - Max concurrent generations (default: 1)

Development

Local Setup

cd services/comfy-agent
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows
pip install -r requirements.txt

# Run locally
uvicorn app.main:app --reload --port 8880

Docker Build

docker build -t comfy-agent:latest .

Testing

# Test image generation
curl -X POST http://localhost:8880/generate/image \
  -H "Content-Type: application/json" \
  -d '{"prompt":"a futuristic city, cyberpunk style"}'

# Check status
curl http://localhost:8880/status/job_abc123...

# Health check
curl http://localhost:8880/healthz

TODO / Roadmap

  1. Workflow Templates: Replace placeholder workflows with actual ComfyUI workflows

    • SDXL text-to-image workflow
    • LTX-2 text-to-video workflow
    • Image-to-video workflow
  2. Result Extraction: Implement proper file extraction from ComfyUI history

    • Download images/videos via /view endpoint
    • Support multiple output formats (PNG, JPG, GIF, MP4)
    • Handle batch outputs
  3. Advanced Features:

    • Workflow library management
    • Custom model loading
    • LoRA/ControlNet support
    • Batch processing
    • Queue prioritization
  4. Monitoring:

    • Prometheus metrics
    • Grafana dashboards
    • Alert on failures
    • GPU usage tracking
  5. Storage:

    • S3/MinIO integration for scalable storage
    • Result expiration/cleanup
    • Thumbnail generation

Integration with Agent Registry

Add to config/agent_registry.yml:

comfy:
  id: comfy
  name: Comfy
  role: Image & Video Generation Specialist
  scope: node_local
  node_id: node-3-threadripper-rtx3090
  capabilities:
    - text-to-image
    - text-to-video
    - image-to-video
    - workflow-execution
  api_endpoint: http://212.8.58.133:8880
  nats_subject: agent.invoke.comfy

License

Part of the DAARION MicroDAO project.

Maintainers

  • DAARION Team
  • Last Updated: 2026-02-10